Deleting persisted seedlist data

You can free up disk space by deleting persisted seedlists from your system using the SearchService.flushPersistedCrawlContent command.

Before you begin

See Starting the wsadmin client for information about how to start the wsadmin command-line tool.

About this task

Persisted seedlists can take up a large amount of space when your deployment has a lot of content. If you know that a particular set of crawled content is no longer needed, you can free up disk space by using the SearchService.flushPersistedCrawlContent command to delete the persisted data. This command only clears persisted seedlists in the default persistence location. If you want to delete seedlists crawled using the startBackgroundCrawl, startBackgroundFileContentExtraction, or startBackgroundIndex commands, you must delete them manually.

You might also want to use the SearchService.flushPersistedCrawlContent command to remove old data when you are about to recrawl the entire system with the persistence option enabled. Where previously persisted data still exists, you can use the command to purge old data from the system before generating a more up-to-date copy.

Procedure

To delete persisted seedlists, complete the following steps.
  1. Start the wsadmin client from one of the following directories on the system on which you installed the Deployment Manager:

    Linux: app_server_root\profiles\dm_profile_root\bin

    Windows: app_server_root/profiles/dm_profile_root/bin

    where app_server_root is the WebSphere® Application Server installation directory and dm_profile_root is the Deployment Manager profile directory, typically dmgr01.

    You must start the client from this directory or subsequent commands that you enter do not execute correctly.

  2. After the wsadmin command environment has initialized, enter the following command to initialize the Search environment and start the Search script interpreter:
    execfile("searchAdmin.py")
    If prompted to specify a service to connect to, type 1 to pick the first node in the list. Most commands can run on any node. If the command writes or reads information to or from a file using a local file path, you must pick the node where the file is stored.
    When the command is run successfully, the following message displays:
    Search Administration initialized
  3. Run the following command:
    SearchService.flushPersistedCrawlContent()
    Deletes current persisted seedlists.
    Note: This command only clears persisted seedlists in the default persistence location. Seedlists crawled using the startBackgroundCrawl, startBackgroundFileContentExtraction, or startBackgroundIndex commands must be deleted manually.
    This command does not take any input parameters.
    Note: Do not run this command while a crawl is in progress.

    When the command runs successfully, 1 is printed to the wsadmin console. If the command does not run successfully, 0 is printed to the wsadmin console.