Configuring post-filtering for community libraries

You can use SearchCellConfig commands to configure post-filtering for community libraries. Post-filtering is disabled by default.

Before you begin

To access configuration files, you must use the IBM® WebSphere® Application Server wsadmin client. See Starting the wsadmin client for details.

About this task

Pre-filtering is a process that happens before you enter a search query. It involves collecting all the access control lists (ACLs) for user content that is not public. These ACLs are added to the search query and are used for searching private content in addition to public content. For community library files that are part of a private community, the ACL of the private community is added to the member's search query.

Post-filtering takes place when the search results are returned from the index after a search query is run. After the search results are returned, the Search application must verify with the filtering service that the user who performed the search has the appropriate level of access to see returned documents. If the user is not allowed to see a file, that file is excluded from the search results and other documents are returned to take the place of the excluded file. Post-filtering is only relevant to community libraries.

When you use post-filtering, the sI (start index) parameter is not supported; however, you can use the page and ps (page size) parameters to control the size of the results list.

Note: Post-filtering is not required under the following circumstances:
  • You are using IBM Connections Content Manager 4.5 CR 1 or later, and.
  • You are using features that are only exposed in the Connections user interface and APIs
Post-filtering must be enabled if you are using FileNet features to restrict access to a greater degree than the restrictions made in the Connections user interface.

Procedure

To configure post-filtering for community libraries, complete the following steps.
  1. Start the wsadmin client from one of the following directories on the system on which you installed the Deployment Manager:

    Linux: app_server_root\profiles\dm_profile_root\bin

    Windows: app_server_root/profiles/dm_profile_root/bin

    where app_server_root is the WebSphere® Application Server installation directory and dm_profile_root is the Deployment Manager profile directory, typically dmgr01.

    You must start the client from this directory or subsequent commands that you enter do not execute correctly.

  2. After the wsadmin command environment has initialized, enter the following command to initialize the Search environment and start the Search script interpreter:
    execfile("searchAdmin.py")
    If prompted to specify a service to connect to, type 1 to pick the first node in the list. Most commands can run on any node. If the command writes or reads information to or from a file using a local file path, you must pick the node where the file is stored.
    When the command is run successfully, the following message displays:
    Search Administration initialized
  3. Check out the Search cell-level configuration file, search-config.xml, with the following command:

    SearchCellConfig.checkOutConfig("working_dir", "cellName")

    Where:
    • working_dir is the temporary directory to which you want to check out the cell level configuration file. This directory must exist on the server where you are running the wsadmin client. Use forward slashes to separate directories in the file path, even if you are using the Microsoft Windows operating system.
      Note: AIX®, and Linux only: The directory must grant write permissions or the command does not run successfully.
    • cellName is the name of the cell that the Search node belongs to. The command is case-sensitive. If you do not know the cell name, you can determine it by typing the following command in the wsadmin command processor:

      print AdminControl.getCell()

    For example:
    SearchCellConfig.checkOutConfig("c:/search_temp", "SearchServerNode01Cell")
  4. Use the following commands as needed:
    SearchCellConfig.enableEcmPostFiltering()

    Enables post-filtering for community libraries. Post-filtering is disabled by default.

    This command does not take any parameters.

    SearchCellConfig.disableEcmPostFiltering()

    Disables post-filtering for community libraries. Post-filtering is disabled by default.

    This command does not take any parameters.

    SearchCellConfig.setEcmPostFilteringMultiplier(multiplier)

    Sets the multiplier for post filtering.

    When a user requests a certain page size for search results, the Search application attempts to populate the page with the specified number of results.  For example, if the user requests a page size of 10, the Search application checks more than 10 documents. However, a limit is required to avoid performance issues. A multiplier of 3 specifies that up to 30 documents are loaded to identify 10 documents to which the user has access. In most cases, statistically, this should be enough to fill the page. If the page cannot be fully populated after checking all 30 documents, a page with fewer search results is returned to the user.

    If you frequently receive partially filled search result pages in Connections, change this parameter.

    This command takes a single parameter:
    • Multiplier. A positive integer that specifies how many documents are checked in the attempt to populate the search results page.
    For example:
    SearchCellConfig.setEcmPostFilteringMultiplier(20)
    SearchCellConfig.setEcmPostFilteringMaxGapSize(maxGapSize)

    Sets the maximum gap size that is allowed for post-filtering.

    If a user uses the pagination controls in the Search user interface, post-filtering calculation is performed when jumping from page 1 of the search results to, for example, page 4. However, you might not want to allow post-filtering calculation when jumping to page 100 for performance reasons. This command specifies the maximum gap that is allowed for post-filtering calculations between the current page and the requested page.

    This command takes a single parameter:
    • maxGapSize. A positive integer that specifies the maximum gap that is allowed between the current page (for which the accurate index is known) and the requested page for post-filtering calculations.
    For example:
    SearchCellConfig.setEcmPostFilteringMaxGapSize(250)
    SearchCellConfig.setEcmPostFilteringConnectionTimeOut(connectionTimeOutInMillis)

    Sets the connection timeout value for post-filtering.

    If the timeout occurs, community library documents are removed from the search results. Results for community documents that have no access control are still shown.

    This command takes a single parameter:
    • connectionTimeOutInMillis. A positive integer that specifies the connection timeout for post-filtering in milliseconds.
    For example:
    SearchCellConfig.setEcmPostFilteringConnectionTimeOut(1000)
    SearchCellConfig.setEcmPostFilteringSocketDataTimeOut(socketDataTimeOutInMillis)

    Sets the socket data timeout value for post-filtering.

    If the timeout occurs, community library documents are removed from the search results. Results for community documents that have no access control are still shown.

    This command takes a single parameter:
    • socketDataTimeOutInMillis. A positive integer that specifies the socket data timeout for post-filtering in milliseconds.
    For example:
    SearchCellConfig.setEcmPostFilteringSocketDataTimeOut(3000)
    SearchCellConfig.setEcmPostFiltering(multiplier,maxGapSize,connectionTimeOutInMillis,socketDataTimeOutInMillis)

    Enables post-filtering settings for community libraries with the values that you specify.

    This command takes the following parameters:
    • Multiplier. A positive integer that specifies how many documents are checked in the attempt to populate the search results page.
    • maxGapSize. A positive integer that specifies the maximum gap that is allowed between the current page (for which the accurate index is known) and the requested page for post-filtering calculations.
    • connectionTimeOutInMillis. A positive integer that specifies the connection timeout for post-filtering in milliseconds.
    • socketDataTimeOutInMillis. A positive integer that specifies the socket data timeout for post-filtering in milliseconds.
    For example:
    SearchCellConfig.setEcmPostFiltering(100,5100,30000,60000)
    Note: This example would be suitable for a community library with approximately 500,000 ECM files. You may need to experiment with the parameters to find the optimum settings values that give the best search results.
  5. Check in the updated search-config.xml configuration file using the following wsadmin client command:

    SearchCellConfig.checkInConfig()

  6. To exit the wsadmin client, type exit at the prompt.
  7. Stop the affected servers and then start them again to put the configuration changes into effect. If the change you made affects a node, you must stop and restart all of the servers on that node. Similarly, if the change you made affects a cell, you must stop and restart all of the servers in that cell.
    Note: For a high-availability deployment, stop and start the servers in turn to ensure that the Search application is still available to your users.