Site content crawling in a staging environment

In a staging environment, the site content crawler is used to crawl site content, and can be configured to invoke indexing the internal site (auto-index=true) and have the content that is tested in staging. After all content is approved, the site content index update can also be replicated to the repeater as part of indexprop. Again, unmanaged content asset files must be manually copied to production. When manually publishing site content from staging to production, ensure that the manifest.txt file is also replicated. This file is later used for emergency updates.

For regular updates, the indexing processing is done in the staging server and then replicated to production. In this case, the manifest file does not need to be copied because it is only needed when indexing.

For emergency updates, the indexing processing is done in the production server against the repeater. In this case, the manifest.txt file must be copied.

For applying emergency fixes to site content, the following outline shows the high-level procedures involved:
  1. After the site content is tested in staging, the updated asset files are manually copied to production.
  2. In the production system, an IT Administrator updates the correct manifest.txt file to include only the set of files that must be reindexed.
  3. Start the site content index rebuild script from the production system so that the repeater can be updated with the latest site content changes.
  4. After reindexing the repeater is completed, the subordinate search servers in production detect these updates and immediately initiate the replication.
  5. Optional: If the updated site content is cached, a manual invalidation is required.

In both cases (regular updates and emergency updates), you must manually copy the updated unmanaged content asset files to production. For deleted files, the index of those deleted files must be manually deleted on the server where the indexing processing occurs by using the di-buildindex utility with webcontentDelete set to true.