Limiting search terms and characters from the search query

You can limit search terms and characters from the search query, such as unimportant words, stemming, or disabling wildcards and other characters.

Procedure

  • Removing unimportant words from the search query:
    Stop words remove common parts of speech that are typically unimportant. Such as the, and, or for. They are defined in the following file:
    • solrhome/MC_masterCatalogID/locale/CatalogEntry/conf/stopwords.txt

    Stop words are considered at both indexing and querying time.

    For example, if a shopper searches for the shirt in the storefront, the is skipped by Solr.

    If you are using the AND search type, no search results are returned, since the is defined in the stopwords.txt file. For more information, see StopFilterFactory.

  • Preventing stemming:
    If you want to protect certain words from being stemmed, you can add them into the protwords.txt file.
  • Disabling wildcard and other character searches:
    Wildcard searching is enabled by default, but if necessary, you can disable it for runtime performance or security reasons:
    • Performance might be impacted, as a wildcard search that uses a common term might return many documents from the search index.
    • Security might be a consideration, as Solr does not analyze and apply filters to wildcard searches.

    A prohibited words list stops the search request from further searching, and is configurable in the wc-component.xml file.

    For example, when you search for * by default, the resulting page is routed to the Prohibited Characters store page.

    The default configuration is:
    
    <_config:property name="StopPatterns" 
    value="\*,~,\?,&apos;&apos;,&quot;&quot;,.*\\.*,.*/.*,.*\|.*" /> 

    You can update the configuration to disable wildcard (*) searches or other characters by using the regular expression format.