Using regular expressions in hit attributes

In many situations, the start and end tags of a hit attribute cannot provide an exclusive and unique match to an item of interest on the page. For example, suppose you have the following HTML on the page:


<error id="35">Coupon Code is invalid<\error>

In the above case, the value for the id may change with each message. As a result, a basic hit attribute cannot be constructed to retrieve this error message every time.

Optionally, you can apply a regular expression to the strings that are matched by the hit attribute. After the hit attribute beginning and end tags found one or more matches in the scanned text, the optional regular expression can be applied to the matching strings to further refine the results, returning one, or more values for the hit attribute after the regular expression was applied.

When configuring a hit attribute as an event condition, the Match Count and Hit Attribute Found operators are based on the number of values that are returned by the hit attribute after the regular expression was applied.

Suppose the hit attribute is defined with the following tags:

  • Start tag: <error id="
  • End tag: </error>

In a hit, the following error messages are detected by the hit attribute:


<error id="35">Coupon Code is invalid<\error>
<error id="15">Please enter a zip code<\error>
<error id="12">Please enter a state<\error>
<error id="13">The credit card is invalid<\error>

Suppose you wanted the hit attribute to track error messages on the page where a required text entry was not provided at all. If these messages begin with Please enter, the regular expression matches this string:


^Please enter

In the above example, the regular expression matches errors 12 and 15. If the event specifies to record the match count, the count is 2.

  • The first returned value is Please enter a zip code, and the second/last value is Please enter a state.
  • If a regular expression is not applied to the above set, the match count is four. The first value is 35">Coupon Code is invalid, and the last value is 13">The credit card is invalid.
    Note: When using the hit attribute as a condition in an event, the values that are returned after the regular expression was applied are available for the evaluation.

After you defined the regular expression, you can test the hit attribute in the Event Tester. See Event Tester.

If the configured hit attribute was functioning properly, you could create a copy of it and modify the regular expression to detect invalid entry error messages. If these messages all end with invalid, the following regular expression matches the string:


invalid$

In the above example, the regular expression matches errors 13 and 35.

Note: Regular expressions are considered a developer-level method for matching strings. When improperly specified, they can consume significant resources. Discover recommends applying them cautiously.
Note: You may also use regular expressions in conditions through Advanced Mode. See Advanced Mode for Events.

Limitations in the use of regular expressions in hit attributes

The hit attribute matchCount() returns the number of strings in the hit that passed the regular expression filtering.

  • If the hit attribute matches multiple times in a single string, the first matching instance in the string is returned. The Match Count, however, is still the number of matches for the hit.
  • Regular expressions may be up to 256 characters in length.
    Note: Avoid creating regular expressions that match multiple instances in a single string.

If a grouping operator, which is specified by enclosing parentheses, is present in the regular expression, the first group pattern is returned. Grouping operators beyond the first group are not returned.

  • If no grouping operators are specified, the entire matching string is returned.
  • If a grouping operator is required before the wanted group pattern, a non-capturing group operator can be specified with a ?: after the opening parenthesis, as in the following example, which matches on content after the first group operator:
    
    (?:not capturing this group)
    
    

Case-sensitive regular expression filtering is controlled by the Case Sensitive check box in the Hit Attribute definition.