Extracting values from inconsistent patterns using regular expressions

Sometimes, you are looking for patterns on a hit that are not consistent. For example, an error message could be formatted like the following:


<error id="35">Coupon Code is Invalid<\error>

The id="35" part is variable, since it represents the actual message itself. If you only want to retrieve the text part (Coupon Code is Invalid), you can't use a Basic Mode hit attribute, since the hit attribute requires strictly consistent patterns to match. You must use a hit attribute to return a much larger string of text and then extract the wanted portions to be the value.

So, first make a hit attribute that matches the consistent parts of the pattern:


Hit Attribute Matching a Consistent Pattern

This configuration returns 35">Coupon Code is Invalid as the value. However, The wanted value may be just the Coupon Code is Invalid message.

To limit the pattern to only match for the message, you must apply a regular expression to the pattern to extract the wanted text. Below is the modified JavaScript:


function NS$E__ERROR_IN_RESPONSE_WITH_REGEXP__634290902173496582()
{
  if ($P["NS.P__ERROR_IN_RESPONSE__634290901401436582"].patternFound())
  {
    $P["NS.P__ERROR_IN_RESPONSE__634290901401436582"].lastValue().
      match('.*?">(.*)$ ');

    $F.setFact
     ("NS.F_E__ERROR_IN_RESPONSE_WITH_REGEXP__634290902173496582", RegExp.$1);
  }
}

The regular expression is defined in the following snippet:


$P["NS.P__ERROR_IN_RESPONSE__634290901401436582"].lastValue().
  match('.*?">(.*)$ ');
  • The match('.*?">(.*)$') part runs the regular expression on value that is returned by the last match of hit attribute NS.P__ERROR_IN_RESPONSE__634290901401436582 on the hit.
    • If you wanted to run the RegEx on the first value, replace lastValue() with firstValue().
    • Since the starting value is 35">Coupon Code is Invalid and we want {{Coupon Code is Invalid,}}, we want to match on the part after 35">. That is what the .*?"> part of the regex code does. The (.*)$ is the part that is extracted.

$F.setFact("NS.F_E__ERROR_IN_RESPONSE_WITH_REGEXP__634290902173496582",
  RegExp.$1);
  • For the fact value, this snippet defines a setFact except for the modifier: RegExp.$1, which takes the first extracted value of the regular expression operation.
    • Regular expressions can theoretically extract multiple values. To use the second extracted value, insert RegExp.$2. To use the third value, insert RegExp.$3.
  • RegExp is a global variable on an event. You do not have to declare it or set it. It is set automatically.

The basic syntax is as follows:


function <EVENT>()
  {
  if <CONDITION>
    {
    <OBJECT>.match('<REGULAR EXPRESSION>');

    $F.setFact("<FACT>", RegExp.$<EXTRACTED VALUE#>);
    }
}

The RegExp variable gets it values from the nearest previous match function. Suppose the code looks like the following:


<OBJECT 1>.match('<REGULAR EXPRESSION 1>');
$F.setFact("<FACT>", RegExp.$1>);
<OBJECT 2>.match('<REGULAR EXPRESSION 2>');
$F.setFact("<FACT>", RegExp.$1>);

The second RegExp.$1 reference uses the first match from the second regular expression for object 2, instead of the match from the regular expression for object 1.