Ingest Attribute index pipeline

The complete data mappings from specification, database and schema are shown for the Attribute category.

Attribute index field mapping from database

​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​The following sequence of steps ​illustrates the Attribute indexing pipeline implemented in Apache NiFi. For information on calling the Ingest service, see Search Ingest Service API. For a complete listing of Elasticsearch index fields and parameters, see Elasticsearch index field types.

The flow consists of mainly two stages:
  1. Creating Attribute document
  2. Associating attribute values

Stage 1: Creating an Attribute ​​document​​​​​​​​​​​​​​​​​

This stage describes how to transform the Attribute data and load it into the Attribute index. It starts with running the following SQL to retrieve Attribute data from the Commerce database.
 SELECT A.ATTR_ID, COALESCE(AD.LANGUAGE_ID, L.LANGUAGE_ID) LANGUAGE_ID, 
		A.STOREENT_ID, A.ATTRTYPE_ID, A.ATTRUSAGE, A.IDENTIFIER, A.SEQUENCE ATTR_SEQUENCE,
		A.DISPLAYABLE, A.SEARCHABLE, A.COMPARABLE, A.FACETABLE, A.MERCHANDISABLE, A.SWATCHABLE, 
		AD.NAME, AD.DESCRIPTION, AD.GROUPNAME, QD.QTYUNIT_ID, QD.DESCRIPTION QTY_DESCRIPTION, 
        L.LOCALENAME, F.MAX_DISPLAY, F.SELECTION,
		F.SEQUENCE FACET_SEQUENCE, F.FACET_ID, F.SORT_ORDER, F.ZERO_DISPLAY, F.GROUP_ID, F.KEYWORD_SEARCH,
		LOWER(S.SRCHFIELDNAME) || '_ntk_cs' FIELDNAME, A.STOREDISPLAY
	      FROM LANGUAGE L, ATTR A
	    	   LEFT JOIN ATTRDESC AD ON (A.ATTR_ID = AD.ATTR_ID AND AD.LANGUAGE_ID = ${param.langId})
	    	   LEFT JOIN QTYUNITDSC QD ON (AD.QTYUNIT_ID = QD.QTYUNIT_ID AND QD.LANGUAGE_ID = AD.LANGUAGE_ID)
	    	   LEFT JOIN FACET F ON (A.ATTR_ID = F.ATTR_ID AND
	    	 					     F.STOREENT_ID IN (SELECT RELATEDSTORE_ID FROM STOREREL
	    	 					   					    WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}))
	    	   LEFT JOIN ATTRDICTSRCHCONF S ON (A.ATTR_ID = S.ATTR_ID AND
	    	                                    S.MASTERCATALOG_ID IN (SELECT C.CATALOG_ID FROM STORECAT C
	    	                                                            WHERE C.MASTERCATALOG = 1 AND C.STOREENT_ID IN 
	                                                                          (SELECT RELATEDSTORE_ID FROM STOREREL
	                                                                            WHERE STATE = 1 AND STRELTYP_ID = -4
	                                                                              AND STORE_ID = ${param.storeId})))
	     WHERE A.STOREENT_ID IN (SELECT RELATEDSTORE_ID FROM STOREREL
	    	 					    WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}) AND L.LANGUAGE_ID = ${param.langId} ${extAttributeAnd}
         UNION
        SELECT S.SRCHATTR_ID, L.LANGUAGE_ID, ${param.storeId}, NULL, NULL, 'facet' || TO_CHAR(S.SRCHATTR_ID), NULL,
               1, 1, NULL, 1, NULL, 0, D.NAME, D.DESCRIPTION, NULL, NULL, NULL,  L.LOCALENAME, F.MAX_DISPLAY, F.SELECTION,
	           F.SEQUENCE FACET_SEQUENCE, F.FACET_ID, F.SORT_ORDER, F.ZERO_DISPLAY, F.GROUP_ID, F.KEYWORD_SEARCH, S.PROPERTYVALUE, 0
          FROM SRCHATTRPROP S, SRCHATTR A, FACET F, FACETDESC D, LANGUAGE L
         WHERE S.PROPERTYNAME = 'facet' AND S.PROPERTYVALUE NOT LIKE 'ad%' AND S.PROPERTYVALUE NOT LIKE 'price_%'
           AND S.SRCHATTR_ID = A.SRCHATTR_ID AND A.SRCHATTR_ID = F.SRCHATTR_ID
           AND (A.INDEXSCOPE IN   
		       (SELECT C.CATALOG_ID FROM STORECAT C
		         WHERE C.MASTERCATALOG = 1 AND C.STOREENT_ID IN
		               (SELECT RELATEDSTORE_ID FROM STOREREL
		                 WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}))
		    OR  A.INDEXSCOPE = 0)
		   AND L.LANGUAGE_ID = ${param.langId} AND D.LANGUAGE_ID = L.LANGUAGE_ID AND F.FACET_ID = D.FACET_ID
         ORDER BY 1
 OFFSET ${param.offset} ROWS FETCH NEXT ${param.pageSize} ROWS ONLY         
                
Note that the 2nd part of the UNION expression above is used for explicitly include facet entries for brand, category, and prices. Next, the result set is passed to the CreateAttributeDocumentFromDatabase processor for transformation, using the following table to ​map the database field returned from the SQL above to an index field in the Attribute index:​
​Index Field​ Name​ ​Index Field Type ​​​Description
​​Document Identifier​​
​id/store id_string Internal id of the owning store; mapped to ATTR.STOREENT_ID
id/​language id_string The identifier of the language​; mapped to ATTRDESC.LANGUAGE_ID
id/​attribute id_string ​The internal id of the attribute; mapped to ATTR.ATTR_ID
​​identifier/specification id_string Set to "​attribute"
identifier/​store id_string A string that uniquely identifies the owning store; mapped to ATTR.STOREENT_ID
identifier/​language id_string The language locale of this attribute; mapped from ATTRDESC.LANGUAGE_ID
​identifier/attribute/raw raw The original form of the attribute identifier; mapped to ATTR.IDENTIFIER
​identifier/attribute/normalized normalilzed ​Normalized form of the attribute identifier; mapped to ATTR.IDENTIFIER
​​Language Sensitive Data​​​
​name/raw raw ​The language-dependent name of this attribute; mapped to ATTRDESC.NAME
​​name/normalized normalized ​Same as above
​​description/raw raw ​A short description of this attribute; mapped to ATTRDESC.DESCRIPTION
Properties​​​
​displayable ​boolean Identifies if this attribute is displayabe at the storefront​; mapped to ATTR.DISPLAYABLE
​searchable ​boolean Identifies if this attribute can be searched​​; mapped to ATTR.SEARCHABLE
​facetable ​boolean Specifies that the attribute is used as a facet in the storefront for faceted navigation​​; mapped to ATTR.FACETABLE
​comparable ​boolean Identifier if this attribute can be used for comparison​​; mapped to ATTR.COMPARABLE
​merchandisable ​boolean Specifies that the attribute is used in creating merchandising rules​​; mapped to ATTR.MERCHANDISABLE
​swatchable ​boolean ​​Identifies if this attribute can be displayed with swatch image​​; mapped to ATTR.SWATCHABLE
ribbon boolean ​Identifies if this attribute can be used as a ribbon for display; mapped to ATTR.STOREDISPLAY​
​group ​id_string ​​Specifies the name of the group of attributes. All related attributes should be created with the same group name.
​unit/identifier ​​id_string The units in which this attribute is measured; mapped to ATTRDESC.QTYUNIT_ID
​unit/name/raw ​raw ​The description of this quantity unit; mapped to QTYUNITDSC.DESCRIPTION
​Navigational Data​​​​
​​sequence ​float The display order of attributes in an attribute group or in the root of the attribute dictionary; mapped to ATTR.SEQUENCE
​facet/limit ​integer ​The maximum values to display in the storefront for the facet; mapped to FACET.MAX_DISPLAY
​facet/zero ​boolean ​Describes whether the facetable attribute should display zero count values; mapped to FACET.ZERO_DISPLAY
​facet/multiple boolean​ ​Describes whether the facetable attribute allows multiple selections; mapped to FACET.SELECTION
​facet/order ​integer​ ​The display order to use when displaying the values for the facet; mapped to FACET.SORT_ORDER
​facet/search ​boolean ​Describes whether the facet should be included in keyword search; mapped to FACET.KEYWORD_SEARCH
​​facet/sequence ​float ​The sequence of the facet showing in the storefront; mapped to FACET.SEQUENCE
​facet/key ​​id_string ​The normalized key that is used for Search Rule​s; mapped to ATTRDICTSRCHCONF.SRCHFIELDNAME
​facet/group ​id_string​ ​The internal group identifier for the facet to be used in the storefront; mapped to FACET.GROUP_ID
For example code, see Stage 1 samples.

Stage 2: Associating Attribute values

This stage describes how to transform the Attribute value data and load it into the Attribute index. It starts with running the following SQL to retrieve Attribute value data from the Commerce database:
SELECT LISTAGG(V.IDENTIFIER, '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) ATTRVAL_IDENTIFIER,
	       LISTAGG(COALESCE(VD.STRINGVALUE, ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) STRINGVALUE, 
	       LISTAGG(COALESCE(TO_CHAR(VD.INTEGERVALUE), ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) INTEGERVALUE, 
	       LISTAGG(COALESCE(TO_CHAR(VD.FLOATVALUE), ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) FLOATVALUE, 
	       LISTAGG(TO_CHAR(VD.SEQUENCE), ', ') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) ATTRVAL_SEQUENCE, 
	       LISTAGG(COALESCE(VD.QTYUNIT_ID, ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) QTYUNIT_ID,
	       LISTAGG(COALESCE(NULLIF(VD.IMAGE1,''), ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) IMAGE1, 
	       LISTAGG(COALESCE(NULLIF(VD.IMAGE2,''), ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) IMAGE2,
	       LISTAGG(V.ATTRVAL_ID, ', ') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) ATTRVAL_ID,
	       LISTAGG(COALESCE(QD.DESCRIPTION, ' '), '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) QTY_DESCRIPTION,
	       LISTAGG(A.ATTRTYPE_ID, '###') WITHIN GROUP (ORDER BY V.ATTRVAL_ID) ATTRTYPE_ID,
	       A.ATTR_ID
	  FROM ATTR A, ATTRVAL V, ATTRVALDESC VD, QTYUNITDSC QD
	 WHERE A.ATTR_ID = V.ATTR_ID AND V.ATTRVAL_ID = VD.ATTRVAL_ID AND VD.LANGUAGE_ID = ${param.langId} 
	   AND VD.QTYUNIT_ID = QD.QTYUNIT_ID AND QD.LANGUAGE_ID = VD.LANGUAGE_ID
	   AND A.STOREENT_ID IN (SELECT RELATEDSTORE_ID 
	                           FROM STOREREL WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}) ${extAttributeAnd}
	 GROUP BY A.ATTR_ID
         ORDER BY A.ATTR_ID
         OFFSET ${param.offset} ROWS FETCH NEXT ${param.pageSize} ROWS ONLY​    ​   

The result set is passed to the FindAttributeValuesFromDatabase processor for transformation, using the following table to ​map the database field returned from the SQL above to an index field in the Attribute index:​

​Index Field​ Name​ ​Index Field Type ​​​Description
​​Attribute Values​​​
values/id​ ​id_string​ The internal unique id for this attribute value; mapped to ATTRVAL.ATTRVAL_ID​​
​​values/identifier ​​id_string The external identifier for this attribute value; mapped to ATTRVAL.​​IDENTIFIER
​​​values/sequence​ ​float ​​A number that determines the display order of a list of allowable attribute values for this attribute; mapped to ATTRVALDESC.SEQUENCE
​​​​​values/value/raw ​​id_string ​​​The string value of this attribute value; mapped to ATTRVALDESC.STRINGVALUE​, ATTRVALDESC.FLOATVALUE, or INTEGERVALUE
​​​​​values/value/normalized normalized Same as above
​​​​​values/value/image1 raw ​​​The image1 path of this attribute value; mapped to ATTRVALDESC.IMAGE1
​​​​​values/value/image2 raw ​​​The image2 path of this attribute value; mapped to ATTRVALDESC.IMAGE2
values/unit/identifier ​​id_string ​​The unit of measure identifier in which this attribute is measured; mapped to ATTRVALDESC.QTYUNIT_ID
values/unit/name/raw ​raw ​​The description of the quantity unit; mapped to QTYUNITDSC.DESCRIPTION
For example code, see Stage 2 samples.

Stage 1 samples

The following code is an example of the input data for the CreateAttributeDocumentFromDatabase processor.

{
  "LANGUAGE_ID": -1,
  "STOREENT_ID": 1,
  "ATTRTYPE_ID": null,
  "ATTR_ID": -1013,
  "ATTRUSAGE": null,
  "IDENTIFIER": "facet-1013               ",
  "ATTR_SEQUENCE": null,
  "DISPLAYABLE": 1,
  "SEARCHABLE": 1,
  "COMPARABLE": null,
  "FACETABLE": 1,
  "MERCHANDISABLE": null,
  "SWATCHABLE": 0,
  "NAME": "Category",
  "DESCRIPTION": "The category",
  "GROUPNAME": null,
  "QTYUNIT_ID": null,
  "QTY_DESCRIPTION": null,
  "LOCALENAME": "en_US           ",
  "MAX_DISPLAY": 20,
  "SELECTION": 0,
  "FACET_SEQUENCE": 0,
  "FACET_ID": -1001,
  "SORT_ORDER": 0,
  "ZERO_DISPLAY": 0,
  "GROUP_ID": 0,
  "KEYWORD_SEARCH": 1,
  "FIELDNAME": "parentCatgroup_id_search"
}
The CreateAttributeDocumentFromDatabase processor transforms the input data into the following output data based on the index mapping table:

{ "update": { "_id": "1--1--1013", "_index": ".auth.1.attribute.202006160325", "retry_on_conflict": 5, "_source": false } }
{
  "doc": {
    "identifier": {
      "specification": "attribute",
      "language": "en_US",
      "attribute": {
        "normalized": "facet-1013",
        "raw": "facet-1013"
      },
      "key": "facet-1013"
    },
    "sequence": 0,
    "name": {
      "normalized": "Category",
      "raw": "Category"
    },
    "displayable": true,
    "description": {
      "raw": "The category"
    },
    "facetable": true,
    "id": {
      "language": "-1",
      "attribute": "-1013",
      "store": "1"
    },
    "facet": {
      "zero": false,
      "sequence": 0,
      "search": true,
      "limit": 20,
      "multiple": false,
      "key": "parentCatgroup_id_search",
      "order": 0,
      "group": "0"
    },
    "searchable": true,
    "__meta": {
      "created": "2020-07-28T14:55:54.911Z",
      "modified": "2020-07-28T14:55:54.911Z",
      "version": {
        "min": 0,
        "max": 0
      }
    },
    "swatchable": false
  },
  "doc_as_upsert": true
}

Stage 2 samples

The following code is an example of the input data for the FindAttributeValuesFromDatabase processor.

{
  "ATTRVAL_IDENTIFIER": "Multi-color###Blue###Ivory###Light blue###Gray###White###Purple###Black###Denim###Taupe",
  "STRINGVALUE": "Multi-color###Blue###Ivory###Light blue###Gray###White###Purple###Black###Denim###Taupe",
  "INTEGERVALUE": " ### ### ### ### ### ### ### ### ### ",
  "FLOATVALUE": " ### ### ### ### ### ### ### ### ### ",
  "ATTRVAL_SEQUENCE": "1, 2, 3, 4, 5, 6, 7, 8, 9, 10",
  "QTYUNIT_ID": "C62             ###C62             ###C62             ###C62             ###C62             ###C62             ###C62             ###C62             ###C62             ###C62             ",
  "IMAGE1": " ### ### ### ### ### ### ### ### ### ",
  "IMAGE2": " ### ### ### ### ### ### ### ### ### ",
  "ATTRVAL_ID": "7000000000000000785, 7000000000000000786, 7000000000000000787, 7000000000000000788, 7000000000000000789, 7000000000000000790, 7000000000000000791, 7000000000000000792, 7000000000000000793, 7000000000000000794",
  "QTY_DESCRIPTION": "one###one###one###one###one###one###one###one###one###one",
  "ATTRTYPE_ID": "STRING          ###STRING          ###STRING          ###STRING          ###STRING          ###STRING          ###STRING          ###STRING          ###STRING          ###STRING          ",
  "ATTR_ID": 7000000000000000000
}
The FindAttributeValuesFromDatabase processor transforms the input data with the store id and language id passed from NiFi FlowFile class as flow attributes into the following output:

{ "update": { "_id": "1--1-7000000000000000000", "_index": ".auth.1.attribute.202006160325", "retry_on_conflict": 5, "_source": false } }

{
  "doc": {
    "values": [
      {
        "identifier": "Multi-color",
        "sequence": 1,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000785",
        "value": {
          "normalized": "Multi-color",
          "raw": "Multi-color"
        }
      },
      {
        "identifier": "Blue",
        "sequence": 2,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000786",
        "value": {
          "normalized": "Blue",
          "raw": "Blue"
        }
      },
      {
        "identifier": "Ivory",
        "sequence": 3,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000787",
        "value": {
          "normalized": "Ivory",
          "raw": "Ivory"
        }
      },
      {
        "identifier": "Light blue",
        "sequence": 4,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000788",
        "value": {
          "normalized": "Light blue",
          "raw": "Light blue"
        }
      },
      {
        "identifier": "Gray",
        "sequence": 5,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000789",
        "value": {
          "normalized": "Gray",
          "raw": "Gray"
        }
      },
      {
        "identifier": "White",
        "sequence": 6,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000790",
        "value": {
          "normalized": "White",
          "raw": "White"
        }
      },
      {
        "identifier": "Purple",
        "sequence": 7,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000791",
        "value": {
          "normalized": "Purple",
          "raw": "Purple"
        }
      },
      {
        "identifier": "Black",
        "sequence": 8,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000792",
        "value": {
          "normalized": "Black",
          "raw": "Black"
        }
      },
      {
        "identifier": "Denim",
        "sequence": 9,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000793",
        "value": {
          "normalized": "Denim",
          "raw": "Denim"
        }
      },
      {
        "identifier": "Taupe",
        "sequence": 10,
        "unit": {
          "identifier": "C62",
          "name": {
            "raw": "one"
          }
        },
        "id": "7000000000000000794",
        "value": {
          "normalized": "Taupe",
          "raw": "Taupe"
        }
      }
    ],
    "__meta": {
      "modified": "2020-07-28T15:18:30.965Z"
    }
  }
}