Ingest Category index pipeline

The complete data mappings from specification, database and schema are shown for the category.

Category data specifications

For information on calling the Ingest service, see Search Ingest Service API. For a complete listing of Elasticsearch index fields and parameters, see Elasticsearch index field types.

Data specification:

Data Field​ Name ​Data Type ​​Data Value
/uniqueId long ​The Commerce catalog group internal identifier.
/identifier string The external catalog group identifier.
​/parentCategories/parentCategoryId ​long ​The Commerce category internal identifier of the parent category of this catalog group.
​/parentCategories/categoryIdentifier ​string ​The external category identifier of this parent category.
​/parentCategories/catalogId ​long ​The Commerce catalog internal identifier. If none is specified, the store default catalog will be used.
/parentCategories/​catalogIdentifier ​string ​The external catalog identifier for this parent category.
​/parentCategories/sequence ​float ​The sequence number used to determine the display order within this parent category.
​/descriptions/languageId ​integer The Commerce language internal identifier. If not specified, “locale” will be used. If “locale” is not specified, then the store default language will be used.​
​/descriptions/locale ​string ​The locale identifier.
​/descriptions/name ​string ​The language dependent name of this catalog group.
​/descriptions/description ​string ​The short description of this catalog group.
​/descriptions/longDescription ​string ​The long description of this catalog group.
​/descriptions/keywords ​string ​A list of keywords used for searching.
​/descriptions/thumbnail ​string ​The thumbnail image path of this catalog group.
​/descriptions/fullImage ​string ​The full image path of this catalog group.
​/descriptions/published ​boolean Indicates whether this catalog group should be displayed for the current language. ​Defaults to false.
​/descriptions/sequence ​float

Category index field mapping from data specification

The following diagram illustrates the Category indexing pipeline implemented in Apache NiFi. The flow consists of two stages:
  1. Loading category documents
  2. Building catalog hierarchy and navigation paths for each sales category

Stage 1: Loading a Category document

This stage describes how to transform the Category data using the CreateCategoryDocument Groovy script and load it into the Category index. ​​​​​​​​​​​​​​The following mapping​ table defines how data from the Category Data Specification can be mapped into the Category Index Schema in Elasticsearch.

​Index Field​ Name​ ​Index Field Type ​​How Value Can Be Assigned
​id/store id_integer ​Set by ​InjectMetaData processor
id/​language id_integer ​Assigned directly from data field "languageId"
id/​catalog id_long ​Set by ​InjectMetaData processor
id/​catgroup id_long ​Assigned directly from data field "uniqueId"
​id/member id_long
​identifier/specification id_string Always set to "​category"
​identifier/​store id_string ​Set by ​InjectMetaData processor
identifier/​language id_string ​Extracted from only the language part of the data field "locale"
​identifier/catalog id_string Set by ​InjectMetaData processor
​identifier/category/raw raw Assigned directly from data field "identifier"
​identifier/category/normalized normalilzed Same as above​
​​name/raw raw ​Assigned directly from data field "name"
​​name/normalized normalized ​Same as above
​name/text ​text ​Same as above
​​description/raw raw Assigned directly from data field "description"​
​description/text ​text ​Same as above
​keyword/text ​text Assigned directly from data field "keyword"​
​displayable ​boolean ​Assigned directly from data field "published" of current language
url/​thumbnail ​raw ​Assigned directly from data field "thumbnail"
url/​image ​raw ​Assigned directly from data field "fullimage"
url/​​seo ​raw ​TBD
Definitions for field type aliases (in it​​​alics) are described in Index Field Type Aliase​​​s and Usages.

Stage 2: Building a catalog hierarchy for navigation

This stage describes how to generate the Navigation data and load it into the Category index. It starts with running the following Elasticsearch query agsint the current Category index.
{
  "stored_fields": [
    "id.*",
    "category.*"
  ],
  "size": 10000,
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "category.parent"
        }
      },
      "filter": [
        {
          "term": {
            "id.store": "${param.storeId}"
          }
        },
        {
          "term": {
            "id.catalog": "${param.catalogId}"
          }
        },
        {
          "term": {
            "id.language": "${param.langId}"
          }
        }
      ]
    }
  }
}
Next, the result set is passed to the BuildCatalogHierarchyForCategory Groovy script for transformation and use the following logic to generate the navigation tree.
  • The initial result set from the above search query will return all the top level categories of the given sales catalog. Using all the child categories from this set of top level categories to generate another Elasticsearch query to look for the next level child categories of the current sales catalog.
    	  {
    		"stored_fields": [
    		  "id.*",
    		  "category.*"
    		],
    		"size": 10000,
    		"query": {
    		  "bool": {
    			"must": {
    			  "match_all": {}
    			},
    			"filter": [
    			  {
    				"terms": {
    				  "category.parent": [ "10020", ... ]
    				}
    			  },
    			  {
    				"term": {
    				  "id.store": "1"
    				}
    			  },
    			  {
    				"term": {
    				  "id.catalog": "10502"
    				}
    			  },
    			  {
    				"term": {
    				  "id.language": "-1"
    				}
    			  }
    			]
    		    }
    	​        }
    	}​
    The above query will then be sent back to the dataflow in NiFi for re-execution.
  • For all the nodes returned in the current level in the catalog hierarchy, walk through each category to generate the full navigation path to it in this current sales catalog.
    ​Index Field​ Name​ ​Index Field Type ​​​Description
    ​path ​​text ​Tokenized field for the full navigation path to the current category node in the sales catalog
    path/tree hierarchy ​e.g. /apparel/women/dresses, /apparel/women, /apparel
    ​path/reversed hierarchy_reversed ​e.g. /dresses, /women/dresses, /apparel/women/dresses

    ​​

Category index field mapping from database

Building a Category index from the Commerce database happens in three stages:
  1. Creating a Category document
  2. Updating with Facet information
  3. Building catalog hierarchy for navigation​

Stage 1 - Creating a category document

This stage describes how to transform the Category data and load it into the Category index. It starts with running the following SQL to retrieve Category data from the Commerce database:
(SELECT C.CATGROUP_ID, C.MEMBER_ID, C.IDENTIFIER, 
            COALESCE(D.LANGUAGE_ID, L.LANGUAGE_ID) LANGUAGE_ID, 
            D.NAME, D.SHORTDESCRIPTION, D.THUMBNAIL, D.FULLIMAGE, 
            D.PUBLISHED, D.DISPLAY, D.KEYWORD,
            R.CATGROUP_ID_PARENT, R.CATALOG_ID, R.SEQUENCE, L.LOCALENAME
	   FROM CATGRPREL R, LANGUAGE L, CATGROUP C ${TI_DELTA_CG_JOIN_QUERY}
            LEFT OUTER JOIN CATGRPDESC D ON (D.CATGROUP_ID = C.CATGROUP_ID AND D.LANGUAGE_ID = ${param.langId})
	  WHERE R.CATALOG_ID = ${param.catalogId}
	    AND R.CATALOG_ID IN (SELECT CATALOG_ID FROM STORECAT WHERE STOREENT_ID IN
	        (SELECT RELATEDSTORE_ID FROM STOREREL WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}))
	    AND R.CATGROUP_ID_CHILD = C.CATGROUP_ID AND C.MARKFORDELETE = 0 AND L.LANGUAGE_ID = ${param.langId} ${extCatgroupAndSQL}
      UNION
	 SELECT C.CATGROUP_ID, C.MEMBER_ID, C.IDENTIFIER, 
            COALESCE(D.LANGUAGE_ID, L.LANGUAGE_ID) LANGUAGE_ID,
            D.NAME, D.SHORTDESCRIPTION, D.THUMBNAIL, D.FULLIMAGE, 
            D.PUBLISHED, D.DISPLAY, D.KEYWORD,
            NULL, R.CATALOG_ID, R.SEQUENCE, L.LOCALENAME
	   FROM CATTOGRP R, LANGUAGE L, CATGROUP C ${TI_DELTA_CG_JOIN_QUERY}
            LEFT OUTER JOIN CATGRPDESC D ON (D.CATGROUP_ID = C.CATGROUP_ID AND D.LANGUAGE_ID = ${param.langId})
	  WHERE R.CATALOG_ID = ${param.catalogId}
	    AND R.CATALOG_ID IN (SELECT CATALOG_ID FROM STORECAT WHERE STOREENT_ID IN
	        (SELECT RELATEDSTORE_ID FROM STOREREL WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}))
	    AND R.CATGROUP_ID = C.CATGROUP_ID AND C.MARKFORDELETE = 0 AND L.LANGUAGE_ID = ${param.langId} ${extCatgroupAndSQL}) ORDER BY CATGROUP_ID	
Next, the result set is passed to the CreateCategoryDocumentFromDatabase​ processor for transformation, using the following table to ​map the database field returned from the SQL above to an index field in the Category index:​​
​​​​​Index Field​ Name Index Field Type ​​​Description
​​Document Identifier​​
​id/store id_string Internal id of the owning store; mapped to table STORECGRP​
id/​language id_string The identifier of the language​; mapped to CATGRPDESC​​.LANGUAGE_ID
id/​catalog id_string ​The internal id of the sales catalog; mapped to CATGRPREL.CATALOG_ID
id/​catgroup id_string ​The internal id of the current category; mapped to CATGRPREL.CATGROUP_ID_CHILD
​id/member ​id_string The internal reference number that identifies the owner of the catalog group​; mapped to CATGROUP​.MEMBER_ID
​​identifier/specification id_string Set to "​category"
identifier/​store id_string A string that uniquely identifies the owning store; mapped to table STOREENT​​
identifier/​language id_string The language locale of this catalog group; mapped from CATGRP​DESC​.LANGUAGE_ID
​​identifier/catalog id_string ​The external identifier of the catalog; mapped to CATGRPREL​.CATALOG_ID
​identifier/category/raw raw ​Same as below
​identifier/category/normalized normalilzed ​Catgroup's basic attributes: mapped to CATGROUP.IDENTIFIER
​​Language Sensitive Data​​​
​name/raw raw ​The language-dependent name of this catalog group; mapped to CATGRPDESC​​.NAME
​​name/normalized normalized ​Same as above
​keyword/text ​text ​A keyword used for searching​; mapped to CATGRPDESC.KEYWORD
url/​thumbnail ​raw The thumbnail image path of this catalog group​; mapped to CATGRPDESC​​.THUMBNAIL
​url/​image ​raw The full image path of this catalog group​; mapped to CATGRPDESC​​​.FULLIMAGE​
Properties​​​
​displayable ​boolean ​​Indicates whether this catalog group ​should be displayed for the language; mapped to CATGRPDESC.PUBLISHED
​Navigational Data​​​​
category/catalog id_string ​The sales catalog of this current document used for sequence; mapped to CATGRPREL.CATALOG_ID
​category/parent ​id_string ​The parent sales category of this current document used for sequence; mapped to CATGRPREL.CATGROUP_ID_PARENT
​category/sequence ​float ​The leaf category level (shallow) sequence defined in CMC​; mapped to CATGRPREL.SEQUENCE
For example code, see the Stage 1 samples.

Stage 2 - Updating with facet information

This stage describes how to transform the Facet related data and load it into the Category index. It starts with running the following SQL to retrieve Category data from the Commerce database:
SELECT I.CATALOG_ID, I.CATGROUP_ID, I.CATGROUP_ID_PARENT,
	       LISTAGG(A.NAME, '###') WITHIN GROUP (ORDER BY A.ATTR_ID) NAME,
	       LISTAGG(A.ATTR_ID, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) ATTR_ID,
	       LISTAGG(A.IDENTIFIER, '###') WITHIN GROUP (ORDER BY A.ATTR_ID) IDENTIFIER,
	       LISTAGG(F.MAX_DISPLAY, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) MAX_DISPLAY,
	       LISTAGG(F.SELECTION, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) SELECTION,
	       LISTAGG(TO_CHAR(COALESCE(FCG.SEQUENCE, F.SEQUENCE)), ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) SEQUENCE,
	       LISTAGG(F.FACET_ID, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) FACET_ID,
	       LISTAGG(F.SORT_ORDER, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) SORT_ORDER,
	       LISTAGG(F.ZERO_DISPLAY, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) ZERO_DISPLAY,
	       LISTAGG(COALESCE(F.GROUP_ID, 0), ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) GROUP_ID,
	       LISTAGG(F.KEYWORD_SEARCH, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) KEYWORD_SEARCH,
	       LISTAGG(A.FACETABLE, ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) FACETABLE,
	       LISTAGG(A.SWATCHABLE,  ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) SWATCHABLE,
               LISTAGG(COALESCE(A.STOREDISPLAY, 0),  ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) STOREDISPLAY,
	       LISTAGG(COALESCE(FCG.STOREENT_ID, A.STOREENT_ID), ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) STOREENT_ID,
	       LISTAGG(COALESCE(FCG.DISPLAYABLE, 1), ', ') WITHIN GROUP (ORDER BY A.ATTR_ID) DISPLAYABLE
	  FROM (SELECT A.ATTR_ID, A.IDENTIFIER, A.FACETABLE, A.SWATCHABLE, A.STOREDISPLAY, A.STOREENT_ID, COALESCE(D.NAME, DF.NAME) NAME
	          FROM ATTR A
	               LEFT JOIN ATTRDESC D ON D.ATTR_ID = A.ATTR_ID AND D.LANGUAGE_ID = ${param.langId}
	               LEFT OUTER JOIN ATTRDESC DF ON DF.ATTR_ID = A.ATTR_ID AND DF.LANGUAGE_ID = ${default.language}) A,
	               FACET F
				   LEFT JOIN FACETCATGRP FCG ON F.FACET_ID = FCG.FACET_ID AND FCG.STOREENT_ID = ${param.storeId},
	               (SELECT DISTINCT P.CATALOG_ID, P.CATGROUP_ID, C.ATTR_ID, H.CATGROUP_ID_PARENT
	                  FROM CATGPENREL P, CATENTRYATTR C,
                           (SELECT G.CATGROUP_ID, R.CATALOG_ID, R.CATGROUP_ID_PARENT
                              FROM CATGRPREL R, CATGROUP G
                             WHERE R.CATALOG_ID = ${param.catalogId}
                              AND R.CATALOG_ID IN (SELECT CATALOG_ID FROM STORECAT WHERE STOREENT_ID IN
                                          (SELECT RELATEDSTORE_ID FROM STOREREL WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}))
                              AND R.CATGROUP_ID_CHILD = G.CATGROUP_ID AND G.MARKFORDELETE = 0
                            UNION
                           SELECT G.CATGROUP_ID, R.CATALOG_ID, NULL 
                             FROM CATTOGRP R, CATGROUP G
                            WHERE R.CATALOG_ID = ${param.catalogId}
                              AND R.CATALOG_ID IN (SELECT CATALOG_ID FROM STORECAT WHERE STOREENT_ID IN
                                          (SELECT RELATEDSTORE_ID FROM STOREREL WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId}))
                              AND R.CATGROUP_ID = G.CATGROUP_ID AND G.MARKFORDELETE = 0) H
	                  WHERE P.CATALOG_ID = H.CATALOG_ID AND P.CATGROUP_ID = H.CATGROUP_ID
	                    AND P.CATENTRY_ID = C.CATENTRY_ID ${extCatgroupAndSQL1a}) I
	 ${TI_DELTA_CG_FACET_JOIN_QUERY}
	 WHERE I.ATTR_ID = A.ATTR_ID AND A.ATTR_ID = F.ATTR_ID
	   AND F.STOREENT_ID IN
	       (SELECT RELATEDSTORE_ID FROM STOREREL WHERE STATE = 1 AND STRELTYP_ID = -4 AND STORE_ID = ${param.storeId})
	 GROUP BY I.CATALOG_ID, I.CATGROUP_ID, I.CATGROUP_ID_PARENT

Next, the result set is passed to the FindFacetsForCategoryFromDatabase processor for transformation, using the following table to ​map the database field returned from the SQL above to an index field in the Category index:​

​Index Field​ Name​ ​Index Field Type ​​​Description
​Navigational Data​​​​
​facets/<id>/id ​​id_string ​The internal facet identifier of the corresponding facet; mapped to FACET.FACET_ID
​facets/<id>/key id_string ​The normalized facet identifier that is used as the key for this current facet entry​; generated from ATTR.IDENTIFIER
​facets/<id>/search ​boolean ​Describes whether the facet should be included in keyword search; mapped to FACET.KEYWORD_SEARCH
​facets/<id>/displayable ​boolean ​Describes whether the facet should be displayed in the storefront; mapped to FACETCATGRP.DISPLAYABLE​
​​facets/<id>/sequence ​float ​The sequence of the facet showing in the storefront; mapped to FACET.SEQUENCE
​facets/<id>/​group ​id_string ​The internal group identifier for the facet to be used in the storefront; mapped to FACET.GROUP_ID
​​facets/<id>/attribute/id ​id_string ​The corresponding internal attribute id of the current facet; mapped to FACET.ATTR_ID
​facets/<id>/attribute/name ​id_string ​​The corresponding language specific attribute name of the current facet
​facets/<id>/attribute/identifier ​id_string ​The corresponding attribute identifier of the current facet; mapped to ATTR.IDENTIFIER
​facets/<id>/attribute/displayable ​boolean​ ​Identifies if this facet should be displayed in the storefront; mapped to ATTR.DISPLAYABLE
​facets/<id>/attribute/swatchable ​boolean ​Identifies if this facet can be used with images for faceting; mapped to ATTR.SWATCHABLE
facets/<id>attribute/ribbon boolean ​Identifies if this attribute can be used as a ribbon for display; mapped to ATTR.STOREDISPLAY​
​facets/<id>/attribute/usage ​raw ​Describes the usage of this attribute; ​mapped to ATTR.ATTRUSAGE
​​facets/<id>/display/limit ​integer ​The maximum values to display in the storefront for the facet; mapped to FACET.MAX_DISPLAY
​facets/<id>/display/zero ​boolean ​Describes whether the facetable attribute should display zero count values; mapped to FACET.ZERO_DISPLAY
​facets/<id>/display/multiple boolean​ ​Describes whether the facetable attribute allows multiple selections; mapped to FACET.SELECTION
​facets/<id>/display/order ​integer​ ​The display order to use when displaying the values for the facet; mapped to FACET.SORT_ORDER

For example code, see the Stage 2 samples.

Stage 3 - Building a Catalog hierarchy for navigation

This stage describes how to generate the Navigation data and load it into the Category index. It starts with running the following Elasticsearch query against the current Category index:​

{
  "stored_fields": [
    "id*",
    "category.*",
    "name.*"
  ],
  "size": ${es.pageSize},
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "category.parent"
        }
      },
      "filter": [
        {
          "term": {
            "id.store": "${param.storeId}"
          }
        },
        {
          "term": {
            "id.catalog": "${param.catalogId}"
          }
        },
        {
          "term": {
            "id.language": "${param.langId}"
          }
        }
      ]
    }
  }
}​
Next, the result set is passed to the BuildCatalogHierarchyForCategory processor for transformation and use the following logics to generate the navigation tree.
Note: The initial result set from the above search query will return all the top level categories of the given sales catalog. Using all the child categories from this set of top level categories to generate another Elasticsearch query to look for the next level child categories of the current sales catalog:
{
    "stored_fields": [
        "id*",
        "category.*",
        "name.*"​​
    ],
    "size": 10000,
    "query": {
        "bool": {
            "must": {
                "match_all": {}
            },
            "filter": [
                {
                    "terms": {
                        "category.parent": [
                            "10020", ...
                        ]
                    }
                },
                {
                    "term": {
                        "id.store": "1"
                    }
                },
                {
                    "term": {
                        "id.catalog": "10502"
                    }
                },
                {
                    "term": {
                        "id.language": "-1"
                    }
                }
            ]
        }
    }
}​​

The above query will then be sent back to the dataflow in NiFi for re-execution.

Note: For all the nodes returned in the current level in the catalog hierarchy, walk through each category to generate the full navigation path to it in this current sales catalog​:
​Index Field​ Name​ ​Index Field Type ​​​Description
​Navigational Data​​​​
​category/child ​id_string​ ​The child sales category of this current document used for sequence; mapped to CATGRPREL​.CATGROUP_ID_CHILD
​category/sequence ​float ​The leaf category level (shallow) sequence defined in CMC​; mapped to CATGRPREL.SEQUENCE
​path ​​raw Tokenized field for the full navigation path to the current category node in the sales catalog. ​For example, when a "dress" category (id:10001) with path "/1/3/10001" is indexed for sales catalog 10502, this field stores the original form of the path
path/tree hierarchy This is the tokenized version for the path field above, e.g. 1, 3, 10001​
path/list raw ​This is the canonical version for the list of path field names, e.g. Aurora, Women, Dress
For example code, see the Stage 3 samples.

Database mapping samples:

Stage 1

The following code is an example of the input data for the CreateCategoryDocumentFromDatabase processor:

{
  "CATGROUP_ID": 5,
  "MEMBER_ID": 7000000000000001000,
  "IDENTIFIER": "Girls",
  "LANGUAGE_ID": -1,
  "NAME": "Girls",
  "SHORTDESCRIPTION": "Girls",
  "LONGDESCRIPTION": null,
  "THUMBNAIL": "images/catalog/apparel/girls/category/catr_app_girls.png",
  "FULLIMAGE": "images/catalog/apparel/girls/category/catr_app_girls.png",
  "PUBLISHED": 1,
  "DISPLAY": null,
  "KEYWORD": "casual, sporty, skirt, sweater, kid",
  "CATGROUP_ID_PARENT": 1,
  "CATALOG_ID": 10001,
  "SEQUENCE": 1,
  "LOCALENAME": "en_US           "
}
The CreateCategoryDocumentFromDatabase processor transforms the input data into the following output data:

{ "update": { "_id": "1--1-10001-5", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{
  "doc": {
    "identifier": {
      "specification": "category",
      "language": "en_US",
      "category": {
        "normalized": "Girls",
        "raw": "Girls"
      }
    },
    "name": {
      "normalized": "Girls",
      "raw": "Girls"
    },
    "displayable": true,
    "description": {
      "raw": "Girls"
    },
    "id": {
      "catalog": "10001",
      "member": "7000000000000001000",
      "catgroup": "5",
      "language": "-1",
      "store": "1"
    },
    "keyword": {
      "text": "casual, sporty, skirt, sweater, kid"
    },
    "category": {
      "parent": "1",
      "sequence": 1,
      "catalog": "10001"
    },
    "url": {
      "image": "images/catalog/apparel/girls/category/catr_app_girls.png",
      "thumbnail": "images/catalog/apparel/girls/category/catr_app_girls.png"
    },
    "__meta": {
      "created": "2020-07-30T13:10:34.018Z",
      "modified": "2020-07-30T13:10:34.046Z",
      "version": {
        "min": 0,
        "max": 0
      }
    }
  },
  "doc_as_upsert": true
}

Stage 2

The following code is an example of the input data for the FindFacetsForCategoryFromDatabase processor:

{
  "CATALOG_ID": 10001,
  "CATGROUP_ID": 10001,
  "NAME": "Available Sizes",
  "ATTR_ID": "7000000000000000001",
  "IDENTIFIER": "swatchSize",
  "MAX_DISPLAY": "-1",
  "SELECTION": "0",
  "SEQUENCE": "0",
  "FACET_ID": "3074457345618269104",
  "SORT_ORDER": "0",
  "ZERO_DISPLAY": "0",
  "GROUP_ID": "0",
  "KEYWORD_SEARCH": "1",
  "FACETABLE": "1",
  "SWATCHABLE": "0",
  "STOREENT_ID": "10501",
  "DISPLAYABLE": "1"
}
The FindFacetsForCategoryFromDatabase processor transforms the above input data with the store id, languages id and catalog id passed in from NiFi FlowFile class as attributes into the following output data:

{ "update": { "_id": "1--1-10001-10001", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{
  "doc": {
    "__meta": {
      "modified": "2020-07-30T13:42:12.819Z"
    },
    "facets": {
      "swatchsize": {
        "sequence": 0,
        "search": true,
        "display": {
          "zero": false,
          "limit": -1,
          "multiple": false,
          "order": 0
        },
        "displayable": true,
        "id": "3074457345618269104",
        "attribute": {
          "identifier": "swatchSize",
          "name": "Available Sizes",
          "id": "7000000000000000001",
          "swatchable": false
        },
        "key": "swatchsize",
        "group": "0"
      }
    }
  }
}

Stage 3

The following code is an example of the input data for the BuildCatalogHierarchyForCategory processor:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 6,
      "relation": "eq"
    },
    "max_score": 0,
    "hits": [
      {
        "_index": ".auth.1.category.202006182302",
        "_type": "_doc",
        "_id": "1--1-10502-10",
        "_score": 0,
        "fields": {
          "name.normalized": [
            "newsletters & magazines"
          ],
          "id.member": [
            "7000000000000001001"
          ],
          "identifier.category.normalized": [
            "newslettersandmagazines"
          ],
          "name.raw": [
            "Newsletters & Magazines"
          ],
          "identifier.category.raw": [
            "NewslettersAndMagazines"
          ],
          "category.catalog": [
            "10502"
          ],
          "identifier.language": [
            "en_US"
          ],
          "id.catgroup": [
            "10"
          ],
          "id.catalog": [
            "10502"
          ],
          "identifier.specification": [
            "category"
          ],
          "id.language": [
            "-1"
          ],
          "id.store": [
            "1"
          ],
          "category.sequence": [
            6
          ]
        }
      },
      {
        "_index": ".auth.1.category.202006182302",
        "_type": "_doc",
        "_id": "1--1-10502-7",
        "_score": 0,
        "fields": {
          "name.normalized": [
            "grocery"
          ],
          "id.member": [
            "7000000000000001001"
          ],
          "identifier.category.normalized": [
            "grocery"
          ],
          "name.raw": [
            "Grocery"
          ],
          "identifier.category.raw": [
            "Grocery"
          ],
          "category.catalog": [
            "10502"
          ],
          "identifier.language": [
            "en_US"
          ],
          "id.catgroup": [
            "7"
          ],
          "id.catalog": [
            "10502"
          ],
          "identifier.specification": [
            "category"
          ],
          "id.language": [
            "-1"
          ],
          "id.store": [
            "1"
          ],
          "category.sequence": [
            3
          ]
        }
      },
      {
        "_index": ".auth.1.category.202006182302",
        "_type": "_doc",
        "_id": "1--1-10502-8",
        "_score": 0,
        "fields": {
          "name.normalized": [
            "health"
          ],
          "id.member": [
            "7000000000000001001"
          ],
          "identifier.category.normalized": [
            "health"
          ],
          "name.raw": [
            "Health"
          ],
          "identifier.category.raw": [
            "Health"
          ],
          "category.catalog": [
            "10502"
          ],
          "identifier.language": [
            "en_US"
          ],
          "id.catgroup": [
            "8"
          ],
          "id.catalog": [
            "10502"
          ],
          "identifier.specification": [
            "category"
          ],
          "id.language": [
            "-1"
          ],
          "id.store": [
            "1"
          ],
          "category.sequence": [
            4
          ]
        }
      },
      {
        "_index": ".auth.1.category.202006182302",
        "_type": "_doc",
        "_id": "1--1-10502-6",
        "_score": 0,
        "fields": {
          "name.normalized": [
            "electronics"
          ],
          "id.member": [
            "7000000000000001001"
          ],
          "identifier.category.normalized": [
            "electronics"
          ],
          "name.raw": [
            "Electronics"
          ],
          "identifier.category.raw": [
            "Electronics"
          ],
          "category.catalog": [
            "10502"
          ],
          "identifier.language": [
            "en_US"
          ],
          "id.catgroup": [
            "6"
          ],
          "id.catalog": [
            "10502"
          ],
          "identifier.specification": [
            "category"
          ],
          "id.language": [
            "-1"
          ],
          "id.store": [
            "1"
          ],
          "category.sequence": [
            2
          ]
        }
      },
      {
        "_index": ".auth.1.category.202006182302",
        "_type": "_doc",
        "_id": "1--1-10502-9",
        "_score": 0,
        "fields": {
          "name.normalized": [
            "home & furnishing"
          ],
          "id.member": [
            "7000000000000001001"
          ],
          "identifier.category.normalized": [
            "home furnishings"
          ],
          "name.raw": [
            "Home & Furnishing"
          ],
          "identifier.category.raw": [
            "Home Furnishings"
          ],
          "category.catalog": [
            "10502"
          ],
          "identifier.language": [
            "en_US"
          ],
          "id.catgroup": [
            "9"
          ],
          "id.catalog": [
            "10502"
          ],
          "identifier.specification": [
            "category"
          ],
          "id.language": [
            "-1"
          ],
          "id.store": [
            "1"
          ],
          "category.sequence": [
            5
          ]
        }
      },
      {
        "_index": ".auth.1.category.202006182302",
        "_type": "_doc",
        "_id": "1--1-10502-1",
        "_score": 0,
        "fields": {
          "name.normalized": [
            "apparel"
          ],
          "id.member": [
            "7000000000000001001"
          ],
          "identifier.category.normalized": [
            "apparel"
          ],
          "name.raw": [
            "Apparel"
          ],
          "identifier.category.raw": [
            "Apparel"
          ],
          "category.catalog": [
            "10502"
          ],
          "identifier.language": [
            "en_US"
          ],
          "id.catgroup": [
            "1"
          ],
          "id.catalog": [
            "10502"
          ],
          "identifier.specification": [
            "category"
          ],
          "id.language": [
            "-1"
          ],
          "id.store": [
            "1"
          ],
          "category.sequence": [
            1
          ]
        }
      }
    ]
  }
}

The BuildCatalogHierarchyForCategory processor transforms this input data into to a set of output data, one for the next relationship and one for update relationship as shown in the NiFi flow chart.

The following sample output data is shown for the next relationship:

{
  "stored_fields": [
    "id*",
    "category.*",
    "name.*"
  ],
  "size": 10000,
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": [
        {
          "terms": {
            "category.parent": [
              "11",
              "1",
              "6",
              "7",
              "8",
              "9",
              "10"
            ]
          }
        },
        {
          "term": {
            "id.store": "1"
          }
        },
        {
          "term": {
            "id.catalog": "10001"
          }
        },
        {
          "term": {
            "id.language": "-1"
          }
        }
      ]
    }
  }
}
The following sample output data is shown for the update relationship:

{ "update": { "_id": "1--1-10001-7", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/7","category":{"path":{"name":["Grocery"],"id":["7"]}},"__meta":{"modified":"2020-07-31T12:57:33.942Z"}} }
{ "update": { "_id": "1--1-10001-9", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/9","category":{"path":{"name":["Home & Furnishing"],"id":["9"]}},"__meta":{"modified":"2020-07-31T12:57:34.409Z"}} }
{ "update": { "_id": "1--1-10001-6", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/6","category":{"path":{"name":["Electronics"],"id":["6"]}},"__meta":{"modified":"2020-07-31T12:57:34.410Z"}} }
{ "update": { "_id": "1--1-10001-1", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/1","category":{"path":{"name":["Apparel"],"id":["1"]}},"__meta":{"modified":"2020-07-31T12:57:34.410Z"}} }
{ "update": { "_id": "1--1-10001-11", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/11","category":{"path":{"name":["Hardware"],"id":["11"]}},"__meta":{"modified":"2020-07-31T12:57:34.411Z"}} }
{ "update": { "_id": "1--1-10001-8", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/8","category":{"path":{"name":["Health"],"id":["8"]}},"__meta":{"modified":"2020-07-31T12:57:34.411Z"}} }
{ "update": { "_id": "1--1-10001-10", "_index": ".auth.1.category.202006160325", "retry_on_conflict": 5, "_source": false } }
{ "doc": {"path":"/10","category":{"path":{"name":["Newsletters & Magazines"],"id":["10"]}},"__meta":{"modified":"2020-07-31T12:57:34.412Z"}} }