HCL Commerce Version 9.1.15.0 or later

Statistics service for the Indexing build summary

The statistics service gathers data on catalog dimensions. You can store this data in the HCL Cache for use in creating Ingest run summaries, and for tuning parameter values.

Using the service

The statistics service runs once per day by default. If you want to change the run frequency, update the Run Schedule variable under the Generate FlowFile Process inside the Statistics Auth or Statistics Live stage of the indexing service. It gathers data on all relevant catalog related dimensions, such as products or items, categories, attributes and their values, facets, Search Term Associations, contracts, and prices. The service adds this information to the Indexing Build summary document. Once the statistics have been collected, you can store them in the HCL Cache or even an index of their own, so that they can be looked up when creating an Ingest Run summary. This data is also needed when calculating certain tuning parameter values.

If you are using a single environment configuration, the service gathers dimensions related only to that environment. For example, if you run the service in the Auth environment the setup service will fail to gather data for the Live JDBC and an error will be written to the logs, which you can access using using GET /log/_search?q=severity:E with error code (logger.message.code) DI2150E and DI2151E for Live and Auth respectively. This can be resolved by creating a dual environment setup, as described in Elasticsearch with dedicated Auth and Live nodes. After changing your environment, run the service again to get the latest Catalog data profile summary. For more information about dual

The data gathered by the service can be accessed through the index status API for all reindex operations.

HCL Commerce Version 9.1.15.0 or laterThe SQL count method is used to gather the dimensions. Each SQL statement requires a unique identifier and the location (file system address) of the SQL code file that contains it. Some examples are provided below to illustrate the kinds of dimensions to be collected. You can adjust this SQL to account for other use cases.
Table 1. Default dimension queries
SQL CODE (ingest.database.sql) IDENTIFIER (ingest.database.identifier)
SELECT COUNT(*) COUNTER FROM STORE; total.stores
SELECT COUNT(*) COUNTER FROM STORE WHERE STORETYPE = 'CPS';
total.catalog.asset.stores
SELECT COUNT(*) COUNTER FROM STORE WHERE STORETYPE = 'MHS';
total.b2c.stores
SELECT COUNT(*) COUNTER FROM STORE WHERE STORETYPE = 'BMH';
total.b2b.stores
SELECT COUNT(*) COUNTER FROM STORE WHERE STORETYPE = 'MPS';
total.b2c.storefront.asset.stores
SELECT COUNT(*) COUNTER FROM STORE WHERE STORETYPE = 'BMP';
total.b2b.storefront.asset.stores
SELECT COUNT(DISTINCT LANGUAGE_ID) COUNTER FROM STORELANG   s
                WHERE STOREENT_ID IN (SELECT STORE_ID FROM STORE WHERE STORETYPE IN ('MHS',
                  'BMH'));
total.supported.languages
SELECT COUNT(*) COUNTER FROM STORECAT; total.catalogs
SELECT COUNT(*) COUNTER FROM STORECAT WHERE MASTERCATALOG   = 1; 
total.master.catalogs
SELECT COUNT(*) COUNTER FROM CATGROUP WHERE MARKFORDELETE   = 0;
total.categories 
                
SELECT COUNT(*) COUNTER FROM CATENTRY WHERE MARKFORDELETE   = 0;
total.catentries
SELECT COUNT(*) COUNTER FROM CATENTRY c WHERE   CATENTTYPE_ID
                IN ('ProductBean') AND MARKFORDELETE = 0;
total.products
SELECT COUNT(*) COUNTER FROM CATENTRY c WHERE   CATENTTYPE_ID
                IN ('ItemBean') AND MARKFORDELETE = 0;
total.items
SELECT COUNT(*) COUNTER FROM CATENTRY c WHERE   CATENTTYPE_ID
                NOT IN ('ProductBean', 'ItemBean') AND MARKFORDELETE = 0;
total.other.catenties
SELECT COUNT(*) COUNTER FROM CONTRACT WHERE STATE = 3; total.contracts
SELECT COUNT(*) COUNTER FROM ATTR; total.attributes
SELECT COUNT(*) COUNTER FROM ATTR WHERE ATTRUSAGE = 1; total.allowed.value.attributes
SELECT COUNT(*) COUNTER FROM ATTR WHERE ATTRUSAGE = 2 OR
                ATTRUSAGE IS NULL; 
total.assigned.value.attributes
SELECT COUNT(*) COUNTER FROM ATTR WHERE FACETABLE = 1; total.facetable.attributes
SELECT COUNT(*) COUNTER FROM ATTRVAL; total.attribute.values
SELECT COUNT(*) COUNTER FROM ATTRVAL WHERE VALUSAGE IN(1,2);
total.allowed.attribute.values
SELECT COUNT(*) COUNTER FROM ATTRVAL WHERE VALUSAGE IS NULL;
total.assigned.attribute.values
SELECT COUNT(*) COUNTER FROM CATENTRYATTR; total.catentry.attributes
SELECT COUNT(*) COUNTER FROM CATENTRYATTR WHERE USAGE= 1;
total.catentry.defining.attributes
SELECT COUNT(*) COUNTER FROM CATENTRYATTR WHERE USAGE= 2;
total.catentry.descriptive.attributes
SELECT COUNT(*) COUNTER FROM TRADEPOSCN WHERE   MARKFORDELETE = 0;
total.pricelists
SELECT COUNT(*) COUNTER FROM OFFER WHERE ENDDATE > CURRENT_TIMESTAMP 
OR ENDDATE IS NULL;
total.offers
SELECT COUNT(*) COUNTER FROM OFFERPRICE; total.offerprices
SELECT COUNT(*) COUNTER FROM CATGRPTPC; total.offer.pricelists
SELECT COUNT(*) COUNTER FROM INVENTORY; total.inventories
Note:

The service is used to capture statistics for tuning purpose. It is not a real-time Catalog data profile summary.

If there have been significant changes in Catalog size or you have customized the SQL, re-run the service by changing the Run Schedule in the Generate FlowFile Process inside the Statistics Auth or Statistics Live stage of the indexing process, and do a full reindexing run to generate a real-time Catalog data profile summary.

The Ingest summary is a record of the dimension sizes at the time that it was run. It is not updated in real-time, so any changes made to the underlying tables will not be registered until the next time you run the report.

Adding custom SQL to the service

You can add your own SQL to the Statistics Service by adding your SQL inside the sql/statistics directory. The file name should be the same as the identifier for the key. Store the SQL file address along with its identifier in the HCL Cache in the key-value pair format shown in the following example.
{	
		"identifier" : "total.stores",
		"location" : "sql/statistics/TotalStores.sql"
	},
  1. In the SQL statement, store the value extracted by the query in COUNTER. This value corresponds to the value in the key-value pair used to store the dimension in HCL-Cache.
  2. Create a file with the same name as you will be using for the identifier in the SQL code inside sql/statistics and use the SQL file name of the IDENTIFIER key. Examples of the key can be found in the second column of the Default dimension queries table.
  3. Once you have created the SQL file and its corresponding identifier, add the identifier/location pair in JSON format to the Replacement Value variable inside the Catalog Flowfile processor. Use the convention Identifier-SQL JSON for the identifier and sql/statistics/sql_filename.sql for the SQL file. For example, the default dimension identifiers and their corresponding SQL code files are defined as follows:
    [
    	{	
    		"identifier" : "total.stores",
    		"location" : "sql/statistics/TotalStores.sql"
    	},
    	{	
    		"identifier" : "total.catalog.asset.stores",
    		"location" : "sql/statistics/TotalCatalogAssetStores.sql"
    	},
    	{	
    		"identifier" : "total.b2c.stores",
    		"location" : "sql/statistics/Totalb2cStores.sql"
    	},
    	{	
    		"identifier" : "total.b2b.stores",
    		"location" : "sql/statistics/Totalb2bStores.sql"
    	},	
    	{	
    		"identifier" : "total.b2c.storefront.asset.stores",
    		"location" : "sql/statistics/Totalb2cStorefrontAssetStores.sql"
    	},	
    	{	
    		"identifier" : "total.b2b.storefront.asset.stores",
    		"location" : "sql/statistics/Totalb2bStorefrontAssetStores.sql"
    	},	
    	{	
    		"identifier" : "total.supported.languages",
    		"location" : "sql/statistics/TotalSupportedLanguages.sql"
    	},	
    	{	
    		"identifier" : "total.catalogs",
    		"location" : "sql/statistics/TotalCatalogs.sql"
    	},	
    	{	
    		"identifier" : "total.master.catalogs",
    		"location" : "sql/statistics/TotalMasterCatalogs.sql"
    	},	
    	{	
    		"identifier" : "total.categories",
    		"location" : "sql/statistics/TotalCategories.sql"
    	},	
    	{	
    		"identifier" : "total.catentries",
    		"location" : "sql/statistics/TotalCatentries.sql"
    	},	
    	{	
    		"identifier" : "total.products",
    		"location" : "sql/statistics/TotalProducts.sql"
    	},	
    	{	
    		"identifier" : "total.items",
    		"location" : "sql/statistics/TotalItems.sql"
    	},	
    	{	
    		"identifier" : "total.other.catenties",
    		"location" : "sql/statistics/TotalOtherCatenties.sql"
    	},	
    	{	
    		"identifier" : "total.contracts",
    		"location" : "sql/statistics/TotalContracts.sql"
    	},	
    	{	
    		"identifier" : "total.attributes",
    		"location" : "sql/statistics/TotalAttributes.sql"
    	},	
    	{	
    		"identifier" : "total.allowed.value.attributes",
    		"location" : "sql/statistics/TotalAllowedValueAttributes.sql"
    	},	
    	{	
    		"identifier" : "total.assigned.value.attributes",
    		"location" : "sql/statistics/TotalAssignedValueAttributes.sql"
    	},	
    	{	
    		"identifier" : "total.facetable.attributes",
    		"location" : "sql/statistics/TotalFacetableAttributes.sql"
    	},	
    	{	
    		"identifier" : "total.attribute.values",
    		"location" : "sql/statistics/TotalAttributeValues.sql"
    	},	
    	{	
    		"identifier" : "total.allowed.attribute.values",
    		"location" : "sql/statistics/TotalAllowedAttributeValues.sql"
    	},	
    	{	
    		"identifier" : "total.assigned.attribute.values",
    		"location" : "sql/statistics/TotalAssignedAttributeValues.sql"
    	},	
    	{	
    		"identifier" : "total.catentry.attributes",
    		"location" : "sql/statistics/TotalCatentryAttributes.sql"
    	},	
    	{	
    		"identifier" : "total.catentry.defining.attributes",
    		"location" : "sql/statistics/TotalCatentryDefiningAttributes.sql"
    	},
    	{	
    		"identifier" : "total.catentry.descriptive.attributes",
    		"location" : "sql/statistics/TotalCatentryDescriptiveAttributes.sql"
    	},
    	{	
    		"identifier" : "total.pricelists",
    		"location" : "sql/statistics/TotalPricelists.sql"
    	},
    	{	
    		"identifier" : "total.offers",
    		"location" : "sql/statistics/TotalOffers.sql"
    	},
    	{	
    		"identifier" : "total.offerprices",
    		"location" : "sql/statistics/TotalOfferprices.sql"
    	},
    	{	
    		"identifier" : "total.offer.pricelists",
    		"location" : "sql/statistics/TotalOfferPricelists.sql"
    	},
    	{	
    		"identifier" : "total.inventories",
    		"location" : "sql/statistics/TotalInventories.sql"
    	}	
    ]