Connectors are sets of data pipes with built-in processing, which together form a
pipeline. To create a new connector, define a descriptor to describe the schema, the
pipeline the connector will use, and the schedule (interval, recurrence) that it
follows.
Before you begin
Important: When the Ingest service starts it attempts to
verify the required connector settings in both Zookeeper and NiFi. If no prior
connector configuration is detected, the Ingest service will create all the
mandatory HCL Commerce Search connectors automatically. This one-time procedure may
take up to 20 minutes to complete depending on the computing power of your system.
Note that external mounts with standalone docker contains, or persistent
volumes in Kubernetes, are highly recommended for this process. You can refer to
the sample docker-compose or helm charts for details.
The reason for this verification check is that data inside the
application containers could be lost once the container is deleted. The
advantage is that having the application configuration and internal metadata
stored in an external storage allows the application to resume immedately to the
most recent state and can continue to functional even after the container has
been redeployed.
About this task
In this topic you will learn how to build a NiFi connector for use with the Ingest
service. You create the connector by defining a descriptor and making a request to
the Ingest service at POST: /connectors
.
Note: The following instructions describe the Commerce side of an architecture that uses
the open-source Apache NiFi project. As such, documentation about NiFi is
necessarily maintained by the Apache project. For complete instructions on
programming for NiFi, refer to the
NiFi Documentation on the project
site.
Procedure
-
Build your NiFi pipes. Each pipe is a NiFi process group. You can extend the
existing default pipes by creating new connectors and storing your pipes in the
NiFi registry. By using the registry, you can take advantage of its ability to
service pipes separately and at different version numbers.
The pre-built pipelines provided in Version 9.1 are Store, Catalog,
Attribute, Category, Product, SEO, Price, Inventory, Entitlement, and
Merchandising.
- Store
- Name, description, store level defaults, supported languages and
currencies
- Catalog
- Name, description, catalog filters
- Attribute
- Name, description, attribute values, and facet properties
- Category
- Name, short description, sales catalog hierarchy and navigation
properties, facets, seo url
- Product
- Name, short description, brand, price lists, inventory counts,
natural language, sales catalog hierarchy and navigation properties,
seo url, spell correction, suggestion, entitlement, merchandising
associations, attachments
- URL
- Search engine optimization properties and URL resolution for
products and categories
- Description
- Long description for product and category.
-
Define your connector. A connector is a set of pipes or pipelines, bundled as a
single processing unit. The standard tool for constructing connectors is the
Apache NiFi interface. You can use the
NiFi console to describe the relationships between pipelines that process
incoming data. Multiple processing pipelines can be linked to one another to
form a pipeline series inside of a connector, including a custom pipeline
created by customers. The output is a Connector Descriptor, which you store in
ZooKeeper. The console is located at
http://nifi_hostname:30600/nifi.

You can set connectors to run once, or on a recurring
schedule.
-
Design your connector by defining the needed attributes in a descriptor. The
descriptor serves as a blueprint for a connector and has the following required
attributes:
- Name
- Each descriptor (and by extension a connector) must have a unique
name. If a connector with the given name already exists, a new
connector will not be created and a 400: Bad Request will be
returned.
- Description
- A description of what this connector does. It is recommended to give
a connector a description so that it is easy to recall the purpose
of a connector.
- Pipes
- The list of pipes that make up a connector. Each pipe in a connector
is responsible for doing some sort of ETL operation(s). Each pipe
must have a name, which corresponds to a pipe that exists in the
NiFi Registry. For more information, see the Apache NiFi Registry documentation.
- You can define the properties that you want set in a pipe;
PROCESS_GROUP and CONTROLLER_SERVICE level properties are supported.
To define a property, the name and value of the property should be
specified alongside their scope (The immediate parent Process Group
or Controller Service they should be defined inside).
- When used with PROCESS_GROUP scope, the property defined inside of
this section will be used as a variable to be set into the given
process group. For
example:
{
"name": "connector.wait.limit",
"value": "3",
"scope": {
"name": "Wait for Completion",
"type": "PROCESS_GROUP"
}
}
In
this example, you are defining a variable with the name
connector.wait.limit created inside of the
process group with name Wait for Completion. This
variable will have a value of 3 assigned to it. Note that variables
in NiFi can be inherited to child processors and child process
groups. Setting the variable at the pipe level should be sufficient
for all of its sub-components to access it.
- When used with CONTROLLER_SERVICE scope, the property defined inside
of this section will be treated as one of the existing properties
belonging to the specified controller service. For
example,
{
"name": "Database Connection URL",
"value": "${AUTH_JDBC_URL}",
"scope": {
"name": "Database Connection Pool",
"type": "CONTROLLER_SERVICE"
}
},
In
this example, an existing property called "Database Connection
URL" of the controller service with name "Database Connection
Pool" will be updated with the given value
${AUTH_JDBC_URL}
. This value can be an
absolute value or an environment variable name such as the one
illustrated in this example. Using an environment variable
allows this value to be configured at deployment time when
setting up the NiFi application container.
- HCL Commerce Version 9.1 provides default support for a
core set of Commerce-specific document types. These data
specifications can be used for both full and delta updates. For an
example of a Commerce data specification, see Product data specification.
-
Create connectors in ElasticSearch.
-
Use the REST interface of your choice to create the connector. Run the
following command on the .json connector file you
have created.
POST: http://ingest_hostname:30800/connectors
Body: select "JSON"
Copy and paste the name of the connector
file to the Body and submit the request. It can take ten or twenty
minutes for the process to complete.
You can find examples of
.json connector files in the connectors.zip installation file, which is
up to date for Version 9.1.9.0. To fetch and download connectors for
your current version of HCL Commerce, use the Ingest API
/connectors endpoint. You can use the
GET: /connectors endpoint to get the
descriptors of all existing connectors in NiFi. For more
information, see Managing connectors in the Ingest service.
-
Check the NiFi interface to ensure that the pipeline is running. Wait
until there are 0 Stopped component, 0 invlid component, and 0/0 bytes
in total queued data.
-
In your REST interface, use the GET method to verify your connectors:
GET: http://ingest_hostname:30800/connectors
Note: If you encounter a "No processor defined" error after you have created the
connectors, restart them. In the NiFi
Operate panel,
click on
Stop button, then the
Start button, to restart all connectors.
The
NiFi interface is at the following
address:
http://nifi_hostname:30600/nifi/
List of in-built connectors:
- auth.validate
- This pipeline is used to check the health of the index by
comparing and counting elasticsearch documents against the
database. Currently it checks the integrity of store, category,
product, attribute and URL index.
- auth.content/ live.content
-
Creates static pages inside the URL index. Each page layout
is constructed using the selection of widgets from the
Commerce Composer widget library.
- auth.delete/ live.delete
- When a delete event for a category, product, or attribute
occurs, this pipe is called. This pipe deletes products or
category and sends an update event to the parent or child of
those products or categories.
- push-to-live
- This pipeline is called when the user is all set to send all
index data to the live environment. It is used a locking
mechanism to write data into the live environment. When write
access is granted in a live environment, authentication is
disabled.
Results
You now have a data definition, a pipe or set of pipes, and their relationship as
described in a data descriptor. You are ready to use the connector with the Ingest
service.