D. Import and configure the BigDataODBCHiveTemplate data source template in Campaign

This is the fourth step to integrate HCL® Campaign with Hive-based Apache Hadoop data sources.

Before you begin

Complete C. Map existing HBase tables to Hive.

About this task

To enable Campaign to communicate with your Hive-based Hadoop system, you must do the following actions:
  • Import the BigDataODBCHive.xml template into HCL Campaign. You must import the template only once. Importing a template makes it available for creating data sources.
  • Use the template to create and configure a data source for each Hive implementation that communicates with HCL Campaign.
  • For each data source, configure the HiveQueryMode property in the Campaign configuration.

Procedure

  1. Use the configTool utility to import the BigDataODBCHive.xml template into Campaign.
    • BigDataODBCHive.xml is in <Campaign_Home>/conf.
    • configTool is in <Marketing_Platform_Home>/tools/bin. For more information, see the HCL Marketing Platform Administrator’s Guide in the http://help.hcltechsw.com/.

    The following example imports the template into the default Campaign partition, partition1. Replace <Campaign_Home> with the complete path to the HCL Campaign installation directory.

    ./configTool -i -p "Affinium|Campaign|partitions|partition1|dataSources" –f <Campaign_Home>/conf/BigDataODBCHive.xml

  2. Create a data source based on BigDataODBCHiveTemplate. Do this for each Hive implementation that communicates with Campaign. For example, if you have four implementations (MapR, Cloudera, Hortonworks, BigInsights®), create four separate data sources, and configure each one.
    1. In HCL Campaign, choose Settings > Configuration
    2. Go to Campaign|partitions|partition[n]|dataSources.
    3. Select BigDataODBCHiveTemplate.
    4. Supply a New category name that identifies the Hive dataSource, for example Hive_MapR or Hive_Cloudera or Hive_HortonWorks or Hive_BigInsights.
    5. Complete the fields to set the properties for the new data source, then save your changes.
      Important: Some properties do not have default values, so you must supply them. Pay special attention to the properties described below. This is only a partial list of the properties included in this template. For complete information, see the IBM Campaign Administrator's Guide.
    Configuration property Description
    ASMUserForDBCredentials No default value defined. Specify the Campaign system user.
    DSN DSN Name as specified in the odbc.ini file for the Hive-based Hadoop big data instance.
    HiveQueryMode

    For data sources that use the DataDirect ODBC driver, use Native.

    For data sources that use the Cloudera ODBC driver or Hortonworks Hive ODBC driver, use SQL.

    JndiName Not needed for user data source.
    SystemTableSchema No default value defined. Specify the user of the database that you connect to.
    OwnerForTableDisplay No default value defined. Specify the user of the database that you connect to.
    LoaderPreLoadDataFileCopyCmd SCP is used to copy data from HCL Campaign to a temp folder called /tmp on the Hive-based Hadoop system. The location must be called /tmp and it must be on the Hive server (the file system location, not the HDFS location). This value can either specify the SCP command or call a script that specifies the command.

    For more information and detailed instructions about how to export data from Campaign to a Hive-based Hadoop system, see the IBM Campaign Administrator's Guide.

    LoaderPostLoadDataFileRemoveCmd Data files are copied from IBM Campaign to a temp folder on the Hive-based Hadoop system. You must use the SSH "rm" command to remove the temporary data file.

    For more information and detailed instructions about how to export data from Campaign to a Hive-based Hadoop system, see the IBM Campaign Administrator's Guide.

    LoaderDelimiter No default value defined. Specify the delimiter such as comma (,) or semi-colon (;) that separates fields in the temporary data files that are loaded into the big data instance. Tab (/t) is not supported.

    The delimiter value must match the ROW format delimiter that was used when the big data database table was created. In this example, a comma is used: ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ;"

    SuffixOnTempTableCreation

    SuffixOnSegmentTableCreation

    SuffixOnSnapshotTableCreation

    SuffixOnExtractTableCreation

    SuffixOnUserBaseTableCreation

    SuffixOnUserTableCreation

    No default value defined. Use the same character as specified for LoaderDelimiter.
    UseExceptForMerge Set to FALSE. Hive does not support the EXCEPT clause, so a setting of TRUE can result in process failures.

    DateFormat

    DateTimeFormat

    DateTimeOutputFormatString

    All Date strings must use the dash "-" character to format dates. Hive does not support any other characters for dates. Example: %Y-%m-%d %H:%M:%S
    Type BigDataODBC_Hive
    UseSQLToRetrieveSchema Set to FALSE.
    DataFileStagingFolder Default location value is set to /tmp. You can change the location value. Example: /opt/campaign/
    Note: The value for this folder must have a trailing slash.
    If you have written shell script to copy the Campaign data file to the Hive server, you need to modify it. Example:
    #!/bin/sh
    scp $1 root@emm52.in.ibm.com:/opt/campaign/
    ssh root@emm52.in.ibm.com "chmod 0666 /opt/campaign/
    `basename $1`"
    If you are using LoaderPreLoadDataFileCopyCmd, then you need to update the file location. Example:
    scp <DATAFILE> <USER>@[hostname]:/opt/campaign/
    
    If you are using LoaderPostLoadDataFileRemoveCmd, then you need to update the file location. Example:
    ssh <USER>@[hostname] "rm /opt/campaign/<DATAFILE>"

What to do next

E. Configure SSH on the Campaign listener server