Synchronizing the Profiles database with your organization's user data

To keep your organization's user data up-to-date, regularly synchronize the IBM® Connections Profiles database with your data source, such as an LDAP directory or an employee database.

About this task

The Profiles database is the core repository within Connections for information about the Connections users in your organization. Typically, the primary source of this user data is your organization's LDAP directory, but you can also synchronize from non-LDAP sources, either exclusively or in combination with an LDAP directory. For more information about synchronizing non-LDAP sources, see Using a custom source repository connector.

The synchronization process is controlled by properties in the profiles_tdi.properties file.

It is recommended that you use the sync_all_dns command to transfer changes in your organization's user data repository to the Profiles database. If you want to keep the Profiles database in a close synchronized state with your LDAP directory, run this task nightly or at another frequency that suits you.

Note: An alternative approach is to synchronize by using a change log that is maintained by the LDAP. However, there are significant challenges to getting this approach to work, and it is not recommended. For more information about using this approach, see Synchronizing IBM Tivoli Directory Server and Microsoft Active Directory LDAP changes.

During synchronization, the values of attributes that are mapped from the LDAP directory to the Profiles database in the file map_dbrepos_from_source.properties are evaluated to determine which users need updating. In the Profiles database, existing users are updated, deleted, or deactivated, and new users are created. If you configure extension attributes, that data is also compared and synchronized. The comparison includes all user data within the search scope, including extension data.

The sync_all_dns command processes one user at a time, and as the number of users and extension attributes increases, daily synchronization can take too long. There are two performance-related options for sync_all_dns that you can use: multi-processing, and time stamp tracking. The multi-processing option divides up the work into independent processes that proceed simultaneously so that multiple users are processed concurrently. The time stamp tracking option tracks the LDAP time of last update, which all LDAPs support, and eliminates the need for most comparing. For details about these options, see Improving the performance of the sync_all_dns command. There are certain circumstances where the time stamp cannot be observed, for example, when you are switching LDAP directories.

The sync_all_dns command creates temporary files that are used during the synchronization process. Use the sync_updates_working_directory property to specify the location of the temporary files. Use the sync_updates_clean_temp_files property to specify whether to delete or retain the temporary files after synchronization. Retaining the files is useful when you are troubleshooting a problem.

In addition to the temporary files, the following files in the TDI solution directory record the changes that were made during synchronization:

  • employee.adds
  • employee.delete
  • employee.error
  • employee.skip
  • employee.update

When the sync_updates_show_summary_only property is set to true, no changes are made.

Like the other TDI tasks, the sync_all_dns command writes log information to the log file ibmdi.log in the TDI\logs directory. You can check the log to see whether the command finishes successfully, and look for error information if necessary.

For more information about how the sync_all_dns command works, see Understanding how the sync_all_dns process works.

Procedure

To synchronize LDAP directory changes with Profiles:
  1. Use the properties in the following table to control the synchronization process.
    OptionDescription

    sync_updates_hash_field

    sync_updates_hash_field is a key property. This property specifies the field that is used to match a user record in the Profiles database with the corresponding user information in the source. The supported fields are uid, guid, and email. The default is uid.

    It is critical that you choose a field that does not ordinarily change over time, so that the match will remain intact. If the value in the field in the source does change, the match is broken, and the existing database information for this person could be deleted.

    If the value of the hash field in the source does change, you must set this property to a different field that has not changed, for at least one run of sync_all_dns. For example, if the value for uid changes in the source, you must set the property to either guid or email. After one run of sync_all_dns, you can change the property back to uid.

    Note: If the value is guid and you change LDAP providers, you must change the value to uid or email temporarily because guid is LDAP-specific.

    perform_deletion_or_inactivate_for_sync

    The default value is true. Set this property to false when you don't want to delete or mark as inactive those users who are no longer in the LDAP directory.

    The sync_all_dns command checks the value of the property and acts by using the following logic:

    • If the value is true, look at the sync_delete_or_inactivate property to determine which action to take. The action is either delete or inactivate.
    • If the value is false, perform neither the delete action nor the inactivate action.

    sync_delete_or_inactivate

    Controls what happens to a user record when it is not found in the LDAP directory. The value must be either delete or inactivate, and is case-sensitive. Inactivate is basically a soft delete. By default, the property is set to inactivate. The inactive state is propagated to all the other Connections applications independent of whether the user is deleted or inactivated.

    For information about hard deleting users who have been inactive for some time, see Using supplied scripts to delete inactive users based on inactivity length.

    source_ldap_iterate_with_filter

    The default value is false. When set to true, the source_ldap_iterate_with_filter_functions_file property is used to locate the file that contains the JavaScript code that affects chunking. This is needed when there is a limit to the number of users that can be obtained with an LDAP query.

    For more information about how to use this property, see Populating a large user set.

    This property is not configurable when you are using the population wizard.

    source_ldap_iterate_with_filter_functions_file

    Used only when source_ldap_iterate_with_filter is set to true.

    Set this property to the name of a JavaScript file that contains the filter code that affects chunking.

    Use when the size of the data to be retrieved from LDAP exceeds the search limit of the LDAP or exceeds the memory capacity of the TDI LDAP connector. For example, if your search parameters would return 250 K records but your LDAP only allows 10K to be returned at a time, you can use this property.

    For more information about how to use this property, see Populating a large user set.

    This property is not configurable when using the population wizard.

    sync_updates_double_check

    The default value is false. When set to true, the assembly line that is defined by the sync_check_if_remove property runs.
    Note: This property applies to delete/inactivate only.

    sync_check_if_remove

    Used only when sync_updates_double_check is set to true.

    Specifies the name of an assembly line in profiles_tdi.xml that verifies the delete operation or the inactivate operation.

    By default, the name of the assembly line is set to sync_all_dns_check_if_remove. The sync_all_dns_check_if_remove assembly line looks up the distinguished name of the about-to-be-deleted user in the LDAP directory. If the user is found, sync_all_dns_check_if_remove returns a status that causes the main assembly to bypass the delete or inactivate action.

    For more information about this property, see Customizing the logic used for the delete operation.

    sync_updates_clean_temp_files

    The default value is true. When set to false, temporary files are not deleted until the next time that sync_all_dns is run.

    sync_updates_hash_partitions

    Number of partitions to divide the temporary files into. The default of 10 is sufficient in most cases. If problems develop, you can increase the value. The typical problem is running out of memory during the update phase because all the data related to all users in a partition is held in memory during the update, delete, and add phases. For more information about partitions, see Understanding how the sync_all_dns process works.

    sync_updates_show_summary_only

    The default value is false. When set to true, the employee.* files in the TDI solution directory contain the records that are changed, but no changes are made.

    sync_updates_working_directory

    The directory where the working files are stored. The path can be relative to the TDI solution directory or an absolute path. The default value is sync_updates, which is a relative path.

    sync_updates_size_model

    The default value is single. This property is used for enhancing the performance of the sync_all_dns command. Possible values are single, multi4, multi6, or multi8. For more information about this property, see Improving the performance of the sync_all_dns command

    sync_updates_use_ldap_timestamp

    The default value is false. This property is used for enhancing the performance of the sync_all_dns command. For more information about this property, see Improving the performance of the sync_all_dns command.

  2. If you are storing data from multiple LDAP branches or multiple LDAP directories in the same Profiles database, you must synchronize each LDAP branch or LDAP directory separately. To accomplish this task, you can set the following properties in the profiles_tdi.properties file.
    Note: These properties can only be used with the sync_all_dns command. They cannot be used with the process_tds_changes and process_ad_changes commands.
    OptionDescription

    sync_source_url_enforce

    The default value is false. When set to true, synchronizes only those users where the stored source URL matches the current source URL. The current source URL is the concatenation of the source_ldap_url, source_ldap_search_base, and source_ldap_search_filter properties. That is, it limits the scope of the set of data in the database, and skips the records that do not match the current source URL.

    sync_source_url_override

    The default value is false. This property is effective only when sync_source_url_enforce is true. When sync_source_url_enforce is true and sync_source_url_override is false, records where the current source URL and the stored source URL do not match are skipped.

    When sync_source_url_override is true, the records that would have been skipped are checked for a match of the hash field in the current LDAP branch or LDAP directory. If there is a match and at least one field needs to be updated, the record is updated, and the source URL is set to the current value. If no fields are updated, which would be very unusual in the cases where you would use this override, no change is made. For example, the major use case is switching LDAP directories, and in this case the guid is sure to change.

    This property should be clearly understood before setting it to true. See the last example later in this document to see it in action.

    sync_store_source_url

    The default value is true. Stores the source LDAP URL in the prof_source_url field in the database. The source LDAP URL is needed to determine the source of the data to correctly synchronize it when there is more than one LDAP branch or LDAP directory. Even if there is only one LDAP, it is best to leave this property set to true.

  3. Run the sync_all_dns command. The command name is either sync_all_dns.sh or sync_all_dns.bat, depending on your operating system.
    Note: When the sync_all_dns command runs, a lock file is created in the TDI solution directory. The lock file prevents others from starting a sync_all_dns process in the same TDI solution directory. The name of the lock file is sync_all_dns.lck. The lock file is deleted after the sync_all_dns command completes. If the command does not complete, the lock file is not deleted. You can delete it yourself, or you can run the clearLock.sh or the clearLock.bat script, located in the TDI solution directory.

Example employee tables

The sample EMPLOYEE table illustrates results from a scenario for the fictional Zeta Bank company, in which you have pulled users A, B, and C from the Littleton LDAP branch and users X, Y, and Z from the Westford LDAP branch.

Following the best practice, you have used two TDI solution directories with slightly different profiles_tdi.properties files. In the following discussion, one of the solution directories is called ABC, and the other is called XYZ.

Each of the profiles_tdi.properties files include the following entries:

source_ldap_url=ldap://ldap.zetabank.com:389
source_ldap_search_filter=((objectClass=inetOrgPerson)(uid=*))
sync_store_source_url=true
sync_source_url_enforce=true
sync_source_url_override=false

The ABC/profiles_tdi.properties file, for the Littleton branch, includes the following entry:

source_ldap_search_base=cn=users,location=littleton,dc=zetabank,dc=com

The XYZ/profiles_tdi.properties file, for the Westford branch, includes the following entry:

source_ldap_search_base=cn=users,location=westford,dc=zetabank,dc=com

After running collect_dns and populate_from_dn_file in each of the two TDI solution directories, the EMPLOYEE table contains the following data:

Table 1. Example employee table after running collect_dns and populate_from_dn_file in each TDI solution directory.
uid PROF_SOURCE_URL
A ldap://ldap.zetabank.com:389/cn=users,location=littleton,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
B ldap://ldap.zetabank.com:389/cn=users,location=littleton,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
C ldap://ldap.zetabank.com:389/cn=users,location=littleton,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
X ldap://ldap.zetabank.com:389/cn=users,location=westford,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
Y ldap://ldap.zetabank.com:389/cn=users,location=westford,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
Z ldap://ldap.zetabank.com:389/cn=users,location=westford,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))

Notice that the only difference between the values for PROF_SOURCE_URL is the location parameter.

If you run sync_all_dns in the ABC TDI solution directory, you get updates for people A, B, and C, but not for people X, Y, and Z because sync_source_url_enforce is set to true and the PROF_SOURCE_URL for those people does not match the concatenation of the source_ldap_url, source_ldap_search_base, and source_ldap_search_filter properties. Setting sync_source_url_enforce to false causes the people X, Y, and Z to be deleted from the database because they don’t exist in the Littleton branch.

If the people A, B, and C move from Littleton to Waltham, the ABC/profiles_tdi.properties would then have the following entry for source_ldap_search_base:

source_ldap_search_base=cn=users,location=waltham,dc=zetabank,dc=com

An update is required because the location value in the PROF_SOURCE_URL column is different than the location value in ABC/profiles_tdi.properties. To correctly update the value in PROF_SOURCE_URL, in ABC/profiles_tdi.properties set sync_source_url_override to true and then run sync_all_dns in the ABC TDI solution directory.

As a safety precaution, the PROF_SOURCE_URL is not updated if it is the only attribute that changes. Also, you should set sync_source_url_override to false after running sync_all_dns.

After the command completes, the EMPLOYEE table contains the following data:

Table 2. Example employee table after running collect_dns in the ABC TDI solution directory.
uid PROF_SOURCE_URL
A ldap://ldap.zetabank.com:389/cn=users,location=waltham,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
B ldap://ldap.zetabank.com:389/cn=users,location=waltham,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
C ldap://ldap.zetabank.com:389/cn=users,location=waltham,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
X ldap://ldap.zetabank.com:389/cn=users,location=westford,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
Y ldap://ldap.zetabank.com:389/cn=users,location=westford,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))
Z ldap://ldap.zetabank.com:389/cn=users,location=westford,dc=acme,dc=com?((objectClass=inetOrgPerson)(uid=*))