Improving stagingprop performance

The stagingprop utility is enhanced to better process large amounts of data.

Improvements made to stagingprop

The stagingprop performance enhancements can include the following enhancements:
  • Improvements to the record-consolidation logic
  • Reducing the number of calls that are made to the database to fetch and update metadata and the actual data that is being propagated.
With these enhancements, you can experience a shorter propagation time for the same amount of data.
Note: Propagation times vary depending on the characteristics of the data that is being processed.

The STAGLOG table in the WebSphere Commerce database has two new indexes. The indexes improve the logic of stagingprop to use performance capabilities that are available in supported database types.

Oracle and DB2 JDBC driver batch functionality usage is extended to tables with all column types. Previously, propagated tables that had BLOB or CLOB columns were excluded from being propagated with JDBC batch.

Propagating large amounts of data

Tuning the -transaction and -batchsize parameters

One key characteristic of the performance improvements is the consideration of large data. As business grows, so does the amount of data that needs to be moved from an authoring or staging environment to a Production environment. To support the movement of large data by using stagingprop, consider using large transaction sizes and large batch sizes when you propagate data.
transaction
Specifies the record-count for each transaction. This record-count is the number of changes that are propagated before a commit is issued on the production and staging databases. Each transaction can include multiple batches, which are all committed when the specified record-count is reached or exceeded. The improved logic of the stagingprop utility relies on this value to be large, yet still within the constraints of both databases to move data across faster. If you do not specify a value for this parameter, the change logs are committed as a single transaction.
batchsize
Specifies the number of consolidated records to include in a batch. Similar to the transaction parameter, set a large value for this parameter. By setting the batchsize parameter to a large value, you can buffer large amounts of propagation data in the JVM before the data is sent to the production database in a single trip. The default value is "100".
You are recommended to use both the transaction and batchsize parameters when you need to propagate large amounts of data.
Note: When you use both parameters, the record-count that you specify for the transaction parameter can be exceeded depending on the value that you set for the batchsize parameter. For example, if you set the transaction parameter value to be "10000" and the batchsize parameter values to be "3000", the relationship between the parameters can resemble the following diagram.

With this relationship, the changes that are propagated to the production database are committed every 4 batches. The records are not committed until the fourth batch is full. In this scenario, the number of records that are committed in each transaction can be 120000, even thought the value of the transaction parameter is only "100000".
When you are tuning the batchsize and transaction parameters, consider the following recommended settings:
  • For optimal performance, ensure that the value of the transaction parameter is a multiple of the value for the batchsize parameter. The recommended ratio of the batchsize parameter value to transaction parameter value is 1:10. As an example, in the scenario in the preceding diagram, the transaction parameter value is "100000". For the value of the batchsize parameter to use the recommended ratio, the value should be "1000". If the value of the batchsize parameter is to remain "3000", however, then the value of the transaction parameter should be "300000".
  • To resolve general performance issues, consider setting larger values for both parameters. The optimal values for IBM internal testing were "100000" for the batchsize parameter and "1000000" for the transaction parameter. However, keep in mind that the larger the batch size, the bigger the heap that is required to hold the SQL statements in memory before a commit occurs. You might need to adjust your JVM heap when you set a large batchsize parameter value.
  • If the stagingprop utility takes a long time to complete propagating data, consider setting the parameters to process less data at a time. If you are propagating too many records at a time, consider running the utility at a more frequent interval. Increasing the frequency that you propagate data can result in fewer records per propagation.
  • To help in troubleshooting when the stagingprop utility is failing to propagate data, adjust the value of the batchsize parameter. For example, if you do not set a value for this parameter, the value is "100", which can make it difficult for you to find records that are causing errors. By setting the parameter value to "0" you can turn off JDBC batching, which can help you identify the records that are causing an error.

JVM runtime heap

You can increase the staging and production databases constraints to improve performance. However, you might also need to adjust the constraints of the JVM. Doing so buffers large amounts of data in-memory by stagingprop and the JDBC driver during the consolidation and propagation phases.

You can increase the maximum amount of runtime heap that can be used by the JVM. Specifying the argument and value combination of -Xmx<n> to the invocation of stagingprop in either the stagingprop.sh or stagingprop.bat files.

In WC_installdir/bin/stagingprop.sh:
  1. Locate the following line of code:
    ${JAVA_EXE?} ... -Dwclogdir=$WCLOGDIR ...  com.ibm.commerce.staging.StagingProp $* $DEFAULTLOG $DBTYPE
    
  2. Update this line of code to be the following code:
    ${JAVA_EXE?} ... -Dwclogdir=$WCLOGDIR -Xmx3072m ... com.ibm.commerce.staging.StagingProp $* $DEFAULTLOG $DBTYPE

In WC_installdir/bin/stagingprop.bat:

  1. Locate the following line of code:
    "%JAVA_HOME%\bin\java" ... -Dwclogdir=%WCLOGDIR%  ... com.ibm.commerce.staging.StagingProp %args%
    
  2. Update this line of code to be the following code:
    "%JAVA_HOME%\bin\java" ... -Dwclogdir=%WCLOGDIR% -Xmx3072m ... com.ibm.commerce.staging.StagingProp %args%

In the examples above, the maximum amount of runtime heap that can be used by the JVM during invocation of stagingprop is increased to 3 gigabytes.

If you set the -actionOnError parameter, you might need to adjust the size of the stack that is reserved by the JVM during invocation when certain JDBC drivers are being used. This adjustment can be done in a similar way to the runtime heap, but by using the argument and value combination of -Xss<n>. For example, a specification of -Xss1280k adjusts the reserved JVM stack size to 1.25 megabytes.
Note: With this adjustment, If you change the -Xmx and -Xss JVM arguments, make sure that you do not use a value that is too large. Doing so diverts memory resources that can be used elsewhere.
For reference, in large data tests that are conducted during development, the following JVM values are used:
  • -Xmx3072m
  • -Xss1280k
If you are making these adjustments, you can use 100000 as a possible value for the -batchsize parameter, and 1000000 for the -transaction parameter.