Throughput and CPU measurements

There are many factors that can impact system throughput and CPU performance, and these can be measured with the goal of improving performance.

The execution of a single map occurs in the context of a single thread. Therefore, additional processors will not improve the execution time of a single transformation. The Command Server executes a simple map in this fashion. Execution time measurements of a single transformation performed with the Command Server will only account for the computational power of a single processor.

In gauging performance, it is important to choose a measure of performance appropriate to the architecture of the mapping process. The performance measurement should agree with the performance goal for the system as a whole. If the system includes a number of maps running in parallel on multi-processor hardware, transactional throughput is probably more important than the execution time of any single transformation.

For example, a given map might execute in 10 seconds. On a 4-processor box, 4 simultaneous executions of that same map might complete in 11 seconds. The transactional throughput is nearly four times that of the single execution scenario, but each of the 4 simultaneous executions still executed in 10 seconds.

Conversely, if the process involves a batch of data with a single map, map execution time might be the most appropriate primary measurement. Execution time is also quite appropriate when running a series of maps. In such cases, all of the work occurs in a single thread as previously mentioned.

Assuming that CPU processing power is the constraining factor in both cases, improving throughput and execution time require different approaches. Adding additional processors of the same type will not improve map execution time. Improving map execution time requires more computationally powerful (faster) processors. Conversely, additional processors with the same power can improve throughput. More processors of the same power can allow more maps to execute simultaneously.

This same principle applies when comparing map performance across hardware platforms. A hardware configuration might have more processors available, but if an individual processor lacks computational power, the execution time of a single map on that processor will suffer. However, even with weaker processors, this same configuration might provide better throughput. While computational power varies considerably among processors, clock cycle frequency is a loose predictor of relative computational strength.