Workload analysis

Workload analysis is a process of designing an effective data mart.

You can do the workload analysis manually.

To get the best performance from accelerated queries, you need to have an optimal data mart that includes all the necessary columns and tables, but excludes any columns and tables that are not used in your warehousing queries. The data mart must also specify the necessary joins between the tables as they are used in the queries. If your data warehouse has a complex database schema and a workload that includes reports that might run hundreds of different queries, it is challenging to determine the optimal data mart definition. Workload analysis gathers information about your warehousing queries as you run them, known as query probing, and analyzes the resulting data to determine which queries are good candidates for acceleration. The columns, tables, and joins used by the queries that can be accelerated are included in the data mart definition.

Workload analysis needs to determine the fact table of a query. If parallel database query (PDQ) is active, you can specify the fact table with the FACT optimizer directive. If the FACT optimizer directive is not set, and for inner join queries, the fact table is identified as the table with the most number of rows.

By default, information from all queries that you run during query probing is included in the data mart definition. You can customize the data mart definition by including specific queries in the definition. If you turn SQL tracing on, you can identify the probing data that resulted from a specific SQL statement by its statement ID.

If you do workload analysis manually, you can identify specific statements, for example statements that took a certain length of time to process, or statements that accessed specific tables. With that information, you can include the probing data that resulted from only these statements in the mart definition. To view SQL tracing information, run the onstat -g his command or query the syssqltrace table in the sysmaster database. For example, in DB-Access, run this SQL statement:

SELECT sql_runtime, sql_statement FROM syssqltrace
WHERE sql_stmtname matches "SELECT" ORDER BY sql_runtime DESC

By default, query probing is based on the running time of each query. To run the query probing more quickly, issue the SET EXPLAIN ON AVOID_EXECUTE statement before you run your query workload. When you issue this statement, the queries are optimized and the probing data is collected, but a result set is not determined or returned. However, if you avoid running the queries, you do not know exactly how long it takes to run the queries.

The query probing data is stored in memory. You can view the query probing data by running the onstat -g probe command or by querying the system monitoring interface.