Example of creating a fragmented virtual table

This example shows how to create a fragmented virtual table that is based on a table that contains time series data and that is fragmented by expression.

Prerequisites

Before you run the statements in this example, create three dbspaces named dbs1, dbs2, and dbs3. Otherwise, substitute your dbspace names in the example.

About this example

When you create the stores_demo database, all the setup tasks for creating and loading a time series table are complete. The ts_data table in the stores_demo database contains time series data. The ts_data_v table is a virtual table that is based on the ts_data table. The ts_data table and the ts_data_v virtual table are not fragmented.

In this example, you alter the ts_data table to fragment it by expression into three dbspaces. You re-create the ts_data_v virtual table as a fragmented virtual table. Then, you run queries in parallel against the ts_data_v virtual table.

Preparing for parallel queries

Run the SQL statements in these steps from an SQL editor, such as DB-Access .

To prepare for running parallel queries on the ts_data_v virtual table:

  1. If necessary, create the stores_demo database by running the following command:
    dbaccessdemo
  2. If necessary, set the PDQPRIORITY environment variable to a value other than OFF to enable parallel database queries. For example, run the following SQL statement:
    SET PDQPRIORITY 100
  3. Specify an explain output file and enable explain output so that you can determine whether your queries run in parallel by running the following SQL statements:
    SET EXPLAIN FILE TO 'c:/test/parallelvtq.out';
    SET EXPLAIN ON;

    You can specify a different directory and file name for the explain output file.

  4. Drop the existing virtual table on time series data, ts_data_v, in the stores_demo database by running the following SQL statement:
    DROP TABLE ts_data_v;
  5. Alter the ts_data table to fragment it by expression by running the following SQL statement:
    ALTER FRAGMENT ON TABLE ts_data init 
           FRAGMENT BY EXPRESSION 
               PARTITION part1 (loc_esi_id <= "4727354321355594") in dbs1, 
               PARTITION part2 (loc_esi_id <= "4727354321510846") in dbs2, 
               REMAINDER IN dbs3;
  6. Create the fragmented virtual table, ts_data_v, by running the following SQL statement:
    EXECUTE PROCEDURE TSCreateVirtualTab(
       'ts_data_v', 
       'ts_data', 
       'origin(2010-11-10 00:00:00.00000),calendar(cal15min),
        container(raw_container),threshold(0),regular', 
       'fragment', 
       'raw_reads'
      );

    The difference between this definition of the ts_data_v virtual table and the original definition is the value of the fourth parameter, the TSVTMode parameter. In this definition, the TSVTMode parameter is set to fragment to fragment the virtual table. In the original definition of the ts_data_v virtual table, the TSVTMode parameter is set to 0, which indicates the default behavior.

  7. Set the isolation level to DIRTY READ by running the following SQL statement:
    SET ISOLATION TO DIRTY READ;

    The DIRTY READ isolation level ensures that a parallel query on a virtual table succeeds if the data in the base table is being modified at the same time.

Run parallel queries

The following SQL statement selects the number of values, the sum of the values, and the average of the values for a 15-minute period on February 7, 2011 for each customer who lives in the state of Arizona:

SELECT state,
        COUNT(value) num_values,
        SUM(value) sum,
        AVG(value) average
FROM ts_data_v v, customer_ts_data l, customer c
WHERE
        v.loc_esi_id=l.loc_esi_id AND l.customer_num=c.customer_num
	AND state = "AZ"
        AND v.tstamp BETWEEN '2011-02-07 23:30:00.00000' 
                         AND '2011-02-07 23:45:00.00000'
GROUP BY state, value
ORDER BY state, value;

The result of the query displays the information for the three qualifying customers:

state  (count)  (sum)   (avg)
AZ        1     0.011   0.011
AZ        1     0.012   0.012
AZ        2     0.050   0.025

The explain output file, parallelvtq.out, describes the query plan for this query. The information (Parallel, fragments: ALL) for the second and third scans indicates that the scans ran in parallel and accessed all fragments:

...

Estimated Cost: 14
Estimated # of Rows Returned: 1
Maximum Threads: 5
Temporary Files Required For: Order By  Group By 

  1) informix.c: SEQUENTIAL SCAN

        Filters: informix.c.state = 'AZ' 

  2) informix.l: INDEX PATH

    (1) Index Keys: customer_num   (Parallel, fragments: ALL)
        Lower Index Filter: informix.l.customer_num = informix.c.customer_num 
NESTED LOOP JOIN

  3) informix.v: VTI SCAN  (Parallel, fragments: ALL)

        VTI Filters: (informix.lessthanorequal(informix.v.tstamp,datetime
             (2011-02-07 23:45:00.00000) year to fraction(5) ) AND 
             informix.greaterthanorequal(informix.v.tstamp,datetime
             (2011-02-07 23:30:00.00000) year to fraction(5) ))

        Filters: informix.v.loc_esi_id = informix.l.loc_esi_id 
NESTED LOOP JOIN

The following SQL statement selects the number of values, the sum of the values, and the average of the values after 11:30 PM on February 7, 2011 for each customer:

SELECT state,
        COUNT(value) num_values,
        SUM(value) sum,
        AVG(value) average
FROM ts_data_v v, customer_ts_data l, customer c
WHERE
        v.loc_esi_id=l.loc_esi_id
        AND l.customer_num=c.customer_num
        AND greaterthan(v.tstamp,'2011-02-07 23:30:00.00000')
GROUP BY state, value
ORDER BY state, value;

The result of the query displays the information for all the qualifying customers:

state  (count)  (sum)   (avg)
AZ       1     0.011     0.011
AZ       1     0.025     0.025
CA       1     0.020     0.02
CA       1     0.023     0.023
CA       1     0.056     0.056
CA       1     0.065     0.065
CA       1     0.071     0.071
CA       1     0.073     0.073
CA       1     0.088     0.088
CA       1     0.162     0.162
CA       1     0.204     0.204
CA       1     0.226     0.226
CA       1     0.246     0.246
CA       1     0.277     0.277
CA       1     0.323     0.323
CA       1     0.340     0.34
CA       1     0.415     0.415
CA       1     0.469     0.469
CA       1     0.670     0.67
CA       1     1.412     1.412
CO       1     0.118     0.118
DE       1     0.256     0.256
FL       1     3.470     3.47
MA       1     4.388     4.388
NJ       2     0.374     0.187
NY       1     0.239     0.239
OK       1     0.086     0.086

The GROUP BY and ORDER BY clauses order the results by state.

The explain output for this query shows that the query ran in parallel.

...

Estimated Cost: 63
Estimated # of Rows Returned: 1
Maximum Threads: 5
Temporary Files Required For: Order By  Group By 

  1) informix.l: SEQUENTIAL SCAN

  2) informix.v: VTI SCAN  (Parallel, fragments: ALL)

        VTI Filters: informix.greaterthan(informix.v.tstamp,datetime
         (2011-02-07 23:30:00.00000) year to fraction(5) )

        Filters: informix.v.loc_esi_id = informix.l.loc_esi_id 
NESTED LOOP JOIN

  3) informix.c: INDEX PATH

    (1) Index Keys: customer_num   (Parallel, fragments: ALL)
        Lower Index Filter: informix.l.customer_num = informix.c.customer_num 
NESTED LOOP JOIN