Sharded queries

You can query a sharded table as if it is a single table on one database server. However, restrictions for distributed queries between database servers and restrictions specific to sharded queries apply.

When you run a sharded query, do not include server name qualifications for remote servers.

If the SHARD_ID configuration parameter is set to unique values on each shard server in the shard cluster, sharded queries are run in parallel on each shard server.

If you set the USE_SHARDING session environment option, insert, update, and delete operations on shard tables use the two-phase commit protocol. Otherwise, sharded insert, update, and delete operations follow the eventually consistency model where data is moved to the appropriate shard server after the transaction is committed.

If your shard servers have high-availability secondary servers, you can run sharded queries from the secondary servers.

Data types

A sharded query can return the following data types: non-opaque atomic built-in data types, LVARCHAR, Boolean, BSON, and JSON. Sharded queries cannot return distinct data types.

To run sharded queries on time series data in a TimeSeries data type, shard a virtual table that is based on the time series table.

Restrictions

You cannot include the following SQL syntax elements in a query that includes a sharded table:

  • DataBlade API routines
  • Java user-defined routines
  • Triggers
  • A FOR UPDATE clause in a SELECT statement

You cannot run an EXECUTE FUNCTION or EXECUTE PROCEDURE statement for a routine to operate on a sharded table.

You cannot run a statement that contains an update to a shard key that requires the row to move to another shard server. To update the shard key of a row, delete the row and then insert it with the new values.

You cannot shard data in an XA environment.

Sharded Table Joins

HCL Informix® supports joining between multiple sharded tables with parallel execution. However, such parallel joins between two shard tables are allowed ONLY when the following conditions are met:

  • both the tables must have the joining column as key
  • both the tables must have exactly the same strategy defined on the key
  • both the tables must have exact partitioning conditions defined using the key
  • both the tables must have same set of the participating nodes
  • only equi-joins are allowed in case of other non-expression strategies

All the table filters on the sharded tables will be pushed to their respective participants

When the shard-join is rejected due to any restriction, Informix server will attempt a fall back mechanism.

Shard join fallback is enabled using the following command:

SET ENVIRONMENT SHARDJOIN_FALLBACK ON. For more information, see SHARDJOIN_FALLBACK session environment option.

Performance tips

You can improve the speed of sharded queries by customizing how shared memory for sharded queries is allocated. You can control shared memory allocation by setting the SHARD_MEM configuration parameter on each shard server.

If your sharded queries frequently include joins to another table, replicate that table to all the shard servers to improve query performance.

If your sharded queries included stored routines as a filter, define the routines on all the shard servers. Queries run faster when the data is filtered on each shard server before being returned.

If the SHARD_ID configuration parameter is set on all shard servers, the shard servers use server multiplexer group (SMX) connections. You can reduce latency between shard servers by increasing the number of pipes that are used for the SMX connections. Set the SMX_NUMPIPES configuration parameter to the number of pipes.