Defining a sharding schema with an expression

The MongoDB shell db.runCommand command with shardCollection command syntax creates a definition for distributing data across the database servers of a shard cluster.

Procedure

To create a shard-cluster definition that uses an expression for distributing data across database servers:
  1. Run the mongo command.
    The command starts the MongoDB shell.
  2. Run the db.runCommand from the MongoDB shell, with shardCollection command syntax.
    The shardCollection command syntax for using an expression is shown in the following diagram:
    
    1 db.runCommand ({"shardCollection":"database.
    2.1 collection
    2.1 table
    1", key:{
    2.1 column
    2.1 field
    1:1},expressions:{ + , " ER_group_name ": expression "  " ER_group_name ":" remainder " })
    Element Description Restrictions
    collection The name of the collection that is distributed across database servers. The collection must exist.
    column The shard key that is used to distribute data across the database servers of a shard cluster. The column must exist.

    Composite shard keys are not supported.

    database The name of the database that contains the collection that is distributed across database servers. The database must exist.
    ER_group_name The Enterprise Replication group name of a database server that receives copied data.

    The default Enterprise Replication group name for a database server is the database server's name prepended with g_. For example, the default Enterprise Replication group name for a database server that is named myserver is g_myserver.

    None.
    expression The expression that is used to select documents by shard key value. None.
    field The shard key that is used to distribute data across the database servers of a shard cluster. The field must exist.

    Composite shard keys are not supported.

    remainder Specifies a database server that receives documents with shard key values that are not selected by expressions. The remainder expression is required.
    table The name of the table that is distributed across database servers. The table must exist.
  3. For optimal query performance, connect to the wire listener and run the MongoDB ensureIndex command on the shard key of each of a cluster's shard servers. The ensureIndex command ensures that an index is created for the collection or table on the shard server.

Results

The name of a shard-cluster definition that is created by a shardCollection command that is run through the wire listener is:

1 sh_database_
2.1 collection
2.1 table

Examples

Define a shard cluster that uses an expression to distribute data across multiple database servers
The following command defines a shard cluster that uses an expression on the field value state for distributing collection1 across multiple database servers.
> db.runCommand({"shardCollection":"database1.collection1",
   key:{state:1},expressions:{"g_shard_server_1":"in ('KS','MO')",
   "g_shard_server_2":"in ('CA','WA')","g_shard_server_3":"remainder"}})
The name of the created shard-cluster definition is sh_database1_collection1.
  • Inserted documents with KS and MO values in the state field are sent to g_shard_server_1.
  • Inserted documents with CA and WA values in the state field are sent to g_shard_server_2.
  • All inserted documents that do not have KS, MO, CA, or WA values in the state field are sent to g_shard_server_3.
Define a shard cluster that uses an expression to distribute data across multiple database servers
The following command defines a shard cluster that uses an expression on the column value animal for distributing table2 across multiple database servers.
> db.runCommand({"shardCollection":"database1.table2",
   key:{animal:1},expressions:{"g_shard_server_1":"in ('dog','coyote')",
   "g_shard_server_2":"in ('cat')","g_shard_server_3":"in ('rat')",
   "g_shard_server_4":"remainder"}})
The name of the created shard-cluster definition is sh_database2_table2.
  • Inserted rows with dog or coyote values in the animal column are sent to g_shard_server_1.
  • Inserted rows with cat values in the animal column are sent to g_shard_server_2.
  • Inserted rows with rat data values in the animal column are sent to g_shard_server_3.
  • All inserted rows that do not have dog, coyote, cat, or rat values in the animal column are sent to g_shard_server_4.
Define a shard cluster that uses an expression to distribute collections across multiple database servers
The following command defines a shard cluster that uses an expression on the field value year for distributing collection3 across multiple database servers.
> db.runCommand({"shardCollection":"database1.collection3",
   key:{year:1},expressions:{"g_shard_server_1":"between 1980 and 1989",
   "g_shard_server_2":"between 1990 and 1999",
   "g_shard_server_3":"between 2000 and 2009",
   "g_shard_server_4":"remainder"}})
The name of the created shard-cluster definition is sh_database3_collection3.
  • Inserted documents with values of 1980 to 1989 in the year field are sent to g_shard_server_1.
  • Inserted documents with values of 1990 to 1999 in the year field are sent to g_shard_server_2.
  • Inserted documents with values of 1980 to 1989 in the year field are sent to g_shard_server_3.
  • Inserted documents with values below 1980 or above 2009 in the year field are sent to g_shard_server_4.