Clustering spatial data on the disk

About this task

If the rows of a table with an R-tree index are clustered on disk the same way as the corresponding entries in the R-tree index that indexes the column, the performance of the retrieval of the data is improved. This section describes how you can cluster existing spatial data on the disk to reflect the ordering in the R-tree index.
Important: Because the following procedure requires that the data in the original table be temporarily deleted, make a backup copy of the table either by loading all the rows into a new table or by taking a full backup of the database.

To cluster existing spatial data on the disk to reflect the ordering in an R-tree index:

Procedure

  1. Create a new table that is exactly the same as the original table and insert all rows from the original table into the new table.
    For example, if the original table is called circle_tab, the following SQL statements create an exact copy called circle_tab_temp and insert all rows from the circle_tab table into the circle_tab_temp table:
    CREATE TABLE circle_tab_temp
    (
        id        INTEGER,
        circles   MyCircle
    );    
    
    INSERT INTO circle_tab_temp
    SELECT * FROM circle_tab;
  2. Create an R-tree index on the relevant column of the new table.
  3. Update statistics on the new table.
  4. Drop the R-tree index on the original table and delete all rows.
  5. Insert all rows from the new table back into the original table with a SELECT statement that returns all rows in the new table and uses the R-tree index at the same time.
    Be careful that you design this SELECT statement carefully so it satisfies both restrictions.

    You might consider using the Overlap strategy function in your query, passing as the second parameter the coordinates of the entire space in which the spatial objects in the table exist. Because each spatial object obviously overlaps with the entire possible space, the query returns every row in the table. In addition, because the Overlap strategy function is specified in the WHERE clause of the query, the query must use the R-tree index.

    For example, assume all the spatial objects in the table exist within a box defined by the coordinates (-1000,-1000,1000,1000). In this case, the query might look like the following example:
    INSERT INTO circle_tab
    SELECT * FROM circle_tab_temp 
    WHERE Overlap(circles, 'box(-1000,-1000,1000,1000)::MyBox');
  6. Create a new R-tree index on the appropriate column of the original table.
  7. Drop the new table.

What to do next

If your original table is fragmented, be sure to use the same fragmentation scheme throughout the procedure. In other words, fragment the new table and its index the same way the original table and index are fragmented and make sure that the data is re-inserted into the correct fragment of the original table.

Subsequent updates will gradually degrade the clustering of data achieved with this procedure.