Effectiveness of bounding box representation

The characteristics of the data stored in an R-tree indexed column can affect the performance of queries that search the data. The higher the selectivity of the data, the faster the queries execute. Although you might not have any control over what your data looks like, it is useful to know how it can affect queries.

The selectivity of data indexed with the R-tree access method is affected by two characteristics of the data: how much overlap occurs and the relative sizes of close objects. The more overlap that occurs between the bounding boxes of the objects, the lower the selectivity of the data. Grouping many small bounding boxes close to one large bounding box lowers the selectivity of the small bounding boxes as it increases the selectivity of the large bounding box.

An example of data that has high selectivity is the set of lakes on a map. Although the lakes might be oddly shaped, they are compact and well represented by bounding boxes. In a small area, the bounding boxes of faraway lakes do not appear.

An example of data that has low selectivity is satellite ground tracks. Over time, the tracks cover most of the earth, so the bounding boxes of a particular satellite greatly overlap the bounding boxes of other satellites. Checking for bounding boxes overlapping a particular place on earth does not eliminate many satellites, unless time can also be used for finer resolution. Airline routes behave similarly.