Select a data type

Every column in a table must have a data type. The choice of data type is important for the following reasons:
  • It establishes the set of valid data items that the column can store.
  • It determines the kinds of operations that you can perform on the data.

    For example, you cannot apply aggregate functions, such as SUM, to columns that are defined on a character data type.

  • It determines how much space each data item occupies on disk.

    The space required to accommodate data items is not as important for small tables as it is for tables with hundreds of thousands of rows. When a table reaches that many rows, the difference between a 4-byte and an 8-byte data type can be crucial.

The following figure shows a decision tree that summarizes the choices among built-in data types. The choices are explained in the following sections.

Figure 1: Select a data type

If your data is boolean (true or false) then use BOOLEAN. If your data is made up only of integers then the type to use depends on the range of values required. If all of the numbers are between negative two to the fifteenth minus one and two to the fifteenth minus one then use SMALLINT. If some of the values are outside that range but all are in the range of negative two to the thirty-first minus one and two to the thirty-first minus one then use INTEGER. If some of the values are outside that range but all are within the range of negative two to the sixty fourth minus one and two to the sixty fourth minus one then use INT8. If some of the values are outside that range then use DECIMAL(p,0). If not all of the values are integers and the number of fractional digits is fixed then use DECIMAL(p,s). If not all of the values are integers and the number of fractional digits is variable then use SMALLFLOAT, FLOAT, or DECIMAL(p) depending on the largest number of significant digits. If there are never more than 8 significant digits then use SMALLFLOAT. If there are sometimes more than 8 but never more than 16 significant digits use FLOAT. If there are ever more than 16 significant digits then use DECIMAL(p). If your data is chronological and the data represents a span of time rather than a specific point in time use INTERVAL If your data is chronological and represents a point in time then use either DATE or DATETIME. Use DATE if the data is precise to the day; otherwise use DATETIME. If your data contains non-English-language characters, use either NCHAR(n) or NVARCHAR(m,r). Use NCHAR(n) if there is no or little variance in item lengths; otherwise use NVARCHAR(m,r). If your data is made up of ASCII characters and there is little or no variance in the lengths of the items then use either CHAR(n), CLOB, or TEXT. Use CHAR(n) if none of the items are over 32 766 bytes in length; otherwise use CLOB if you require random access to the data (reading or writing to any portion of it) or TEXT if you do not. If your data is made up of ASCII characters but there is significant variance in the sizes of the items then use LVARCHAR if any of the lengths will exceed 255 bytes; otherwise use VARCHAR(m,r) or its synonym CHARACTER VARYING(m,r). If some part of the data is not ASCII then use BLOB if you must have random access to the data; otherwise use BYTE.

Figure 2: Select a data type (continued)

begin figure description - This figure is described in the surrounding text. - end figure description