Avoid modification of global and static variables

To be threadsafe, a well-behaved C UDR must avoid use of global and static variables.

Global and static variables are stored in the address space of a virtual processor, in the data segment of a shared-object file. These variables belong to the address space of the VP, not of the thread itself. Modification of or taking pointers to global or static variables is not safe across VP migration boundaries.

When an SQL statement contains a C UDR, the routine manager loads the shared-object file that contains the UDR object code into each VP. Therefore, each VP receives its own copy of the data and text segments of a shared-object file and all VPs have the same initial data in their shared-object data segments. The following figure shows a schematic representation of a virtual processor and indicates the location of global and static variables.
Figure 1: Location of global and static variables in a VP

begin figure description - This figure is described in the surrounding text. - end figure description

As preceding figure shows, global and static variables are not stored in database server shared memory, but in the data and text segments of a VP. These segments in one VP are not visible after a thread migrates to another VP. Therefore, if a C UDR modifies global or static data in the data segment of one VP, the same data is not available if the thread migrates.

The following code fragment shows an implementation of a C UDR named bad_rowcount() that creates an incremented row count for the results of a query.
/* bad_rowcount()
 *     Increments a counter for each row in a query result.
 *     This is the WRONG WAY to implement the function
 *     because it updates a static variable.
 */
mi_integer
bad_rowcount(Gen_fparam)
   MI_FPARAM *Gen_fparam;
{
   static mi_integer bad_count = 0;
   bad_count++;
   return bad_count;
}
Suppose the following SELECT statement executes:
SELECT bad_rowcount(), customer_id FROM customer;
The CPU VP that is processing this query (for example, CPU-VP 1) executes the bad_rowcount() function. The bad_rowcount() function is not well-behaved because it uses a static variable to hold the row count. Use of this static bad_count variable creates the following problems:
  • The updated bad_count value is not visible when the thread migrates to another VP.

    When bad_rowcount() increments the bad_count variable to 1, it updates the static variable in the shared-object data segment of CPU-VP 1. If the thread now migrates to a different CPU VP (for example, CPU-VP 2), this incremented value of bad_count is not available to the bad_rowcount() function. This next invocation of bad_rowcount() gets an initialized value of zero, instead of 1.

  • Concurrent activity of the bad_rowcount() function is not interleaved.

    For example, suppose CPU-VP 1 and CPU-VP 2 are processing session threads for three client applications, each of which execute the bad_rowcount() function. Now two copies of the bad_count static variable are being incremented among the three client applications.

A well-behaved C UDR can avoid use of global and static data with the following workarounds.
Workaround Description
Use only local (stack) variables and user memory (which the DataBlade® API memory-management functions allocate). Both of these types of memory remain accessible when a thread migrates to another VP:
  • Because the stack is maintained as part of the thread, reads and writes of local variables are maintained when the thread migrates among VPs. Write reentrant code that keeps variables on the stack.
  • User memory is in database server shared memory and therefore is accessible by all VPs.

For more information, see Manage user memory.

Use a function-parameter structure, named MI_FPARAM, to track private state information for a C UDR. The MI_FPARAM structure is available to all invocations of a UDR within a routine sequence. The code example in The MI_FPARAM structure for holding private-state information shows the implementation of the rowcount() function, which uses the MI_FPARAM structure to correctly implement the row counter that bad_rowcount() attempts to implement. For more information, see Saving a user state.
If necessary, you can use read-only static or global variables because the values of these variables remain the same in each CPU VP. Keep in mind, however, that addresses of global and static variables as well as addresses of functions are not stable when the UDR migrates across VPs.

If your C UDR cannot avoid using global or static variables, it is an ill-behaved routine. You can execute the ill-behaved routine in a nonyielding user-defined VP class but not in the CPU VP. A nonyielding user-defined VP prevents the UDR from yielding and thus from migrating to another VP. Because the nonyielding VP executes the UDR to completion, any global (or static) value is valid for the duration of a single invocation of the UDR. The nonyielding VP prevents other invocations of the same UDR from migrating into the VP and updating the global or static variables. However, it does not guarantee that the UDR will return to the same VP for the next invocation.

For the global (or static) value to be valid across a single UDR instance (all invocations of the UDR), define a single-instance user-defined VP. This VP class contains one nonyielding VP. It ensures that all instances of the same UDR execute on the same VP and update the same global variables. A single-instance user-defined VP is useful if your UDR must access a global or static variable by its address.

For more information, see Choose the user-defined VP class.