Rerunning failed grid routines

You can rerun a grid routine that failed on one or more servers in the grid.

About this task

If a grid routine failed on one or more servers in the grid, you can run the cdr list grid command with the --nacks option to see the details of why it failed. If a server in the grid is offline or is not connected to the network, then a grid routine will be pending on that server and will be run when the server is reconnected to the grid.

In some cases, you should not rerun a failed routine, because the failure is expected. For example, if a server already has the database object that a grid routine is creating, then that routine fails on that server. If a command failed on all grid servers, you can run the original command again instead of running the ifx_grid_redo() procedure.

The grid must exist and you must run the grid routine as an authorized user from an authorized server.

Procedure

To rerun a grid routine, run the ifx_grid_redo() procedure.

If you run the ifx_grid_redo() procedure without additional arguments besides the grid name, all routines that failed are re-attempted on all the servers on which they failed. You can specify on which server to rerun routines and which routines to rerun.

Example

Suppose you have a grid, named grid1, that contains the servers gserv_1 and gserv_2, which have a database named db1.

You create a dbspace named dbsp2 on the server gserv_1 and then create a table in that dbspace in a grid context with the following commands:

$ dbaccess db1 -
execute procedure ifx_grid_connect('grid1');
create table t100 (c1 int primary key) in dbsp2;
execute procedure ifx_grid_disconnect();

The cdr list grid command shows that the command failed on the server gserv_2:

$ cdr list grid grid1 --nack
Grid                     Node                     User
------------------------ ------------------------ ------------------------
grid1                    gserv_1*                 user1
                         gserv_2
Details for grid grid1

Node:gserv_1 Stmtid:4 User:user1 Database:db1 2011-02-24 09:27:44
create table t100 (c1 int primary key) in dbsp2
NACK gserv_2 2011-02-24 09:27:45 SQLERR:-261 ISAMERR:-130
     Grid Apply Transaction Failure

The error indicates that the table could not be created because the specified dbspace does not exist.

You create a dbspace named dbsp2 on the server gserv_2 and run the ifx_grid_redo() procedure to rerun the original command on gserv_2:

$ dbaccess db1 –
execute procedure ifx_grid_redo('grid1');

The output of the cdr list grid command shows that the command succeeded on both servers:

$ cdr list grid grid1 -v
Grid                     Node                     User
------------------------ ------------------------ ------------------------
grid1                    gserv_1*                 user1
                         gserv_2
Details for grid grid1
...
Node:gserv_1 Stmtid:4 User:user1 Database:db1 2011-02-24 09:27:44
create table t100 (c1 int primary key) in dbsp2
ACK gserv_1 2011-02-24 09:27:44
ACK gserv_2 2011-02-24 09:31:09