SQL Advanced

6.2.1    How to emulate the Oracle decode function/crosstab
6.2.2    How to implement if-then-else within a select-clause.
6.2.3    deleted due to copyright hassles with the publisher
6.2.4    How to pad with leading zeros an int or smallint.
6.2.5    Divide by zero and nulls.
6.2.6    Convert months to financial months.
6.2.7    Hierarchy traversal - BOMs.
6.2.8    Is it possible to call a UNIX command from within a stored procedure or a trigger?
6.2.9    Information on Identities and Rolling your own Sequential Keys
6.2.10 How can I execute dynamic SQL with ASE
6.2.11 Is it possible to concatenate all the values from a column and return a single row?
6.2.12 Selecting rows N to M without Oracle's rownum?
6.2.13 How can I return number of rows that are returned from a grouped query without using a temporary table?

6.2.1: How to emulate the Oracle decode function/crosstab

If you are using ASE version 11.5 or later, the simplest way to implement the Oracle decode is with the CASE statement. The following code snippet should be compared with the example using a characteristic function given below .

SELECT STUDENT_ID,
       (CASE WHEN COURSE_ID = 101 THEN 1 ELSE 0 END) AS COURSE_101,
       (CASE WHEN COURSE_ID = 105 THEN 1 ELSE 0 END) AS COURSE_105,
       (CASE WHEN COURSE_ID = 201 THEN 1 ELSE 0 END) AS COURSE_201,
       (CASE WHEN COURSE_ID = 210 THEN 1 ELSE 0 END) AS COURSE_210,
       (CASE WHEN COURSE_ID = 300 THEN 1 ELSE 0 END) AS COURSE_300
GROUP BY STUDENT_ID
ORDER BY STUDENT_ID

However, if you have a version of ASE that does not support the case statement, then you will have to try the following. There may be other reasons to try characteristics functions. If you go to the Amazon web site and look for reviews for of Rozenshteins book, Advanced SQL, you will see that one reviewer believes that a true crosstab is not possible with the case statement. I am not sure. I have also not done any performance tests to see which is quicker.

There is a neat way to use boolean logic to perform cross-tab or rotation queries easily, and very efficiently. Using the aggregate 'Group By' clause in a query and the ISNULL(), SIGN(), ABS(), SUBSTRING() and CHARINDEX() functions, you can create queries and views to perform all kinds of summarizations.

This technique does not produce easily understood SQL statements.

If you want to test a field to see if it is equal to a value, say 100, use the following code:

SELECT (1- ABS( SIGN( ISNULL( 100 - <field>, 1))))

The innermost function will return 1 when the field is null, a positive value if the field < 100, a negative value if the field is > 100 and will return 0 if the field = 100. This example is for Sybase or Microsoft SQL server, but other servers should support most of these functions or the COALESCE() function, which is the ANSI equivalent to ISNULL.

The SIGN() function returns zero for a zero value, -1 for a negative value, 1 for a positive value The ABS() function returns zero for a zero value, and > 1 for any non-zero value. In this case it will return 0 or 1 since the argument is the function SIGN(), thus acting as a binary switch.

Put it all together and you get '0' if the value match, and '1' if they don't. This is not that useful, so we subtract this return value from '1' to invert it, giving us a TRUE value of '1' and a false value of '0'. These return values can then be multiplied by the value of another column, or used within the parameters of another function like SUBSTRING() to return a conditional text value.

For example, to create a grid from a student registration table containing STUDENT_ID and COURSE_ID columns, where there are 5 courses (101, 105, 201, 210, 300) use the following query:

Compare this version with the case statement above.

SELECT STUDENT_ID,
    (1- ABS( SIGN( ISNULL( 101 - COURSE_ID, 1)))) COURSE_101,
    (1- ABS( SIGN( ISNULL( 105 - COURSE_ID, 1)))) COURSE_105,
    (1- ABS( SIGN( ISNULL( 201 - COURSE_ID, 1)))) COURSE_201,
    (1- ABS( SIGN( ISNULL( 210 - COURSE_ID, 1)))) COURSE_210,
    (1- ABS( SIGN( ISNULL( 300 - COURSE_ID, 1)))) COURSE_300
GROUP BY STUDENT_ID
ORDER BY STUDENT_ID

6.2.2: How to implement if-then-else in a select clause

ASE 11.5 introduced the case statement, which can be used to replace a lot of this 'trick' SQL with more readable (and standard) code. With a case statement, an if then else is as easy as:

declare @val char(20)
select @val = 'grand'

select case when @val = 'small' then
                'petit'
            else
                'grand'
            end

However, quite a number of people are still using pre-11.5 implementations, including those people using the free 11.0.3.3 Linux release. In that case you can use the following recipe.

To implement the following condition in a select clause:

if @val = 'small' then
    print 'petit'
else
    print 'grand'
fi

in versions of ASE prior to 11.5 do the following:

select isnull(substring('petit', charindex('small', @val), 255), 'grand')

To test it out, try this:

declare @val char(20)
select @val = 'grand'
select isnull(substring('petit', charindex('small', @val), 255), 'grand')

This code is not readily understandable by most programmers, so remember to comment it well.

6.2.3: Removed

6.2.4: How to pad with leading zeros an int or smallint.

By example:

declare @Integer        int

/* Good for positive numbers only. */
select @Integer = 1000

select "Positives Only" =
       right( replicate("0", 12) + convert(varchar, @Integer), 12)

/* Good for positive and negative numbers. */
select @Integer = -1000

select "Both Signs" =
       substring( "- +", (sign(@Integer) + 2), 1) +
       right( replicate("0", 12) + convert(varchar, abs(@Integer)), 12)

select @Integer = 1000

select "Both Signs" =
       substring( "- +", (sign(@Integer) + 2), 1) +
       right( replicate("0", 12) + convert(varchar, abs(@Integer)), 12)

go

Produces the following results:

 Positives Only
 --------------
 000000001000

 Both Signs
 -------------
 -000000001000

 Both Signs
 -------------
 +000000001000

6.2.5: Divide by zero and nulls

During processing, if a divide by zero error occurs you will not get the answer you want. If you want the result set to come back and null to be displayed where divide by zero occurs do the following:

1> select * from total_temp
2> go
 field1      field2
 ----------- -----------
          10          10
          10           0
          10        NULL

(3 rows affected)
1> select field1, field1/(field2*convert(int,
                  substring('1',1,abs(sign(field2))))) from total_temp
2> go
 field1
 ----------- -----------
          10           1
          10        NULL
          10        NULL

6.2.6: Convert months to financial months

To convert months to financial year months (i.e. July = 1, Dec = 6, Jan = 7, June = 12 )

Method #1

select ... ((sign(sign((datepart(month,GetDate())-6) * -1)+1) *
   (datepart(month, GetDate())+6))
      + (sign(sign(datepart(month, GetDate())-7)+1) *
   (datepart(month, GetDate())-6)))
   ...
  from ...

Method #2

select charindex(datename(month,getdate()),
"          July      August    September October   November  December
January   Febuary   March     April     May       June      "
		     ) / 10

In the above example, the embedded blanks are significant.

6.2.7: Hierarchy traversal - BOMs

Alright, so you wanna know more about representing hierarchies in a relational database? Before I get in to the nitty gritty I should at least give all of the credit for this algorithm to: "_Hierarical_Structures:_The_Relational_Taboo!_, _(Can_ Transitive_Closure_Queries_be_Efficient?)_", by Michael J. Kamfonas as published in 1992 "Relational Journal" (I don't know which volume or issue).

The basic algorithm goes like this, given a tree (hierarchy) that looks roughly like this (forgive the ASCII art--I hope you are using a fixed font to view this):

                                    a
                                   / \
                                 /     \
                               /         \
                             b             c
                            / \           /|\
                           /   \        /  |  \
                          /     \     /    |   \
                         d       e   f     |    g

Note, that the tree need not be balanced for this algorithm to work.

The next step assigned two numbers to each node in the tree, called left and right numbers, such that the left and right numbers of each node contain the left and right numbers of the ancestors of that node (I'll get into the algorithm for assigning these left and right numbers later, but, hint: use a depth-first search):

                                   1a16
                                   / \
                                 /     \
                               /         \
                            2b7           8c15
                            / \           /|\
                           /   \        /  |  \
                          /     \     /    |   \
                        3d4     5e6 9f10 11g12 13h14

Side Note: The careful observer will notice that these left and right numbers look an awful lot like a B-Tree index.

So, you will notice that all of the children of node 'a' have left and right numbers between 1 and 16, and likewise all of the children of 'c' have left and right numbers between 8 and 15. In a slightly more relational format this table would look like:

              Table: hier
   node   parent left_nbr  right_nbr
   -----  ------ --------  ---------
   a        NULL        1         16
   b           a        2          7
   c           a        8         15
   d           b        3          4
   e           b        5          6
   f           c        9         10
   g           c       11         12
   h           c       13         14

So, given a node name, say @node (in Sybase variable format), and you want to know all of the children of the node you can do:

    SELECT h2.node
      FROM hier   h1,
           hier   h2
     WHERE h1.node      =   @node
       AND h2.left_nbr  >   h1.left_nbr
       AND h2.left_nbr  <   h1.right_nbr

If you had a table that contained, say, the salary for each node in your hierarchy (assuming a node is actually a individual in a company) you could then figure out the total salary for all of the people working underneath of @node by doing:

    SELECT sum(s.salary)
      FROM hier   h1,
           hier   h2,
           salary s
     WHERE h1.node      =   @node
       AND h2.left_nbr  >   h1.left_nbr
       AND h2.right_nbr >   h1.right_nbr
       AND s.node       =   h2.node

Pretty cool, eh? And, conversely, if you wanted to know how much it cost to manage @node (i.e. the combined salary of all of the boss's of @node), you can do:

    SELECT sum(s.salary)
      FROM hier   h1,
           hier   h2,
           salary s
     WHERE h1.node      =   @node
       AND h2.left_nbr  <   h1.left_nbr
       AND h2.left_nbr  >   h1.right_nbr
       AND s.node       =   h2.node

Now that you can see the algorithm in action everything looks peachy, however the sticky point is the method in which left and right numbers get assigned. And, unfortunately, there is no easy method to do this relationally (it can be done, it just ain't that easy). For an real- world application that I have worked on, we had an external program used to build and maintain the hierarchies, and it was this program's responsibility to assign the left and right numbers.

But, in brief, here is the algorithm to assign left and right numbers to every node in a hierarchy. Note while reading this that this algorithm uses an array as a stack, however since arrays are not available in Sybase, they are (questionably) emulated using a temp table.

    DECLARE @skip            int,
            @counter         int,
            @idx             int,
            @left_nbr        int,
            @node            varchar(10)

    /*-- Initialize variables --*/
    SELECT @skip    = 1000,   /* Leave gaps in left & right numbers */
           @counter = 0,      /* Counter of next available left number */
           @idx     = 0       /* Index into array */

    /*
     * The following table is used to emulate an array for Sybase,
     * for Oracle this wouldn't be a problem. :(
     */
    CREATE TABLE #a (
        idx          int           NOT NULL,
        node         varchar(10)   NOT NULL,
        left_nbr     int           NOT NULL
    )

    /*
     * I know that I always preach about not using cursors, and there
     * are ways to get around it, but in this case I am more worried
     * about readability over performance.
     */
    DECLARE root_cur CURSOR FOR
      SELECT h.node
        FROM hier h
       WHERE h.parent IS NULL
    FOR READ ONLY

    /*
     * Here we are populating our "stack" with all of the root
     * nodes of the hierarchy.  We are using the cursor in order
     * to assign an increasing index into the "stack"...this could
     * be done using an identity column and a little trickery.
     */
    OPEN root_cur
    FETCH root_cur INTO @node
    WHILE (@@sqlstatus = 0)
    BEGIN
      SELECT @idx = @idx + 1
      INSERT INTO #a VALUES (@idx, @node, 0)
      FETCH root_cur INTO @node
    END
    CLOSE root_cur
    DEALLOCATE CURSOR root_cur

    /*
     * The following cursor will be employed to retrieve all of
     * the children of a given parent.
     */
    DECLARE child_cur CURSOR FOR
      SELECT h.node
        FROM hier h
       WHERE h.parent = @node
    FOR READ ONLY

    /*
     * While our stack is not empty.
     */
    WHILE (@idx > 0)
    BEGIN
      /*
       * Look at the element on the top of the stack.
       */
      SELECT @node      = node,
             @left_nbr  = left_nbr
        FROM #a
       WHERE idx = @idx

      /*
       * If the element at the top of the stack has not been assigned
       * a left number yet, then we assign it one and copy its children
       * on the stack as "nodes to be looked at".
       */
      IF (@left_nbr = 0)
      BEGIN
         /*
          * Set the left number of the current node to be @counter + @skip.
          * Note, we are doing a depth-first traversal, assigning left
          * numbers as we go.
          */
         SELECT @counter  = @counter + @skip
         UPDATE #a
            SET left_nbr  = @counter
          WHERE idx = @idx

         /*
          * Append the children of the current node to the "stack".
          */
         OPEN child_cur
         FETCH child_cur INTO @node
         WHILE (@@sqlstatus = 0)
         BEGIN
            SELECT @idx = @idx + 1
            INSERT INTO #a VALUES (@idx, @node, 0)
            FETCH child_cur INTO @node
         END
         CLOSE child_cur

      END
      ELSE
      BEGIN
         /*
          * It turns out that the current node already has a left
          * number assigned to it, so we just need to assign the
          * right number and update the node in the actual
          * hierarchy.
          */
         SELECT @counter = @counter + @skip

         UPDATE h
            SET left_nbr  = @left_nbr,
                right_nbr = @counter
          WHERE h.node    = @node

         /*
          * "Pop" the current node off our "stack".
          */
         DELETE #a WHERE idx = @idx
         SELECT @idx = @idx - 1
      END
    END /* WHILE (@idx > 0) */
    DEALLOCATE CURSOR child_cur

While reading through this, you should notice that assigning the left and right numbers to the entire hierarchy is very costly, especially as the size of the hierarchy grows. If you put the above code in an insert trigger on the hier table, the overhead for inserting each node would be phenomenal. However, it is possible to reduce the overall cost of an insertion into the hierarchy.

By leaving huge gaps in the left & right numbers (using the @skip variable), you can reduce the circumstances in which the numbers need to be reassigned for a given insert. Thus, as long as you can squeeze a new node between an existing pair of left and right numbers you don't need to do the re-assignment (which could affect all of the node in the hierarchy).
By keeping an extra flag around in the hier table to indicate which nodes are leaf nodes (this could be maintained with a trigger as well), you avoid placing leaf nodes in the array and thus reduce the number of updates.

Deletes on this table should never cause the left and right numbers to be re-assigned (you could even have a trigger automagically re-parent orphaned hierarchy nodes).

All-in-all, this algorithm is very effective as long as the structure of the hierarchy does not change very often, and even then, as you can see, there are ways of getting around a lot of its inefficiencies.

6.2.8: Calling OS commands from a trigger or a stored procedure

11.5 and above

The Adaptive Server (11.5) will allow O/S calls from within stored procedures and triggers. These stored procedures are known as extended stored procedures.

Pre-11.5

Periodically folks ask if it's possible to make a system command or call a UNIX process from a Trigger or a Stored Procedure.

Guaranteed Message Processing

The typical ways people have implemented this capability is:

Buy Open Server and bind in your own custom stuff (calls to system() or custom C code) and make Sybase RPC calls to it.
Have a dedicated client application running on the server box which regularly scans a table and executes the commands written into it (and tucks the results into another table which can have a trigger on it to gather results...). It is somewhat tricky but cheaper than option 1.

Sybase ASE 10.0.2.5 and Above - syb_sendmsg()

This release includes a new built-in function called syb_sendmsg(). Using this function you can send a message up to 255 bytes in size to another application from the ASE. The arguments that need to be passed to syb_sendmsg() are the IP address and port number on the destination host, and the message to be sent. The port number specified can be any UDP port, excluding ports 1-1024, not already in use by another process. An example is:

1> select syb_sendmsg("120.10.20.5", 3456, "Hello")
2> go

This will send the message "Hello" to port 3456 at IP address '120.10.20.5'. Because this built-in uses the UDP protocol to send the message, the ASE does not guarantee the receipt of the message by the receiving application.

Also, please note that there are no security checks with this new function. It is possible to send sensitive information with this command and Sybase strongly recommends caution when utilizing syb_sendmsg to send sensitive information across the network. By enabling this functionality, the user accepts any security problems which result from its use (or abuse).

To enable this feature you should run the following commands as the System Security Officer.

Enable the syb_sendmsg() feature using sp_configure.

1> sp_configure "allow sendmsg", 1
2> go

1> sp_configure "syb_sendmsg port number", <port number>
2> go

1> reconfigure with override  -- Not necessary with 11.0 and above
2> go

The server must be restarted to set the port number.

Using syb_sendmsg() with Existing Scripts

Since syb_sendmsg() installs configuration parameter "allow sybsendmsg", existing scripts that contain the syntax

1> sp_configure allow, 1
2> go

to enable updates to system tables should be altered to be fully qualified as in the following:

1> sp_configure "allow updates", 1
2> go

If existing scripts are not altered they will fail with the following message:

1> sp_configure allow, 1
2> go
Configuration option is not unique.
duplicate_options
----------------------------
allow updates
allow sendmsg

(return status = 1)

(The above error is a little out of date for the latest releases of ASE, there are now 8 rows that contain "allow", but the result is the same.)

Backing Out syb_sendmsg()

The syb_sendmsg() function requires the addition on two config values. If it becomes necessary to roll back to a previous ASE version which does not include syb_sendmsg(), please follow the instructions below.

Edit the RUNSERVER file to point to the SWR ASE binary you wish to use.
isql -Usa -P<sa password> -Sserver_name -n -iunconfig.sendmsg -ooutput_file

Sample C program

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <fcntl.h>

main(argc, argv)
int argc; char *argv[];
{

    struct sockaddr_in sadr;
    int portnum,sck,dummy,msglen;
    char msg[256];

    if (argc <2) {
        printf("Usage: udpmon <udp portnum>\n");
        exit(1);
    }

    if ((portnum=atoi(argv[1])) <1) {
        printf("Invalid udp portnum\n");
        exit(1);
    }

    if ((sck="socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP))" < 0) {
        printf("Couldn't create socket\n");
        exit(1);
    }

    sadr.sin_family      = AF_INET;
    sadr.sin_addr.s_addr = inet_addr("0.0.0.0");
    sadr.sin_port        = portnum;

    if (bind(sck,&sadr,sizeof(sadr)) < 0) {
        printf("Couldn't bind requested udp port\n");
        exit(1);
    }

    for (;;)
    {

        if((msglen="recvfrom(sck, msg, sizeof(msg), 0, NULL, &dummy))" < 0)
            printf("Couldn't recvfrom() from udp port\n");

        printf("%.*s\n", msglen, msg);
    }
}

6.2.9: Identities and Sequential Keys

This has several sections, culled from various sources. It is better described as "Everything you've ever wanted to know about identities." It will serve to answer the following frequently asked questions:

What are the Features and Advantages of using Identities?
What are the Problems with and Disadvantages of Identities?
Common Questions about Identities

How do I optimize the performance of a table that uses Identities?
How do I recover from a huge gap in my identity column?
How do I fix a table that has filled up its identity values?

OK, I hate identities. How do I generate sequential keys without using the Identity feature?
How do I optimize a hand-made sequential key system for best performance?

- Question 8.1 of the comp.database.sybase FAQ has a quick blurb about identities and sequential numbers. Search down in the page for the section titled, "Generating Sequential Numbers." Question 8.1 is a general document describing Performance and Tuning topics to be considered and thus doesn't go into as much detail as this page.

- There's a white paper by Malcolm Colton available from the sybase web site. Goto the Sybase web site http://www.sybase.com and type Surrogate in the search form. Select the Surrogate Primary Keys, Concurrency, and the Cache Hit Ratio document.

Advantages/Features of Using Identities

There's an entire section devoted to Identity columns in the ASE Reference manual, Chapter 5

Sybase System 10 introduced many changes over the 4.9.x architecture. One of these changes was the Identity feature. The identity column is a special column type that gets automatically updated by the server upon a new row insert. Its purpose is to guarantee a unique row identifier not based on the other data in the row. It was integrated with the server and made memory based for fast value retrieval and no locking (as was/is the case with homegrown sequential key generation schemes).

The Advantages and Features of Identities include:

A non-SQL based solution to the problem of having an default unique value assigned to a row. ASE prefetches identity values into cache and adds them automatically to rows as they're inserted into tables that have a type Identity column. There's no concurrency issues, no deadlocking in high-insert situations, and no possibility of duplicate values.

A high performance Unique identifier; ASE's optimizer is tuned to work well with Unique indexes based on the identity value.

The flexibility to insert into the identity field a specific value in the case of a mistaken row deletion. (You can never update however). You accomplish this by:

1> set identity_insert [datababase]..[table] on
2> go

Note however that the System will not verify the uniqueness of the value you specifically insert (unless of course you have a unique index existing on the identity column).

The flexibility during bcp to either retain existing identity values or to reset them upon bcping back in. To retain the specific identity values during a bcp out/in process, bcp your data out normally (no special options). Then create your bcp in target table with ddl specifying the identity column in the correct location. Upon bcp'ing back in, add the "-E" option at the end of the bcp line, like this (from O/S prompt):
```
% bcp [database]..[new_table] in [bcp datafile] -Usa -S[server] -f [fmt file] -E
```
For procedures on resetting identity values during a bcp, see the section regarding Identity gaps.
Databasewide Identity options: 1) The ability to set Sybase to automatically create an Identity column on any table that isn't created with a primary key or a unique constraint specified. 2) Sybase can automatically include an Identity field in all indexes created, guaranteeing all will be unique. These two options guarantee increased index performance optimization and guarantees the use of updateable cursors and isolation level 0 reads.
These features are set via sp_dboption, like this:
```
1> sp_dboption [dbname], "auto identity", true
2> go
or
1> sp_dboption [dbname], "identity in nonunique index", true
2> go
```
To tune the size of the auto identity (it defaults to precision 10):
```
1> sp_configure "size of auto identity", [desired_precision]
2> go
```
(the identity in nonunique index db_option and the size of auto identity sp_configure value are new with System 11: the auto identity existed with the original Identity feature introduction in System 10)

Like other dboptions, you can set these features on the model database before creating new databases and all your future databases will be configured. Be warned of the pitfalls of large identity gaps however; see the question regarding Burn Factor Vulnerability in the Common Questions about Identities section.
The existence of the @@identity global variable, which keeps track of the identity value assigned during the last insert executed by the server. This variable can be used programming SQL around tables that have identity values (in case you need to know what the last value inserted was). If the last value inserted in the server was to a non-identity table, this value will be "0."

Back to start of 6.2.9

Disadvantages/Drawbacks of Using Identities

Despite its efficacy of use, the Identity has some drawbacks:

The mechanism that Sybase uses to allocate Identities involves a memory based prefetch scheme for performance. The downside of this is, during non-normal shutdowns of ASE (shutdown with nowait or flat out crashes) ASE will simply discard or "burn" all the unused identity values it has pre-allocated in memory. This sometimes leaves large "gaps" in your monotonically increasing identity columns and can be unsettling for some application developers and/or end users.

NOTE: Sybase 11.02.1 (EBF 6717) and below had a bug (bugid 96089) which would cause "large gaps to occur in identity fields after polite shutdowns." The Sybase 11.02.2 rollup (EBF 6886) fixed this problem. If you're at or below 11.02.1 and you use identities, you should definitely upgrade.

(paraphrased from Sybooks P&T guide, Chapter 6): If you do a large number of inserts and you have built your clustered index on an Identity column, you will have major contention and deadlocking problems. This will instantly create a hot spot in your database at the point of the last inserted row, and it will cause bad contention if multiple insert requests are received at once. Instead, create your clustered index on a field that will somewhat randomize the inserts across the physical disk (such as last name, account number, social security number, etc) and then create a non-clustered index based on the identity field that will "cover" any eligible queries.

The drawback here, as pointed out in the Identity Optimization section in more detail, is that clustering on another field doesn't truly resolve the concurrency issues. The hot spot simply moves from the last data page to the last non-clustered index page of the index created on the Identity column.
If you fill up your identity values, no more inserts can occur. This can be a big problem, especially if you have a large number of inserts and you have continually crashed your server. However this problem most often occurs when you try to alter a table and add an Identity column that's too small, or if you try to bcp into a table with an identity column thetas too small. If this occurs, follow the procedures for recovering from identity gaps.
I've heard (but not been able to reproduce) that identities jump significantly when dumping and loading databases. Not confirmed.

NOTE: there are several other System 11 bugs related to Identities. EBF 7312 fixes BugId 97748, which caused duplicate identity values to be inserted at times. EBF 6886 fixed (in addition to the above described bug) an odd bug (#82460) which caused a server crash when bcping into a table w/ an identity added via alter table. As always, try to stay current on EBFs.

Back to start of 6.2.9

Common questions about Identities

Is the Identity the equivalent of Oracle's auto-sequencing?:

Answer: More or less yes. Oracle's auto-sequencing feature is somewhat transparent to the end user and automatically increments if created as a primary key upon a row insert. The Sybase Identity column is normally specified at table creation and thus is a functional column of the table. If however you set the "auto identity" feature for a database, the tables created will have a "hidden" identity column that doesn't even appear when you execute a select * from [table]. See the Advantages of Identities for more details.

How do I configure Identities?: You can either create your table initially with the identity column:
```
1> create table ident_test
2> (text_field varchar(10),
3>  ident_field numeric(5,0) identity)
4> go
```
Or alter an existing table and add an identity column:
```
1> alter table existing_table
2> add new_identity_field numeric(7,0) identity
3> go
```
When you alter a table and add an identity column, the System locks the table while systematically incrementing and adding unique values to each row. IF YOU DON'T SPECIFY a precision, Sybase defaults the size to 18! Thats 1,000,000,000,000,000,000-1 possible values and some major major problems if you ever crash your ASE and burn a default number of values... (10^18 with the default burn factor will burn 5^14 or 500,000,000,000,000 values...yikes).
How do I Configure the burn factor?: The number of identity values that gets "burned" upon a crash or a shutdown can by found by logging into the server and typing:
```
1> sp_configure "identity burning set factor"
2> go
```
the Default value set upon install is 5000. The number "5000" in this case is read as ".05% of all the potential identity values you can have in this particular case will be burned upon an unexpected shutdown." The actual number depends on the size of the identity field as you specified it when you created your table.

To set the burn factor, type:
```
1> sp_configure "identity burning set factor", [new value]
2> go
```
This is a static change; the server must be rebooted before it takes effect.
How do I tell which tables have identities?: You can tell if a table has identities one of two ways:
1. sp_help [tablename]: there is a field included in the sp_help output describing a table called "Identity." It is set to 1 for identity fields, 0 otherwise.
2. Within a database, execute this query:
```
1> select object_name(id) "table",name "column", prec "precision"
2> from syscolumns
3> where convert(bit, (status & 0x80)) = 1
4> go
```
this will list all the tables and the field within the table that serves as an identity, and the size of the identity field.

What is my identity burn factor vulnerability right now?:
In other words, what would happen to my tables if I crashed my server right now?

Identities are created type numeric, scale 0, and precision X. A precision of 9 means the largest identity value the server will be able to process is 10^9-1, or 1,000,000,000-1, or 999,999,999. However, when it comes to Burning identities, the server will burn (based on the default value of 5000) .05% of 1,000,000,000 or 500,000 values in the case of a crash. (You may think an identity precision allowing for 1 Billion rows is optimistic, but I once saw a precision set at 14...then the database crashed and their identity values jumped 5 TRILLION. Needless to say they abandoned their original design. Even worse, SQL server defaults precision to 18 if you don't specify it upon table creation...that's a MINIMUM 10,000,000,000 jump in identity values upon a crash with the absolute minimum burn factor)

Lets say you have inserted 5 rows into a table, and then you crash your server and then insert 3 more rows. If you select all the values of your identity field, it will look like this:
```
1> select identity_field from id_test
2> go
identity_field
--------------
1
2
3
4
5
500006
500007
500008

(8 rows affected)
```
Here's your Identity burning options (based on a precision of 10^9 as above):
```
Burn value	% of values	# values burned during crash
5000		.05%		500,000
1000		.01%		100,000
100		.001%		10,000
10		.0001%		1,000
1 		.00001%		100
```
So, the absolute lowest amount of numbers you'll burn, assuming you configure the burn factor down to 1 (sp_configure "identity burning set factor", 1) and a precision of 9, is 100 values.

Back to start of 6.2.9

Optimizing your Identity setup for performance and maintenance

If you've chosen to use Identities in your database, here are some configuration tips to avoid typical Identity pitfalls:
- Tune the burn factor!: see the vulnerability section for a discussion on what happens to identity values upon ASE crashes. Large jumps in values can crash front ends that aren't equipped to handle and process numbers upwards of 10 Trillion. I've seen Powerbuilder applications crash and/or not function properly when trying to display these large identity values.
- Run update statistics often on tables w/ identities: Any index with an identity value as the first column in the search condition will have its performance severely hampered if Update statistics is not run frequently. Running a nightly update statistics/sp_recompile job is a standard DBA task, and should be run often regardless of the existence of identities in your tables.
- Tune the "Identity Grab Size": ASE defaults the number of Identity values it pre-fetches to one (1). This means that in high insert environments the Server must constantly update its internal identity placeholder structure before adding the row. By tuning this parameter up:
```
1> sp_configure "identity grab size", [number]
2> go
```
  You can prefetch larger numbers of values for each user as they log into the server an insert rows. The downside of this is, if the user doesn't use all of the prefetched block of identity values, the unused values are lost (seeing as, if another user logs in the next block gets assigned to him/her). This can quickly accelerate the depletion of identity values and can cause gaps in Identity values.
  (this feature is new with System 11)
- Do NOT build business rules around Identity values. More generally speaking the recommendation made by DBAs is, if your end users are EVER going to see the identity field during the course of doing their job, then DON'T use it. If your only use of the Identity field is for its advertised purpose (that being solely to have a uniquely identifying row for a table to index on) then you should be fine.
- Do NOT build your clustered index on your Identity field, especially if you're doing lots of inserts. This will create a hot spot of contention at the point of insertion, and in heavier OLTP environments can be debilitating.
- There is an excellent discussion in document http://www.sybase.com/detail?id=860 on the performance and tuning aspects of Identities. It supplements some of the information located here (Note: this will open in a new browser window).

Back to start of 6.2.9

Recovery from Large Identity value gaps or
Recovery from Identity insert errors/Full Identity tables

This section will discuss how to re-order the identity values for a table following a crash/abnormal shutdown that has resulted in huge gaps in the values. The same procedure is used in cases where the identity field has "filled up" and does not allow inserts anymore. Some applications that use Identities are not truly candidates for this process (i.e., applications that depend on the identity field for business purposes as opposed to simple unique row identifiers). Applications like this that wish to rid their dependence on identities will have to re-evaluate their database design.
- There are alternate methods that perform the above in multi steps, and might be more appropriate in some situations.
  - You can select all columns but the identity into a new table, then alter that table and add an identity later on.
Back to start of 6.2.9

How do I generate Sequential Keys w/o the Identity feature?
There are many reasons not to use the Identity feature of Sybase. This section will present several alternative methods, along with their advantages and drawbacks. The methods are presented in increasing order of complexity. The most often implemented is Method 3, which is a more robust version of Method 2 and which uses a surrogate-key storage table.

Throughout this section the test table I'm adding lines to and generating sequential numbers for is table inserttest, created like this:
```
1> create table inserttest
2> (testtext varchar(25), counter int)
3> go
```
- Method 2: See Question 10.1.1 of the comp.database.sybase FAQ is the May 1994 (Volume 3, Number 2) Sybase Technical Note (these links will open in a new browser window). Search down in the tech note for the article titled, "How to Generate Sequential Keys for Table Key Columns." This has a simplistic solution that is expanded upon in Method 3.
- Method 3: Create a holding table for keys in a common database: Here's our central holding table.
```
1> create table keystorage
2> (tablename varchar(25),
4>  lastkey int)
5> go
```
  And initially populate it with the tablenames and last values inserted (enter in a 0 for tables that are brand new).
```
1> insert into keystorage (tablename,lastkey)
2> select "inserttest", max(counter) from inserttest
3> go
```
  Now, whenever you go to insert into your table, go through a process like this:
```
1> begin tran
2> update keystorage set lastkey=lastkey+1 where tablename="inserttest"
3> go

1> declare @lastkey int
2> select @lastkey = lastkey from keystorage where tablename="inserttest"
3> insert inserttest (testtext,counter) values ("nextline",@lastkey)
4> go

1> commit tran
2> go
```
  There is plenty of room for error checking with this process: for example (code adapted from Colm O'Reilly (colm@mail.lk.blackbird.ie) post to Sybase-L 6/20/97):
```
1> begin tran
2>   update keystorage set lastkey=lastkey+1 where tablename="inserttest"
3>   if @@rowcount=1
4>   begin
5>     declare @lastkey int
6>     select @lastkey=lastkey from keystorage where tablename="inserttest"
7>   end
8> commit tran
9> begin tran
10>   if @lastkey is not null
11>   begin
12>     insert inserttest (testtext,counter) values ("third line",@lastkey)
13>   end
14> commit tran
15> go
```
  This provides a pretty failsafe method of guaranteeing the success of the select statements involved in the process. You still have a couple of implementation decisions though:
  - One transaction or Two? The above example uses two transactions to complete the task; one to update the keystorage and one to insert the new data. Using two transactions reduces the amount of time the lock is held on keystorage and thus is better for high insertion applications. However, the two transaction method opens up the possibility that the first transaction will commit and the second will roll back, leaving a gap in the sequential numbers. (of course, this gap is small potatoes compared to the gaps that occur in Identity values). Using one transaction (deleting lines 8 and 9 in the SQL above) will guarantee absolutely no gaps in the values, but will lock the keystorage table longer, reducing concurrency in high insert applications.
  - Update first or select first? The examples given generally update the keystorage table first, THEN select the new value. Performing the select first (you will have to rework the creation scheme slightly; by selecting first you're actually getting the NEXT key to add, where as by updating first, the keystorage table actually holds the LAST key added) you allow the application to continue processing while it waits for the update lock on the table. However, performing the update first guarantees uniqueness (selects are not exclusive).
  Some DBAs experienced with this keystorage table method warn of large amounts of blocking in high insert activity situations, a potential drawback.
- Method 4: Enhance the above method by creating an insert trigger on your inserttest table that performs the next-key obtainment logic. Or you could create an insert trigger on keystorage which updates the table and obtains your value for you. Integrating the trigger logic to your application might make this approach more complex. Also, because of the nature of the trigger you'll have to define the sequence number columns as allowing NULL values (a bad thing if you're depending on the sequential number as your primary key). Plus, triggers will slow the operation down because after obtaining the new value via trigger, you'll have to issue an extra update command to insert the rest of your table values.
- Method 5: (Thanks to John Drevicky (jdrevicky@tca-techsys.com))
  The following procedure is offered as another example of updating and returning the Next Sequential Key, with an option that allows automatic reuse of numbers......
```
  -----------------------------------------------------------------
  ----
  --
  DECLARE @sql_err int, @sql_count int
  --
  begin tran
  --
  select @out_seq = 0
  --
  UPDATE NEXT_SEQUENCE
     SET next_seq_id
            =  ( next_seq_id
             * ( sign(1 + sign(max_seq_id - next_seq_id) ) -- evaluates: 0 [when
                                                           -- next > max]; else 1
             *   sign(max_seq_id - next_seq_id)            -- evaluates: 0 [when next = max];
                                                           --            1 [next < max];
                                                           --           -1 [next > max]
               )                                           -- both evaluate to 1 when next < max
                        ) + 1                              -- increment by [or restart at] 1
   WHERE seq_type = @in_seq_type
  --
  select @sql_err = @@error, @sql_count = @@rowcount
  --
  IF @sql_err = 0 and @sql_count = 1
  BEGIN
    select @out_seq = next_seq_id
      from NEXT_SEQUENCE
     where seq_type = @in_seq_type
    --
    commit tran
    return 0
  END
  ELSE
  BEGIN
    RAISERROR 44999 'Error %1! returned from proc derive_next_sequence...no update occurred', @sql_err
    rollback tran
  END
```
- Other Methods: there are several other implementation alternatives available that involve more complex logic but which might be good solutions. One example has a central table that stores pre-inserted sequential numbers that are deleted as they're inserted into the production rows. This method allows the sequence numbers to be recycled if their associated row is deleted from the production table. An interesting solution was posted to Sybase-L 6/20/97 by Matt Townsend ( mtowns@concentric.net) and is based on the millisecond field of the date/time stamp. His solution guarantees uniqueness without any surrogate tables or extra inserts/updates, and is a superior performing solution to other methods described here (including Identities), but cannot support exact sequential numbers. Some other solutions are covered in a white paper available at Sybase's Technical library discussing Sequential Keys (this will open in a new browser window).
Back to start of 6.2.9

Optimizing your home grown Sequential key generating process for any version of Sybase
- max_rows_per_page/fillfactor/table padding to simulate row level locking: This is the most important tuning mechanism when creating a hand -made sequence key generation scheme. Because of Sybase's page level locking mechanism, your concurrency performance in higher-insert activity situations could be destroyed unless the server only grabs one row at a time. However since Sybase doesn't currently have row-level locking, we simulate row-level locking by creating our tables in such a way as to guarantee one row per 2048 byte page.
  
  For pre-System 11 servers; Calculate the size of your rows, then create dummy fields in the table that get populated with junk but which guarantee the size of the row will fill an entire page. For example (code borrowed from Gary Meyer's 5/8/94 ISUG presentation (gmeyer@netcom.com)):
  1> create table keystorage 2> (tablename varchar(25), 3> lastkey int, 4> filler1 char(255) not null, 5> filler2 char(255) not null, 6> filler3 char(255) not null, 7> filler4 char(255) not null, 8> filler5 char(255) not null, 9> filler6 char(255) not null, 9> filler7 char(255) not null) 10> with fillfactor = 100 11> go
  
  We use 7 char(255) fields to pad our small table. We also specify the fillfactor create table option to be 100. A fillfactor of 100 tells the server to completely fill every data page. Now, during your initial insertion of a line of data, do this:
  
  1> insert into keystorage 2> (tablename,lastkey, 3> filler1,filler2,filler3,filler4,filler5,filler6,filler7) 4> values 5> ("yourtable",0, 6> replicate("x",250),replicate("x",250), 7> replicate("x",250),replicate("x",250), 8> replicate("x",250),replicate("x",250), 9> replicate("x",250)) 10> go
  
  This pads the row with 1750 bytes of junk, almost guaranteeing that, given a row's byte size limit of 1962 bytes (a row cannot span more than one page, thus the 2048 page size minus server overhead == 1962), we will be able to simulate row level locking.
  
  In Sybase 11, a new create table option was introduced: max_rows_per_page. It automates the manual procedures above and guarantees at a system level what we need to achieve; one row per page.
  1> create table keystorage 2> (tablename varchar(25), 3> lastkey int) 4> with max_rows_per_page = 1 5> go
- Create unique clustered indexes on the tablename/entity name within your keystorage table. This can only improve its performance. Remember to set max_rows_per_page or the fillfactor on your clustered index, as clustered indexes physically reorder the data.
- Break up the process into multiple transactions wherever possible; this will reduce the amount of time any table lock is held and will increase concurrency in high insertion environments.
- Use Stored Procedures: Put the SQL commands that update the keystorage table and then insert the updated key value into a stored procedure. Stored procedures are generally faster than individual SQL statements in your code because procedures are pre-compiled and have optimization plans for index usage stored in Sybase's system tables.
- Enhance the keystorage table to contain a fully qualified table name as opposed to just the tablename. This can be done by adding fields to the table definition or by just expanding the entity name varchar field definition. Then place the keystorage table in a central location/common database that applications share. This will eliminate multiple keystorage tables but might add length to queries (since you have to do cross-database queries to obtain the next key).
  - There is an excellent discussion located in the whitepapers section of Sybase's home page discussing the performance and tuning aspects of any type of Sequential key use. It supplements the information here (note: this page will open in a new browser window).

Back to start of 6.2.9

6.2.10: How can I execute dynamic SQL with ASE?

Adaptive Server Enterprise: System 12

ASE 12 supports dynamic SQL, allowing the following:

declare @sqlstring varchar(255) select @sqlstring = "select count(*) from master..sysobjects" exec (@sqlstring) go

Adaptive Server Enterprise: 11.5 and 11.9

There is a neat trick that was reported first by Bret Halford ( bret@sybase.com ). (If anyone knows better, point me to the proof and I will change this!) It utilises the CIS features of Sybase ASE.

Firstly define your local server to be a remote server using
sp_addserver LOCALSRV,sql_server[,INTERFACENAME] go

Enable CIS
sp_configure "enable cis",1 go

Finally, use sp_remotesql, sending the sql to the server defined in point 1.
declare @sqlstring varchar(255) select @sqlstring = "select count(*) from master..sysobjects" sp_remotesql LOCALSRV,@sqlstring go

Remember to ensure that all of the databases referred to in the SQL string are fully qualified since the call to sp_remotesql places you back in your default database.

Sybase ASE (4.9.x, 10.x and 11.x before 11.5)

Before System 11.5 there was no real way to execute dynamic SQL. Rob Verschoor has some very neat ideas that fills some of the gaps (http://www.euronet.nl/~syp_rob/dynsql.html).

Dynamic Stored Procedure Execution

With System 10, Sybase introduced the ability to execute a stored procedure dynamically.

declare @sqlstring varchar(255) select @sqlstring = "sp_who" exec @sqlstring go

For some reason Sybase chose never to document this feature.

Obviously all of this is talking about executing dynamic SQL within the server itself ie stored procedures and triggers. Dynamic SQL within client apps is a different matter altogether.

6.2.11: Is it possible to concatenate all the values from a column and return a single row?

Hey, this was quite cool I thought. It is now possible to concatenate a series of strings to return a single column, in a sort of analogous manner to sum summing all of the numbers in a column. Obviously, in versions before 12.5, the longest string that you can have is 255 characters, but with very long varchars, this may prove useful to someone.

Use a case statement, a la,

1> declare @string_var varchar(255)
2>
3> select @string_var = ""
4>
5> select @string_var = @string_var +
6>                       (case 1 when 1
7>                               then char_col
8>                        end)
9> from tbl_a
10>
11> print "%1!", @string_var
12> go
(1 row affected)
 ABCDEFGH
(8 rows affected)
1> select * from tbl_a
2> go
 char_col
 --------
 A
 B
 C
 D
 E
 F
 G
 H

(8 rows affected)
1>

6.2.12: Selecting rows N to M without Oracle's `rownum`?

Sybase does not have a direct equivalent to Oracle's rownum but its functionality can be emulated in a lot of cases.

If you are simply trying to retrieve the first N rows of a table, then simple use:

set rowcount

replacing <N> with your desired number of rows. (set rowcount 0 restores normality.) If it is simply the last N rows, then use a descending order-by clause in the select.

1> set rowcount 
2> go
1> select foo
2> from bar
3> order by barID desc
4> go

If you are trying to retrieve rows 100 to 150, say, from a table in a given order. You could use this to retrieve rows for a set of web pages, but there are probably more efficient ways using cursors or well written queries or even Sybperl! The general idea is select the rows into a temporary table adding an identity column at the same time. Only select enough rows to do the job using the rowcount trick. Finally, return the rows from the temporary table where the identity column is between 100 and 150. Something like this:

set rowcount 150

select pseudo_key = identity(3),
       col1,
       col2
  into #tempA
  from masterTable
 where clause...
 order by 2,3

select col1,col2 from #tempA where pseudo_key between 100 and 150

Remember to reset rowcount back to 0 before issuing any more SQL or you will only get back 150 rows!

A small optimisation would be to select only the key columns for the source table together with the identity key. Once you have the set of rows you require in the temporary table, join this back to the source using the key columns to get any data that you require.

An alternative, which might be better if you needed to join back to this table a lot, would be to insert enough rows to cover the range as before, but then delete the set of unwanted rows. This would be a very efficient mechanism if the majority of your queries involved the first few rows of a table. A typical application for this might be a search engine displaying relevant items first. The chances are that the user is going to be bored after the first couple of pages and go back to playing 'Internet Doom'.

set rowcount 150

select col1,
       col2
  into #tempA
  from masterTable
 where clause...

set rowcount 100

delete #tempA

Sybase does not guarantee to return rows in any particular order, so the delete may not delete the correct set of rows. In the above example, you should add an order-by to the 'select' and build a clustered index on a suitable key in the temporary table.

The following stored proc was posted to the Sybase-L mailing list and uses yet another mechanism. You should check that it works as expected in your environment since it relies on the fact a variable will be set using the last row that is returned from a result set. This is not published behaviour and is not guaranteed by Sybase.

CREATE PROCEDURE dbo.sp_get_posts
    @perpage    INT,
    @pagenumber INT
WITH RECOMPILE
AS

   -- if we're on the first page no need to go through the @postid push
   IF @pagenumber = 1
   BEGIN
      SET ROWCOUNT @perpage

      SELECT ...
      RETURN
   END

   -- otherwise

   DECLARE @min_postid NUMERIC( 8, 0 ),
           @position   INT

   SELECT @position = @perpage * ( @pagenumber - 1 ) + 1

   SET ROWCOUNT @position

   -- What happens here is it will select through the rows
   -- and order the whole set.
   -- It will stop push postid into @min_postid until it hits
   -- ROWCOUNT and does this out of the ordered set (a work
   -- table).

   SELECT @min_postid = postid
     FROM post
    WHERE ...
    ORDER BY postid ASC

   SET ROWCOUNT @perpage

   -- we know where we want to go (say the 28th post in a set of 50).
    SELECT ...
      FROM post
     WHERE postid >= @min_postid
           ...
  ORDER BY postid ASC

Yet another solution would be to use a loop and a counter. Probably the least elegant, but again, it would depend on what you were trying to do as to what would be most appropriate.

As you can see, none of these are particularly pretty. If you know of a better method, please forward it to dowen@midsomer.org.

6.2.13: How can I return number of rows that are returned from a grouped query without using a temporary table?

This question is certainly not rocket science, but it is often nice to know how many rows are returned as part of a group by. This might be for a report or a web query, where you would want to tell the user how many rows were returned on page one. It is easy using a temp table, but how to do it without a temp table is a little harder. I liked this solution and thought that it might not be obvious to everyone, it was certainly educational to me. Thanks go to Karl Jost for a very nice answer.

So, give data like:

 name     item
 ----     ----
 Brown    1
 Smith    2
 Brown    5
 Jones    7

you wish to return a result set of the form:

 name    sum(item)   rows
 ----    ---------   ----
 Brown   6           3
 Jones   7           3
 Smith   2           3

rather than

 name    sum(item)   rows
 ----    ---------   ----
 Brown   6           2
 Jones   7           1
 Smith   2           1

Use the following, beguilingly simple query:

select name, sum(item), sum(sign(count(*)))
from data
group by name