Connection to legacy Sybase DBMS sometimes hangs

On a Windows Groundplex, we are using the “Generic JDBC - Execute” snaps to issue several different kinds of queries to a legacy Sybase Adaptive Server Anywhere 9 database. (Also sometimes called Sybase “SQL Server Anywhere.”)

We are using the connection options recommended by Snaplogic in the online documentation. [TBD: put a link to that here.]

Every once in a while, a query that has previously worked just fine will “hang” the connection. Once this has happened, attempts to query via that connection go unresolved and time out. We believe that when the problem happens, what may be happening is that the query is either failing or else not finishing in time, and that the connection is left open in an “off hook” state, with the result set still pending. Subsequent queries on such a channel wouldn’t get acknowledged because it would just be sitting there waiting for the previous action to complete… which never does.

Has anyone else experienced this with a Sybase connection? (Or with any other non-standard legacy database connection?)

Any ideas how to troubleshoot and fix this?

OH MAN! This brings back memories! You might have fallen into a trap I consider a design flaw! The default primary key for this variant of sybase, IIRC, was CLUSTERED! Is that what it is on YOUR system?

http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc32300.1570/html/sqlug/X50317.htm

If so, you must do things IN KEY ORDER! If you do them OUT of order, it will take LONGER!

How much longer? Well, I was on a contract where I often had to run things at the end of the week just before I flew out. NO PROBLEM! It was a reliable process, and generally took 30 MINUTES to run!

Well, sometimes I would get to the airport and be struck with HORROR! I would suddenly remember that the DBA might make a change that week. Whenever he “made a change”, he would drop and recreate the table, WITH NO SPECIFICATION! So the primary key would be clustered, and the routine I was running would take about THREE DAYS!!!

During this period, the connection will APPEAR to hang! It is really just waiting for the operation to complete.

CLUSTERED on SYBASE, at least with the first variant they had, AND with M/S SQL Server up to 6.5(or was it 7), since they shared the same code. Was GREAT for READING, since it was faster and took less space. It was great for writing, IN SEQUENCE! It was HORRIBLE for writing OUT of sequence. The delay on a particular record is based on where the update or delete occurs. If it is towards the end, it might not be very noticeable. If it is at the beginning, it could take a LONG time!

Of course it is ALSO possible that you simply have a lock, or a deadlock. If it is a lock, it will likely eventually go away. If it is a deadlock, it will stay locked for a while and eventually kill, iirc, the longest running transaction. Of course the transaction on the sacrificed job will roll back also.

1 Like

Stephen:

Thanks for the feedback!

First of all, exactly what do you mean by “do things” in key order?

Here’s some more background; hoping this will be helpful!

In this case, we are issuing SELECT statements that involve joining several tables.

To expand on my earlier statement of the queries otherwise being “just fine”, these queries normally complete very quickly regardless of where it’s being run from: SnapLogic “Generic JDBC Execute” Snap, or a tool like “averSQL”, or Sybase’s own “Sybase Central.”

The query hangs only in SnapLogic, and it does not do it every time.
We have yet to be able to reproduce it on demand; it’s sporadic.

Note that averSQL uses Windows-native ODBC, not JDBC, and Sybase Central uses the Sybase-native connection API. That makes me suspect the problem is driver-specific. A few days ago, I found a comment elsewhere online where someone using the JConnect (jconn4.jar) driver complained about similar behavior, and Sybase recommended using the (newer) “iAnywhere” JDBC-ODBC bridge drivers. But… (a) I haven’t been able to re-find the comment, and (b) we haven’t been able to get the bridge drivers working on our GroundPlex. (For some reason it can’t find the dbjodbc9.dll file.)

We suspect that the hangs happen when the queries involve one particular table, which is a change log. The change log table has NO primary key defined! (I know: bleah!) It does have 1 unique index (on 5 columns), and 3 other non-unique indexes (one on 4 columns, two others on two columns… hm… and the two others seem to be identical: wow, what’s up with that?)

I think that a deadlock is extremely unlikely, as we’re experiencing this in testing where the only two processes accessing the database are our application (1 QA user) and the SnapLogic pipelines themselves. The QA person makes some changes with the application, then kicks off the pipelines, then checks our central database to see if certain changes have taken place in the data there.

To the best of my knowledge, most if not all of the other tables have PKs. If I can reproduce the problem, I’ll post more details regarding the actual query and the specs of the tables involved. Maybe we can find a problem hiding in those details.


Second, I’m not sure how to find the defaults for Sybase.
I’ve worked with MS SQL Server, Oracle, PostgreSQL, MySQL, and MaxDB… but it’s been over 20 years since I’ve touched Sybase, so I’m essentially approaching it anew. How would I check to see if it is defaulting to clustered? Does it still matter since the potentially problematic table has no PK?

Well, clustered indexes are only created on tables with primary keys, so it sounds like that might not be a problem here. What I mean by key order is that keys can be anything. Say the key was numeric, and you had records 1-1000000 listed out there, with their number as the primary key.

MOST databases store data OUT OF ORDER, and have a LINKED set of pages that hold the keys, and pointers to their tables. The overall hit on a read isn’t that noticeable, and a write can be VERY fast.

A CLUSTERED index on Sybase, and the older versions of MS/SQL Server is different. They store data IN ORDER, and kind of use THAT as the index. Reads are a bit faster, and a write can be very slow, if written out of order.

In my example, with a non clustered index, writes and reads would be roughly the same whereever they are. As I said though, that index, and the random order will slow down reads a bit. If you wrote record 1000000 or later, the write would actually be a bit faster than the unclustered. If you wrote record 1, it could be MUCH slower, because it has to make room for that record.

It has been a long while since I have used sybase, or older versions of MS SQL. I believe you can just use DESCRIBE mytablename

Stephen:

Aha. Well, that makes sense. “Clustered” generally means “index stored with or within the row”, so it does stand to reason that an unsophisticated implementation would have to physically rewrite a bunch of stuff if you wrote things out of order. I imagine if I had an integer PK clustered (ascending) and wrote records {10,9,8,7,6,5,4,3,2,1} in that order, the last row written would take 9 read-writes to move everything down before it could write row 1 at the top.

So, I verified, none of the indexes are clustered. Plus we’re not writing.

So this is something different. We just now managed to provoke it into doing it again, but still don’t have enough information to be able to reproduce it. Fun!! (Not.)

Yeah, I think sybase made it the default on the idea that people would always have a write once mentality, where changes would have a reversing entry, or things would be only read. It is a nice concept, but I HATE that they made it the default. It gave me a lot of grief because the DBA insisted on doing ALL changes, and just wouldn’t listen. So that computer probably did over 2 months work for NOTHING. Luckily that was generally on the weekends, so it wan’t as bad as it could have been.

Outside of transactions, locks, and deadlocks, I think the database should be relatively fast, meaning that you wouldn’t notice a significant pause. And I am ASSUMING that that is what you mean. The page size at least WAS small, so it would do more page splits than say M/S SQL TODAY, or Oracle. So there might be more gaps where it seems to almost hesitate, but we are talking barely susceptible hesitations that may be tens or hundreds of milliseconds.

If you can, it might be a good idea to bring it up with sybase. It COULD be a jdbc problem, or even some oddity in sybase. If you are doing enough processing, it could even be garbage collection.

It could be a windows problem. Windows NEVER handled virtual memory well, and at least earlier versions of windows didn’t generally handle memory over a certain amount properly. They always wanted people to pay more to be able to do that. And what else is happening on that windows system?