View Issue Details

IDProjectCategoryView StatusLast Update
0002919Ham Radio DeluxeBugpublic2019-11-08 02:31
ReporterKB3NPHAssigned ToWA9PIE 
PrioritynormalSeveritymajorReproducibilitysometimes
Status closedResolutionfixed 
Product Version 
Target VersionFixed in Version6.7.0.244 
Summary0002919: DX Cluster WSI Lookups stalling DX Cluster
DescriptionOSTicket #509156 - Didn't see anything else in Mantis related to this issue.

Did a remote into the customer's computer. He is running Windows 8.1 and has 6+ Gigs of ram and HRD 6.4.0.893. Customer said this same issue has been going on for quite a few releases now. He is reverting back to a release where everything was working as it should.

When the DX cluster is running, from time to time he receives maybe one spot that "stalls" the DX cluster. When this happens there is a little icon indicating something going on in the very left column of the DX cluster display. It appears that during this time it may be searching the log for worked status. When this happens it "stalls" the complete logbook, meaning you can't click on any menu option or do anything in the logbook. After a short period, ranging from about 1/2 a second to as much as several seconds, it appears to locate what it is looking for, populates the WSI column and releases the logbook so it can then function normally.

In watching the DX Cluster, sometimes this happens with only 1 spot. Sometimes it happens with 5 or 6 spots in a row if they come in rapid succession. I've included a screenshot from the customer's screen when there were several of these stalls. You can see that the WSIs are NOT populated during this stall and the little folder icon is showing in the left most column of the image for the stalled spots. Once the lookup was completed and all of the WSIs were populated for each of those spots, the Logbook returned to allow normal operation.

Hope this info helps.
TagsNo tags attached.
ModuleLogbook
Sub-ModuleDX Cluster
Testing Beta Successful

Relationships

related to 0002797 closedWA9PIE Ham Radio Deluxe Logbook stops displaying spots normally (they're grey'd out and have the search folder icon in the far left column) 
related to 0003299 new 1 - Backlog V6.5.0.207 Connecting to DX cluster locks Logbook 

Activities

KB3NPH

2018-10-17 11:57

administrator  

DX Cluster Stalls.jpg (81,510 bytes)
DX Cluster Stalls.jpg (81,510 bytes)

K7ZCZ

2018-10-17 13:56

administrator   ~0006288

The "Logfile" view should be used to collect information about the actions of the application in this state. Marking the "debug" and "timing" buttons will cause the logbook to display information about each database query executed, including information about how long the query takes to run. That information should be valuable toward troubleshooting.

Further, information about the user's database configuration would be very valuable. A screenshot of the "Logbook Databases" dialog box would be a good start. This dialog is reachable using the "Databases" command in the "Configure" tear-off menu of the "Logbook" tear-off menu in the "Tools" menu in the Logbook's main window.

KB3NPH

2018-10-19 11:23

administrator   ~0006304

Last edited: 2018-10-19 11:29

View 2 revisions

Posted two logbook logfiles in Mantis 2919 folder in the DUMPS folder. These logfiles are from OSTicket #509156
Also included I've attached a screenshot of this tickets Logbook Database Config dialog.

KB3NPH

2018-10-19 11:29

administrator  

K7ZCZ

2018-10-21 17:36

administrator   ~0006324

The screenshot of the database manager shows that there's only one Logbook database configured. That's good news, since some users manage to have multiple databases and even multiple copies of the same database configured to handle lookups. The screenshot also tells us that Microsoft Access is the backing store for the single logbook database in use.

When the Logbook receives a DX cluster notificaiton, it does about 16 different queries against each of the databases configured to handle WSI lookups. The queries are all pretty poor as far as ideal efficiency goes, as most of them end up being implemented by complete tables scans. But they normally execute quite quickly--in just a few milliseconds.

From the provided logs, we can see that most lookups are very fast. However, some are terribly slow. From the 165817.TXT file, for example, we see that it takes more than ten seconds to execute just one of the queries to look up the "PU2UAF" callsign, and about 11 seconds for one of the queries to look up "DK3PO".

16:53:38  Stopwatch      BackgroundProcessingThreadLookups.cpp (2307): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 11923904 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (2307): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 10326181 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (2460): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 3461 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (2460): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 920641 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (2761): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 1610 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (2761): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 1762 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3034): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 1173 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3083): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 568 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3034): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 2502 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3118): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 769 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3167): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 819 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3083): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 939 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3118): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 676 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3167): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 1022 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3560): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 1294 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3560): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 2802 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3617): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 5952 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3560): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 762 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3617): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 715 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3675): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 675 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3617): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 5210 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3560): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 1313 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3617): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 649 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (3675): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 515 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (4436): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 2108 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (4481): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 659 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (4481): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 1314 microseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (4481): [connection = 0A813260, threadNumber = 0, callsign = "PU2UAF"] 3708 microseconds
16:53:39  CDXCObject     lookup of "PU2UAF" took 10375 milliseconds
16:53:39  Stopwatch      BackgroundProcessingThreadLookups.cpp (4481): [connection = 0A814620, threadNumber = 1, callsign = "DK3PO"] 9151 microseconds
16:53:39  CDXCObject     lookup of "DK3PO" took 12875 milliseconds


The remainder of the queries are pretty quick, but these two slow queries dominate the lookup operation for the two callsigns -- and they take more than 10 seconds and just less than 13 seconds to look up.

The timing information in the log is very specific to the database access, so we know with great certainty that it's really the database that took so long to execute the query. At this point in the product's lifecycle, there's not too much we can do for the database design; a more efficient design is possible, but it would involve significant re-work.

One feasible approach for the application is to not use multiple threads to handle lookups. Lookups are split across two threads; if we assume the stall is because the Access engine can't handle multi-threaded access, then the proposed fix is appropriate. However, I can't find any Microsoft documentation that says the Access engine can (or cannot) handle access from multiple concurrent threads.

In fact, we know that not all users experience these symptoms, which suggests that the issue of slow query performance is due to the database engine configuration on some users' machines; due to a specific version or patch level of the database engine; or due to a certain pattern of cardinality in the data being searched. Before a remediative plan is chosen, we should do what we can to better understand the cause of the symptoms.

K7ZCZ

2018-10-22 17:13

administrator   ~0006329

Similar to the report in 2797

K7ZCZ

2019-07-25 18:19

administrator   ~0008273

For other customers, we've found that upgrading to the current Access drivers and adjusting the data source definition to use that new driver has improved this situation.

What's the status of this issue with this customer in current builds? Have they installed a more modern Access runtime, and moved their data sources to that driver? How did the symptoms change?

K7ZCZ

2019-08-01 14:23

administrator   ~0008318

In the 2019-08-01 call, we confirmed that updating to the current Access runtime resolves this issue for customers.

WA9PIE

2019-09-22 19:35

administrator   ~0008589

So is this something that we're going to write code for? Or is this something we'll use as an after-the-fact procedure for resolving customer problems?

g3ucq

2019-09-23 05:44

viewer   ~0008595

No problems for me either with a MYSQL database or Access.

KB3NPH

2019-10-10 13:50

administrator   ~0008783

This appears to be resolved.

WA9PIE

2019-10-23 07:56

administrator   ~0008967

Personally, I think this is a problem with a very old Access driver. With the new driver, I've never had a problem... and users I've moved to the latest driver (which requires migrating the data to a new database) no longer have this problem.

Closing.

Issue History

Date Modified Username Field Change
2018-10-17 11:57 KB3NPH New Issue
2018-10-17 11:57 KB3NPH Status new => assigned
2018-10-17 11:57 KB3NPH Assigned To => K7ZCZ
2018-10-17 11:57 KB3NPH File Added: DX Cluster Stalls.jpg
2018-10-17 13:56 K7ZCZ Note Added: 0006288
2018-10-18 17:03 K7ZCZ Assigned To K7ZCZ => KB3NPH
2018-10-19 11:23 KB3NPH Note Added: 0006304
2018-10-19 11:29 KB3NPH Note Edited: 0006304 View Revisions
2018-10-19 11:29 KB3NPH File Added: 2018-10-19 12_26_38-logbook databases (1).jpg
2018-10-21 17:36 K7ZCZ Note Added: 0006324
2018-10-22 17:13 K7ZCZ Relationship added related to 0002797
2018-10-22 17:13 K7ZCZ Note Added: 0006329
2019-04-18 14:31 K7ZCZ Relationship added related to 0003299
2019-07-25 18:19 K7ZCZ Note Added: 0008273
2019-08-01 14:23 K7ZCZ Status assigned => resolved
2019-08-01 14:23 K7ZCZ Resolution open => fixed
2019-08-01 14:23 K7ZCZ Note Added: 0008318
2019-08-30 13:32 K7ZCZ Project 1 - Backlog => 3 - Current Dev List
2019-08-30 16:00 K7ZCZ Fixed in Version => 6.7.0.226
2019-09-22 19:35 WA9PIE Note Added: 0008589
2019-09-23 05:44 g3ucq Note Added: 0008595
2019-10-10 13:50 KB3NPH Note Added: 0008783
2019-10-10 13:53 KB3NPH Assigned To KB3NPH => WA9PIE
2019-10-10 13:53 KB3NPH Description Updated View Revisions
2019-10-23 07:56 WA9PIE Status resolved => closed
2019-10-23 07:56 WA9PIE Testing Not Started => Beta Successful
2019-10-23 07:56 WA9PIE Note Added: 0008967
2019-11-08 02:16 WA9PIE Fixed in Version 6.7.0.226 => 6.7.0.244
2019-11-08 02:31 WA9PIE Project 3 - Current Dev List => Ham Radio Deluxe