View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0002862||Ham Radio Deluxe||Bug||public||2018-08-28 15:33||2018-09-11 13:18|
|Priority||normal||Severity||crash||Reproducibility||have not tried|
|Target Version||Fixed in Version||188.8.131.526|
|Summary||0002862: Logbook: Bad thread synchronization around lookup calls and REQUEST_SIGNAL_DATA structure|
|Description||We've been trying to find a crash in the Logbook that corrupts memory and causes the Logbook application to shut down. See the related bugs for more details and more minidump files, including a couple of changes made in speculation of a fix.|
I recently found a dump file on the Dev Dashboard which is stored in the Dumps directory on Google Drive for this issue. The call stack for the dump is provided below. Note that the nThreadID parameter to the CountryLookup() member is -1.
This dump comes from build 873 being run on current Windows 10 bits (build 10.0.17134.112)
0:021> kb *** Stack trace for last set context - .thread/.cxr resets it # ChildEBP RetAddr Args to Child 00 171ed234 774f4e2e 00000001 00000009 774f2bb3 ntdll!RtlReportCriticalFailure+0x88 01 171ed240 774f2bb3 00000002 774a2a5f e1a95355 ntdll!RtlpReportHeapFailure+0x2f 02 171ed248 774a2a5f e1a95355 00000000 1d31f844 ntdll!RtlpHeapHandleError+0x11 03 171ed27c 774f9af9 1d31f844 1d31f844 00000009 ntdll!RtlpHpHeapHandleError+0x65 04 171ed290 774afb7d 1d31f844 00000000 00000000 ntdll!RtlpLogHeapFailure+0x41 05 171ed2ec 00f4e2b1 00920000 00000000 1d31f844 ntdll!RtlFreeHeap+0x74fad 06 171ed300 012dfbee 1d31f844 12f365f9 1bfe7a50 HRDLogbook!free+0x1a [f:\dd\vctools\crt\crtw32\heap\free.c @ 51] 07 171ef908 012e820c 1d31f844 1d31f6c4 ffffffff HRDLogbook!CBackgroundProcessingThreadLookups::CountryLookup+0x894e [c:\ham radio\logbook\hrdlogbook\backgroundprocessingthreadlookups.cpp @ 5308] 08 171ef93c 012d04ff 1bfe7a50 00000000 12f36585 HRDLogbook!CBackgroundProcessingThreadLookups::ProcessData+0xbc [c:\ham radio\logbook\hrdlogbook\backgroundprocessingthreadlookups.cpp @ 178] 09 171ef974 014b3a60 12f3656d 0dd22e60 00988a90 HRDLogbook!CBackgroundProcessingThread::DoWork+0xaf [c:\ham radio\logbook\hrdlogbook\backgroundprocessingthread.cpp @ 305] 0a 171ef99c 014b39a6 12f36529 00f534d7 0dd22e60 HRDLogbook!CThinThread::Run+0x80 [c:\ham radio\logbook\hrdlogbook\thinthread.cpp @ 188] 0b 171ef9d8 00f5342b 0179a718 12f366e1 00f534d7 HRDLogbook!CThinThread::Start+0x46 [c:\ham radio\logbook\hrdlogbook\thinthread.cpp @ 236] 0c 171efa10 00f53553 00f534d7 171efa30 7734a1a4 HRDLogbook!_callthreadstartex+0x1b [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376] 0d 171efa1c 7734a1a4 0dd22e60 7734a180 8d2ac2f5 HRDLogbook!_threadstartex+0x7c [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354] 0e 171efa30 7746174e 0dd22e60 e1a97b51 00000000 kernel32!BaseThreadInitThunk+0x24 0f 171efa78 77461722 ffffffff 774a62f3 00000000 ntdll!__RtlUserThreadStart+0x2b 10 171efa88 00000000 00f534d7 0dd22e60 00000000 ntdll!_RtlUserThreadStart+0x1b
|Tags||No tags attached.|
There's only a couple of places in the code which make the call to perform this lookup. One is asynchronous, and doesn't wait for a return -- it passes pointers to help-allocated memory. The other location uses stack-based structures and is synchronous; it has to be, because otherwise the stack frame would disappear, rendering the pointers passed to the worker function invalid.
The whole notion of having the same code handle synchronous and asynchronous requests is a bit iffy; why use a separate thread just to wait for it to complete its work?
The -1 parameter noted above gives a clue; we know that the call must be coming from the synchronous code. It turns out that the synchronous code is synchronous -- but has a time out of seven seconds. The time out duration was arbitrarily chosen -- but how could it possibly take more than 7 seconds to do a callsign lookup?
If the time out is exceeded, the stack frame of the calling function is torn down, taking the REQUEST_SIGNAL_DATA and lookup result data structures with it. When the lookup code in the other thread derefereneces those pointers, it's walking over memory it doesn't own. The first symptom is this crash, where the REQUEST_SIGNAL_DATA structure is deallocated. It's puzzling that the structure is deallocated, but that's because it's a flag in the structure itself that indicates it is meant to be deallocated; as soon as the structure becomes corrupt (overwritten with other data after the calling stack frame disappears) the flag indicating deletion is required is very likely to look like a non-zero "TRUE" value, causing the deallocation to work.
The chorus of offsetting problems disguised this issue pretty well, but the fix should take care of things. Instead of waiting indefinitely, we still wait for seven seconds -- but log, and conitnue waiting. That way, if a hang actually occurs, the code is still showing some output and some evidence of where it might be hung.
Last edited: 2018-09-01 14:41
Fixed with this checkin, which first went out in the 877 build:
Jose made a couple of good code review comments, so I'll address those before resolving this issue fully ...
I addressed Jose's CR comments in this checkin, so let's resolve this one now.
||Unable to test|
||Putting FIV back... I changed it by accident.|
||Given that we can't test this (it was a dump collected on the dev dashboard), I'm going to give this item a pass.|
||Relationship added||related to 0002813|
||Relationship added||related to 0002789|
||Relationship added||related to 0002735|
||Note Added: 0006015|
||Note Added: 0006036|
||Note Edited: 0006036||View Revisions|
||Assigned To||=> K7ZCZ|
||Status||new => resolved|
||Resolution||open => fixed|
||Note Added: 0006044|
||Fixed in Version||=> 184.108.40.2068|
|2018-09-10 02:45||g3ucq||Note Added: 0006167|
|2018-09-10 11:01||WA9PIE||Fixed in Version||220.127.116.118 =>|
|2018-09-10 11:05||WA9PIE||Fixed in Version||=> 18.104.22.1688|
|2018-09-10 11:05||WA9PIE||Description Updated||View Revisions|
|2018-09-10 11:05||WA9PIE||Note Added: 0006179|
|2018-09-10 16:53||WA9PIE||Status||resolved => closed|
|2018-09-10 16:53||WA9PIE||Testing||Not Started => Not Tested|
|2018-09-10 16:53||WA9PIE||Note Added: 0006192|
|2018-09-11 13:15||WA9PIE||Project||3 - Current Dev List => Ham Radio Deluxe|
|2018-09-11 13:18||WA9PIE||Fixed in Version||22.214.171.1248 => 126.96.36.1996|