调用CoInitialize后线程卡住,首先想到的就是死锁。
0:008> !cs -l ----------------------------------------- DebugInfo = 0x00432418 Critical section = 0x75d60280 (USER32!gcsUserApiHook+0x0) LOCKED LockCount = 0x1 WaiterWoken = No OwningThread = 0x00000f0c RecursionCount = 0x1 LockSemaphore = 0x200 SpinCount = 0x00000000 ----------------------------------------- DebugInfo = 0x0043e7a0 Critical section = 0x7580773c (ole32!g_mxsSingleThreadOle+0x18) LOCKED LockCount = 0x0 WaiterWoken = No OwningThread = 0x00000f4c RecursionCount = 0x1 LockSemaphore = 0x0 SpinCount = 0x00000000
根据OwningThread,看一下这两个线程
0:001> ~~[f0c] 0 Id: d40.f0c Suspend: 2 Teb: 7efdd000 Unfrozen Start: XXX_exe+0x15e2 (000915e2) Priority: 0 Priority class: 32 Affinity: 1 0:000> ~~[f4c] 1 Id: d40.f4c Suspend: 1 Teb: 7efda000 Frozen Start: XXX+0x37a7 (744537a7) Priority: 0 Priority class: 32 Affinity: 1
两个锁分别被0号线程和1号线程拥有,并且两个锁都是锁定状态,很像死锁
注意到0号线程的Suspend是2,其中一个计数是因为调试器加的,减去之后线程还是暂停的,说明这个线程就是暂停状态。
那1号线程为什么卡住呢。
0:001> kv # ChildEBP RetAddr Args to Child 00 01eed258 77288dd4 00000200 00000000 00000000 ntdll!NtWaitForSingleObject+0x15 (FPO: [3,0,0]) 01 01eed2bc 77288cb8 00000000 00000000 00000000 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo]) 02 01eed2e4 75cfaccc 75d60280 00000011 6dfe0000 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo]) 03 01eed360 75cfab0e 6dfe0000 6dff4571 0000c03b USER32!InitUserApiHook+0x21 (FPO: [Non-Fpo]) 04 01eed49c 7726011a 01eed4b4 00000000 01eee758 USER32!__ClientLoadLibrary+0xb1 (FPO: [Non-Fpo]) 05 01eed4dc 75cfa8e8 00000000 0000c03b 0000c03b ntdll!KiUserCallbackDispatcher+0x2e (FPO: [0,0,0]) 06 01eed788 75cfaa3c 00000000 0000c03b 01eed824 USER32!VerNtUserCreateWindowEx+0x1a9 (FPO: [Non-Fpo]) 07 01eed83c 75cf8a5c 00000000 0000c03b 01eed824 USER32!_CreateWindowEx+0x210 (FPO: [Non-Fpo]) 08 01eed878 7570644f 00000000 0000c03b 75706470 USER32!CreateWindowExW+0x33 (FPO: [Non-Fpo]) 09 01eed8b0 7570650c 75806b04 75807724 00000006 ole32!InitMainThreadWnd+0x3e (FPO: [0,0,4]) (CONV: stdcall) [d:\w7rtm\com\ole32\com\objact\mainthrd.cxx @ 160] 0a 01eed8c8 75700b81 00000000 00000006 00000000 ole32!wCoInitializeEx+0xef (FPO: [Non-Fpo]) (CONV: stdcall) [d:\w7rtm\com\ole32\com\class\compobj.cxx @ 2437] 0b 01eed8e8 75f0f4ae 00000002 00000006 75ed79be ole32!CoInitializeEx+0x29d (FPO: [Non-Fpo]) (CONV: stdcall) [d:\w7rtm\com\ole32\com\class\compobj.cxx @ 2110] WARNING: Stack unwind information not available. Following frames may be wrong. 0c 01eedd48 75f95090 00000008 00000005 02912c28 SHELL32!SHGetSpecialFolderLocation+0x136d 0d 01eedd64 75f2dbd1 02912c28 00000000 01eee754 SHELL32!Ordinal711+0x174b 0e 01eedf9c 75f2db76 00000000 02912c28 00000000 SHELL32!SHCreateDirectoryExW+0x70 0f 01eedfb4 10053d2d 00000000 02912c28 00000000 SHELL32!SHCreateDirectoryExW+0x15 ...//此处省略 1a 01eef7e8 77289ed2 00000000 7721419c 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo]) 1b 01eef828 77289ea5 744537a7 00000000 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo]) 1c 01eef840 00000000 744537a7 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])
我们注意到RtlEnterCriticalSection函数,查一下函数声明如下:
NTSTATUS RtlEnterCriticalSection(RTL_CRITICAL_SECTION* crit);
第一个参数就是要找的关键区的结构
0:001> !cs 75d60280 ----------------------------------------- Critical section = 0x75d60280 (USER32!gcsUserApiHook+0x0) DebugInfo = 0x00432418 LOCKED LockCount = 0x1 WaiterWoken = No OwningThread = 0x00000f0c RecursionCount = 0x1 LockSemaphore = 0x200 SpinCount = 0x00000000
可以看到OwningThread是0x00000f0c也就是0号线程,而0号线程之前是处于暂停状态的,如果不恢复它的运行,那么这个线程将永远的等下去。
PS:
Terminate和SuspendThread不要乱用,问题原因找到后,事后我看到了下面的文章,可以看一下,写的还可以。
线程天敌TerminateThread与SuspendThread