-
Notifications
You must be signed in to change notification settings - Fork 5k
Fix unhandled exception in tiered compilation thread #116716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unhandled exception in tiered compilation thread #116716
Conversation
The detection the unhandled exception is not working in the case of the tiered compilation thread, because there is no indication that the exception is going to be caught in the `CEEInfo::runWithErrorTrap`. So the EH assumes the exception is going to be unhandled and exits the process. The unhandled exception message was not seen in the repro case of the issue likely because the tiered compilation thread console output didn't get flushed. When running under a debugger and stepping over the `InternalUnhandledExceptionFilter_Worker`, it was actually printed out. The fix is to add `DebuggerU2MCatchHandlerFrame` to the `CEEInfo::runWithErrorTrap`. That allows the EH to see that the exception will be caught there and let it flow through the CallDescrWorker to reach that trap. Closes dotnet#116676
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR ensures that exceptions in the tiered compilation thread are correctly detected as handled by adding a debugger catch frame and improves clarity by renaming a confusing local variable.
- Register a
DebuggerU2MCatchHandlerFrame
inCEEInfo::runWithErrorTrap
so EH knows exceptions will be caught. - Rename
invalidRevPInvoke
toisPropagatingToNativeCode
inSfiNext
and update related logic and log messages.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/coreclr/vm/jitinterface.cpp | Added DebuggerU2MCatchHandlerFrame catchFrame(true) at the start of runWithErrorTrap . |
src/coreclr/vm/exceptionhandling.cpp | Renamed invalidRevPInvoke to isPropagatingToNativeCode and updated assignments, checks, and logs. |
Comments suppressed due to low confidence (3)
src/coreclr/vm/jitinterface.cpp:10588
- [nitpick] Add a brief comment explaining why this catch frame is needed (e.g., "// Ensure EH sees that this thread has a catch handler for tiered compilation").
DebuggerU2MCatchHandlerFrame catchFrame(true /* catchesAllExceptions */);
src/coreclr/vm/exceptionhandling.cpp:3913
- [nitpick] The name
isPropagatingToNativeCode
overlaps withisPropagatingToExternalNativeCode
and may be ambiguous; consider a more specific name likehasReversePInvokeFrame
to clarify its intent.
CLR_BOOL isPropagatingToNativeCode = FALSE;
src/coreclr/vm/jitinterface.cpp:10588
- No tests were added to verify that exceptions in the tiered compilation thread are now correctly caught; consider adding a repro unit test covering this scenario.
DebuggerU2MCatchHandlerFrame catchFrame(true /* catchesAllExceptions */);
Should we add a regression test for this? |
EH_LOG((LL_INFO100, "SfiNext: reached native frame at IP=%p, SP=%p, isPropagatingToNativeCode=%d\n", | ||
GetIP(pThis->m_crawl.GetRegisterSet()->pCurrentContext), GetSP(pThis->m_crawl.GetRegisterSet()->pCurrentContext), isPropagatingToNativeCode)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EH_LOG((LL_INFO100, "SfiNext: reached native frame at IP=%p, SP=%p, isPropagatingToNativeCode=%d\n", | |
GetIP(pThis->m_crawl.GetRegisterSet()->pCurrentContext), GetSP(pThis->m_crawl.GetRegisterSet()->pCurrentContext), isPropagatingToNativeCode)); | |
EH_LOG((LL_INFO100, "SfiNext: reached native frame at IP=%p, SP=%p, isPropagatingToNativeCode=%s\n", | |
GetIP(pThis->m_crawl.GetRegisterSet()->pCurrentContext), GetSP(pThis->m_crawl.GetRegisterSet()->pCurrentContext), (isPropagatingToNativeCode ? "true" : "false"))); |
It needs to be compiled as optimized and the Console.WriteLine is needed to repro the issue.
{ | ||
for (int i = 0; i < 1_000_000; i++) | ||
{ | ||
Console.WriteLine(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will produce megabytes of output that will get copied around, etc. Can we do something less visible, like Thread.Sleep(10)
and change the for-loop to go till like 200? We should only need to hit the loop about CallCountThreshold
times (30) to trigger the background tiered compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It usually triggers on hundreds of thousands of iterations for me, so even a Sleep(1) was too long. Let me try if a longer sleep and less iterations would give the tiered compilation enough time to crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've modified it to use the Sleep(10) and less loops. It reproduces the problem without the fix reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use the env var for aggressive tiering here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is needed, the test now repros the problem reliably (with the fix from this PR removed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Fix contract Co-authored-by: Jan Kotas <[email protected]>
The detection the unhandled exception is not working in the case of the tiered compilation thread, because there is no indication that the exception is going to be caught in the
CEEInfo::runWithErrorTrap
. So the EH assumes the exception is going to be unhandled and exits the process.The unhandled exception message was not seen in the repro case of the issue likely because the tiered compilation thread console output didn't get flushed. When running under a debugger and stepping over the
InternalUnhandledExceptionFilter_Worker
, it was actually printed out.The fix is to add
DebuggerU2MCatchHandlerFrame
to theCEEInfo::runWithErrorTrap
. That allows the EH to see that the exception will be caught there and let it flow through theCallDescrWorker
to reach that trap.I have also taken this opportunity and renamed the confusing
invalidRevPInvoke
local variable.Closes #116676