-
Notifications
You must be signed in to change notification settings - Fork 239
Error logging and observation
WIL's Error handling helpers allow you to observe and log messages generated by helper macros such as RETURN_IF_FAILED
, THROW_IF_FAILED
, and LOG_IF_FAILED
, either right when the errors are reported (for logging) or after the fact.
If this is not enough for your needs, WIL's error handling helpers also provide other error handling customization features.
By default, WIL will log failures using OutputDebugString
if a user-mode debugger is attached to the application process, but do nothing with them otherwise.
So if you have a program like this:
#include <exception>
#include "wil/result.h"
void BlowUp() try
{
throw std::exception{ "Too bad!" };
} CATCH_LOG()
int main()
{
LOG_HR(E_UNEXPECTED);
BlowUp();
}
then when you run it under a debugger you will see WIL log messages like these:
X:\scratch.cpp(12)\scratch.exe!00073551: (caller: 00074A6C) LogHr(1) tid(4ac0) 8000FFFF Catastrophic failure
X:\scratch.cpp(8)\scratch.exe!00073512: (caller: 00073559) LogHr(2) tid(4ac0) 8007023E {Application Error}
The exception %s (0x Msg:[std::exception: Too bad!]
but when you run it without a debugger, you won't see any output at all.
If necessary, WIL lets you change how it detects a debugger; see Custom debugger detection.
You can add to the default logging behavior by registering callback functions with WIL. Be careful: don't write code in these callback functions that affects your code's logic.
void wil::SetResultLoggingCallback(CustomLoggingCallback* callback);
Use wil::SetResultLoggingCallback
to set a process-wide callback function
that WIL will call each time it logs a failure. The callback returns void and
takes a single wil::FailureInfo
parameter that describes the failure being logged:
void __stdcall CustomLoggingCallback(wil::FailureInfo const& failure) noexcept;
If you're writing the top-level code of an application, you might place a call to your
preferred logging library in the SetResultLoggingCallback
callback function.
For this purpose, you can use wil::GetFailureLogString
to compose a default
log message string.
Because SetResultLoggingCallback
changes process-wide state, you should call it
only at process startup, not in the middle of normal execution, in order to avoid
race conditions between threads setting and/or calling the callback.
You can clear a previously set callback by calling SetResultLoggingCallback(nullptr)
,
but you cannot change the callback pointer from one non-null value directly to another.
int main()
{
// Print every log message to standard error.
wil::SetResultLoggingCallback([](wil::FailureInfo const &failure) noexcept
{
constexpr std::size_t sizeOfLogMessageWithNul = 2048;
wchar_t logMessage[sizeOfLogMessageWithNul];
if (SUCCEEDED(wil::GetFailureLogString(logMessage, sizeOfLogMessageWithNul, failure)))
{
std::fputws(logMessage, stderr);
}
});
// ...
}
void wil::SetResultTelemetryFallback(CustomTelemetryFallback *callback);
If your app has a component that uploads reliability metrics, analytics, or telemetry,
consider using wil::SetResultTelemetryFallback
to set a per module (DLL or EXE) callback function
that will be called as an opportunity to send failure messages to this telemetry client.
This callback is syntactically similar to the one taken by SetResultLoggingCallback
,
but not the same:
void __stdcall CustomTelemetryFallback(bool alreadyReportedToTelemetry, wil::FailureInfo const& failure) noexcept;
The difference is that the SetResultTelemetryFallback
callback also takes a alreadyReportedToTelemetry
parameter.
WIL uses this parameter to tell your callback's code whether the message has already been sent up to telemetry
via a previously registered ThreadFailureCallback
(see below).
The same warnings apply regarding race conditions when calling SetResultTelemetryFallback
from multiple threads as for SetResultLoggingCallback
.
You can call both SetResultTelemetryFallback
and SetResultLoggingCallback
in your program, and both callbacks will run.
wil::SetResultTelemetryFallback([](bool alreadyReportedToTelemetry, wil::FailureInfo const &failure) noexcept
{
// ...
});
auto wil::ThreadFailureCallback(CustomFailureCallback* callback);
Use wil::ThreadFailureCallback
to add a thread-local callback function
that WIL will call before outputting each error handling helper log message.
The callback is similar to the one accepted by SetResultLoggingCallback
,
but returns a bool
true/false flag instead of void.
bool CustomFailureCallback(wil::FailureInfo const& failure) noexcept;
Return true from the callback when you want to tell the process's
SetResultTelemetryFallback
callback that the thread
has already uploaded a metric or telemetry data point for the failure, so that
your process-wide SetResultTelemetryFallback callback doesn't send a duplicate
telemetry data point.
You can register as many callbacks as you like with ThreadFailureCallback
, and they all will run;
each call to ThreadFailureCallback
returns an RAII object that will
automatically unregister the callback when it is destructed.
auto monitor = wil::ThreadFailureCallback([](wil::FailureInfo const &failure) noexcept
{
bool didReportToTelemetry = false;
// ...
// Set didReportToTelemetry to true if we've sent the failure to the telemetry client.
// If at least one ThreadFailureCallback callback returns true,
// then a true alreadyReportedToTelemetry flag will be passed to the
// SetResultTelemetryFallback callback.
return didReportToTelemetry;
});
The ThreadFailureCallback
RAII object must be constructed and destructed on the same thread,
and it must follow stack-like behavior: The last one constructed on a thread
must be the first one destructed.
If you violate this rule, you are likely to crash in
ThreadFailureCallbackHolder::GetThreadContext
as it tries to walk a corrupted list of callbacks.
Violations of the LIFO rule can occur if an exception occurs that is handled by an outer stack frame, and that exception bypasses C++ destructors, typically because the exception is not covered by the current module's exception handling model. Examples:
- A structured exception like an access violation occurs
and is caught by an outer frame. The
/Eha
exception handling model detects some but not all of these cases. - A C++ exception is thrown from C code and is caught by an outer frame,
and the module is compiled with
/Ehc
(assume C does not throw exceptions). - A C++ exception escapes a function marked
noexcept
and is caught by an outer frame. The/Ehr
exception handling model detects and handles this case.
If your module is an out-of-process COM server, make sure to use the
IGlobalOptions
interface to disable automatic handling
of structured exceptions:
wil::CoCreateInstance<IGlobalOptions>(CLSID_GlobalOptions)->
Set(COMGLB_EXCEPTION_HANDLING, COMGLB_EXCEPTION_DONOT_HANDLE_ANY);
Another source of non-LIFO use is a coroutine that suspends
with an active ThreadFailureCallback
.
The ThreadFailureCallback
does not destruct even though control
has returned to the caller, which will commonly lead to non-LIFO usage.
HRESULT wil::GetFailureLogString(PWSTR destBuffer, size_t destBufferSize, wil::FailureInfo const &failure) noexcept;
wil::GetFailureLogString
outputs the same error message that WIL would print to the debugger output for the given error. If you've registered callback functions in order to send log messages to your own logger, but you like the string format already used for the debugger output, you can call GetFailureLogString
to obtain the debugger output string without having to parse the error data structure yourself.
You need to allocate your own wchar_t
buffer to hold the string output by GetFailureLogString
, and then pass the buffer's size including a space for the null terminator.
If you want to change both the debugger output and the result of GetFailureLogString
, you can use SetResultMessageCallback
.
See the example for SetResultLoggingCallback.
The ThreadFailureCache
remembers the most recent unique failure that occurred on its thread.
A failure is considered "unique" if has a different error code from the previous error. Errors are commonly propagated up the call stack, resulting in the same error code being reported multiple times. The ThreadFailureCache
records only the first such error, which is most likely to be the point of origination.
The ThreadFailureCache
is movable but not copyable. See details below for move semantics.
Warning: The ThreadFailureCache
relies upon its destructor to clean up per-thread state. If the destructor is bypassed (for example, due to a handled structured exception), the per-thread state points to stack structures that no longer exist, and the code will probably crash in mysterious ways.
-
ThreadFailureCache()
Constructs aThreadFailureCache
and begins watching for errors. (SeeWatchCurrentThread
.) -
ThreadFailureCache(ThreadFailureCache&& other)
Move constructor transfers the cached failure from the other object.The new object begins watching for errors even if the source object was not watching for errors.
The source object's error-watching state is unchanged. You must stop it manually.
The source object's error state is invalid. Do not call
GetFailureInfo
on the old object.
-
~ThreadFailureCache()
If the object was watching for errors, it must be destructed from the same thread that calledWatchCurrentThread()
.
-
ThreadFailureCache&& operator=(ThreadFailureCache&& other)
Move assignment operator transfers the cached failure from the other object.The error-watching state of the source and destination objects is unchanged.
The source object's failure information is invalid. Do not call
GetFailureInfo
on the source object.
-
IgnoreCurrentThread()
Pause watching for errors from the current thread. It is an error to call this method when errors are already being ignored. This method must be called from the same thread that calledWatchCurrentThread()
most recently. -
WatchCurrentThread()
Watch for errors from the current thread, which can different from the thread that calledIgnoreCurrentThread()
. It is an error to call this method when errors are already being watched. -
FailureInfo const* GetFailure()
If a failure has been observed, returns aFailureInfo
pointer that describes the most recent unique failure. Returnsnullptr
if no failure has been observed.The returned
FailureInfo
pointer is valid only until the thread next observes an error. Therefore, you should callGetFailure()
only from the thread that is watching, or when the object is not watching at all.
The ThreadFailureCache
can be used as part of an RAII type
that tracks a sequence of operations leading to a goal.
If the enclosing type did not reach the success state prior to its
destruction (due to early return or exception unwinding),
it can use the ThreadFailureCache
to infer a possible reason for the failure.
class OperationLogger
{
public:
OperationLogger() = default;
~OperationLogger()
{
if (!m_complete)
{
auto info = m_cache.GetFailure();
if (info)
{
// log info->hr, info->pszFile, info->uLineNumber, etc.
}
else
{
// failure not logged by WIL error macro.
// Could be C++ exception like std::bad_alloc or premature return.
}
}
void Complete() { m_complete = true; }
private:
bool m_complete = false;
ThreadFailureCache m_cache;
};
Leak detection tools may report a memory leak in
wil::details_abi::ProcessLocalStorageData<wil::details_abi::ProcessLocalData>::MakeAndInitialize
.
The first WIL client that uses the error reporting services allocates this data, which is then shared with all other WIL clients. The last WIL client to unload frees the memory. If the process terminates with active WIL clients, then the memory is intentionally leaked. (See Shutdown-aware objects.)
As a result, leak detection tools may generate false positives.
- The module that allocated the memory is not necessarily the one that will free it. Leak detection tools may expect modules to free all the memory they allocated, unaware that this memory is shared between modules.
- The memory is intentionally leaked if the process shuts down while there are still active WIL clients.
See Internals Local data for implementation details.