Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINIFICPP-2494 windows event log: fix event message formatting, refactor #1900

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

szaszm
Copy link
Member

@szaszm szaszm commented Nov 26, 2024

Changes:

  • change enum METADATA to enum class Metadata
  • rewrite getEventData and getEventMessage. They no longer try with a stack buffer first. That looks like a premature optimization, since there are allocations in the function either way, and a user reported issues with the call that may be resolved by this. The rewrite aims to follow modern C++ best practices, instead of the unsafe C-like code it used to be.
  • remove a few unused includes
  • make getComputerName thread-safe and simpler. The max computer name is reduced to 256, based on the MSDN example. I think the overwhelming majority of computer FQDNs will fit in that just fine. Also changed it to truncate when it's longer, instead of bailing out and returning "N/A"
  • other various drive-by refactors

Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file?
  • If applicable, have you updated the NOTICE file?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.

@@ -74,7 +74,7 @@ const short event_type_index = 178; // NOLINT short comes from WINDOWS API

class FakeWindowsEventLogMetadata : public WindowsEventLogMetadata {
public:
[[nodiscard]] std::string getEventData(EVT_FORMAT_MESSAGE_FLAGS flags) const override { return "event_data_for_flag_" + std::to_string(flags); }
[[nodiscard]] nonstd::expected<std::string, std::error_code> getEventData(EVT_FORMAT_MESSAGE_FLAGS flags) const noexcept override { return "event_data_for_flag_" + std::to_string(flags); }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to noexcept. There are allocations here. If they fail, the test will terminate and report failure, which is an acceptable behavior IMO.

}
// TODO(szaszm): add error logging
Copy link
Member Author

@szaszm szaszm Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One behavior change in getEventData: allocation failure now returns an error_code, instead of an empty string. Since both empty strings and errors are ignored here, there is no change in overall behavior. Added a comment to do proper error handling later, but I didn't wanna do it in the context of this change. (MetadataWalker has no logger.)

Comment on lines -154 to -159
std::map<std::string, std::string> MetadataWalker::getFieldValues() const {
return fields_values_;
}

std::map<std::string, std::string> MetadataWalker::getIdentifiers() const {
return replaced_identifiers_;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved inline to the header

Comment on lines 137 to 138
case USER: // TODO(szaszm): unhandled before refactoring
case UNKNOWN: // TODO(szaszm): unhandled before refactoring
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to properly handle these, so I opted to not change the behavior.

Comment on lines -206 to -208
EVT_HANDLE WindowsEventLogHandler::getMetadata() const {
return metadata_provider_.get();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved inline to the header

Comment on lines -90 to -91
static std::string getMetadataString(METADATA val) {
static std::map<METADATA, std::string> map = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was unused

} else {
return METADATA::UNKNOWN;
}
static Metadata getMetadataFromString(std::string_view val) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to be thread-safe, and avoid a static map, at the expense of linear lookup instead of logarithmic. I think it doesn't matter at such a small and constant size.

static std::string computer_name;
if (computer_name.empty()) {
char buff[10248];
static std::string computer_name = []() -> std::string {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to an immediately invoked lambda: this way the thread-safe static initialization takes care of only running the initialization once, and subsequent calls will just copy the static string. The old version was not thread-safe.

for (const auto& option : header_names_) {
auto name = option.second;
const auto& name = option.second;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option.second is a std::string

@@ -296,7 +296,7 @@ int64_t OsUtils::getTotalPagingFileSize() {
return total_paging_file_size;
}

std::error_code OsUtils::windowsErrorToErrorCode(DWORD error_code) {
std::error_code OsUtils::windowsErrorToErrorCode(DWORD error_code) noexcept {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No allocations happen here, so it might as well be noexcept

// this is a continuous enum, so we can rely on the array

using METADATA_NAMES = std::vector<std::pair<METADATA, std::string>>;
nonstd::expected<std::string, std::error_code> formatEvent(EVT_HANDLE metadata, EVT_HANDLE event, EVT_FORMAT_MESSAGE_FLAGS flags) noexcept;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EVT_HANDLE is void*, but I couldn't wrap it in gsl::not_null, because it's incompatible with void*. (operator* cannot return a reference to void)

@szaszm szaszm marked this pull request as ready for review November 28, 2024 14:38
@adamdebreceni
Copy link
Contributor

I wouldn't call this premature optimization, this rewrite had a measurable 20% throughput increase: #928 (comment)

@szaszm
Copy link
Member Author

szaszm commented Nov 28, 2024

my bad then. Back to draft, and I'll wait for confirmation to see if this fixes the crash.

@szaszm szaszm marked this pull request as draft November 28, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants