-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] New MessageLogger configuration syntax #31910
Comments
A new Issue was created by @Dr15Jones Chris Jones. @Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign core |
New categories assigned: core @Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks |
Variant 1Move the destinations Service("MessageLogger",
messageSummaryToJobReport = cms.untracked.bool(False),
generate_preconfiguration_message = cms.untracked.string(), #Intended only for testing
waiting_threshold = cms.untracked.optional.uint32,
modules = cms.untracked.PSet( # all use module labels
suppressInfo = cms.untracked.vstring(),
suppressFwkInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressError = cms.untracked.vstring(),
debugModules = cms.untracked.vstring()
)
cerr = _destination_base_.clone(enabled = cms.untracked.bool(True))
cout = _destination_base_.clone(enabled = cms.untracked.bool(False))
files = cms.untracked.PSet(
allowAnyLabel_ = cms.optional.untracked.PSetTemplate(_destination_file)
)
) An advantage here is it keeps the syntax a bit closer to the original and makes it clear that the other destinations are files. |
Variation 2In the original syntax, there was the option to set default values to be applied to all categories, all destinations or to all categories within a given destination. That could be retained Service("MessageLogger",
...
destinations = cms.untracked.PSet(
defaults = _default
cerr = _destination_base_.clone(enabled = cms.untracked.bool(True))
cout = _destination_base_.clone(enabled = cms.untracked.bool(False))
allowAnyLabel_ = cms.optional.untracked.PSetTemplate(_destination_file)
)
) with #defaults to apply to all destinations and their embedded categories
_default = _destination_base.clone(_category, default = None ) and _destination_base = cms.untracked.PSet(
...
default = _category, #applies to categories, DEBUG, INFO, WARNING and ERROR
...
) I'm not a big fan of this since it makes it more difficult to see exactly what messages will be printed since one needs to check the values of the message expliciltly and the defaults. Also, the same behavior can be accomplished programmatically by the users. |
So, using this new syntax, how would one suppress all INFO messages except for 'MyLabel'? The following does exactly that process.MessageLogger.destinations.cerr.threshold = 'INFO'
process.MessageLogger.destinations.cerr.INFO.limit = 0 #default all INFOs to not write anything
process.MessageLogger.destinations.cerr.categories.MyLabel = dict(limit=-1) #explicitly say to write all of these |
Maybe it's worth considering the most common use cases, and how to make those as simple as possible? I think users usually want one of these cases:
e.g. for SonicTriton tests, I do this: cmssw/HeterogeneousCore/SonicTriton/test/tritonTest_cfg.py Lines 63 to 73 in 9d0ce20
Something like a "MessageLoggerHelper" function with reasonable defaults and clear parameter explanations could help users do common actions without having to keep track of nested PSet structures etc. (We should also make it as easy as possible to activate and view LogDebug messages... this is a frequent source of frustration.) |
NOTE: After testing I did determine that my first guess as to how to keep only 1 particular INFO message was the correct configuration. I've update the previous comment accordingly. |
@kpedro88 process.load('FWCore/MessageService/MessageLogger_cfi')
_cerr = process.MessageLogger.destinations.cerr
_cerr.FwkReport.reportEvery = 500
keep_msgs = [options.producer,options.producer+':TritonClient','TritonClient']
for msg in keep_msgs:
setattr(_cerr, msg, dict(limit = 10000000)) Since I'd propose making the default limit effectivetly infinite, the setattr would then just be setattr(_cerr, msg, dict()) |
That's definitely simpler. I would propose a helper (similar to from FWCore.MessageService.MessageLoggerHelper import MessageLoggerHelper
msgHelper = MessageLoggerHelper(process.MessageLogger)
msgHelper.activate(categories = keep_msgs, dest = 'cerr') (or |
(just to record here: WM and CRAB do not customize MessageLogger configuration) |
I'd add an ability to disable something specifically, not just enable I can't recall a use case to need to control different destinations for logging of outputs of one processing job (since ever on CMS or before). |
I've occasionally done something like this, but in practice, it's also quite simple just to put everything into cout/cerr and then grep for the desired messages. |
fine, I guess having separate destinations is the only way to disentangle Log{Print,Verbatim,Trace}, which have no MSG-generated headers |
A possible use case is given by tests (possibly unit tests) were you want to isolate a subset of categories into a file, but for debugging purposes you may want to optionally enable a few others. For instance if you debug geometry of one subdetector you may not be interested into all the troubles of others, or of unrelated parts of the code. A practical example I have recently played with is https://github.com/cms-sw/cmssw/blob/master/Geometry/MTDGeometryBuilder/test/mtd_cfg.py Grepping files may quickly become very hard, depending on the code executed and/or the number of events. In any case to simplify the analysis of log files I had mad available in CMSSW and old script of mine https://github.com/cms-sw/cmssw/blob/master/FWCore/MessageLogger/scripts/edmMLParser which gives a bit of options to parse output. One part of the MessageLogger that usually looks cumbersome to users is
|
Is there any use case for having both I don't think we actually use it anywhere in production... and every time I send them to different files (e.g. |
Right. I think something that could make it easier would be to have a way to pass to Something like
that would add |
what is the purpose of |
By the way, here is a recent MessageLogger configuration used at HLT: process.MessageLogger = cms.Service("MessageLogger",
categories = cms.untracked.vstring(
'FwkJob',
'FwkReport',
'FwkSummary',
'Root_NoDictionary'),
destinations = cms.untracked.vstring(
'errors',
'warnings',
'infos',
'debugs',
'cout',
'cerr'),
threshold = cms.untracked.string("INFO"),
fwkJobReports = cms.untracked.vstring('FrameworkJobReport'),
FrameworkJobReport = cms.untracked.PSet(
default = cms.untracked.PSet(limit = cms.untracked.int32(0)),
FwkJob = cms.untracked.PSet(limit = cms.untracked.int32(10000000))
),
# not sure what this does
statistics = cms.untracked.vstring('cerr'),
debugModules = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(
'hltOnlineBeamSpot',
'hltCtf3HitL1SeededWithMaterialTracks',
'hltL3MuonsOIState',
'hltPixelTracksForHighMult',
'hltHITPixelTracksHE',
'hltHITPixelTracksHB',
'hltCtfL1SeededWithMaterialTracks',
'hltRegionalTracksForL3MuonIsolation',
'hltSiPixelClusters',
'hltActivityStartUpElectronPixelSeeds',
'hltLightPFTracks',
'hltPixelVertices3DbbPhi',
'hltL3MuonsIOHit',
'hltPixelTracks',
'hltSiPixelDigis',
'hltL3MuonsOIHit',
'hltL1SeededElectronGsfTracks',
'hltL1SeededStartUpElectronPixelSeeds',
'hltBLifetimeRegionalCtfWithMaterialTracksbbPhiL1FastJetFastPV',
'hltCtfActivityWithMaterialTracks'),
suppressError = cms.untracked.vstring(
'hltOnlineBeamSpot',
'hltL3MuonCandidates',
'hltL3TkTracksFromL2OIState',
'hltPFJetCtfWithMaterialTracks',
'hltL3TkTracksFromL2IOHit',
'hltL3TkTracksFromL2OIHit'),
# destinations
errors = cms.untracked.PSet(
placeholder = cms.untracked.bool(True),
threshold = cms.untracked.string("INFO"),
suppressInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressError = cms.untracked.vstring()
),
warnings = cms.untracked.PSet(
placeholder = cms.untracked.bool(True),
threshold = cms.untracked.string("INFO"),
suppressInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressError = cms.untracked.vstring()
),
infos = cms.untracked.PSet(
placeholder = cms.untracked.bool(True),
threshold = cms.untracked.string("INFO"),
Root_NoDictionary = cms.untracked.PSet(
limit = cms.untracked.int32(0)
),
suppressInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressError = cms.untracked.vstring()
),
debugs = cms.untracked.PSet(
placeholder = cms.untracked.bool(True),
threshold = cms.untracked.string("INFO"),
suppressInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressError = cms.untracked.vstring()
),
cout = cms.untracked.PSet(
placeholder = cms.untracked.bool(True)
),
cerr = cms.untracked.PSet(
threshold = cms.untracked.string("INFO"),
suppressInfo = cms.untracked.vstring(),
suppressWarning = cms.untracked.vstring(),
suppressDebug = cms.untracked.vstring(),
suppressError = cms.untracked.vstring(),
noTimeStamps = cms.untracked.bool(False),
default = cms.untracked.PSet(
limit = cms.untracked.int32(10000000)
),
INFO = cms.untracked.PSet(
limit = cms.untracked.int32(0)
),
FwkReport = cms.untracked.PSet(
reportEvery = cms.untracked.int32(1),
limit = cms.untracked.int32(0)
),
Root_NoDictionary = cms.untracked.PSet(
limit = cms.untracked.int32(0)
),
FwkJob = cms.untracked.PSet(
limit = cms.untracked.int32(0)
),
FwkSummary = cms.untracked.PSet(
reportEvery = cms.untracked.int32(1),
limit = cms.untracked.int32(10000000)
),
),
# not sure what this does, either
cerr_stats = cms.untracked.PSet(
optionalPSet = cms.untracked.bool(True),
threshold = cms.untracked.string("WARNING"),
output = cms.untracked.string("cerr"),
)
) As you can see there is a lot of stuff that we should probably remove... unfortunately I think it gets added back because we parse the autogenerated cfi file to begin with :-( |
did I read it correctly that the PSet |
I think they don't actually do anything, because they have
|
You can find cases in CMSSW doing that by |
Unfortunately, the vast majority of use cases configuring the MessageLogger are not production (since that would make it much easier to do changes :) ). Again, doing As for the stack trace, it isn't possible to safely redirect the stack trace output to the MessageLogger as the stack trace happens in a unix signal handler which has restrictions on what one is allowed to call, with dynamic memory handling being one of the forbidden items. |
So by default if one were to write (in the present configuration syntax) process.add_(cms.Service("MessageLogger",
destinations = cms.untracked.vstring("foo"),
foo = cms.untracked.PSet( ... )
) ) Then it would write a file named process.add_(cms.Service("MessageLogger",
destinations = cms.untracked.vstring("foo"),
foo = cms.untracked.PSet( extension = cms.untracked.string("mylog", ...)
) ) Then it would write a file named Just to be clear, I am in no way trying to defend the present interface :), I'm just trying to explain what I've found out about it recently. |
@fwyzard is correct. That parameter tells the MessageLogger to just ignore that destination even though it appears in the All this stuff dates back to the dawn of time when the original developer (who wrote this code for CDF) was still the maintainer of the code. It was pointed out to him that the configuration was difficult to use and his solution was to inject these placeholders under the theory that then people could do simple additions. |
The FWCore.MessageService.MessageLogger_cfi is actually hand written since the present syntax required by the MessageLogger can't be expressed by the ParameterSet validation code. Another reason for this RFC was to change the syntax so we could use the standard validation code. As a first step, the pull request already removes some of the obsolete options from the default logger cfi, so the HLT will inherit that "for free" :). I'd love to get rid of more of those not so useful options but in a way that doesn't break existing configuration in CMSSW. By the way, do you know where the list of HLT module suppressions are being injected? That would be useful in the case we do wind up making an interface change. |
It's maintained in ConfDB, so changing the syntax is not trivial but should be doable. |
Ah, good to know. [note earlier this message said under this RFC nothing would change but that is incorrect] |
By the way - I have a couple of comments / feature requests for the MessageLogger that are not about the syntax used to configure it... I'll mention them here, but I can also open dedicated issues for them :-) Add support for HTML-formatted log files ?It could be something as simple as adding few standard headers, some CSS styling and a Add (optional) colours to the Log messages ?Not the whole message - just the
It could be "disabled" (by default), "auto" (enable only if the output is a console) and "enabled" (explicitly to force it also for files). |
Yes, please do open a dedicated issue for those requests. It makes it easier for us to track specific items. |
It tells the message logger to write is job statistics (that huge tabulated list) to cerr at the end of the job. # not sure what this does, either
cerr_stats = cms.untracked.PSet(
optionalPSet = cms.untracked.bool(True),
threshold = cms.untracked.string("WARNING"),
output = cms.untracked.string("cerr"),
) I think it would be doing the same thing as the other line if |
Variant 3This variant should require the smallest number of files to be modified to still be able to get the goal of
The changes would be as follows
This change could be done in multiple steps to ease migration
|
+1 Variant 3 was implemented in #32138 |
This issue is fully signed and ready to be closed. |
The present configuration syntax used by the MessageLogger service can be confusing to use. Many different parameters often must be changed simultaneously to control the output. Here we propose simplifying some of the options by providing better grouping and avoiding previously required multiple declarations of parameter names. This would entail no changes to the internals of the MessageLogger, just in the way it is configured.
Internally, the MessageLogger has two major concepts: destinations and categories.
A destination sets up where messages are ultimately sent. This could be cout, cerr or to files. Many different options for filtering which messages are ultimately sent are available.
A category is the label passed to the C++ MessageLogger calls. E.g.
edm::LogError("BadFoo")
, "BadFoo" is the category.The MessageLogger also has the ability to filter messages based on the label of the cmsRun module which is running when the message was issued.
The structure of the proposed configuration change is below. {NOTE: all variables beginning with
_
are for illustrative purposes and not for implementation. They avoid duplicating the same description multiple times}with the PSet description for
cout
andcerr
beingand the file destinations having a few extra parameters used to control the file name
The options for the categories are
The text was updated successfully, but these errors were encountered: