-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix eager dtype conversion of value column #1353
Conversation
circleCI failure is a bit mysterious (failure to cross-ref |
OK, the cross-ref failure looks like sphinx-doc/sphinx#13178; sadly I can't replicate the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1353 +/- ##
==========================================
- Coverage 97.43% 97.42% -0.02%
==========================================
Files 40 40
Lines 8903 8918 +15
==========================================
+ Hits 8675 8688 +13
- Misses 228 230 +2 ☔ View full report in Codecov by Sentry. |
ok was able to reproduce the twine failure, and found a workaround here: astral-sh/rye#1446 (comment) TL;DR some kind of mismatch between twine and hatchling, related to metadata versions. |
Thanks for looking into this, @drammock |
Hi @drammock To understand the original issue I actually had to create a test myself that would be caught by the try-catches added in this PR, so maybe it is indeed worth adding a test? Something like this to # Test with only a `value` column, but where the values in this `value` column don't
# meet our assumption that they can be coerced to integers/floats.
events = {
"onset": [10, 15],
"duration": [1, 1],
"value": ["Condition 1", "Condition 2"],
}
events_fname = tmp_path / "bids6" / "sub-01_task-test_events.tsv"
events_fname.parent.mkdir()
_to_tsv(events, events_fname)
raw, event_id = _handle_events_reading(events_fname, raw)
ev_arr, ev_dict = mne.events_from_annotations(raw, event_id=event_id)
# EDIT: test below will pass after https://github.com/mne-tools/mne-bids/pull/1353#discussion_r1894777877
# In this edge case, there were no "event IDs" in the `value` column of the
# events.tsv, so we created event IDs based on the row index of those values.
# And The values of this `value` column were used as annotation descripitons.
assert {"Condition 1": 0, "Condition 2": 1} == event_id == ev_dict
assert raw.annotations.description.tolist() == ["Condition 1", "Condition 2"] This test creates an events TSV file like:
Where
And AFAICT, from the User's perspective:
EDIT: I updated the proposed test after my suggestions in |
On second thought...
In this case, maybe we should either
|
try: | ||
culled_vals = np.asarray(culled_vals, dtype=float) | ||
except ValueError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass | |
# If the values are not numeric, just use the row index as the event ID | |
culled_vals = np.arange(len(culled_vals)) |
@scott-huberty could you please push your test to this PR? For now without changing the behavior as in your suggestions. I would rather merge this one sooner than later as it contains two important fixes / workarounds. We can then discuss your suggested changes in a new issue if you like. |
sorry @sappelhoff Since I am not an MNE maintainer I don't think that I have the push rights needed to do that..(?) |
You can open a PR into my fork+branch. |
never mind @scott-huberty, I pushed the (simplified) test. I gave some thought to your suggestion about setting the event_id to use row numbers instead of the (string) values in the values column. On one hand I think it makes sense (so that it would be possible to actually use that event_id dict in mne-python functions) but I also decided it's not crucial to get it in to this PR, because in the case where the events.tsv value column contains strings, the user won't be expecting to be able to recover an original mapping of eventID integers (because their value column wasn't written by MNE-BIDS in the first place). The other reason I didn't try to add it in here, is that I think it's not ideal to just assign eventID integers from row numbers --- I think you'd want to take (sorted?) unique values and then assign sequential integers starting from 1 probably. Otherwise the event_id mapping changes depending on things like randomized trial presentation order, meaning different subjects could get different mappings. Or, to be really safe, somehow create integers from the hashes of the keys; that would be safest (though perhaps too annoying to be worth it --- especially given that I don't expect users who are in this particular corner case to even want the event_id back). |
Thanks! |
Follow-up to #1349, see https://github.com/mne-tools/mne-bids/pull/1349/files#r1885104885
PR Description
Basically: use defensive programming when trying to dtype-convert the
value
column fromevents.tsv
. I was a bit worried that a column calledvalue
could easily exist in real datasets and not conform to the way that column gets written by MNE-BIDS, and it didn't take long for @larsoner to find one :)The bug was already caught by a downstream test in MNE-BIDS-Pipeline, so I didn't bother adding a new test here, but LMK if you want me to add a simple one.
Merge checklist
Maintainer, please confirm the following before merging.
If applicable: