-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splits are fetched by BIDSPath.fpath only when extension (and/or) suffix are missing #712
Comments
Discussion in #692 is relevant I think |
My personal opinion is that we should stop doing these heuristics altogether, but… well, that's just me :) |
Couldn't agree more. Debugging this was a pain. |
@dmalt can you profile a code snippet to demo the unnatural behaviour you observed? |
@agramfort this is what I mean from pathlib import Path
from mne_bids import BIDSPath
root_path = Path("bids_root")
subj_path = root_path / "sub-01" / "meg"
subj_path.mkdir(exist_ok=True, parents=True)
bp_template = BIDSPath(
subject="01", root="bids_root", suffix="meg", extension=".fif"
)
print(bp_template.fpath)
# prints bids_root/sub-01/meg/sub-01_meg.fif
# which is what I want to write raw data with raw.save
# now let's emulate situation when the raw file was saved in splits:
(subj_path / "sub-01_split-01_meg.fif").touch(exist_ok=True)
(subj_path / "sub-01_split-02_meg.fif").touch(exist_ok=True)
print(bp_template.fpath)
# still prints bids_root/sub-01/meg/sub-01_meg.fif
# because both suffix and extension are set
# Ideally I would like this to pick the first split
# At the same time
bp_template = BIDSPath(subject="01", root="bids_root", suffix="meg")
print(bp_template.fpath)
# outputs
# bids_root/sub-01/meg/sub-01_split-01_meg.fif
# But this template won't work for saving data since it's missing extension
# Bottom line:
# 1) Can't use the same template for reading and writing data
# 2) Can't guess the output of fpath without looking at source code
# 3) It seems strange that I need suffix and extension to pick splits |
if the files are split then bp_template.fpath is indeed I don't understand why you say you need to update your template i've had to deal with split files in a real dataset using |
No, I understand that it's possible to pick up splits of course. It's just that I had a hard time figuring out how to write the code properly for that because it's not mentioned in the docstring that |
And I do care about setting both |
When writing, we return the (new? updated?) BIDSPath. Would using this one help in your case? (I actually haven't tried this out, so just leaving this here) |
You mean with BTW, a kind of explanation about inferring the path there is exactly the thing that would be great to see in |
Ok in this case I'm now confused and need to re-read the entire thread later tonight 😅 |
I'm sorry I confused you :) Probably I'm not being clear enough, so I'll try to reformulate.
Does this make sense? |
ok got it ! yes this is clearly a bug ! this code block : https://github.com/mne-tools/mne-bids/blob/master/mne_bids/path.py#L423 I very open to clarify what fpath should be doing depending on:
@adam2392 can you comment in to clarify what you expect fpath to return in different situations? It will be the basis of an improved docstring and sorry @dmalt for the slow reaction and the time it took me to understand the issue... |
|
well fif is not part of standard beyond MEG but it's true that we use it in
the study template ....
… |
Interesting thought! |
I think the documentation could definitely be improved. Rn:
The third case is probably where it's potentially weird, and I also don't get what's going on w/ split files cuz I don't use them. So in this case, you might store data as BrainVision, and so you could theoretically leave out the |
how about we disable fpath when root is None?
it will simplify things
… |
I think we currently simply treat |
defaulting to '.' yes would simplify things
… |
Thanks for your replies, everybody! I would like to point out that In [1]: bp = BIDSPath(root=BIDS_ROOT, subject="01")
In [2]: str(bp) # simply checking what my BIDSPath looks like
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-9-810b7d4e3b46> in <module>
----> 1 str(bp)
~/miniconda3/envs/mne_bids_latest/lib/python3.8/site-packages/mne_bids/path.py in __str__(self)
346 def __str__(self):
347 """Return the string representation of the path."""
--> 348 return str(self.fpath)
349
350 def __repr__(self):
~/miniconda3/envs/mne_bids_latest/lib/python3.8/site-packages/mne_bids/path.py in fpath(self)
463 'BIDS dataset. Please run the BIDS validator on '
464 'your data.')
--> 465 raise RuntimeError(msg)
466 else:
467 bids_fpath = matching_paths[0]
RuntimeError: Found more than one matching data file for the requested recording. Cannot proceed due to the ambiguity. This is likely a problem with your BIDS dataset. Please run the BIDS validator on your data. Don't you think it might be reasonable to make |
^ I'm in favor of that.
|
Yes, this sounds great |
Is there a reason for such behavior?
I'm asking because I'm using BIDSPath as a kind of template tool for paths in my dataset.
For instance, I like to use the same BIDSPath for writing a file and then loading it at the later processing steps.
The problem occurs when occasionally file sizes exceed the 2 Gb limit and have to be split. In this case, writing is still fine with
split=None
since MNE Python supports bids-style splits when saving raw data.Reading on the other hand becomes problematic because I have to either keep track of all the files which happen to have splits and update my templates separately, which is tedious or use the template
BIDSPath
s with suffix or extension unset, which I didn't realize until looking at the source code and which seem a little random to guess.mne-bids/mne_bids/path.py
Lines 414 to 446 in 74ac41a
So my question is can splits be handled irrespective of suffix and extension?
And also can this fetching behavior of
BIDSPath.fpath
(andBIDSPath.__str__
for that matter) become more transparent? At least mentioning these rules in the docstrings would make a big difference I think.Also maybe adding some flag like
fetch_from_filesystem=True
to make the behavior more explicit and controllable.The text was updated successfully, but these errors were encountered: