Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve order of recording extractor custom properties in add_electrodes() #792

Open
weiglszonja opened this issue Mar 28, 2024 · 11 comments · May be fixed by #793
Open

Preserve order of recording extractor custom properties in add_electrodes() #792

weiglszonja opened this issue Mar 28, 2024 · 11 comments · May be fixed by #793

Comments

@weiglszonja
Copy link
Contributor

weiglszonja commented Mar 28, 2024

I noticed the order of properties that are being added to the Electrodes table is not in the same order as in the recording_extractor._properties, for instance:

recording_extractor._properties

[...., 'electrode_type', 'chamber_type', 'chamber_x', 'chamber_y', 'chamber_z', 'acx_x', 'acx_y', 'acx_z']

The order of properties to add to the Electrodes table is changed at this line:

extracted_properties = set(data_to_add)

which does not preserve the order of the elements:

['electrode_type', 'acx_x', 'chamber_type', 'chamber_z', 'channel_ids', 'acx_z', 'chamber_x', 'acx_y', 'chamber_y', 'offset_to_uV', 'gain_to_uV']

I think we should preserve the order of properties to be the same the recording_extractor._properties.

Opened #793 as draft to discuss

@weiglszonja weiglszonja changed the title Preserve order of recording extractor properties in add_electrodes() Preserve order of recording extractor custom properties in add_electrodes() Mar 28, 2024
@h-mayorquin
Copy link
Collaborator

Why should we keep the order?

@CodyCBakerPhD
Copy link
Member

@h-mayorquin Because it doesn't look as good as it could in Neurosift

In Szonja's example of : ['electrode_type', 'acx_x', 'chamber_type', 'chamber_z', 'channel_ids', 'acx_z', 'chamber_x', 'acx_y', 'chamber_y', 'offset_to_uV', 'gain_to_uV'], wouldn't it look much better if the acx_ were kept together?

(BTW @weiglszonja I'd expand the acx to something more readable - I'm assuming this is acceleration?)

@h-mayorquin
Copy link
Collaborator

If that's the case maybe an easier solution is to have a predetermined order for the special ones (like location) and then write the rest of them with natsort.

I Just thinking that bounding ourselves to followg spikeinterface order will bring us problems. Plus, it is even more problematic when you consider more than one recording object.

@CodyCBakerPhD
Copy link
Member

@h-mayorquin See discussion on related PR for proposed approach: #793

@weiglszonja
Copy link
Contributor Author

(BTW @weiglszonja I'd expand the acx to something more readable - I'm assuming this is acceleration?)

I'm following the lab's notation here, acx_ are the coordinates relative to the crossing of anterior commissure (I added a description so it's clear when looking at the table).

Plus, it is even more problematic when you consider more than one recording object.

Yes, I'm not sure how to solve that one in #793 yet...

@CodyCBakerPhD
Copy link
Member

Yes, I'm not sure how to solve that one in #793 yet...

In principle the approach outlined in #793 (comment) should handle it because the metadata columns are already global across interfaces

@h-mayorquin
Copy link
Collaborator

My first intuition is that the complexity is not worth the effort. For producing better looking output I think a hard-coded rule for some properties and then sorting the rest should ease most of the ugliness.

I am strongly against:

  • Make a distinction between things that were added automatically to the recording vs custom properties: There is no clean way to do this and you will add further coupling between neuroconv and SI which I think should be kept at bay.
  • Expose the option to the users to write in the order they want: this will be hard to document and tests as the possibilities are very wide. Besides,what would be the gain? df[an_order_of_the_columns] and df[another_order_of_the_columns] gives users of the electrode table all the flexibility of presentation they need.

@CodyCBakerPhD
Copy link
Member

Did you read the PR conversation?

Using the already existing metadata["Ecephys"]["Electrodes"] list isn't adding any complexity - all those columns will already be written - and it's already a List[dict] so why not just rely on that order as a loose mechanism for granting priority when constructing the table? It's also already global. It also abstracts from the concept of what was automatically added vs. added in a custom fashion

This is also a feature that could play a role in the GUIDE, which allows even easier customization of electrode table content than we've ever had

@h-mayorquin
Copy link
Collaborator

Yes, I did, I had this proposal in mind:

Using the already existing metadata["Ecephys"]["Electrodes"] list isn't adding any complexity - all those columns will already be written - and it's already a List[dict] so why not just rely on that order as a loose mechanism for granting priority when constructing the table? It's also already global. It also abstracts from the concept of what was automatically added vs. added in a custom fashion

The complexity added would be the modifcation to make the rest of the function to actually work with that. Then documenting and testing the different scenarios (multiple recordings, recordings that match only in some channels, etc) and the error handling. You already mentioned that all the properties should be there.

Probably our difference here is that I Just don't see much value on ordering columns. The example you gave seems like a minor aesthethical thing that can be fixed with two lines (one for defining the order of the special properties) and one for sorting the rest. What would further customization be used for?

https://c2.com/xp/YouArentGonnaNeedIt.html

@CodyCBakerPhD
Copy link
Member

natsort is not magical, and does not grant latent precedence to items that might be considered 'more important' or more 'semantically related'

This can be documented with a single line statement: "The order of entries in the ["Ecephys"]["Electrodes"] list provides a precedence for column order in the final electrodes table."

More tests is a good thing, might even catch other issues with tangentially related things

https://c2.com/xp/YouArentGonnaNeedIt.html

Which is why this feature wasn't in the code base before now, but is requested now, so we're adding it now

@h-mayorquin
Copy link
Collaborator

natsort is not magical, and does not grant latent precedence to items that might be considered 'more important' or more 'semantically related'

I think it would be more fruitful if we had some concrete use cases on mind. The example that you mentioned above would be work with natsort.

I am coming from the prior that determining the order of columns is something that does not add too much. But if the code, tests and documentations are that simple I guess the implementation PR should showcase it.

How would you grant configurability to other tables? Units for example?

Just to be clear, is @weiglszonja use case the same as yours? To make the electrode_table look better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants