Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Analysis Tree a TTree instead of a derived class #478

Open
lobis opened this issue Sep 20, 2023 · 7 comments
Open

Make Analysis Tree a TTree instead of a derived class #478

lobis opened this issue Sep 20, 2023 · 7 comments
Assignees

Comments

@lobis
Copy link
Member

lobis commented Sep 20, 2023

(Apologies if this was present already in another issue, I could not find it).

Currently the TRestAnalysisTree class is derived from TTree. This has some implications, for example you cannot read the data from the tree if you have not loaded the REST dictionaries.

The analysis tree by design contains simple branches (numbers, basic containers such as vector of numbers, etc.). If this class was not derived from TTree and and was a plain TTree instead (with the information currently stored as members of the derived class stored instead in the tree branches), the compatibility would be significantly increased.

I think I raised this concern offline in some ocasion after trying (unsuccessfully) to open REST files with uproot (scikit-hep/uproot5#936) but after speaking with @cmargalejo and @KonradAltenmueller I noticed some of their analysis processes also had to explicitly load the REST dictionaries just because of this, otherwise they could have been REST-agnostic: analyze REST data with plain ROOT macros, which would make sharing data easier.

Glancing at the code there doesn't seem to be any problem with this refactoring other than the work involved, so please if anyone has some concerns let me know before I attempt to make this change.

@lobis lobis self-assigned this Sep 20, 2023
@jgalan
Copy link
Member

jgalan commented Sep 20, 2023

What happens if we want some of the methods that have been implemented in TRestAnalysisTree?

@lobis
Copy link
Member Author

lobis commented Sep 20, 2023

What happens if we want some of the methods that have been implemented in TRestAnalysisTree?

I guess we could maintain the class TRestAnalysisTree but store the Analysis Tree as a TTree. If we want to use those methods we could dynamically cast the Analysis Tree from a TTree* into a TRestAnalysisTree and we would have access to these methods.

Doing this we would lose access to the stored members from the TRestAnalysisTree (we could stored them in the TTree as dedicated branch).

@jgalan
Copy link
Member

jgalan commented Sep 20, 2023

Perhaps we could just implement a TRestAnalysisTree::Export method that creates a raw TTree, that can be further read using uproot.

Anyhow I think the philosophy should be that we build a TRestDataSet, the tree available in that class is already a raw TTree, which is accessible using TRestDataSet::GetTree.

@lobis
Copy link
Member Author

lobis commented Sep 20, 2023

My point is that the files created by REST should be as compatible as possible, without limiting ourselves. The event tree cannot be easily made compatible by outsiders to rest because of its complex structure so it makes sense to require dictionaries (however it can be technically read by third-party programs such as uproot because it's a plain TTree).

The Analysis Tree could be 100% compatible for external programs by just making this change. I don't think we should explicitly need to export the tree into a compatible format when we can make it compatible ourselves.

I think the functionality implemented in the derived class (TRestAnalysisTree) can be implemented using dynamic casting and the TTree UserInfo functionality.

I propose something like this:

  • Create a new class to hold information currently present in the derived class (e.g. std::vector<TString> fObservableNames;). This class should inherit from TObject.
  • Store this into the UserInfo of the AnalysisTree (now a plain TTree).
  • TRestAnalysisTree class is kept. An AnalysisTree present in the root file is casted into TRestAnalysisTree and then a new method TRestAnalysisTree::LoadFromUserInfo() is invoked, which loads the data from the user info into the fields of the TRestAnalysisTree.

REST users will always interact with a TRestAnalysisTree which be accessible from the TRestRun as it is now, just that it will be stored on disk as a plain TTree.

I think this should cover all functionality while maintaning backwards compatility reasonably well, what do you think?

@jgalan
Copy link
Member

jgalan commented Sep 20, 2023

they could have been REST-agnostic: analyze REST data with plain ROOT macros, which would make sharing data easier.

We could always share data by exporting a TRestDataSet but promoting ways to avoid using the framework will contribute to degrade the framework development, in my opinion.

@jgalan
Copy link
Member

jgalan commented Sep 20, 2023

My point is that the files created by REST should be as compatible as possible, without limiting ourselves. The event tree cannot be easily made compatible by outsiders to rest because of its complex structure so it makes sense to require dictionaries (however it can be technically read by third-party programs such as uproot because it's a plain TTree).

The Analysis Tree could be 100% compatible for external programs by just making this change. I don't think we should explicitly need to export the tree into a compatible format when we can make it compatible ourselves.

I think the functionality implemented in the derived class (TRestAnalysisTree) can be implemented using dynamic casting and the TTree UserInfo functionality.

I propose something like this:

  • Create a new class to hold information currently present in the derived class (e.g. std::vector<TString> fObservableNames;). This class should inherit from TObject.
  • Store this into the UserInfo of the AnalysisTree (now a plain TTree).
  • TRestAnalysisTree class is kept. An AnalysisTree present in the root file is casted into TRestAnalysisTree and then a new method TRestAnalysisTree::LoadFromUserInfo() is invoked, which loads the data from the user info into the fields of the TRestAnalysisTree.

REST users will always interact with a TRestAnalysisTree which be accessible from the TRestRun as it is now, just that it will be stored on disk as a plain TTree.

I think this should cover all functionality while maintaning backwards compatility reasonably well, what do you think?

In principle sounds good. Just afraid of too much freedom produces divergences.

@lobis
Copy link
Member Author

lobis commented Sep 20, 2023

they could have been REST-agnostic: analyze REST data with plain ROOT macros, which would make sharing data easier.

We could always share data by exporting a TRestDataSet but promoting ways to avoid using the framework will contribute to degrade the framework development, in my opinion.

This is of course a matter of opinion but the point could also be made that this would not contribute to the degradation of the framework but instead reduce the scope of the framework which would in turn mean resources can be focused on improving in-scope functionalities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants