Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

factorize interpretation #106

Open
wants to merge 60 commits into
base: main
Choose a base branch
from

Conversation

aminnj
Copy link
Member

@aminnj aminnj commented Sep 15, 2021

Split up the large interped_data into smaller chunks for non-, singly-, doubly-jagged branches in case we want to cleanly specialize further (like for Bool).

Replaces #95 .

@codecov
Copy link

codecov bot commented Sep 15, 2021

Codecov Report

Merging #106 (e608734) into master (b32d2cb) will decrease coverage by 0.24%.
The diff coverage is 96.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #106      +/-   ##
==========================================
- Coverage   89.84%   89.59%   -0.25%     
==========================================
  Files          11       11              
  Lines        1319     1326       +7     
==========================================
+ Hits         1185     1188       +3     
- Misses        134      138       +4     
Impacted Files Coverage Δ
src/custom.jl 93.82% <75.00%> (-4.88%) ⬇️
src/root.jl 91.11% <100.00%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b32d2cb...e608734. Read the comment docs.

@tamasgal
Copy link
Member

OK thanks! I just run some random script, just to make sure everything is fine ;)

@tamasgal
Copy link
Member

Alright, it works nicely!

I am just wondering how we should deal with code repetitions in such cases:

function interped_data(rawdata, rawoffsets, ::Type{Vector{_KM3NETDAQHit}}, ::Type{Nojagg})
    UnROOT.splitup(rawdata, rawoffsets, _KM3NETDAQHit, skipbytes=10)
end
function interped_data(rawdata, rawoffsets, ::Type{Vector{_KM3NETDAQHit}}, ::Type{Offsetjagg})
    UnROOT.splitup(rawdata, rawoffsets, _KM3NETDAQHit, skipbytes=10)
end

@aminnj
Copy link
Member Author

aminnj commented Sep 15, 2021

I had originally tried to make a Union like this

-function interped_data(rawdata, rawoffsets, ::Type{Vector{_KM3NETDAQHit}}, ::Type{Nojagg})
-    UnROOT.splitup(rawdata, rawoffsets, _KM3NETDAQHit, skipbytes=10)
-end
-function interped_data(rawdata, rawoffsets, ::Type{Vector{_KM3NETDAQHit}}, ::Type{Offsetjagg})
-    UnROOT.splitup(rawdata, rawoffsets, _KM3NETDAQHit, skipbytes=10)
-end
+function interped_data(rawdata, rawoffsets, ::Type{Vector{_KM3NETDAQHit}}, ::Type{Union{Nojagg,Offsetjagg}})

but I saw tons of unit test failures like this

  ArgumentError: cannot reinterpret `UInt8` as `Vector{UnROOT._KM3NETDAQHit}`, type `Vector{UnROOT._KM3NETDAQHit}` is not a bits type
  Stacktrace:
   [1] (::Base.var"#throwbits#241")(S::Type, T::Type, U::Type)
     @ Base ./reinterpretarray.jl:16
   [2] reinterpret(#unused#::Type{Vector{UnROOT._KM3NETDAQHit}}, a::Vector{UInt8})
     @ Base ./reinterpretarray.jl:36
   [3] interped_data(rawdata::Vector{UInt8}, rawoffsets::Vector{Int32}, #unused#::Type{Vector{UnROOT._KM3NETDAQHit}}, #unused#::Type{UnROOT.Nojagg})
     @ UnROOT ~/.julia/dev/UnROOT/src/root.jl:201

so the interped_data in root.jl was called instead of the one in custom.jl, presumably because the JaggType parameter is more specific in root.jl. Any suggestions?

@tamasgal
Copy link
Member

Hmm... maybe some more fine-graded hierarchy for the JaggType could do it, but I kind of struggle to group them in a logical way in the sense that I have no idea what would be the supertype of Nojagg and OffsetJagg. It's a bit annoying that the Union loses the dispatch-battle.

@aminnj
Copy link
Member Author

aminnj commented Sep 15, 2021

We don't want this right? :)

for J in (:Nojagg,:Offsetjagg)
    T = _KM3NETDAQHit # can loop over this to get the other struct too
    @eval function UnROOT.interped_data(rawdata, rawoffsets, ::Type{Vector{$T}}, ::Type{$J})
        UnROOT.splitup(rawdata, rawoffsets, $T, skipbytes=10)
    end
 end

@tamasgal
Copy link
Member

I don't know, probably not 😆

@Moelf
Copy link
Member

Moelf commented Sep 15, 2021

maybe it's time to figure out a more sane customization interface

@aminnj
Copy link
Member Author

aminnj commented Sep 15, 2021

I'm also not opposed to keeping the main interped_data as in master (as long as constant propagation compiles the right part of the if-else chain, which is hopefully the case...)

Moelf and others added 26 commits June 23, 2022 12:06
* Use two threads

* Remove Julia 1.3

* Oh, we still support Julia 1.3, readding

* Simplify CI and fix threading in Julia 1.3
* Improve thread handling in tests

* Test threading
* fix stacktrace too long

* fix stacktrace too long try 2

* fix stacktrace too long try 3

* fix before 1.6
…e async, add remote (HTTP) source, (JuliaHEP#154)


* reduce basket reading io from 4 -> 1

* use more OffsetBuffer when appropriate

* bump compatibility

* add HTTP source support

* Mmap based local file

* async chunking

* better error

* 100% lock-free

* scitoken ready

* faster display branch

* backport upgrade
* fix windows `id -u`

* remove hacky precompile

* fix CI
bump version
* Fix error when file is not existent

* Use SystemError instead of general error

* Add test for non-existent-file opening
* Add samples for issue 165

* fix C-stype array

* fix io

Co-authored-by: Jerry Ling <[email protected]>
bump version
* normalize branch alias

* touch up docs [skipci]
* xrootd
* bump Julia requirement to >=1.6
* Add skeleton

* Add open journal PDF generator GHA

* Fix path to source file

* Set output path

* summary and statement of need

* Minor cosmetics

* Add comparison intro and uproot

* Add uproot reference

* add functionality

* add performance citation

* small additions

* for ci

* Cite pandas and numpy

* Add ROOT citation

* Cleanup and some additions

* Minor updates and additions

* Update references

* Cleanup and additions

* Update conclusions with Jerry's comments

* Fix typo

* Add NanoAOD refernce

* Add reference to CMS/NanoAOD

* Fix pandas reference

* Add spaces between words and references

* Fix typo

Co-authored-by: Jerry Ling <[email protected]>
Co-authored-by: Nick Amin <[email protected]>
@Moelf
Copy link
Member

Moelf commented Jun 23, 2022

ehh, as always I fucked up the rebase

@codecov
Copy link

codecov bot commented Jun 23, 2022

Codecov Report

Merging #106 (d9fd45a) into master (527f04b) will decrease coverage by 0.20%.
The diff coverage is 96.15%.

@@            Coverage Diff             @@
##           master     #106      +/-   ##
==========================================
- Coverage   90.15%   89.95%   -0.21%     
==========================================
  Files          11       11              
  Lines        1605     1612       +7     
==========================================
+ Hits         1447     1450       +3     
- Misses        158      162       +4     
Impacted Files Coverage Δ
src/custom.jl 93.87% <75.00%> (-4.00%) ⬇️
src/root.jl 94.40% <100.00%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 527f04b...d9fd45a. Read the comment docs.

@Moelf Moelf requested a review from tamasgal June 23, 2022 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants