This project is about providing functional analysis style to CERN ROOT users. See the introduction in Git Pages
The code has has this basic elements:
- streamlined ROOT tree access layer
- histograms handling
- two functional container implementations, eager and lazy one - the later should be typically used
- smaller utilities, weight handling, configuration, diagnostics. The examples and unit tests are also included. The
Helper to generate pure generic lambda expression. Example
auto f = F( _*_);
\\ is identical to
auto f = [](auto _) { return _*_; };
*Notice, the name F
is short enough to actually cause naming clash, in such case: rename it or remove it in favour of macros specific for types you deal with. *
Like F
except it is a closure (can access variables that ere out of scope by reference).
double x = 88;
auto f = C( _* x);
\\ is identical to
auto f = [&](auto _) { return _*x; };
Like C
but w/o returning any value (sometimes needed when no value can be returned).
double x = 0;
auto f = S( x+=_);
\\ is identical to
auto f = [&](auto _) { x += _; };
Prints to the std::cout
the content (if it is printable). Example:
data.inspect(PRINT).filter(...).inspect(PRINT).map(...)...
An extension to the std::pair
. Contains additional field, third
.
The containers are named "vectors", but in fact they can wrap other stl containers too.
Produces eager container.
Produces lazy containers. lazy_view does not cause a copy.
std::vector<MyData> d = ...; // data available in scope
auto dview = lazy_view(d); // this best option in this case
auto oneel = one_own(7) // produces a container of single element
Exposes elements after transforming them.
data.map( F(_*_) ) // produces squared elements
Exposes reduced set of elements
data.filter(F(_<0)).count();
Takes a subroutine (function that returns nothing) and evaluates it on every element. Typically used for side-effects, example: printing.
data.foreach(S( std::cout << _ << " "));
Like the foreach - but evaluates the subroutine in a lazy way.
Returns true
if the container is empty.
Returns number of elements in the container. In case of lazy container it may require traversing the container (because of intermediate filtering).
Returns number of elements satisfying the predicate.
const size_t n_elements_above0 = data.count(F( _>0 ));
\\ slightly a more efficient version of
data.filter(F(_>0)).size();
Returns true if there is at least one element satisfying the predicate.
const bool any_elements_above0 = data.contains(F( _>0 ));
\\ slightly a more efficient version of
not data.filter(F(_>0)).empty();
Returns true if all elements satisfy predicate.
Notice: false
is always returned for an empty container.
Produces container of elements that are sorted ascending by a property extracted by provided function
// assume data is collection of doubles
data.sort() // the doubles in ascending order
data.sort( F( -_)) // sort in descending order
Notice In case of lazy container the sorting involves making a temporary lightweight copy of references.
It may be good idea to use cache/stage
after sorting.
Notice The name of this method was changed from "sorted" to be similar to other names (i.e. reverse - not reversed).
Takes n first elements from the container skipping them by stride
.
//Say the data contains letters A, B, C, D, E, F, ...
data.take(6,2) // results in A C E
Similar to the take
, but skips n
first elements.
Similar to take/while
takes or skips first elements satisfying the predicate.
In case of lazy container make an intermediate container. More in the [Project page] (https://tboldagh.github.io/FunRootAna/) Notice In case of eager container this is void operation.
Produces container of elements in reverse order to the original one.
Converts contained objects to pointer or ref. If the objects are actually of desired type the compilation error is emitted.
Returns a single (or empty) container with the element that is extreme according to the value returned by the keyextractor
function.
// assumed data is: -7, -1, 2, 2, 6, -3
data.max() // is single element container with 6
data.min( F(std::abs(_))).min() // is single element with -1
Produces a container that is the concatenation of the two operands.
data1.chain(data2) // just concatenation of the the two
data1.chain(data1) // repeated container (no actual copies are involved)
data1.chain(data2).chain(data3).chain(data4) // concatenation of 4 containers
Chaining is allowed for containers of objects from the same hierarchy. i.e.
class Base{ virtual int x() const = 0; };
class DerivedA : public Base {...};
class DerivedB : public Base {...};
// assume container a has objects of type DerivedA, and b objects of type DerivedB
a.chain<Base>(b).filter(F(_.x() > 0)).map(F(_.x())) >> HIST1...
// allows for processing the data from both containers using interface Base
Produces pairs of elements from operands.
Notice The containers can be of different length. The pairs are generated until exhaustion of elements in the shorter one.
Compares container element by element using comparator
and returns true if all comparisons were positive. Internally uses zip, and the same limitation applies.
// assume data1 is A B C D E F
// and data 2 is alec, ben, charles
data1.is_same(data2, F(std::tolower(_.first) == _.second[0])) // return true because first letters are the same as the chars in the first sequence
Version w/o the comparator compares elements by their respective == operation.
Groups elements in the container in sub-container of size n.
// assume data is A B C D E F
data.group(3) // is A B C and then D E F
data.group(3, 1) // is A B then B C, then C D ...
Forms container of each pair that can be formed from the two operands.
// assume data1 is A B C
// data2 1 2
data1.cartesian(data2) // is A1, A2, B1, B2, C1, C2
Sums element in the container. If the transform
is provided is is identical to mapping and the summing.
Standard reduction operation.
// assume the data is 1 2 3 4 5
data.accumulate( []( auto total, auto el){ return total*el; }, 1) // is 1*2*3*4*5
Notice This is very versatile operation and in fact can be used to implement almost all (non-lazy) operations of the container.
Calculates statistical properties (count/mean/variance). The transform
is like applying a map
before collecting the stats.
Returns an element at an index. This operation can involve traversing the container and should in general be avoided. Returned ins std::optional that needs to be checked for content. I.e. It is perfectly correct to ask for element beyond the size of the container. The result will be an empty optional.
Identical to element_at(0). Handy with min/max.
Returns first element satisfying the predicate. May be none, thus returning optional.
Insert data to standard containers.
Generates an infinite sequence of double precision values:
c, c*r, c*r^2, c*r^3, ...,
Generates an infinite sequence of values:
c, c+i, c+i+i, c+i+i+i ...
Notice The type of c
and i
can be any allowing for addition, e.g. strings, integers, double, complex etc.
Like arithmetic stream but with increment == 1.
Random integers drawn using standard C random function.
Finite size stream of numbers
begin, begin +stride, begin + stride*2, begin+stride*3
... until the value is not greater than the end
.
The HistHelper
class is to be inherited analysis code in order to profit from the functionality.
The public API of the histogramming has:
Function saving histograms in the file.
Create (one-time)/register respective histograms.
Fill the histogram on the right of >>
with the data from the left of the >>
.
Accepted at the left side:
The ROOT Tree reading is provided by the ROOTTreeAccess
class that offers a handy API for iteration over events and access to the tree branches in two forms:
E.g. for a construct like
for (ROOTTreeAccess event(t); event; ++event)
the ++, and conversion to bool are provided. Range of events iterated over can be limited by passing additional arguments to the constructor.
Additional wrapper TreeView
is provided, for that. It has an API that is identical to the one explained above for Lazy/Eager vectors. The tree is considered an infinite container and thus operations like sorting or reversing are unavailable.
Branches data can be accessed via getters:
returns branch content (works for PODs and std::vector of PODs)
makes the data available via copy-less functional lazy container
The Conf
class offers two sources of configuration. The config file formatted as follows:
settingA=valueA
#settingB=valueB
...
or environment variables that are set before running the analysis program:
export settingA=valueA
In the code the configuration can be accessed by instantiating Conf
object with the file name (or w/o for reading the environment).
There are two methods of the Conf class:
That returns true if there is a setting of a given name.
Returns the setting value, or the default is the setting is absent.
A global weight that can be manipulated in RAII style with helper functions:
UPDATE_MULT_WEIGHT
and UPDATE_ABS_WEIGHT
.
Example:
Weight::set(0.5);
WEIGHT; // get the absolute weight
{
UPDATE_ABS_WEIGHT(0.9);// update the weight locally
}
// weight restored
It is frequent that we need to select a value depending on several conditions. If we want to have that value to be const the code to do it somewhat awkward to write pile of ternary expressions. The Selector
offers a more legible alternative.
const double value = option(x < 0, 0.7)
.option( 0 < x and x < 1, 0.9)
.option(1.2).select();
const std::string descr = option(x<0, "less than zero").option(x>0, "more than zero").select();
// in this case (missing last - option call w/o the boolean) if x is 0 the select will raise an exception
- HIST cleanup - remove unnecessary hashing entirely