Make RF response more homogeneous vs. zenith #1317

moralejo · 2024-11-22T16:43:18Z

We know that at high zeniths (say, above ~50 deg) the output of the random forests depends strongly on zenith (because image parameters change quickly with zenith).

Since the RFs are trained on MC with a discrete distribution of pointings, when applied to the data, the distributions of reconstructed quantities (e.g. gammaness or energy) show sudden jumps when the telescope pointing crosses the middle point in zenith between two training nodes (sin_azimuth is also part of the training, but luckily its effect is negligible compared to that of zenith). In the past, for the analysis of some high-zenith observations, we have dealt with this problem just using a different training sample, with smaller steps in zenith.

A possible general solution (that would allow us to use our existing high-zenith "coarse-grid" training MC) could be the following:

for any given event, compute the RF output with the standard set of parameters
compute it again with the same parameters except for zenith (taking a value just on the other side of the closest zenith boundary between training nodes)
Each of the two values would correspond to the zeniths of the two relevant training nodes. Interpolate them linearly in cos(zenith) to the actual zenith of the event.

This would get rid of the jumps, and result in a more accurate reconstruction between the training nodes.

All we need is to know the pointings used in the training (we could save them together with the RFs). Then replace all calls to "predict" by calls to a new function which calls predict twice and does the interpolation.

Note that besides real data, this will also affect the MC test nodes, which correspond to different pointings than the training ones. Hence I think this change would reduce the systematic errors of the instrument response functions (e.g. eff. area) at high zeniths.

moralejo · 2024-11-26T09:18:16Z

I tested the cosZD interpolation for energy reconstruction, and seems to work very well (see below)
Implementation is rather simple and it increases the time spent by the RF prediction by a factor a bit more than 2 (two calls to predict instead of one), which is not too bad - plus we recently reduced the RFs size, which makes them faster.
This only has to be done in the application part of the RF. During the training the RFs are applied to the same training nodes (different events than those used in training) so there is no mismatch in pointing to worry about.

vuillaut · 2024-11-26T15:42:19Z

Hi
Another approach could be combining linear regression with random forests.
The linear regression would first capture the global trend and smooth things, then random forests make a finer prediction but the steps would already be smoothed.

Something like this:

linear_reg.fit(X, y)
residuals = linear_reg.predict(X)
random_forest.fit(X, residuals)

preds = linear_reg.predict(X_test) + random_forest.predict(X_test)

This would be integrated in the training and inference functions and could be completely transparent to users.

moralejo · 2024-11-27T10:17:32Z

I made an implementation of the proposed RF-prediction interpolation approach in #1320

@vuillaut it seems simpler to me than the linear_reg + RF . For example, in what I implemented there is no need to change anything in the training part (which itself involved the use of RF predictions), because if the pointing for the event on which the RF is applied matches one of the training nodes' pointings (which is the case during the training) the interpolation is unnecessary.

moralejo · 2024-11-27T10:19:43Z

Besides, this allows us to use the already produced RFs (though that is a minor advantage)

moralejo self-assigned this Nov 22, 2024

moralejo mentioned this issue Nov 27, 2024

Interpolation of RF predictions with cosZD, for homogeneous performance #1320

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make RF response more homogeneous vs. zenith #1317

Make RF response more homogeneous vs. zenith #1317

moralejo commented Nov 22, 2024

moralejo commented Nov 26, 2024

vuillaut commented Nov 26, 2024 •

edited

Loading

moralejo commented Nov 27, 2024

moralejo commented Nov 27, 2024

Make RF response more homogeneous vs. zenith #1317

Make RF response more homogeneous vs. zenith #1317

Comments

moralejo commented Nov 22, 2024

moralejo commented Nov 26, 2024

vuillaut commented Nov 26, 2024 • edited Loading

moralejo commented Nov 27, 2024

moralejo commented Nov 27, 2024

vuillaut commented Nov 26, 2024 •

edited

Loading