Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make RF response more homogeneous vs. zenith #1317

Open
moralejo opened this issue Nov 22, 2024 · 4 comments
Open

Make RF response more homogeneous vs. zenith #1317

moralejo opened this issue Nov 22, 2024 · 4 comments
Assignees

Comments

@moralejo
Copy link
Collaborator

We know that at high zeniths (say, above ~50 deg) the output of the random forests depends strongly on zenith (because image parameters change quickly with zenith).

Since the RFs are trained on MC with a discrete distribution of pointings, when applied to the data, the distributions of reconstructed quantities (e.g. gammaness or energy) show sudden jumps when the telescope pointing crosses the middle point in zenith between two training nodes (sin_azimuth is also part of the training, but luckily its effect is negligible compared to that of zenith). In the past, for the analysis of some high-zenith observations, we have dealt with this problem just using a different training sample, with smaller steps in zenith.

A possible general solution (that would allow us to use our existing high-zenith "coarse-grid" training MC) could be the following:

  • for any given event, compute the RF output with the standard set of parameters
  • compute it again with the same parameters except for zenith (taking a value just on the other side of the closest zenith boundary between training nodes)
  • Each of the two values would correspond to the zeniths of the two relevant training nodes. Interpolate them linearly in cos(zenith) to the actual zenith of the event.

This would get rid of the jumps, and result in a more accurate reconstruction between the training nodes.

All we need is to know the pointings used in the training (we could save them together with the RFs). Then replace all calls to "predict" by calls to a new function which calls predict twice and does the interpolation.

Note that besides real data, this will also affect the MC test nodes, which correspond to different pointings than the training ones. Hence I think this change would reduce the systematic errors of the instrument response functions (e.g. eff. area) at high zeniths.

@moralejo moralejo self-assigned this Nov 22, 2024
@moralejo
Copy link
Collaborator Author

I tested the cosZD interpolation for energy reconstruction, and seems to work very well (see below)
Implementation is rather simple and it increases the time spent by the RF prediction by a factor a bit more than 2 (two calls to predict instead of one), which is not too bad - plus we recently reduced the RFs size, which makes them faster.
This only has to be done in the application part of the RF. During the training the RFs are applied to the same training nodes (different events than those used in training) so there is no mismatch in pointing to worry about.

image

@vuillaut
Copy link
Member

vuillaut commented Nov 26, 2024

Hi
Another approach could be combining linear regression with random forests.
The linear regression would first capture the global trend and smooth things, then random forests make a finer prediction but the steps would already be smoothed.

Something like this:

linear_reg.fit(X, y)
residuals = linear_reg.predict(X)
random_forest.fit(X, residuals)

preds = linear_reg.predict(X_test) + random_forest.predict(X_test)

This would be integrated in the training and inference functions and could be completely transparent to users.

@moralejo
Copy link
Collaborator Author

I made an implementation of the proposed RF-prediction interpolation approach in #1320

@vuillaut it seems simpler to me than the linear_reg + RF . For example, in what I implemented there is no need to change anything in the training part (which itself involved the use of RF predictions), because if the pointing for the event on which the RF is applied matches one of the training nodes' pointings (which is the case during the training) the interpolation is unnecessary.

@moralejo
Copy link
Collaborator Author

Besides, this allows us to use the already produced RFs (though that is a minor advantage)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants