Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adam 64 #50

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Adam 64 #50

wants to merge 15 commits into from

Conversation

ntellis
Copy link
Member

@ntellis ntellis commented Dec 15, 2022

No description provided.

@ntellis ntellis requested a review from moeyensj December 15, 2022 19:26
Copy link
Member

@moeyensj moeyensj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an overall question on the scope. Do we want to eventually be able to do something like the following?

from precovery.main import precover
results = precover(..., algorithm="brute-force")

I personally think it would be useful to have the precovery algorithm options mapped to a common interface like that. How feasible is it to take what you have here and enable something like that? If we do have a generalized interface it becomes very easy to do a brute-force vs nelson-force (we need a name) comparison which is what we will want to do.

By extension, that would also mean enabling frame candidates with the brute-force code and probably re-organizing the code a bit to enable that level of generalization. This might put us in the realm of a higher-level design discussion. What do you think?

In the meantime, I think it's become even more important to get the unit test we built last week cleaned up so I will do that today and submit a PR. Ideally, before we merge this work we will want to run that test against the brute-force search as well.

I'm pretty excited about this. I'm looking forward to being able to delete large chunks of code in THOR and this is a huge step forward in that.

return chunk_size


def _initWorker():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the original author: this function should be deleted and any reference to it destroyed and purged. :)

I originally added this function so I could keyboard interrupt long-running multiprocessed jobs. What you can do is launch a pool, and then catch a keyboard interrupt exception, and then terminate the pool. Though ignoring signals is too dangerous to be worth the convenience here so we should ideally remove this and pretend it never existed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some stuff in from the THOR implementation - but will pull this part out. The reasoning for why it was there makes total sense

@@ -7,3 +7,13 @@ def radec_to_healpixel(
ra: npt.NDArray[np.float64], dec: npt.NDArray[np.float64], nside: int
) -> npt.NDArray[np.int64]:
return hp.ang2pix(nside, ra, dec, nest=True, lonlat=True).astype(np.int64)


def radec_to_healpixel_with_neighbors(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is a very useful addition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ntellis Can we make this an optional keyword include_neighbors argument to radec_to_healpixel? I

return attributions


def attributeObservations(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest renaming this to attribute_observations. THOR inherited LSST's use of camel case but ideally let's match the rest of the precovery code and go with underscores. The new version of THOR is headed that direction as well.

cc' @akoumjian : function naming going forward might be a good style guide addition for us to adopt at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with the casing. I'm in favor of sticking to underscore casing for Python, camelCase for JS. Good to know new THOR is moving that way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although rather immaterial snake_case is the way to go.

__all__ = ["calcResiduals", "calcSimpleResiduals", "calcProbabilisticResiduals"]


def calcResiduals(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same renaming here. Let's go with underscores to make it all consistent.

return


def _checkParallel(num_jobs, parallel_backend):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_checkParallel is another THOR artifact that is best dropped I think. I think the use case here isn't substantial enough to merit having this function in its current form since it would be another thing we have to unit test.

Instead, I'd assume we are going to be using Python's multiprocessing moving forward, remove the ray component, and then throw the calculation of the number of cores to use into the attribute_observations function or write a new function altogether that does the same.

I know we also do a "number of cores to use" calculation in the multiprocess indexing code so there could be some use for using a generalized utility function. For example, if num_workers=None then it defaults to using all available threads, or if a user passes an int then uses that number of threads.

Copy link
Member Author

@ntellis ntellis Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I think for this implementation in particular we can just stick with multiprocessing. I had taken out reference to the other backends in attribute_observations , but realize it was left in in this function. Will reorganize as you suggest.

This does have a difference with nelson-force - we really want single machines with many cores, because of the constraint in pulling data out of the indexed db. Doesn't scale horizontally as well for that reason, unless broken up in time.

ephemeris = pd.concat(ephemeris_dfs, ignore_index=True)
unique_frames = pd.concat(frame_dfs, ignore_index=True).drop_duplicates()

# This filter is very inefficient - there is surely a better way to do this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this more efficient I'd delegate this to SQL altogether and add another convenience function. This PR already adds some very useful convenience functions to the FrameIndex class. Maybe adding one that returns all frames with a specific healpixel, mjd, and obscode is another one that will be useful.

Actually, I'd imagine that a function like this is necessary, especially in the context of HEALPix down/upsampling that Tanay's been working on. We will want to be able to query for all frames that correspond to one HEALPixel value at a given mjd and obscode so that we can pull those frames in and then check those for matches. In the context of resampling nside, you could imagine adding a nside_search parameter to this convenience function. If nside_search is not equal to the nside at indexing time, then you then resample pixels and return the correct frames for a given nside_search. Of course, this is out of scope for this PR but having part of this function already included might be super useful.

Please feel free to push back here given you have more knowledge of what Tanay's code looks like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your sense here is correct. I was reticent to add too much bloat to the FrameIndex class - functions convenient just for the brute-force. But I agree that this is a case where we could get utility out of that function outside of brute-force as well.

I'll see what could meet the requirements of both the healpix downsampling and this, and rewrite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded on pushing to SQL. Let the database handle filtering heuristics.

[
pd.DataFrame.from_dict(
{
"mjd_utc": [x[0] for y in range(9)],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is range(9) doing here exactly? Might be worth adding a comment since it's a hard-coded list length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this is unclear. Healpixel plus neighbors is a set of 9 healpixels

"""

# Find all the mjd we need to propagate to
all_frame_mjd = precovery_db.frames.idx.unique_frame_times()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've probably already thought about this but you can solve your comment on line 54 by changing this function to return tuples (or similar) of obscode and mjds for that obscode. Then you can remove the obscode kwarg altogether.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - in my local branch now, will push a fix.

This does break down a little with per-obs MJDs though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, excellent point. We might be able to get around that later on by loading in the frames and then adding a check for if frame.mjd != obs.mjd then propagate to the correct epoch. Not sure how we would generalize it so we can have nelson-force and brute-force use the same logic and code.

@ntellis
Copy link
Member Author

ntellis commented Dec 16, 2022

Thanks for the detailed review @moeyensj. Will incorporate and update.

I think you're absolutely right that we will want to be able to call precovery with a method as you describe, ultimately. Probably written in such a way as to default to "nelson-force" so as not to break anything current.

One limitation that makes just swapping out the mode difficult is that the brute force method is a huge memory hog. Doing the propagation in advance and only pulling intersecting and neighboring frames helps. It's also not nearly as bad at higher-nside (pulling less observations per ephemeride), but with many orbits over 7 years, the alg as written still could demand loading some tens of gb of observations into memory. Just something to be aware of. This could be avoided by setting a max memory usage and batch processing, but that isn't written in as of now - optimization for the future.

Now, we may not be running this algorithm at huge scale repeatedly - I think mostly as a consistency check for the nelson method - so this may not be an issue.

ephemeris = pd.concat(ephemeris_dfs, ignore_index=True)
unique_frames = pd.concat(frame_dfs, ignore_index=True).drop_duplicates()

# This filter is very inefficient - there is surely a better way to do this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded on pushing to SQL. Let the database handle filtering heuristics.

attribution_dfs = []

# prepare ephemeris and observation dictionaries
ephemeris, observations = get_observations_and_ephemerides_for_orbits(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be completely misunderstanding this, so I apologize if I'm off.
Can this function be broken up into separate functions? The process seems to be:

  1. orbits -> ephemerides
  2. ephemerides -> all unique intersecting frames + neighbors
  3. frames -> get all observations

The reason being that I believe most of these are re-usable and consistent and may simply require parameters to adjust results based on brute-force vs. nelson-force, etc. Here is what I am thinking this looks like alternatively,

frame_times = get_frame_times_in_range(mjd_start, mjd_end)
ephemerides = ephemerides_from_orbits(orbits, epochs, obscode=obscode)
frames_to_search = intersecting_frames(ephemerides, neighbors=True) # Default is False
observations = precovery_db.extract_observations_by_frames(filtered_frames)

I think each of these steps is usable in multiple cases, especially with using optional parameters.
This makes each individual piece more testable and gives back simpler return values.

@@ -7,3 +7,13 @@ def radec_to_healpixel(
ra: npt.NDArray[np.float64], dec: npt.NDArray[np.float64], nside: int
) -> npt.NDArray[np.int64]:
return hp.ang2pix(nside, ra, dec, nest=True, lonlat=True).astype(np.int64)


def radec_to_healpixel_with_neighbors(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ntellis Can we make this an optional keyword include_neighbors argument to radec_to_healpixel? I

for frame in frames:
inc_arr = np.empty((0, 5), float)
obs_ids = np.empty((0, 1), object)
for obs in self.frames.iterate_observations(frame):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you summarize what's going on with this loop?

@ntellis
Copy link
Member Author

ntellis commented Jan 27, 2023

The brute force algorithm has now been refactored to return PrecoveryCandidates (or, a superclass with the probabilistic extras included).

However, it now breaks on per-obs MJDs - as prophesized. As written, the alg:

  • propagates to each unique frame midpt time to find intersecting frames and neighbors,
  • extracts observations for those frames
  • chunks and supplies ephemerides and orbits to workers to compute the distances.

This breaks down because the calculated ephemerides now don't match the observation MJDs. Another change is needed: propagate from "frame midpoint" to the mjd of each observation. This implementation is tending towards nelson-force as time goes on.

Comment on lines +214 to +222
for ephemeris_visit, observations_visit in zip(
ephemeris_visits, observations_visits
):
assert len(ephemeris_visit["mjd_utc"].unique()) == 1
assert len(observations_visit["mjd_utc"].unique()) == 1
assert (
observations_visit["mjd_utc"].unique()[0]
== ephemeris_visit["mjd_utc"].unique()[0]
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where a major refactor takes place in the case of per-obs mjds. Not sure the BallTree implementation makes sense in the context where each obs has its own coords_predicted_latlon. Not much of a tree in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants