Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for running curriculum #17

Open
bruno-f-cruz opened this issue Nov 21, 2024 · 1 comment
Open

Add support for running curriculum #17

bruno-f-cruz opened this issue Nov 21, 2024 · 1 comment

Comments

@bruno-f-cruz
Copy link
Collaborator

bruno-f-cruz commented Nov 21, 2024

General strategy

The launcher will depend on aind-behavior-curriculum and must be able to deserialize a recommendation and ask for a new curriculum suggestion at the end of the session.

Since each curriculum will likely require a different set of dependencies and move at a much different rate than the task acquisition code, we must decouple the two environments.

Finally, we must establish a contract between the launcher and the external application

Implementation

  1. Each Curriculum will have its own repository;
  2. Each repository will have a bootstrappable pyproject.toml that can be easily used to generate new environments.
  3. Each repository will be responsible to to create metrics, output a suggestion and push the new session to slims
  4. The code in a repository will follow a template structure with the following interface:
  • The launcher will call an entry point from the curriculum, by default "curriculum -run --path=path_to_session --[opt]=value"
  • The sole input to the curriculum shall be the data directory that resulted from the session.
  • The curriculum will return (ideally via stdout) a serialized version of the new suggestion and of the calculated metrics to be appended to the aind-data-schema. If the process errors, we shall raise an exception and skip the curriculum step on the launcher side
  • Users are allowed to create new assets in the data directory folder
  1. Curriculums will be added to the task repository (e.g. Aind.Behavior.VrForaging) via submodules. In order to update a curriculum, users must make a PR to the repository and have it reviewed. This step ensures that:
  • All code is properly version controlled (via the hash of the submodule)
  • Easy way to find available curriculums (iterate the submodules available)
  • Ensures behavior predictability at the rig
  1. Once control is returned to the launcher, watchdog will be triggered, along with downstream pipelines. Following Description of a full ecossystem pipeline deployed in AIND aind-behavior-curriculum#43, curriculum-derived data will be appended to the generated session.json at the rig and eventually make its way to docdb.

Outstanding questions

What module interfaces with slims?

The launcher (this repo) would be easier to maintain as it would offer a centralized structure. This should be ok since the only information required to upload is already available to the launcher at runtime (i.e. the animal Id). Moreoever it would ensure that whatever is serialized can also be desesrialized back from the same launcher (versions would match).
Having it at the level of the curriculum package would allow users to have custom logic and control over where the data goes. Not sure why this would be useful at this point tho.

The solution is likely to add a new DataBase model where a concrete implementation is SlimsDataBase.

@bruno-f-cruz
Copy link
Collaborator Author

bruno-f-cruz commented Dec 14, 2024

Something that came to mind is to invert the dependency logic by submodule-ing the task repo to the curriculum instead of the other way around.

This has a few advantages:

  • A single curriculum could submodule multiple tasks
  • The curriculum would effectively "decide" the version of the task to run by updating the git submodule file

Some disadvantages include:

  • Each tasks gets spunned via its task specific launcher. This creates a decoupling layer that must be maintained. This can be easily solved if all task launchers respect a common interface which can be ensured if they all use CLABE.

  • The choice to have the metrics serialized into the aind-data-schema injects a rather awkward coupling that may prevent this solution. The data-mapper needs to be defined within the task, but the curriculum, by virtue of being able to be run on top of different tasks, is not accessible at aind-data-schema session time.

  • There would need to be a way to, from the curriculum, launch the experiment-specific launcher but also communicate during runtime with it. This would be necessary for handling the curriculum itself as the task-specific launcher would need to call the curriculum logic to create metrics, and upload suggestion. If metrics were not an issue, one could in theory defer the suggestion until the launcher finishes. This would also require moving the watchdog outside the task-specific launcher.

  • These last two points would require quite a large refactor of the code base as well as a bunch of non-existing features. For now, lets implement the first solution but it may be worth coming back to this after we have a better sense what curricula look like, as well as the interface with SLIMS. The idea I am converging on is that we will need some sort of "Manager for Tasks" that interfaces with SLIMS, checks what is scheduled for an animal, and calls a task-specific launcher via its submodule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant