Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilities for generating monty_config and dataset_args for an arbitrary number of LMs. #94

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

scottcanoe
Copy link
Contributor

@scottcanoe scottcanoe commented Dec 6, 2024

This adds a few functions to config_args.py and make_dataset_configs.py that allow users to create multi-LM experiment configs without writing and using custom config data classes.

Added to config_args.py

  • make_multi_lm_monty_config: The top-level utility function. Builds a MontyConfig object given some the number of LMs and Monty/SM/LM classes with their associated arguments.
  • make_multi_lm_connectivity: generates sm_to_agent_dict, sm_to_lm_matrix, lm_to_lm_matrix, and lm_to_lm_vote_matrix for "typical" multi-LM setups (i.e., 1 agent, 1:1 SM -> LM connectivity, all-to-all voting, and no hierarchy). Used by make_multi_lm_monty_config and may be useful if you write your MontyConfig generator.

Added to make_dataset_configs.py

  • make_multi_lm_habitat_dataset_args: The top-level utility function. Builds a MultiLMMountHabitatDatasetArgs object given the number of SMs and optionally some mount parameters (such as sensor positions, zooms, etc.).
  • make_multi_lm_mount_config: Builds a dictionary of mount parameters (positions, resolutions, etc.).
  • make_multi_lm_sensor_positions: Generates sensor positions. See my comment for more info.

Here's a usage example:

# Set up arguments for `make_multi_lm_monty_config`.
monty_config_args = [
    MontyForGraphMatching,  # monty_class
    DisplacementGraphLM,  # learning_module_class
    dict(k=5, match_attribute="displacement"),  # learning_module_args
    HabitatDistantPatchSM,  # sensor_module_class
    dict(features=["pose_vectors, "on_object, ...]),  # sensor_module_args
    NaiveScanPolicy,  # motor_system_class
    make_naive_scan_policy_config(step_size=5),  # motor_system_args
    dict(num_exploratory_steps=500),  # monty_args
]

pretrain_dist_agent_16lm = dict(
    experiment_class=MontySupervisedObjectPretrainingExperiment,
    experiment_args=ExperimentArgs(
        n_train_epochs=len(TRAIN_ROTATIONS),
        do_eval=False,
    ),
    logging_config=PretrainLoggingConfig(run_name="dist_agent_16lm"),
    monty_config=make_multi_lm_monty_config(16, *monty_config_args),  # <----------
    # Set up environment and agent.
    dataset_class=ED.EnvironmentDataset,
    dataset_args=make_multi_lm_habitat_dataset_args(16),  # <----------
    # Set up training dataloader.
    train_dataloader_class=ED.InformedEnvironmentDataLoader,
    train_dataloader_args=EnvironmentDataloaderPerObjectArgs(
        object_names=SHUFFLED_YCB_OBJECTS,
        object_init_sampler=PredefinedObjectInitializer(rotations=TRAIN_ROTATIONS),
    ),
)

I'm currently using a pattern like this in dmc_pretraining_experiments.py and dmc_eval_experiments.py (though I haven't submitted a monty_lab PR yet). It made it easy to add 4, 8, and 16 LM experiments and removed the need for a bunch of hand-made dataclasses (farewell toTenLMMountHabitatDatasetArgs, EnvInitArgsTenLMMount, and TenLMMountConfig).

Testing: I re-wrote DMC experiments to use these config generators. I then randist_agent_5lm and dist_agent_5lm_randrot_noise, and the results were identical. I've also pretrained models with 2, 4, 5, 8, 9, 10, and 16 LMs, and the object models look good.

16LMs

@scottcanoe
Copy link
Contributor Author

scottcanoe commented Dec 6, 2024

A note on generating sensor module positions: I've implemented two choices for determining how to place sensor modules. The function make_multi_lm_sensor_positions has the parameters n_sms (number of sensor modules), delta (grid spacing, default is 1 cm), and order_by. order_by can be either "distance" or "spiral".

  • "spiral" is pretty straightforward. It starts at the center of a grid and spirals outward.
  • "distance" also spirals, but it prioritizes sensor module positions that are closest to the center.

For example, here are arrangements generated for 5 sensor modules:
n_sms_5_distance

Numbers indicate sensor module ID, and black squares don't have sensor modules ("view_finder" just indicates the last generated position which is always the same as sensor module 0).

Here's another for 16 sensor modules:
n_sms_16_distance

Though the "distance" method is a bit more jagged or aesthetically weird, I think it's the better choice since, in principal, it should reduce the occurrence of off-object observations for peripheral sensor modules. So I let "distance" by the default argument for make_multi_lm_sensor_positions.

I should note that you can always supply your own positions to make_multi_lm_habitat_dataset_args. For example, if you wanted to use the "spiral" method or shorter spacing between sensor modules, you can generate those with make_multi_lm_sensor_positions or create them manually and pass those positions into make_multi_lm_habitat_dataset_args.

@scottcanoe scottcanoe marked this pull request as ready for review December 6, 2024 22:40
@scottcanoe scottcanoe added the enhancement New feature or request label Dec 6, 2024
Copy link
Contributor

@vkakerbeck vkakerbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this looks very useful! I left a few comments, mostly around variable and function names.
On a more general note, it would be nice if you can include the usage example from your PR description either in the documentation or in the docstring. In the usage example it would be good to specify the monty_config_args as a dictionary with keys instead of a list where it matters in which order they are specified and it is a lot less verbose (hence why you probably added comments)

src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
@tristanls tristanls added the triaged This issue or pull request was triaged label Dec 9, 2024
@scottcanoe
Copy link
Contributor Author

I'm currently scrubbing references to learning modules in my make_dataset_configs.py functions in favor of multi_sensor..., but it occurs to me that the existing code uses multi-LM language. For example, MultiLMMountConfig and FiveLMMountConfig. Maybe we want to settle on the language and change these names as well.

Changed `make_mult_lm_monty_config` to take (required) keyword arguments. Also changed it to accept a `connectivity_func` that produces connectivity matrices.
@scottcanoe
Copy link
Contributor Author

I'm ready for another round of review. Here's a brief summary of changes:

  • Some minor changes to variable names within functions to make them more descriptive.
  • I've replaced language related to learning modules and sensor modules with multi-sensor language in make_dataset_configs.py. This includes function names, parameters, and documentation. This is meant to prepare for future development where we may have more complex setups, but as I mentioned in a previous comment, this does clash somewhat with the multi-LM naming conventions used elsewhere in make_dataset_configs.py.
  • I've removed the required keyword-only pattern in make_multi_sensor_mount_config's signature.
  • In config_args.py, make_multi_lm_connectivity has been renamed to make_multi_lm_flat_dense_connectivity.
  • I've added a connectivity_func parameter in make_lm_monty_config to support the generation of non-flat, dense connectivity types.
  • I've also added the keyword-only pattern in make_multi_lm_monty_config to support function use with (unpacked) dictionaries instead of lists. However, I don't think many of the arguments have obvious default values, so I have made most of them required keyword arguments.

I haven't added a usage example to a docstring yet. I'm undecided as to where that should go (either make_multi_sensor_habitat_dataset_args, make_multi_lm_monty_config, or both?).

In general, I'm finding it a bit tricky to find the balance between a more generic approach that can support future use cases and functions that are useful now. For example, some functions (e.g., make_sensor_positions_on_grid) may be useful down the road in a multi-agent setting. However, pretty much every other function has certain (documented) constraints and assumptions baked in, such as 1:1 sensor -> sensor module relationships, 1:1 sensor module -> learning module relationships, a single agent, and the inclusion of a view finder.

I expect there will be some more back and forth about further changes.

Copy link
Contributor

@nielsleadholm nielsleadholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice PR, thanks for the super clear documentation. Just added a few comments.

Re. the names of the functions, I agree it's a hard balance to strike - personally I'm ok with how they are now, i.e. avoiding making the names super specific, but then also very long-winded.

src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
src/tbp/monty/frameworks/config_utils/config_args.py Outdated Show resolved Hide resolved
Copy link
Contributor

@vkakerbeck vkakerbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates! Looks very nice :) Happy to merge after you go through Niels comments.

Added optional `view_finder_config` parameter to `make_multi_lm_monty_config`.
Set `semantics` default to `False` except for the last entry which must be `True` for the view finder.
@scottcanoe scottcanoe marked this pull request as draft December 12, 2024 22:04
@scottcanoe
Copy link
Contributor Author

Update: defaulting to False values in the semantics mount config caused unexpected problems. I played around with a few things, such as

  • Changing use_semantic_sensor=False in transforms and anywhere else relevant since those will be the defaults soon.
  • Setting the semantics value for the view finder to always be true (i.e., semantics[-1] = True).
  • Different combinations of values for semantics and use_semantic_sensor.

One of two errors was thrown, depending on the changes I made. Since these problems tie into our planned changes to use_semantic_sensor defaults, I thought it best to downgrade this PR until those changes are complete. It should be a bit easier to tackle these errors when I have a better idea of how the semantic sensor logic works, and it's probably best to test this code against the new defaults anyways.

@scottcanoe scottcanoe removed the triaged This issue or pull request was triaged label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants