Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flepimop patch Abilities And Documentation #423

Merged
merged 13 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 6 additions & 35 deletions documentation/gitbook/how-to-run/multi-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,58 +20,29 @@ You should see an assortment of yml files as a result of that `ls` command.

## Usage

If you run
If you run:

```bash
flepimop simulate config_sample_2pop.yml
```

you'll get a basic foward simulation of this example model. However, you might also note there are several `*_part.yml` files, corresponding to partial configs. You can `simulate` using the combination of multiple configs with, for example:
You'll get a basic forward simulation of this example model. However, you might also note there are several `*_part.yml` files, corresponding to partial configs. You can `simulate` using the combination of multiple configs with, for example:

```bash
flepimop simulate config_sample_2pop.yml config_sample_2pop_outcomes_part.yml
```

if want to see what the combined configuration is, you can use the `patch` command:
While simulate can run your patched configuration, we also suggest you check your configuration file using the patch command:

```bash
flepimop patch config_sample_2pop.yml config_sample_2pop_outcomes_part.yml
flepimop patch config_sample_2pop.yml config_sample_2pop_outcomes_part.yml > config_new.yml
cat config_new.yml
```

You may provide an arbitrary number of separate configuration files to combine to create a complete configuration.

## Caveats

At this time, only `simulate` supports multiple configuration files. Also, the patching operation is fairly crude: configuration options override previous ones completely, though with a warning. The files provided from left to right are from lowest priority (i.e. for the first file, only options specified in no other files are used) to highest priority (i.e. for the last file, its options override any other specification).
At this time, only simulate directly supports multiple configuration files, and our current patching capabilities only allow for the addition of new sections as given in our tutorials. This is helpful for building models piece-by-piece from a simple compartmental forward simulation, to including outcome probabilities, and finally, adding modifier sections. If multiple configuration files specify the same higher level configuration chunks (e.g., seir, outcomes), this will yield an error.

We are expanding coverage of this capability to other flepimop actions, e.g. inference, and are exploring options for smarter patching.

However, currently there are pitfalls like

```yaml
# config1
seir_modifiers:
scenarios: ["one", "two"]
one:
# ...
two:
# ...
```

```yaml
# config2
seir_modifiers:
scenarios: ["one", "three"]
one:
# ...
three:
# ...
```

Then you might expect

```bash
flepimop simulate config1.yml config2.yml
```

...to override seir scenario one and add scenario three, but what actually happens is that the entire seir_modifiers from config1 is overriden by config2. Specifying the configuration files in the reverse order would lead to a different outcome (the config1 seir_modifiers overrides config2 settings). If you're doing complex combinations of configuration files, you should use `flepimop patch ...` to ensure you're getting what you expect.
97 changes: 90 additions & 7 deletions flepimop/gempyor_pkg/src/gempyor/cli.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from click import pass_context, Context
import click
import yaml

from .shared_cli import (
config_files_argument,
Expand All @@ -7,7 +8,7 @@
cli,
mock_context,
)
from .utils import config
from .utils import _dump_formatted_yaml, config

# register the commands from the other modules
from . import compartments, simulate
Expand All @@ -20,12 +21,94 @@


# add some basic commands to the CLI
@cli.command(params=[config_files_argument] + list(config_file_options.values()))
@pass_context
def patch(ctx: Context = mock_context, **kwargs) -> None:
"""Merge configuration files"""
@cli.command(
params=[config_files_argument] + list(config_file_options.values()),
context_settings=dict(help_option_names=["-h", "--help"]),
)
@click.pass_context
def patch(ctx: click.Context = mock_context, **kwargs) -> None:
"""Merge configuration files

This command will merge multiple config files together by overriding the top level
keys in config files. The order of the config files is important, as the last file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should frame the example in the way we actually expect people to use the tool: flepimop patch config1.yml config2.yml > confignew.yml. Showing the output still useful so maybe do the piping, followed by a cat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, I've amended the last commit to reflect this.

has the highest priority and the first has the lowest.

A brief example of the command is shown below using the sample config files from the
`examples/tutorials` directory. The command will merge the two files together and
print the resulting configuration to the console.

\b
```bash
$ flepimop patch config_sample_2pop_modifiers_part.yml config_sample_2pop_outcomes_part.yml > config_sample_2pop_patched.yml
$ cat config_sample_2pop_patched.yml
write_csv: false
stoch_traj_flag: false
jobs: 14
write_parquet: true
first_sim_index: 1
config_src: [config_sample_2pop_modifiers_part.yml, config_sample_2pop_outcomes_part.yml]
seir_modifiers:
scenarios: [Ro_lockdown, Ro_all]
modifiers:
Ro_lockdown:
method: SinglePeriodModifier
parameter: Ro
period_start_date: 2020-03-15
period_end_date: 2020-05-01
subpop: all
value: 0.4
Ro_relax:
method: SinglePeriodModifier
parameter: Ro
period_start_date: 2020-05-01
period_end_date: 2020-08-31
subpop: all
value: 0.8
Ro_all:
method: StackedModifier
modifiers: [Ro_lockdown, Ro_relax]
outcome_modifiers:
scenarios: [test_limits]
modifiers:
test_limits:
method: SinglePeriodModifier
parameter: incidCase::probability
subpop: all
period_start_date: 2020-02-01
period_end_date: 2020-06-01
value: 0.5
outcomes:
method: delayframe
outcomes:
incidCase:
source:
incidence:
infection_stage: I
probability:
value: 0.5
delay:
value: 5
incidHosp:
source:
incidence:
infection_stage: I
probability:
value: 0.05
delay:
value: 7
duration:
value: 10
name: currHosp
incidDeath:
source: incidHosp
probability:
value: 0.2
delay:
value: 14
```
"""
parse_config_files(config, ctx, **kwargs)
print(config.dump())
print(_dump_formatted_yaml(config))


if __name__ == "__main__":
Expand Down
25 changes: 18 additions & 7 deletions flepimop/gempyor_pkg/src/gempyor/shared_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,9 @@ def _parse_option(param: click.Parameter, value: Any) -> Any:
config_src = []
if len(found_configs) != 1:
if not found_configs:
raise ValueError(f"No config files provided.")
click.echo("No configuration provided! See help for required usage:\n")
click.echo(ctx.get_help())
ctx.exit()
else:
error_dict = {k: kwargs[k] for k in found_configs}
raise ValueError(
Expand All @@ -237,21 +239,30 @@ def _parse_option(param: click.Parameter, value: Any) -> Any:
)
config_src = _parse_option(config_validator, kwargs[config_key])
cfg.clear()
cfg_data = {}
for config_file in config_src:
tmp = confuse.Configuration("tmp")
tmp.set_file(config_file)
if intersect := set(tmp.keys()) & set(cfg.keys()):
warnings.warn(f"Configuration files contain overlapping keys: {intersect}.")
cfg.set_file(config_file)
if intersect := set(tmp.keys()) & set(cfg_data.keys()):
intersect = ", ".join(sorted(list(intersect)))
raise ValueError(
"Configuration files contain overlapping keys, "
f"{intersect}, introduced by {config_file}."
)
for k in tmp.keys():
cfg_data[k] = tmp[k].get()
cfg.set(cfg_data)
cfg["config_src"] = [str(k) for k in config_src]

# deal with the scenario overrides
scen_args = {k for k in parsed_args if k.endswith("scenarios") and kwargs.get(k)}
for option in scen_args:
scen_args = {k for k in parsed_args if k.endswith("_scenarios")}
for option in {s for s in scen_args if kwargs.get(s)}:
key = option.replace("_scenarios", "")
value = _parse_option(config_file_options[option], kwargs[option])
if cfg[key].exists():
cfg[key]["scenarios"] = as_list(value)
cfg[key]["scenarios"] = (
list(value) if isinstance(value, tuple) else as_list(value)
)
else:
raise ValueError(
f"Specified {option} when no {key} in configuration file(s): {config_src}"
Expand Down
4 changes: 3 additions & 1 deletion flepimop/gempyor_pkg/src/gempyor/simulate.py
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,9 @@ def simulate(


@cli.command(
name="simulate", params=[config_files_argument] + list(config_file_options.values())
name="simulate",
params=[config_files_argument] + list(config_file_options.values()),
context_settings=dict(help_option_names=["-h", "--help"]),
)
@pass_context
def _click_simulate(ctx: Context, **kwargs) -> int:
Expand Down
72 changes: 72 additions & 0 deletions flepimop/gempyor_pkg/src/gempyor/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import scipy.ndimage
import scipy.stats
import sympy.parsing.sympy_parser
import yaml

from . import file_paths

Expand Down Expand Up @@ -1039,3 +1040,74 @@ def move_file_at_local(name_map: dict[str, str]) -> None:
for src, dst in name_map.items():
os.path.makedirs(os.path.dirname(dst), exist_ok=True)
shutil.copy(src, dst)


def _dump_formatted_yaml(cfg: confuse.Configuration) -> str:
"""
Dump confuse configuration to a formatted YAML string.

Args:
cfg: The confuse configuration object.

Returns:
A formatted YAML string representation of the configuration.

Examples:
>>> from gempyor.utils import _dump_formatted_yaml
>>> import confuse
>>> conf = confuse.Configuration("foobar")
>>> data = {
... "name": "Test Config",
... "compartments": {
... "infection_stage": ["S", "E", "I", "R"]
... },
... "seir": {
... "parameters": {
... "beta": {"value": 3.4},
... "gamma": {"value": 5.6},
... },
... "transitions": {
... "source": ["S"],
... "destination": ["E"],
... "rate": ["beta * gamma"],
... "proportional_to": [["S"], ["I"]],
... "proportion_exponent": [1, 1],
... },
... },
... }
>>> conf.set(data)
>>> print(_dump_formatted_yaml(conf))
name: "Test Config"
compartments:
infection_stage: [S, E, I, R]
seir:
parameters:
beta:
value: 3.4
gamma:
value: 5.6
transitions:
source: [S]
destination: [E]
rate: ["beta * gamma"]
proportional_to: [[S], [I]]
proportion_exponent: [1, 1]
"""

class CustomDumper(yaml.Dumper):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.add_representer(list, self._represent_list)
self.add_representer(str, self._represent_str)

def _represent_list(self, dumper, data):
return dumper.represent_sequence("tag:yaml.org,2002:seq", data, flow_style=True)

def _represent_str(self, dumper, data):
if " " in data:
return dumper.represent_scalar("tag:yaml.org,2002:str", data, style='"')
return dumper.represent_scalar("tag:yaml.org,2002:str", data)

return yaml.dump(
yaml.safe_load(cfg.dump()), Dumper=CustomDumper, indent=4, sort_keys=False
)
Loading
Loading