Skip to content

Commit

Permalink
feat: A basic simulator (#197)
Browse files Browse the repository at this point in the history
  • Loading branch information
stufisher authored Jul 22, 2022
1 parent 9a7eb6a commit c087e76
Show file tree
Hide file tree
Showing 10 changed files with 688 additions and 1 deletion.
66 changes: 66 additions & 0 deletions docs/simulator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Simulator

`ispyb.simulate` creates a new DataCollection row in the ISPyB database from a simple yaml definition. It creates a data collection, related sample information, and associated shipping entities. It then copies some raw data and associated snapshots (and thumbnails).

Simulate a data collection:

```bash
ispyb.simulate <beamline> <experiment>
ispyb.simulate bm23 energy_scan1
```

The simulator will create hierarchically a component (`Protein`), related `BLSample` (with intermediate `Crystal`), and potentially a `SubSample`, contained within a `Container`, `Dewar`, and `Shipment` belonging to the specified `Proposal` if they do not already exist with the defined name. Then the simulator creates a `DataCollection` and `DataCollectionGroup`, linked to the relevant `BLSample` and `BLSession`. If grid info information is specified it will also create an entry in `GridInfo`

## Configuration

The configuration file location is defined via the `SIMULATE_CONFIG` environment variable. An example configuration is available in `examples/simulation.yml`. The structure and requirements of this file are documented in the example.

Each entry in `experiments` represents a different data collection. The `experimentType` column relates to a `DataCollectionGroup.experimentType` entry so must match one of the available types in the database. See [experimentType](https://github.com/ispyb/ispyb-database/blob/main/schema/1_tables.sql#L1518)s for a full list.

## Available columns per table

The ISPyB tables are large, and as such only a subset of the columns are exposed by this simulator, the most pertinent in order to create usable data collections and associated entries. These are as listed below for each table.

### Component (Protein)

- acronym
- name
- sequence
- density
- molecularMass
- description

### BLSample

- name

### BLSubSample

- x
- y
- x2
- y2
- type

### DataCollection

- imageContainerSubPath
- numberOfImages
- wavelength
- exposureTime
- xtalSnapshotFullPath1-4

### GridInfo

- steps_x
- steps_y
- snapshot_offsetXPixel
- snapshot_offsetYPixel
- dx_mm
- dy_mm
- pixelsPerMicronX
- pixelsPerMicronY

## Plugins

The simulator can trigger events before and after the data is copied using the `ispyb.simulator.before_datacollection` and `ispyb.simulator.after_datacollection` entry points. These are passed just the new `DataCollection.dataCollectionId`.
91 changes: 91 additions & 0 deletions examples/simulation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Whether to link or copy data
copy_method: copy

# Map each beamline to a session
sessions:
bl: blc00001-1

# Where to copy raw data from
raw_data: /data/ispyb-test

# Where to write simulated data to, can use {beamline} placeholder
data_dir: /data/tests/{beamline}/simulation

ispyb_url: https://ispyb.esrf.fr

# Define Components (Proteins)
components:
# an internal reference for the component
comp1:
# columns to populate for this component
acronym: Component1
sequence: SiSP
molecularMass: 12.5

comp2:
acronym: Component2

# Define BLSamples
samples:
# an internal reference for this sample
samp1:
# columns to populate for this sample
name: Sample1
# which component this sample is an instance of (one of the keys in components above)
component: comp1

samp2:
name: Sample2
component: comp2

# Define Experiments (DataCollections)
experiments:
# a shortname for this experiment (available via cli)
energy_scan1:
# the experimentType, must map to a valid type in DataCollectionGroup.experimentType
experimentType: OSC
# data will be split into its respective imageDirectory and fileTemplate columns
data: osc/oscillation.h5
# which sample to link this data collection to (one of the keys in samples above)
sample: samp1

# columns to populate
# xtalSnapshot thumbnails should have a trailing t
# Fullsize image: osc/snapshot1.png
# Thumbnail: osc/snapshot1t.png
xtalSnapshotFullPath1: osc/snapshot1.png
numberOfImages: 4001
exposureTime: 1
#energy: 8.8143
wavelength: 1.4065
imageContainerSubPath: 1.1/measurement

xrf_map1:
experimentType: Mesh
data: mesh/mesh.h5
sample: samp1

xtalSnapshotFullPath1: mesh/snapshot1.png
numberOfImages: 1600
exposureTime: 0.03
#energy: 2.4817
wavelength: 4.9959

# additionally populate GridInfo
grid:
steps_x: 40
steps_y: 40
dx_mm: 0.001
dy_mm: 0.001
pixelsPerMicronX: -0.44994
pixelsPerMicronY: -0.46537
snapshot_offsetXPixel: 682.16
snapshot_offsetYPixel: 554

# additionally populate BlSubSample
subsample:
x: 9038007
y: 24467003
x2: 9078007
y2: 24507003
type: roi
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ nav:
- Routes: routes.md
- ⧉ Tests coverage: https://app.codecov.io/gh/ispyb/py-ispyb/
- ⧉ Endpoints documentation: https://ispyb.github.io/py-ispyb/api/
- Simulator: simulator.md
theme:
name: material
features:
Expand Down
4 changes: 4 additions & 0 deletions pyispyb/app/extensions/database/definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@
models.BLSession.visit_number,
).label("session")

_proposal = sqlalchemy.func.concat(
models.Proposal.proposalCode, models.Proposal.proposalNumber
).label("proposal")


def get_blsession(session: str) -> Optional[models.BLSession]:
return (
Expand Down
2 changes: 2 additions & 0 deletions pyispyb/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ class Settings(BaseSettings):

cors: bool = False

simulation_config: str = None

class Config:
env_file = get_env_file()

Expand Down
Empty file added pyispyb/simulation/__init__.py
Empty file.
79 changes: 79 additions & 0 deletions pyispyb/simulation/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
from abc import ABC, abstractmethod
from contextlib import contextmanager
import logging
import os
import pkg_resources
from typing import Any

import yaml

from ..config import settings
from ..app.extensions.database.session import _session

logger = logging.getLogger(__name__)


def load_config() -> dict[str, Any]:
if not settings.simulation_config:
raise RuntimeError("`SIMULATION_CONFIG` environment variable is not defined")

if not os.path.exists(settings.simulation_config):
raise AttributeError(f"Cannot find config file: `{settings.simulation_config}`")

config = {}
with open(settings.simulation_config, "r") as stream:
config = yaml.safe_load(stream)

return config


class Simulation(ABC):
def __init__(self):
self._config = load_config()

@property
def config(self) -> dict[str, Any]:
return self._config

@contextmanager
def session(self):
db_session = _session()
try:
yield db_session
db_session.commit()
except Exception as e: # noqa
db_session.rollback()
raise
finally:
db_session.close()

@property
def beamlines(self) -> list[str]:
return self.config["sessions"].keys()

@property
def experiment_types(self) -> list[str]:
return self.config["experiments"].keys()

def before_start(self, dataCollectionId: int) -> None:
for entry in pkg_resources.iter_entry_points(
"ispyb.simulator.before_datacollection"
):
fn = entry.load()
logger.info(f"Executing before start plugin `{entry.name}`")
fn(dataCollectionId)

def after_end(self, dataCollectionId: int) -> None:
for entry in pkg_resources.iter_entry_points(
"ispyb.simulator.after_datacollection"
):
fn = entry.load()
logger.info(f"Executing after end plugin `{entry.name}`")
fn(dataCollectionId)

def do_run(self, *args, **kwargs) -> None:
self.run(*args, **kwargs)

@abstractmethod
def run(self, *args, **kwargs) -> None:
pass
51 changes: 51 additions & 0 deletions pyispyb/simulation/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
import argparse
import logging

from .datacollection import SimulateDataCollection


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


def run() -> None:
try:
sdc = SimulateDataCollection()
except AttributeError as e:
exit(f"Simulation Error: {e}")

parser = argparse.ArgumentParser(description="ISPyB simulation tool")
parser.add_argument(
"beamline", help="Beamline to run simulation against", choices=sdc.beamlines
)

parser.add_argument(
"experiment", help="Experiment to simluate", choices=sdc.experiment_types
)

parser.add_argument(
"--delay",
default=5,
type=int,
dest="delay",
help="Delay between plugin start and end events",
)
parser.add_argument(
"--debug",
action="store_true",
help="Enable debug output",
)

args = parser.parse_args()

root = logging.getLogger()
root.setLevel(level=logging.DEBUG if args.debug else logging.INFO)

try:
sdc.do_run(args.beamline, args.experiment, delay=args.delay)
except Exception as e:
if args.debug:
logger.exception("Simulation Error")
print(e)
else:
print(f"Simulation Error: {e}")
Loading

0 comments on commit c087e76

Please sign in to comment.