-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from thehyve/first-version
First version of the CDM
- Loading branch information
Showing
68 changed files
with
11,499 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<!--- | ||
Thank you for your soon-to-be pull request. Before you submit this, please | ||
double check to make sure that you've added an entry to CHANGELOG.md. | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
name: Ruff | ||
on: [push, pull_request] | ||
jobs: | ||
ruff: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Set up Python | ||
uses: actions/setup-python@v4 | ||
- name: Install Poetry | ||
uses: snok/install-poetry@v1 | ||
- name: Install nox | ||
run: pip install nox | ||
- name: Run ruff linter via nox session | ||
run: nox -s lint |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# This workflow will install Python dependencies, run tests with a range of Python versions | ||
|
||
name: tests | ||
|
||
on: [push, pull_request] | ||
|
||
jobs: | ||
tests: | ||
runs-on: ubuntu-latest | ||
|
||
strategy: | ||
matrix: | ||
python-version: [ | ||
'3.8', | ||
'3.9', | ||
'3.10', | ||
'3.11', | ||
'3.12', | ||
] | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install Poetry | ||
uses: snok/install-poetry@v1 | ||
- name: Install package and dependencies | ||
run: poetry install | ||
- name: Test with pytest | ||
run: poetry run pytest -vs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# See https://pre-commit.com for more information | ||
# See https://pre-commit.com/hooks.html for more hooks | ||
repos: | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v4.5.0 | ||
hooks: | ||
- id: trailing-whitespace | ||
- id: end-of-file-fixer | ||
- id: check-yaml | ||
- id: check-added-large-files | ||
- repo: https://github.com/astral-sh/ruff-pre-commit | ||
# Ruff version. | ||
rev: v0.3.2 | ||
hooks: | ||
# Run the linter. | ||
- id: ruff | ||
args: [ --fix ] | ||
# Run the formatter. | ||
- id: ruff-format |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Changelog | ||
|
||
## v0.1.0 | ||
First release. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# omop-cdm | ||
|
||
omop-cdm is a Python package that contains SQLAlchemy declarative table definitions of several | ||
versions of the [OHDSI OMOP CDM](https://ohdsi.github.io/CommonDataModel/). | ||
|
||
## Installation | ||
|
||
omop-cdm requires Python >= 3.8. | ||
|
||
Install from PyPI: | ||
```shell | ||
pip install omop-cdm | ||
``` | ||
|
||
## Usage | ||
|
||
See [User documentation](docs/README.md) | ||
|
||
## Supported databases | ||
The omop-cdm table definitions are tested to be compatible with PostgreSQL. | ||
|
||
Though not officially supported, omop-cdm doesn't use postgres-specific features | ||
of SQLAlchemy, so it can likely be used for other database types as well. | ||
|
||
## CDM versions | ||
omop-cdm contains table defintions for the following CDM versions: | ||
- CDM 5.4 | ||
- CDM 5.3.1 | ||
- CDM 6.0.0 ([not recommended](https://ohdsi.github.io/CommonDataModel/cdm60.html#NOTE_ABOUT_CDM_v60)) | ||
|
||
## Development | ||
|
||
### Setup steps | ||
|
||
- Make sure [Poetry](https://python-poetry.org/docs/#installation) is installed. | ||
- Install the project and dependencies via `poetry install`. | ||
- Set up the pre-commit hook scripts via `poetry run pre-commit install`. | ||
|
||
### Nox sessions | ||
|
||
Several developer actions (e.g. run tests, code format, lint) are available | ||
via [nox](https://nox.thea.codes/en/stable/) sessions. | ||
For a complete list, run: | ||
```shell | ||
nox --list | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
# Usage | ||
|
||
omop-cdm provides SQLAlchemy declarative table definitions that can be used | ||
to interact with an OMOP CDM via Python. This can be to more easily create all tables in | ||
your database, but also to use them for data manipulation | ||
(see [SQLAlchemy data manipulation](https://docs.sqlalchemy.org/en/20/tutorial/orm_data_manipulation.html)). | ||
|
||
## Preparation | ||
|
||
Before creating the OMOP CDM tables, the following must be available: | ||
|
||
- Target database | ||
- Schema (or schemas) in which to create the tables | ||
|
||
## Schemas | ||
|
||
omop-cdm uses placeholders for the schema names, which are intended to be | ||
replaced by the desired schema names. This can be done via the `schema_translate_map` | ||
property of SQLAlchemy. By default, the vocabulary tables have a different schema | ||
than the other tables, but both placeholders can be mapped to the same target | ||
schema, to make sure all tables are in a single schema (recommended). | ||
|
||
This example shows how to set a single schema "cdm54" via the SQLAlchemy engine: | ||
```python | ||
from omop_cdm.constants import CDM_SCHEMA, VOCAB_SCHEMA | ||
from sqlalchemy import create_engine | ||
|
||
|
||
OMOP_SCHEMA = "cdm54" | ||
|
||
schema_map = { | ||
CDM_SCHEMA: OMOP_SCHEMA, | ||
VOCAB_SCHEMA: OMOP_SCHEMA, | ||
} | ||
|
||
engine = create_engine("postgresql://postgres@localhost/mydb") | ||
engine = engine.execution_options(schema_translate_map=schema_map) | ||
``` | ||
|
||
## Regular vs Dynamic CDM | ||
|
||
The CDM table definitions of a particular CDM release can be imported in two different | ||
ways. | ||
|
||
### Regular | ||
To use a standard set of CDM tables, you can import the module containing the | ||
definitions as follows: | ||
```python | ||
from omop_cdm.regular import cdm54 | ||
``` | ||
In this regular fashion, all tables are already bound to a SQLAlchemy | ||
`DeclarativeBase` class. This approach leaves fewer options for CDM modification, | ||
but is slightly simpler to use. | ||
|
||
E.g. once the engine is defined, creating all the tables in a database can be done | ||
by simply running the following: | ||
|
||
```python | ||
with engine.begin() as conn: | ||
cdm54.Base.metadata.create_all(bind=conn) | ||
``` | ||
|
||
### Dynamic | ||
|
||
Alternatively, you can use the dynamic CDM definitions. Here tables have not been bound to | ||
a SQLAlchemy `DeclarativeBase` class yet, but are defined as regular classes, which can | ||
then be used as [mixins](https://docs.sqlalchemy.org/en/20/orm/declarative_mixins.html). | ||
|
||
To use these, the classes must first be bound to a `Base`: | ||
```python | ||
from omop_cdm.dynamic import cdm54 | ||
from sqlalchemy.orm import DeclarativeBase | ||
|
||
|
||
class Base(DeclarativeBase): | ||
pass | ||
|
||
|
||
class Person(cdm54.BasePersonCdm54, Base): | ||
pass | ||
|
||
# Etc. for all other tables | ||
``` | ||
This approach allows for the greatest customization possibilities. | ||
|
||
Additionally, the dynamic version includes two additional tables which can optionally | ||
be added to your CDM. These are the `StemTable` (intermediate table for mapping purposes) | ||
and the `StcmVersion` table (to allow versioning of STCM source vocabulary mappings). | ||
|
||
## Legacy tables | ||
|
||
Apart from the standard CDM tables, the following legacy table classes can be | ||
added to your CDM by binding them to your `DeclarativeBase`: | ||
|
||
- AttributeDefinition | ||
- Cohort | ||
- CohortAttribute | ||
- CohortDefinition | ||
|
||
These tables have been part of the standard CDM in previous releases, but have since been moved | ||
to the results schema or removed entirely. | ||
|
||
They can be imported from `omop_cdm.dynamic.legacy`. | ||
|
||
## Customizing the CDM | ||
|
||
> **_NOTE:_** When adding or replacing columns, you can determine the position of the column in the table by setting | ||
a ``sort_order`` value inside ``mapped_column()``. The default CDM table columns have a ``sort_order`` | ||
value pre-assigned with increments of 100 (column1 = 100, column2 = 200, etc.). This only works for tables | ||
> imported from the dynamic module. | ||
### New columns | ||
If desired, you can add custom fields to existing CDM tables. | ||
For dynamic tables, define the table class as normal, but add additional `mapped_column` fields with the | ||
data types of choice. E.g. to add an integer column to the person table: | ||
|
||
```python | ||
class Person(BasePersonCdm54, Base): | ||
favorite_number: Mapped[Optional[int]] = mapped_column(Integer) | ||
``` | ||
|
||
For tables from the regular modules, use the following syntax: | ||
```python | ||
Person.favorite_number = Column(Integer, nullable=True) | ||
``` | ||
|
||
|
||
### Replace columns | ||
|
||
> **_NOTE:_** Replacing columns is only possible with dynamic table definitions. | ||
Although not recommended, you can replace existing CDM table columns with a different type of column. | ||
E.g. removing the character limit of the `ethnicity_source_value` column in the person table, by replacing | ||
it with a `Text` data type, can be done as follows: | ||
|
||
```python | ||
from typing import Optional | ||
|
||
from omop_cdm.dynamic import cdm54 as cdm | ||
from sqlalchemy import Text | ||
from sqlalchemy.orm import Mapped, mapped_column | ||
|
||
|
||
class Person(cdm.BasePersonCdm54, Base): | ||
ethnicity_source_value: Mapped[Optional[str]] = mapped_column(Text) | ||
``` | ||
|
||
It's important to note that replacing columns can only be done if the replacement | ||
doesn't break relationships with other tables. | ||
For example, replacing the `Integer` `person_id` column with `BigInteger` in the person table | ||
is possible. Replacing it with a column of type ``Text`` is not, as it breaks FK relationships | ||
that other CDM tables have with this field. | ||
|
||
### Replace whole table | ||
|
||
> **_NOTE:_** Replacing tables is only possible with dynamic table definitions. | ||
Instead of adding or replacing individual columns, it's also possible to replace an entire table. | ||
To do that, remove the inherited table base class from your table class and define all fields yourself. | ||
E.g. for the person table: | ||
|
||
```python | ||
from omop_cdm.constants import CDM_SCHEMA | ||
|
||
class Person(Base): | ||
__tablename__ = "person" | ||
__table_args__ = {"schema": CDM_SCHEMA} | ||
# Define all columns here | ||
``` | ||
|
||
Just like with modifying individual columns, this will only work if no violations occur in relationships | ||
with other CDM tables. | ||
|
||
Add new tables | ||
-------------- | ||
In addition to the default CDM tables, you can also add your own custom tables to the model. | ||
|
||
By adding these tables to the same `DeclarativeBase` as the regular tables, they will become part | ||
of the ORM. For example: | ||
|
||
```python | ||
from omop_cdm.constants import CDM_SCHEMA | ||
from sqlalchemy import ForeignKey, Boolean | ||
from sqlalchemy.orm import Mapped, mapped_column, relationship | ||
|
||
|
||
class Narcissus(Base): | ||
__tablename__ = 'narcissus' | ||
__table_args__ = {'schema': CDM_SCHEMA} | ||
|
||
person_id: Mapped[int] = mapped_column(ForeignKey('cdm_schema.person.person_id')) | ||
loved_by_narcissus: Mapped[bool] = mapped_column(Boolean, default=False) | ||
|
||
person: Mapped["Person"] = relationship("Person") | ||
``` | ||
|
||
### Schema name | ||
When adding your own tables, it's a good practice to specify a schema name via ``__table_args__`` | ||
(see example above). If the schema will always be the same, there is no harm in hard coding the name. | ||
Otherwise, it's better to provide the schema placeholder name, and let the runtime schema name be | ||
determined by your `schema_translate_map`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
import nox # type: ignore | ||
|
||
nox.options.sessions = [ | ||
"tests", | ||
"lint", | ||
] | ||
|
||
python = [ | ||
"3.8", | ||
"3.9", | ||
"3.10", | ||
"3.11", | ||
"3.12", | ||
] | ||
|
||
|
||
@nox.session(python=python) | ||
def tests(session: nox.Session): | ||
"""Run pytest + code coverage.""" | ||
session.run("poetry", "install", external=True) | ||
session.run("pytest", "--cov-report", "term-missing", "--cov=src") | ||
|
||
|
||
@nox.session(reuse_venv=True, name="format") | ||
def format_all(session: nox.Session): | ||
"""Format codebase with ruff.""" | ||
session.run("poetry", "install", "--only", "dev", external=True) | ||
session.run("ruff", "format") | ||
# format imports according to isort via ruff check | ||
session.run("ruff", "check", "--select", "I", "--fix") | ||
|
||
|
||
@nox.session(reuse_venv=True) | ||
def lint(session: nox.Session): | ||
"""Run ruff linter.""" | ||
session.run("poetry", "install", "--only", "dev", external=True) | ||
# Run the ruff linter | ||
session.run("ruff", "check") | ||
# Check if any code requires formatting via ruff format | ||
session.run("ruff", "format", "--diff") |
Oops, something went wrong.