Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

240506 SP24 Course PR #293

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
da38a55
Create text2mol_dataset.py
darinz Apr 17, 2024
2cd0b2f
Added Drug Interaction dataset
bjones325 Apr 30, 2024
963eda4
Add files via upload
tamimuiuc May 1, 2024
fa36fe5
Add files via upload
tamimuiuc May 1, 2024
5975db6
Add files via upload
tamimuiuc May 1, 2024
e3308eb
Add files via upload
tamimuiuc May 1, 2024
9bf99d7
Add files via upload
tamimuiuc May 1, 2024
afa77b5
Add heart failure prediction task.
johnbiggan May 2, 2024
fd45fe6
Update heart_failure_prediction.py
johnbiggan May 2, 2024
31b69fe
gpm_dataset.py created for GPM dataset
mayank-sharma-16 May 2, 2024
bdccdf4
added output example and included gpm in __init__
mayank-sharma-16 May 2, 2024
16d29c9
some tidying
mayank-sharma-16 May 3, 2024
311d9d1
mimic4_umse for DL4H
May 5, 2024
3627b85
add drive dataset
ixrfish May 6, 2024
2a9e26a
revert changes to tuab and tuev
ixrfish May 6, 2024
551eb8a
newlines
ixrfish May 6, 2024
b013531
added condition recommendation task
May 8, 2024
9be3827
Merge pull request #281 from darinz/cs598dlh_text2mol_dataset
ycq091044 May 8, 2024
4fa081a
Merge pull request #284 from bjones325/cs598_drug_interaction_dataset
ycq091044 May 8, 2024
38eca66
Merge pull request #285 from tamimuiuc/develop
ycq091044 May 8, 2024
2f8a64a
rm some unused files
ycq091044 May 9, 2024
2a4d97e
Merge pull request #286 from johnbiggan/jbiggan-hf_task
ycq091044 May 9, 2024
82e6cc9
minor
ycq091044 May 9, 2024
f257d71
add hf prediction task; need to check correctness
ycq091044 May 9, 2024
c3cb243
Merge branch '240506-sp24-course-pr' into master
ycq091044 May 9, 2024
698fcf2
Merge pull request #288 from mayank-sharma-16/master
ycq091044 May 9, 2024
a85cf28
change default path to pyhealth cache path
ycq091044 May 9, 2024
c4060b1
Merge pull request #289 from jhnwu3/CS598/mimic4_umse
ycq091044 May 9, 2024
9e7e6b0
add previous generative model notebooks
ycq091044 May 9, 2024
2ddd2b3
Merge branch '240506-sp24-course-pr' into irisf.drive_dataset
ycq091044 May 9, 2024
8b95b3b
Merge pull request #290 from ixrfish/irisf.drive_dataset
ycq091044 May 9, 2024
a91110e
Adding ACDC dataset CS598 dhl extra credit from Sherry Li.
May 12, 2024
3d8500e
Update comment.
May 12, 2024
4c83c8a
Merge remote-tracking branch 'upstream/240506-sp24-course-pr' into cs…
May 12, 2024
0e85349
Merge with upstream
May 12, 2024
d38036c
Merge with upstream
May 12, 2024
ffa56e7
Merge with upstream
May 12, 2024
7fa1db0
Merge with upstream.
May 12, 2024
d8fa943
Merge pull request #294 from sherrylinice/cs-598-dhl-sherry-extra-cre…
ycq091044 May 12, 2024
9becc2a
Merge pull request #292 from san2ds/master
ycq091044 May 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/api/datasets/pyhealth.datasets.GPMDataset.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
pyhealth.datasets.GPMDataset
===================================

This class defines a PyTorch Dataset for processing proteomics data
for multiple species from the Global Proteomics Database at:
https://gpmdb.thegpm.org/thegpm-cgi/peptides_by_species.pl

.. autoclass:: pyhealth.datasets.GPMDataset
:members:
:undoc-members:
:show-inheritance:






2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,5 @@ bokeh==3.0.1
gspread==5.6.2
google-cloud-storage==2.6.0
oauth2client==4.1.3
Jinja2==3.1.4
Jinja2==3.1.3
flask==2.2.5
642 changes: 642 additions & 0 deletions examples/chextXray_image_generation_diffusion.ipynb

Large diffs are not rendered by default.

46 changes: 46 additions & 0 deletions examples/hf_mimic3_rnn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
from pyhealth.datasets import MIMIC3Dataset
from pyhealth.datasets import split_by_patient, get_dataloader
from pyhealth.models import RNN
from pyhealth.tasks import hf_prediction_mimic3_fn
from pyhealth.trainer import Trainer

# STEP 1: load data
base_dataset = MIMIC3Dataset(
root="/srv/local/data/physionet.org/files/mimiciii/1.4",
tables=["DIAGNOSES_ICD", "PROCEDURES_ICD", "PRESCRIPTIONS"],
code_mapping={"ICD9CM": "CCSCM"},
dev=True,
refresh_cache=False,
)
base_dataset.stat()

# STEP 2: set task
sample_dataset = base_dataset.set_task(hf_prediction_mimic3_fn)
sample_dataset.stat()

train_dataset, val_dataset, test_dataset = split_by_patient(
sample_dataset, [0.8, 0.1, 0.1]
)
train_dataloader = get_dataloader(train_dataset, batch_size=32, shuffle=True)
val_dataloader = get_dataloader(val_dataset, batch_size=32, shuffle=False)
test_dataloader = get_dataloader(test_dataset, batch_size=32, shuffle=False)

# STEP 3: define model
model = RNN(
dataset=sample_dataset,
feature_keys=["conditions", "procedures", "drugs"],
label_key="label",
mode="binary",
)

# STEP 4: define trainer
trainer = Trainer(model=model)
trainer.train(
train_dataloader=train_dataloader,
val_dataloader=val_dataloader,
epochs=50,
monitor="roc_auc",
)

# STEP 5: evaluate
trainer.evaluate(test_dataloader)
Loading