Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Medication Value Sets for More Realistic Data Generation Description #69

Open
jenniferjiangkells opened this issue Sep 30, 2024 · 4 comments
Labels
Component: Data Generator Issue/PR that addresses DataGenerator methods hacktoberfest Issues suitable for hacktoberfest Help Wanted: Domain Knowledge 🎓 Issues that require outside contribution of health-specific domain knowledge non-code-contribution Issue/PR that are non-code contributions

Comments

@jenniferjiangkells
Copy link
Member

Description

Improve MedicationRequestMedication value set to generate more realistic and comprehensive medication data. Value sets are currently SNOMED CT codes in Virtual Therapeutic Moiety (medicinal product) form.

The current implementation provides a basic list of common medications (generated by ChatGPT), but it has not been verified and may lack the depth and variety needed to simulate more realistic real-world data.

@dataclass
class MedicationRequestMedication(ValueSet):
system: SimpleCodeSystem = SimpleCodeSystem.snomedct
extension: CodeExtension = CodeExtension.uk
value_set: List[Dict] = field(
default_factory=lambda: [
{"code": "774656009", "display": "Aspirin"},
{"code": "773455007", "display": "Atorvastatin"},
{"code": "776713006", "display": "Metformin"},
{"code": "776550005", "display": "Lisinopril"},
{"code": "776526008", "display": "Levothyroxine"},
{"code": "774557006", "display": "Amlodipine"},
{"code": "777537002", "display": "Simvastatin"},
{"code": "777537002", "display": "Omeprazole"},
{"code": "776577001", "display": "Losartan"},
{"code": "777483005", "display": "Salbutamol"},
{"code": "776060008", "display": "Gabapentin"},
{"code": "776226009", "display": "Hydrochlorothiazide"},
{"code": "776052007", "display": "Furosemide"},
{"code": "776770001", "display": "Metoprolol"},
{"code": "777059008", "display": "Pantoprazole"},
{"code": "777310000", "display": "Prednisone"},
{"code": "776824002", "display": "Montelukast"},
{"code": "776016008", "display": "Fluticasone"},
{"code": "774586009", "display": "Amoxicillin"},
{"code": "777521008", "display": "Sertraline"},
{"code": "777990007", "display": "Zolpidem"},
{"code": "777816004", "display": "Tramadol"},
{"code": "777947006", "display": "Warfarin"},
{"code": "777027001", "display": "Oxycodone"},
{"code": "777673003", "display": "Tamsulosin"},
]
)

Context

Realistic medication data is crucial for:

  1. Testing clinical decision support systems
  2. Simulating diverse patient populations
  3. Ensuring our generated data covers a wide range of medical scenarios
  4. Improving the overall quality and usefulness of our synthetic healthcare data
  5. Make the generated data more valuable for testing and development purposes.

Possible Implementation

  1. Expand the current list of medications to include a broader range of drugs across various therapeutic categories.
  2. Add additional attributes to each medication entry, such as:
    • Dosage forms (e.g., tablet, capsule, injection)
    • Typical dosage strengths
    • Route of administration
  3. Include less common medications to represent more specialised treatments.
  4. Implement a weighting system to reflect the relative frequency of prescription for each medication.
  5. Add other code systems or extension systems - currently all codes are SNOMED CT UK edition verified by the SNOMED CT browser https://termbrowser.nhs.uk/

Example of an enhanced medication entry:

{
    "code": "774656009",
    "display": "Aspirin",
    "dosage_forms": ["tablet", "capsule"],
    "strengths": ["81 mg", "325 mg"],
    "route": "oral",
    "frequency_weight": 0.8
}
@jenniferjiangkells jenniferjiangkells added good first issue Good for newcomers Help Wanted: Domain Knowledge 🎓 Issues that require outside contribution of health-specific domain knowledge Component: Data Generator Issue/PR that addresses DataGenerator methods hacktoberfest Issues suitable for hacktoberfest labels Sep 30, 2024
@Aryanil-codes
Copy link

Aryanil-codes commented Oct 4, 2024

Can i get assigned this issue?
And do i need to be a certified doctor working on this? or if I can google well enough and have a basic understanding is that alright too?
Thanks

@deevyanshoo
Copy link

Is this already assigned? I can work on this, I have prior knowledge about healthcare domain

@jenniferjiangkells
Copy link
Member Author

Hi @deevyanshoo @Aryanil-codes - first of all thanks for both your interest in contributing! ⭐ This issue is more about the curation of clinical knowledge and is more suited to people with domain knowledge. However we could really do with some help on improving the structure and configuration of the data classes as well, so I will open these as separate issues and tag you both in it if you're still interested in working on this! Just comment on the issues to let us know that you've started working on it. 😄

@jenniferjiangkells jenniferjiangkells added non-code-contribution Issue/PR that are non-code contributions and removed good first issue Good for newcomers labels Oct 5, 2024
@jenniferjiangkells
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Data Generator Issue/PR that addresses DataGenerator methods hacktoberfest Issues suitable for hacktoberfest Help Wanted: Domain Knowledge 🎓 Issues that require outside contribution of health-specific domain knowledge non-code-contribution Issue/PR that are non-code contributions
Projects
Status: Todo
Development

No branches or pull requests

3 participants