Identifying Entites in Healthcare Data Using Natural Language Processing

Problem Statement

BeHealthy has a web platform that allows doctors to list their services and manage patient interactions and provides services for patients such as booking interactions with doctors and ordering medicines online. Here, doctors can easily organise appointments, track past medical records and provide e-prescriptions. So, companies like ‘BeHealthy’ are providing medical services, prescriptions and online consultations and generating huge data day by day.

Datasets

There are four datasets provided to you to process, which are as follows: train_sent test_sent train_label test_label

Actions:

You need to process and modify the data into sentence format. This step has to be done for the 'train_sent' and ‘train_label’ datasets and for test datasets as well.
After that, you need to define the features to build the CRF model.
Then, you need to apply these features in each sentence of the train and the test dataset to get the feature values.
Once the features are computed, you need to define the target variable and then build the CRF model.
Then, you need to perform the evaluation using a test data set.
After that, you need to create a dictionary in which diseases are keys and treatments are values.

There are eight major tasks we have to perform. They are as follows:

Data preprocessing
Concept identification
Defining the features for CRF
Getting the features words and sentences
Defining input and target variables
Building the model
Evaluating the model
Identifying the diseases and predicted treatment using a custom NER

Python Packages Used

Numpy - version 1.23.5
Pandas - version 1.5.3
Matplotlib - version 3.7.1
Seaborne - version 0.12.2
Scikit-learn - version 1.2.2
sklearn-crfsuite - version 0.3.6
Spacy - Version: 3.7.4

Contributor

Created by Anupam Maiti - feel free to contact me!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ReadMe.md		ReadMe.md
Syntactic_Processing_Healthcare_Data_Anupam_v2.ipynb		Syntactic_Processing_Healthcare_Data_Anupam_v2.ipynb
test_label		test_label
test_sent		test_sent
train_label		train_label
train_sent		train_sent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identifying Entites in Healthcare Data Using Natural Language Processing

Problem Statement

Datasets

Actions:

Python Packages Used

Contributor

About

Releases

Languages

dynamicanupam/Custom_Entity_detection_in_Healthcare_data

Folders and files

Latest commit

History

Repository files navigation

Identifying Entites in Healthcare Data Using Natural Language Processing

Problem Statement

Datasets

Actions:

Python Packages Used

Contributor

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages