-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
executable file
·27 lines (16 loc) · 1.77 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
title: "Readme"
author: "Quynh Tran"
date: "1/21/2022"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
Precision medicine for cancer treatment relies on accurate pathological diagnosis. The number of known tumor classes has increased rapidly; hence, reliance on traditional methods of histopathologic classification alone has become unfeasible. To help reduce variability, validation costs, and improve standardization of the diagnostic process, molecular classification methods that employ supervised machine learning models trained on DNA-methylation data have been developed. These methods require large labeled training data sets to obtain clinically acceptable classification accuracy. While there is abundant unlabeled epigenetic data across multiple databases, labeling pathology data for machine learning models is time-consuming and resource-intensive. Semi-supervised learning (SSL) approaches have been used to maximize the utility of labeled and unlabeled data for classification tasks and effectively applied in genomics. SSL methods have not yet been explored with epigenetic data nor demonstrated beneficial to central nervous system (CNS) tumor classification.
## Objectives
* This study explores the application of semi-supervised machine learning using methylation data to objectively perform labeling on unlabeled samples from brain tumor patients.
* The study then demonstrates the utility of the SSL labeled data in improving accuracy of CNS tumor classification by adding these data to the training data sets of supervised classifiers.
## Paper for citation
This work has been published on BMC Bioinformatics and is available [here](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-04764-1).