Skip to content

Latest commit

 

History

History
222 lines (171 loc) · 8.52 KB

README.md

File metadata and controls

222 lines (171 loc) · 8.52 KB

Introduction to Computer Programming for Biomedical Data Science

BMI 6018/MDCRC 6521

Instructors

Prerequisite

A basic programming course, such as codecademy’s Python Course

Course Description

This course will provide students in the biological and medical domains the foundational programming skills to create computer programs to manage and analyze data drawn from clinical, biological, and public health domains. Working with the Python programming language, students will learn how to write procedural and object oriented programs. Mathematical principles relevant to biomedical data science will be reviewed through programming examples and problems. Students will also develop competency in using software version control with git as well as how to work within Linux environments.

Text Books

Programming

There is no required textbook for this course. However, I will make a PDF copy of Allen Downey's Think Python, version 2 available through Canvas. This is book with a Creative Commons license, so if you are interested you can clone, edit, and build your own copy of the book. For relevant modules, I will provide references to readings in Downey's book.

In addition there are a number of very useful books that are available online through the University of Utah's subscription to the Safari Technical Books Online.

Mathematics

As part of this class we will be reviewing some foundational mathematics as they related to computational issues in biomedical informatics. Some of the mathematics books that I will be drawing materials from include:

Learning Objectives

Upon completing this course students will be able to:

  1. Use basic mathematical principles (e.g. set theory, first order logic, calculus, linear algebra, probability, and graph theory) to motivate and inform computational problems in biology and healthcare.

  2. Follow software engineering principles such as version control, documentation, and testing while developing biomedical software.

  3. Develop biomedical software applications in the Python programming language.

  4. Develop pipelines for manipulating, analyzing, and visualizing biomedical data.

Evaluation Methods

  1. Quizzes: 15% of grade
  2. Class Participation: 15% of grade
    1. I will not keep attendance, but I do expect reasonable attendance and participation with in-class activities, peer review of assignments, etc.
  3. Homework Assignments: 40%
  4. Term Project: 30%

No tests will be given

Teaching and Learning Methods

This class will follow a “flipped classroom” paradigm. Students will be expected to watch prepared videos and read relevant on-line materials prior to the start of class. Class time will be spent answering questions raised by on-line materials, clarifying topics, and participating in individual and group hands-on activities. Students are encouraged to work in groups.

Writing

While this course does not include a formal written component, students are required to write documentation for all their code following standard Python conventions.

Analytical

The analytical component of this course includes review of foundational mathematical concepts that are of importance to biomedical informatics.

Course Schedule

  1. Crash course

    1. Working in Linux
    2. Using git
    3. Quick overview of Python
  2. Computational Environments and software engineering

    1. Principles
      1. Working in the Linux shell
      2. Using Jupyter notebooks
      3. Using git for version control
      4. Terminal editors
      5. Documentation
    2. Application:
      1. Working within the course framework
  3. Review of sublanguages (mathematical, computational, medical) and their symbols/notation

    1. Principles

      1. Common mathematical, computational, and medical symbols (and their meaning)
    2. Application: Pseudo-code and program design

  4. Mathematical and Computational Concepts of Numbers

    1. Principles
      1. Integer, rational, real, and complex numbers in mathematics
      2. Integer, rational, real, and complex numbers in Python
    2. Application: representing biomedical data numerically
  5. Code Blocks in Python

    1. Principles
      1. If/Else If/Else Blocks
      2. Repeition with For and While Loops
    2. Applications
  6. Collections

    1. Principles
      1. Set theory
      2. Strings, lists, tuples, dictionaries, and sets in Python
    2. Application:
      1. Using Python collections to represent laboratory test data
      2. Using sets to analyze biomedical texts
      3. Counting kmers
      4. Dictionaries and ICD-9 codes (MIMIC2)
  7. Functions in Mathematics and Computing

    1. Principles
      1. Mathematical description of functions
      2. Mutable and immutable function arguments
      3. Functions as arguments to functions
      4. Functions for code-reuse
      5. Recursion
      6. Exceptions
    2. Application
      1. Writing functions to find prime numbers
      2. Computing greatest common denominators
      3. Identifying kmers
  8. Advanced Code Blocks in Python

    1. Principles
      1. Modules
      2. Packages
  9. Calculus and numeric approximations of derivatives

    1. Principles
      1. Meaning of derivatives
      2. Symbolic differentiation with Sympy
      3. Working with Numpy arrays
        1. Slicing
        2. Vectorized operations
      4. Numerical derivatives of Numpy arrays
      5. Approximation
      6. Optimization
    2. Applications:
      1. Drug delivery timing
      2. QRS identification in ECG signals
  10. Pandas for Data Wrangling and visualization

    1. Principles
      1. Reading tabular data
      2. Numeric representation standards (locale library)
      3. Working with missing data
      4. Representing dates and times
    2. Applications
      1. Air quality and temporal data
      2. Car accidents and spatial data
      3. Reading lab data
  11. Working with Data files

    1. Principles
      1. Reading and writing data from disk with Python
      2. Data serialization with Pickle
    2. Applications
      1. Parsing radiology report files
      2. Parsing common bioinformatics file formats FASTA, FASTQ
  12. Object oriented programming and probability

    1. Principles

      1. Encapsulation
      2. Polymorphism
      3. Inheritance
      4. Basic principles of counting
      5. Random values
    2. Application

      1. Modeling RGB$\alpha$
      2. Simulating populations of patients
  13. Basic Text Processing with Python

    1. Principles
      1. Tokenization
      2. Regular expressions
    2. Application
      1. Text de-identification
      2. Extracting gene and protein data
  14. Linear Algebra and Text Processing

    1. Principles

      1. Vectors
      2. Word vectors
      3. Dot products
      4. Vector norms
    2. Application

      1. Cosine similarity of text documents
      2. Rerpesenting sparse vectors with dictionaries
  15. Networks, ontologies and graph theory

    1. Principles
      1. Edges and nodes
      2. Directional graphs
      3. Graph traversal
      4. Shortest paths
    2. Applications
      1. Reasoning with Ontologies
      2. Analyzing Twitter networks
      3. Analyzing collaboration networks with Pubmed data
  16. Visualization of Biomedical Data

    1. Principles

      1. Creating graphs with Matplotlib
      2. Creating graphs with Holoviews
    2. Applications:

      1. Visualizing heart sounds
      2. Visualizaing MIMICII Data
  17. Networks and Probability

    1. Principles
      1. Multigraphs
      2. Basic principles of probability
    2. Applications
      1. Disease transmission in Utah public schools