Skip to content

How to get started with computational chemistry research. Directed to new people in our lab, may be useful in general.

Notifications You must be signed in to change notification settings

geem-lab/compchem_start

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 

Repository files navigation

Computational chemistry - how to get started

This document has resources for people new to computational quantum chemistry. Mainly directed for people starting at our lab.

This list is currently quite rough looking, I'm constantly making updates :)

List of abbreviations:

  • DFT - Density functional theory
  • QC - Quantum chemistry
  • QM - Quantum mechanics
  • MD - Molecular dynamics

Books and lecture notes

Quantum chemistry

Computational chemistry

General

DFT

Other

Working with remote clusters

  • All the heavy computational work is done on remote computers
  • All these computers run on linux

CSC

  • The calculations require a lot of resources
  • CSC provides computational resources for us
    • Taito supercluster
    • Sisu supercomputer
  • A lot of chemistry software preinstalled
  • Access to several databases
    • Cambridge Structural Database System
      • X-ray structures of molecules, useful toolkits for searching structures
      • Web interface WebCSD
        • Search with DOI etc
        • The .cif files can be opened with avogadro, chemcraft, ...

Resource allocation

  • Clusters have a lot of users, workload manager is used to allocate resources
  • Define the resources you need
    • Amount of cores, memory, time, ...
  • SLURM workload manager is used in our environments
    • Batch job file
      • Used to define the calculation
        • Resources and software needed
    • CSCs tutorial
    • Here's another tutorial

Used methods

Quantum chemistry

$H\Psi = E\Psi$

  • Basis sets
  • BO approximation
  • HF
  • Electron correlation
  • DFT
    • Functionals

Workflow

  1. Build the molecule with some visual tool (see below)
  1. Create an input file
  • Each QC program has their own
  1. Send the job to queue
  2. Check results
  • Errors are usually due to
    • faulty input file
    • not enough resources, such as time, memory
    • convergence problems
      • several ways to proceed, see Jensen's book
  1. Collect and analyze results
  • Spreadsheets, Python, ...

Molecular dynamics

  • Classical simulations

    • Potential is defined as a sum of bond, angle, dihedral, van der Waals, electrostatic terms
      • Parameters from experiments, QC calculations
    • No quantum mechanical effects
      • Bond breaking, polarization, ...
  • Not very useful in our stuff, we mostly look at small molecules

Software

Quantum chemistry

  • ORCA

    • On CSC
    • Active community, good forum
    • ORCA input library is amazing. Easy to get started with this!
    • Free for scientists! :)
    • Good for spectroscopy
    • DLPNO-CCSD(T) method!
  • Turbomole

    • On CSC
    • Fast!
    • No typical input files
      • Jobs are set up using the interactive define-script
        • Not very user friendly, must be scripted away when running large-scale calculations
  • Gaussian

  • NWChem

    • On CSC
    • Open-source! :)
    • Good for LARGE problems, scales to thousands of processors

Molecular dynamics

  • GROMACS
    • The manual is a good intro to MD!

Visualization and

  • Avogadro

    • Open source, quite powerful
    • Best tool for building molecules, UFF force field for preoptimization
    • Conformation search
    • Visualization of orbitals, vibrations, Bader analysis, ...
  • VMD

    • Excellent and scriptable program, heavily used in MD field
  • Chemcraft

    • Good for many manipulations, visualization
    • Not free, 150 days free trial period

Wavefunction analysis

  • MultiWFN
    • Excellent tool for wave function analysis
    • Pretty much everything is implemented
    • Takes as an input eg. a molden-file
  • NCIPlot
    • Visual analysis of weak interactions

Scripting and working with data

Python

  • Best choice in my opinion!
    • I use it for everything: scientific computing, workflow automation, web development, deep learning, home automation...
    • One of the most popular languages
      • Excellent resources available!
    • Slow for number crunching
      • Can be used as front-end for C, Fortran through libraries
  • So many resources!
  • Jupyter Notebooks: Very nice graphical user interface
  • Each notebook consists of cells
    • May contain code, markdown text, latex
    • Can be run separately
  • Inline plotting
  • Can be shared
  • See Johansson's Scientific python lectures, excellent intro to python, notebooks, the scientific stack

Excellent data science stack, useful libraries:

  • Numpy

    • linear algebra calculations and much more
  • SciPy

    • special functions, integrals, a lot of numerical tools
  • Matplotlib

    • plotting, excellent for visualizations *Pandas
    • "excel", data structures, good for data manipulation, analysis
  • Scikit-learn

    • Machine learning, regression, statistics
  • Keras, tensorflow

    • Deep learning, cool stuff

Bash tools

  • sed
  • awk
  • gnuplot
  • "Information technologies for chemistry"

Software

  • RDKit
    • Very nice open-source library, python interface
  • CDK
    • Apparently quite good, in Java

Data structures for molecules

Databases

  • Relational databases (MySQL, sqlite, postgreSQL, ...)
  • Non-relational (MongoDB, ...)
  • Pandas is quite good in my opinion

About

How to get started with computational chemistry research. Directed to new people in our lab, may be useful in general.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published