Skip to content
This repository has been archived by the owner on Aug 10, 2022. It is now read-only.

Protospacer mismatch

nyoungb2 edited this page Sep 20, 2013 · 1 revision
  • Determine the conservation between spacer and protospacer. This can be useful for determining CRISPR "seed" sequences.

  • Spacer blasting must be done prior.

  • The workflow consists of making a table of spacer-protospacer mismatches, which can then be plotted using R or excel.

Getting mismatches for spacer-protospacers

$ CLdb_spacerBlastProtoMismatch.pl -d CLdb.sqlite > all_proto_mismatch.txt

Getting mismatches for subtype I-B spacer-protospacers

$ CLdb_spacerBlastProtoMismatch.pl -d CLdb.sqlite -sub I-B > I-B_proto_mismatch.txt

plotting using R

This is an example script of making a plot with the output from CLdb_spacerBlastProtoMismatch.pl

rm(list=ls())
library(ggplot2)
setwd("./PATH/TO/FILE/")

# ordering the bins correctly
tbl <- read.delim("all_proto_mismatch.txt")
tbl$bin <- as.numeric(tbl$bin)
tbl$bin <- as.character(tbl$bin)
tbl$bin <- factor(tbl$bin, levels=unique(sort(as.numeric(tbl$bin))))

# plotting with ggplot
ggplot(tbl, aes(bin, mismatch)) +
	geom_bar(stat="identity") +
	facet_grid(subtype ~ .) +
	theme(
		axis.text.x = element_text(angle=90)
		)