retrieving sgRNAs from library and oligo ordering

Experiment: CRISPR Last edited: May 9, 2023 6:48 PM Owner: Bercin Cenik Type: csv, python, shell Works?: No

Notes and debugging

add notes about debugging here.

Usage

Make an excel sheet with the following columns:

#1: geneID

#2: 20bp seed sequence

#3: ”CACCG” for seeds not beginning with a “G”, “CACC” for seeds beginning with a “G”

#4: merge of columns 3&2

#5: “AAAC”

#6: reverse complement of seed sequence

#7: ”C” for seeds not beginning with a “G”, blank for seeds beginning with a “G”

#8: merge of columns 5&6&7

Next, populate these columns.

Column#2: Get your sgRNAs from the Brunello library. Use my “grep sgRNA from list” protocol for this. For example:

cat broadgpp-brunello-library-corrected.txt | grep -iw 'ZC3H13\|KIAA1429\|METTL3\|METTL14\|ELP1\|FOS\|USP49\|SNRNP25\|OGT\|SRF\|AKIRIN2\|MAPK1\|DDX49\|RIOK2' > methylation.sgRNAs.txt

This is assuming you are in a directory that contains the library (”broadgpp-brunello-library-corrected.txt”). The names of genes need to be separated by “|”, this will pipe the output to the subsequent gene, first escaping the pipe character.

The text file will look like this:

AKIRIN2_41635_CTGGTGAAGCTGGTCCACTG	CTGGTGAAGCTGGTCCACTG	AKIRIN2
AKIRIN2_41636_GCCGCAGAAGTATCTCCGAA	GCCGCAGAAGTATCTCCGAA	AKIRIN2
AKIRIN2_41637_GGGGACGCCGGGCTCAACAG	GGGGACGCCGGGCTCAACAG	AKIRIN2
AKIRIN2_41638_TGTGCATCAGAAGTACAACA	TGTGCATCAGAAGTACAACA	AKIRIN2

#The third column of these include the seed sequence, 
#which will be pasted into the new excel sheet’s second column (#2: 20bp seed sequence)

Column#3: This is the 5’ flank of the duplex oligo. If the sgRNA seed sequence begins with a G, type “CACC” here. If it doesn’t, you will need to add a G manually to create a GC-clamp to increase the efficiency of the oligo. For seeds not beginning with a G, type “CACCG” here.
Column#4: the 5’ oligo, merge the 3th and 2nd columns by using & on excel, in that order.
Column#5: 3’ flank. This is always a “CAAA”, but since we are writing 5’ to 3’, type “AAAC” in this column for every entry.
Column #6: ”revcomp of sgRNA”. These are the reverse complements of the seed sequence. Briefly:
```
original.            ACGTACGT
reverse.             TGCATGCA
complement.          TGCATGCA
reverse complement.  ACGTACGT
```
Run the “## for getting reverse complements” script shown below. This assumes you have a list of sgRNA seed sequences ("sgRNA_list.csv”) in the directory you are working in, and outputs a reverse complement of those (”rc_sgRNA_list.csv”). Paste the contents of rc_sgRNA_list.csv into column#6 (”revcomp of sgRNA”).
Column #7: clamp. This is a “C” if the seed does not begin with a G, blank if it does (since we are manually adding the GC if it’s absent, etc).
Column #8: the 3’ oligo, merge the 5th, 6th and 7th columns with the & operator on excel.

The final sheet will look like this:

"gene","sgRNA sequence","5'flank","5'oligo for ordering","3'flank","revcomp of sgRNA","clamp","3'oligo for ordering"
AKIRIN2_sg1,CTGGTGAAGCTGGTCCACTG,CACCG,CACCGCTGGTGAAGCTGGTCCACTG,AAAC,CAGTGGACCAGCTTCACCAG,C,AAACCAGTGGACCAGCTTCACCAGC
AKIRIN2_sg2,GCCGCAGAAGTATCTCCGAA,CACC,CACCGCCGCAGAAGTATCTCCGAA,AAAC,TTCGGAGATACTTCTGCGGC,,AAACTTCGGAGATACTTCTGCGGC
AKIRIN2_sg3,GGGGACGCCGGGCTCAACAG,CACC,CACCGGGGACGCCGGGCTCAACAG,AAAC,CTGTTGAGCCCGGCGTCCCC,,AAACCTGTTGAGCCCGGCGTCCCC
AKIRIN2_sg4,TGTGCATCAGAAGTACAACA,CACCG,CACCGTGTGCATCAGAAGTACAACA,AAAC,TGTTGTACTTCTGATGCACA,C,AAACTGTTGTACTTCTGATGCACAC

The 4th and 8th columns are the sequences of the oligos which will be ordered.

Run the script “##for reformatting the sgRNA oligos to an order-ready layout” below. This assumes there is an input.csv in your working directory that has the above 8-column format for all guides, and outputs the sequences in tandem (Forward followed by Reverse for each oligo). The list should look like this:

AKIRIN2_sg1_F	CACCGCTGGTGAAGCTGGTCCACTG
AKIRIN2_sg1_R	AAACCAGTGGACCAGCTTCACCAGC
AKIRIN2_sg2_F	CACCGCCGCAGAAGTATCTCCGAA
AKIRIN2_sg2_R	AAACTTCGGAGATACTTCTGCGGC
AKIRIN2_sg3_F	CACCGGGGACGCCGGGCTCAACAG
AKIRIN2_sg3_R	AAACCTGTTGAGCCCGGCGTCCCC
AKIRIN2_sg4_F	CACCGTGTGCATCAGAAGTACAACA
AKIRIN2_sg4_R	AAACTGTTGTACTTCTGATGCACAC

Order oligos from IDT (25nm, STD).
Important note: make sure non-target controls are also in your list of oligos to order.

Scripts

## for getting reverse complements

import csv

def reverse_complement(seq):
    complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
    seq = seq.lstrip('\ufeff')  # remove UTF-8 BOM character
    rev_seq = seq[::-1]
    rev_comp_seq = ''.join([complement[base] for base in rev_seq])
    return rev_comp_seq

with open('sgRNA_list.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    sgRNA_list = [row[0] for row in reader]

rc_sgRNA_list = [reverse_complement(sgRNA) for sgRNA in sgRNA_list]

with open('rc_sgRNA_list.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    for sgRNA in rc_sgRNA_list:
        writer.writerow([sgRNA])

## for only getting reverse sequences:

import csv

def reverse_sequence(seq):
    return seq[::-1]

# Read sgRNA sequences from input CSV file
with open('three_five.csv', 'r') as f:
    reader = csv.reader(f)
    sgRNA_list = [row[0] for row in reader]

# Compute reverse of each sequence
reverse_sgRNA_list = [reverse_sequence(sgRNA) for sgRNA in sgRNA_list]

# Write reversed sequences to output CSV file
with open('reverse_sgRNAs.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    for sgRNA in reverse_sgRNA_list:
        writer.writerow([sgRNA])

## for getting reverse complement of the nontarget controls:

def reverse_complement(seq):
    complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
    rev_seq = seq[::-1]
    rev_comp_seq = ''.join([complement[base] for base in rev_seq])
    return rev_comp_seq
sgRNA_list = ['AAAAAGCTTCCGCCTGATGG',
              'AAAACAGGACGATGTGCGGC',
              'AAAACATCGACCGAAAGCGT',
              'AAAATAGCAGTAAACTCAAC']

rc_sgRNA_list = [reverse_complement(sgRNA) for sgRNA in sgRNA_list]

print(rc_sgRNA_list)

##for reformatting the sgRNA oligos to an order-ready layout
import csv

with open('input.csv') as infile, open('output2.csv', 'w', newline='') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile, delimiter='\t')
    for row in reader:
        for i in range(0, len(row), 2):
            writer.writerow([row[i], row[i+1]])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retrieving sgRNAs from library and oligo ordering 05dbf5dca63e4680868d007c6680f5b0.md

retrieving sgRNAs from library and oligo ordering 05dbf5dca63e4680868d007c6680f5b0.md

retrieving sgRNAs from library and oligo ordering

Notes and debugging

Usage

Scripts

Files

retrieving sgRNAs from library and oligo ordering 05dbf5dca63e4680868d007c6680f5b0.md

Latest commit

History

retrieving sgRNAs from library and oligo ordering 05dbf5dca63e4680868d007c6680f5b0.md

File metadata and controls

retrieving sgRNAs from library and oligo ordering

Notes and debugging

Usage

Scripts