DeepPrime-Off
How to Use DeepPrime
To use DeepPrime-Off, you need to prepare pegRNA and thier feature informations in a specific format as an input. Also, DeepPrime-Off uses Cas-OFFinder to find off-target candidates for each pegRNA spacer.
import pandas as pd
from genet.predict import DeepPrimeOff
dp_off = DeepPrimeOff()
dp_off.setup(
features=df_features,
cas_offinder_result='OFFinder_output.txt',
ref_genome='Homo sapiens',
download_fasta=True,
custom_genome=None,
)
ID | Spacer | RT-PBS | PBS_len | RTT_len | RT-PBS_len | Edit_pos | Edit_len | RHA_len | Target | ... | deltaTm_Tm4-Tm2 | GC_count_PBS | GC_count_RTT | GC_count_RT-PBS | GC_contents_PBS | GC_contents_RTT | GC_contents_RT-PBS | MFE_RT-PBS-polyT | MFE_Spacer | DeepSpCas9_score |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
48_ABL1_ex4_pos6C_A_rank3 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTAGATCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 27 | 1 | 13 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | ... | -345.645 | 7 | 20 | 27 | 70 | 50 | 54 | -10.7 | -5.9 | 42.90589 |
66_ABL1_ex4_pos8C_A_rank3 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTATAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 29 | 1 | 11 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | ... | -345.645 | 7 | 20 | 27 | 70 | 50 | 54 | -14.3 | -5.9 | 42.90589 |
69_ABL1_ex4_pos8C_G_rank3 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTACAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 29 | 1 | 11 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | ... | -345.645 | 7 | 21 | 28 | 70 | 52.5 | 56 | -14.3 | -5.9 | 42.90589 |
72_ABL1_ex4_pos8C_T_rank3 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTAAAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 29 | 1 | 11 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | ... | -345.645 | 7 | 20 | 27 | 70 | 50 | 54 | -14.3 | -5.9 | 42.90589 |
96_ABL1_ex4_pos11C_G_rank3 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACCTAGAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 32 | 1 | 8 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | ... | -345.645 | 7 | 21 | 28 | 70 | 52.5 | 56 | -15.5 | -5.9 | 42.90589 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
11476_ABL1_ex9_pos100A_C_rank1 | CAGGAATCCAGTATCTCAGA | ATGGGTACGTTACCGTCTGAGATACTGG | 10 | 18 | 28 | 10 | 1 | 8 | GTTCCAGGAATCCAGTATCTCAGACGGTAAAGTACCCATCCCGGGG... | ... | -483.811 | 5 | 9 | 14 | 50 | 50 | 50 | -3.3 | -1.3 | 55.85136 |
11479_ABL1_ex9_pos100A_G_rank1 | CAGGAATCCAGTATCTCAGA | GGGTACCTTACCGTCTGAGATACTGG | 10 | 16 | 26 | 10 | 1 | 6 | GTTCCAGGAATCCAGTATCTCAGACGGTAAAGTACCCATCCCGGGG... | ... | -320.805 | 5 | 9 | 14 | 50 | 56.25 | 53.84615 | -3.6 | -1.3 | 55.85136 |
The predict
function can be executed after setup is completed. The features
DataFrame, internally created by the DeepPrimeOff
object during setup, is used as input for the DeepPrime-Off model. Therefore, if the setup
is not completed properly, the model will not be able to find the required input and will raise an error.
ID | DeepPrime-Off_score | Spacer | RT-PBS | PBS_len | RTT_len | RT-PBS_len | Edit_pos | Edit_len | RHA_len | Target | Off-target | Off-context | Location | Position | Strand | MM_num |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
48_ABL1_ex4_pos6C_A_rank3 | 0 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTAGATCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 27 | 1 | 13 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | TtCCTGTCTgTGTGGGCTGATGG | TTATTTCCTGTCTGTGTGGGCTGATGGTCCTTCAATCATTGAAGTC... | 1 dna:chromosome chromosome:GRCh38:1:1:2489564... | 1.66E+08 | + | 2 |
66_ABL1_ex4_pos8C_A_rank3 | 0 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTATAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 29 | 1 | 11 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | TtCCTGTCTgTGTGGGCTGATGG | TTATTTCCTGTCTGTGTGGGCTGATGGTCCTTCAATCATTGAAGTC... | 1 dna:chromosome chromosome:GRCh38:1:1:2489564... | 1.66E+08 | + | 2 |
69_ABL1_ex4_pos8C_G_rank3 | 0 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTACAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 29 | 1 | 11 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | TtCCTGTCTgTGTGGGCTGATGG | TTATTTCCTGTCTGTGTGGGCTGATGGTCCTTCAATCATTGAAGTC... | 1 dna:chromosome chromosome:GRCh38:1:1:2489564... | 1.66E+08 | + | 2 |
72_ABL1_ex4_pos8C_T_rank3 | 0 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACGTAAAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 29 | 1 | 11 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | TtCCTGTCTgTGTGGGCTGATGG | TTATTTCCTGTCTGTGTGGGCTGATGGTCCTTCAATCATTGAAGTC... | 1 dna:chromosome chromosome:GRCh38:1:1:2489564... | 1.66E+08 | + | 2 |
96_ABL1_ex4_pos11C_G_rank3 | 0 | TGCCTGTCTCTGTGGGCTGA | GAGGAGACCTAGAGCTGAAGGAAACAGGGAACAGCCTTCAGCCCAC... | 10 | 40 | 50 | 32 | 1 | 8 | AGCTTGCCTGTCTCTGTGGGCTGAAGGCTGTTCCCTGTTTCCTTCA... | TtCCTGTCTgTGTGGGCTGATGG | TTATTTCCTGTCTGTGTGGGCTGATGGTCCTTCAATCATTGAAGTC... | 1 dna:chromosome chromosome:GRCh38:1:1:2489564... | 1.66E+08 | + | 2 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
11476_ABL1_ex9_pos100A_C_rank1 | 0 | CAGGAATCCAGTATCTCAGA | ATGGGTACGTTACCGTCTGAGATACTGG | 10 | 18 | 28 | 10 | 1 | 8 | GTTCCAGGAATCCAGTATCTCAGACGGTAAAGTACCCATCCCGGGG... | gAGGAgcCCAGTATCTCAGATGG | AGTGAGAGGAGCCCAGTATCTCAGATGGAAATGCAGAAATCACCTG... | Y dna:chromosome chromosome:GRCh38:Y:2781480:5... | 26339822 | - | 3 |
11479_ABL1_ex9_pos100A_G_rank1 | 0 | CAGGAATCCAGTATCTCAGA | GGGTACCTTACCGTCTGAGATACTGG | 10 | 16 | 26 | 10 | 1 | 6 | GTTCCAGGAATCCAGTATCTCAGACGGTAAAGTACCCATCCCGGGG... | gAGGAgcCCAGTATCTCAGATGG | AGTGAGAGGAGCCCAGTATCTCAGATGGAAATGCAGAAATCACCTG... | Y dna:chromosome chromosome:GRCh38:Y:2781480:5... | 26339822 | - | 3 |