csir_logo

Picro-DB

ihbt_logo
select area to view large

Figure-1: Schematic diagram of complete procedure of genome sequencing analysis

Description
(A) Schematic diagram of novel strategy used to join contigs into scaffold using unique stretches of contigs. Initially Pacbio long reads(LR) were corrected using self correction as well as hybrid correction. LRs from both the strategy were used in CANU assembler for primary assembly. Primary assembly was checked for novel and known repetitive content. These content were used used to mask the assembly. On the basis of unique regions and repeat families, Unique identification(UID) generated for each region. On the basis of these UID scaffolds generated with the help of local similarity search tool(BLAST). Further this assembly improve with Illumina PE read mapping. (B) Workflow of annotation pipeline followed for all the major component of genome including Repeats, Transcription factors, Genes, miRNAs, TFBS, Phylogenetic analysis, miRNA targeting information, non-coding RNA annotation etc.