Changes between Initial Version and Version 1 of BIOS_ReferenceFiles/MetaExonAnnotation_documentation


Ignore:
Timestamp:
Sep 19, 2016 4:48:45 PM (8 years ago)
Author:
jamverlouw
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BIOS_ReferenceFiles/MetaExonAnnotation_documentation

    v1 v1  
     1= Meta-exon annotation =
     2
     3
     4To create the meta-exon annotation the following steps were taken: [[BR]]
     51. The exon annotation from Ensembl Biomart v.71 was downloaded. The file contained the following columns: [[BR]]chromosome, exon start, exon end, Ensembl exon id, Ensembl gene id, gene name, strand.[[BR]]
     62. All additional contigs (GL*, LRG* etc) were removed, so that only ordinary chromosomes (1-22, X, Y, MT) remained. This was done by a custom script cutStrangeChr.py (see attachment). [[BR]]
     73. The Biomart file was converted to bed format and sorted by start coordinate:[[BR]]
     84. Exons were merged using mergeBed tools from BEDTools suite:[[BR]]
     95. The resulting file was converted to gtf format, retaining the strand information by a custom script mergedBed_to_gtf.py (see attachment).[[BR]]
     10
     11
     12The final commands to generate the meta-exon annotation were the following:[[BR]][[BR]]
     13{{{./cutStrangeChr.py biomart_export.txt | awk 'BEGIN {FS="\t"}; {OFS="\t"}; {if ($7 == "-1") $7 = "-"; else $7 = "+"}; {print $1, $2 - 1, $3, $4 ":" $5 ":" $6, ".", $7}' | sort -k1,1n -k2,2n | mergeBed -nms -d -1 -i stdin > biomart_export.merged.tmp}}}[[BR]][[BR]]
     14{{{./mergedBed_to_gtf.py biomart_export.merged.tmp biomart_export.txt | sort -k1,1n -k4,4n > meta-exons_v71_cut_sorted_18-04-14.gtf}}}
     15
     16
     17