validate subpackage¶
Sub-package Documentation¶
The validation sub-package is responsible for pulling supporting reads from the bam file and re-calling events based on the evidence in a standard notation.
Types of Output Files¶
A variety of intermediate output files are given for the user. These can be used to “drill down” further into events and also for developers debugging when adding new features, etc.
expected name/suffix |
file type/format |
content |
---|---|---|
|
raw evidence |
|
|
aligned contigs |
|
|
evidence collection window regions |
|
|
validated event positions |
|
|
text/tabbed |
failed events |
|
text/tabbed |
validated events |
|
assembled contigs |
|
|
results from blatting contigs |
|
|
igv batch file |
Algorithm Overview¶
(For each breakpoint pair)
Calculate the window/region to read from the bam and collect evidence
Store evidence (flanking read pair, half-mapped read, spanning read, split read, compatible flanking pairs) which match the expected event type and position
Assemble a contig from the collected reads. see theory - assembling contigs
Generate a fasta file containing all the contig sequences
Align contigs to the reference genome (currently blat is used to perform this step)
Make the final event calls. Each level of calls consumes all supporting reads so they are not re-used in subsequent levels of calls.
(For each breakpoint pair)
call by contig
call by spanning read
call by split read
call by flanking read pair. see theory - calling breakpoints by flanking evidence
Output new calls, evidence, contigs, etc
modules