Multiple hits
There is an option called -g/--max-multihits <int>
, it instructs TopHat to allow up to this many alignments to the reference for a given read, and suppresses all alignments for reads with more than this many alignments.
Calculate MAPQ
This is the formule Tophat used for MAPQ calculation. i is the number of hits, m is the MAPQ score. This formule is found in tophat source code tophat_reports.cpp.
int mapQ=255; if (grade.num_alignments > 1) { double err_prob = 1 - (1.0 / grade.num_alignments); mapQ = (int)(-10.0 * log(err_prob) / log(10.0)); } # grad.num_alignments is the number of equally best hits
for (i in 1:40) { + e = 1-(1/i) + m = round(-10*log(e)/log(10)) + print(c(i,m)) + } [1] 1 Inf [1] 2 3 [1] 3 2 [1] 4 1 [1] 5 1 [1] 6 1 [1] 7 1 [1] 8 1 [1] 9 1 [1] 10 0 [1] 11 0 [1] 12 0 [1] 13 0 [1] 14 0 [1] 15 0 [1] 16 0 [1] 17 0 [1] 18 0 [1] 19 0 [1] 20 0 [1] 21 0 [1] 22 0 [1] 23 0 [1] 24 0 [1] 25 0 [1] 26 0 [1] 27 0 [1] 28 0 [1] 29 0 [1] 30 0 [1] 31 0 [1] 32 0 [1] 33 0 [1] 34 0 [1] 35 0 [1] 36 0 [1] 37 0 [1] 38 0 [1] 39 0 [1] 40 0
so: 255 = unique mapping
3 = maps to 2 locations in the target
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = maps to 10 or more locations.
But in reality, tophat report 1 for 3-location mapping.
reference: http://user.list.galaxyproject.org/about-Mapping-Quality-td4366680.html
Hide Comments