Tophat

Multiple hits

There is an option called -g/--max-multihits <int>, it instructs TopHat to allow up to this many alignments to the reference for a given read, and suppresses all alignments for reads with more than this many alignments.

Calculate MAPQ

This is the formule Tophat used for MAPQ calculation. i is the number of hits, m is the MAPQ score. This formule is found in tophat source code tophat_reports.cpp.

	int mapQ=255;
	if (grade.num_alignments > 1)  {
        double err_prob = 1 - (1.0 / grade.num_alignments);
        mapQ = (int)(-10.0 * log(err_prob) / log(10.0));
    }
    # grad.num_alignments is the number of equally best hits
    for (i in 1:40) {
    +   e = 1-(1/i)
    +   m = round(-10*log(e)/log(10))
    +   print(c(i,m))
    + }
    [1]   1 Inf
    [1] 2 3
    [1] 3 2
    [1] 4 1
    [1] 5 1
    [1] 6 1
    [1] 7 1
    [1] 8 1
    [1] 9 1
    [1] 10  0
    [1] 11  0
    [1] 12  0
    [1] 13  0
    [1] 14  0
    [1] 15  0
    [1] 16  0
    [1] 17  0
    [1] 18  0
    [1] 19  0
    [1] 20  0
    [1] 21  0
    [1] 22  0
    [1] 23  0
    [1] 24  0
    [1] 25  0
    [1] 26  0
    [1] 27  0
    [1] 28  0
    [1] 29  0
    [1] 30  0
    [1] 31  0
    [1] 32  0
    [1] 33  0
    [1] 34  0
    [1] 35  0
    [1] 36  0
    [1] 37  0
    [1] 38  0
    [1] 39  0
    [1] 40  0

so: 255 = unique mapping

3 = maps to 2 locations in the target

2 = maps to 3 locations

1 = maps to 4-9 locations

0 = maps to 10 or more locations.

But in reality, tophat report 1 for 3-location mapping.

reference: http://user.list.galaxyproject.org/about-Mapping-Quality-td4366680.html

Homepage
Comments

Hide Comments