Sequence comparison: Multiple testing correction

Sequence comparison: Multiple testing correction
Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble

One-minute response – what people liked
Practice problems are getting more complex, but it’s helpful. Examples were very helpful. Really good practice questions today. x2 Problem sets were good – just challenging enough. Overall, the first part of the lecture makes sense, and I normally get involved by the 2nd part. Thank you for the long practice time at the end. Good step-by-step explaining of statistical analysis portion. I appreciate how available you and Lindsay are during the exercises to answer questions. The examples for p-values were clear and easy to understand. Python section was really useful. I appreciate the modifications you have made in response to requests. I appreciate examples using both DNA and proteins. I enjoyed learning stats for a new type of problem. I liked the slide that specified whether a file or string was input/output.

One-minute response – pacing
Pacing was good. x4 I liked the pacing for the Python part of the class. Great pace and contents covered in both parts of the class. Good pacing, good information. Good lecture and pacing. Good pacing for the problem- solving part of the class. It seems like we went through the programming a bit fast today. x3 Lecture part was at a good pace; Python part seemed a bit fast. Speed of the lecture was great until we started programming and you kept talking. Wish we had more time to think about the problems.

One-minute response Your two examples of Smith-Waterman score distributions on slide 24 look suspiciously symmetric.

One-minute response Theory part Python part I think it would be helpful to go over the distribution stuff a bit more. Like walking through making the graph and calculating a p-value. Would rather spend whole class today on statistical/theoretical and whole next class on Python. First part was a little too basic, so it was more confusing trying to link the concepts with the basic statistics. It would help to know a little more about how Python is actually thinking about each command to improve flexibility and debugging. It would be helpful to go over the sample problems at the beginning of the next class. I would like to be able to read data tables from .csv files. I am confused by the use of readline to get another line besides the first. I liked the examples but took a little extra time to get the I/O commands due to the need to create the extra test files.

The show so far … Most statistical tests compare observed data to the expected result according to the null hypothesis. Sequence similarity scores follow an extreme value distribution, which is characterized by a larger tail. The p-value associated with a score is the area under the curve to the right of that score. The p-value for score x for an extreme value distribution with parameters μ and λ is

What p-value is significant?

What p-value is significant?
The most common thresholds are 0.01 and 0.05. A threshold of 0.05 means you are 95% sure that the result is significant. Is 95% enough? It depends upon the cost associated with making a mistake. Examples of costs: Doing expensive wet lab validation. Making clinical treatment decisions. Misleading the scientific community. Most sequence analysis uses more stringent thresholds because the p- values are not very accurate.

Two types of errors In statistical testing, there are two types of errors: False positive (aka “Type I error”): incorrectly rejecting the null hypothesis. I.e., saying that two sequences are homologous when they are not. False negative (aka “Type II error”): incorrectly accepting the null hypothesis. I.e., saying that two sequences are not homologous when they are. We use p-values to control the false positive rate.

Multiple testing Say that you perform a statistical test with a 0.05 threshold, but you repeat the test on twenty different observations. Assume that all of the observations are explainable by the null hypothesis. What is the chance that at least one of the observations will receive a p-value less than 0.05?

Multiple testing Say that you perform a statistical test with a 0.05 threshold, but you repeat the test on twenty different observations. Assuming that all of the observations are explainable by the null hypothesis, what is the chance that at least one of the observations will receive a p-value less than 0.05? Pr(making a mistake) = 0.05 Pr(not making a mistake) = 0.95 Pr(not making any mistake) = = 0.358 Pr(making at least one mistake) = = 0.642 There is a 64.2% chance of making at least one mistake.

Family-wise error rate
For a single test, the Type I error rate is the probability of incorrectly rejecting the null hypothesis. For multiple tests, the family-wise error rate is the probability of incorrectly rejecting at least one null hypothesis.

Bonferroni correction
Divide the desired p-value threshold by the number of tests performed. For the previous example, 0.05 / 20 = Pr(making a mistake) = Pr(not making a mistake) = Pr(not making any mistake) = = Pr(making at least one mistake) = =

Proof: Bonferroni adjustment controls the family-wise error rate
Note: Bonferroni adjustment does not require that the tests be independent. Boole’s inequality Definition of p-value m = number of hypotheses m0 = number of null hypotheses ⍺ = desired control pi = ith p-value Definition of m and m0

Database searching Say that you search the non-redundant protein database at NCBI, containing roughly one million sequences. What p-value threshold should you use?

Database searching Say that you search the non-redundant protein database at NCBI, containing roughly one million sequences. What p-value threshold should you use? Say that you want to use a conservative p-value of Recall that you would observe such a p-value by chance approximately every 1000 times in a random database. A Bonferroni correction would suggest using a p-value threshold of / 1,000,000 = = 10-9.

E-values A p-value is the probability of making a mistake.
The E-value is the expected number of times that the given score would appear in a random database of the given size. One simple way to compute the E-value is to multiply the p-value times the size of the database. Thus, for a p-value of and a database of 1,000,000 sequences, the corresponding E-value is × 1,000,000 = 1,000. BLAST actually calculates E-values in a more complex way.

Summary Confidence thresholds of 0.01 and 0.05 are arbitrary.
In practice, the threshold should be selected based on the costs associated with false positive and false negative conclusions. The Bonferroni correction divides the desired threshold by the number of tests performed. The E-value is equivalent to Bonferroni adjustment, but multiplies the p-value by the number of tests performed.

Sequence comparison: Multiple testing correction

Similar presentations

Presentation on theme: "Sequence comparison: Multiple testing correction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sequence comparison: Multiple testing correction

Similar presentations

Presentation on theme: "Sequence comparison: Multiple testing correction"— Presentation transcript:

Similar presentations

About project

Feedback