Title
Testing Hypotheses On Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate For Simulated Data, And How To Modify Them
Abstract
To check whether a new algorithm is better, researchers use traditional statistical techniques for hypotheses testing. In particular, when the results are inconclusive, they run more and more simulations (n(2) > n(1), n(3) > n(2), . . . , n(m) > n(m-1)) until the results become conclusive. In this paper, we point out that these results may be misleading. Indeed, in the traditional approach, we select a statistic and then choose a threshold for which the probability of this statistic "accidentally" exceeding this threshold is smaller than, say, 1%. It is very easy to run additional simulations with ever-larger n. The probability of error is still 1% for each n(i), but the probability that we reach an erroneous conclusion for at least one of the values n(i) increases as m increases. In this paper, we design new statistical techniques oriented towards experiments on simulated data, techniques that would guarantee that the error stays under, say, 1% no matter how many experiments we run.
Year
DOI
Venue
2006
10.20965/jaciii.2006.p0260
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS
Keywords
Field
DocType
hypothesis testing, simulated data
Data dredging,Alternative hypothesis,Population,Statistic,Computer science,Testing hypotheses suggested by the data,Artificial intelligence,Statistics,Exploratory data analysis,Standard deviation,Machine learning,Statistical hypothesis testing
Journal
Volume
Issue
ISSN
10
3
1343-0130
Citations 
PageRank 
References 
0
0.34
7
Authors
3
Name
Order
Citations
PageRank
Richard Aló1111.93
Vladik Kreinovich21091281.07
Scott A. Starks36112.76