"A model, like a novel, may resonate with nature, but it is not a ‘real’ thing. Like a novel, a model may be convincing - it may ‘ring true’ if it is consistent with our experience of the natural world. But just as we may wonder how much the characters in a novel are drawn from real life and how much is artifice, we might ask the same of a model: How much is based on observation and measurement of accessible phenomena, how much is convenience? Fundamentally, the reason for modeling is a lack of full access, either in time or space, to the phenomena of interest." (Kenneth Belitz, Science, Vol. 263, 1944)
"Hypothesis is a tool which can cause trouble if not used properly. We must be ready to abandon out hypothesis as soon as it is shown to be inconsistent with the facts." (William I B Beveridge, "The Art of Scientific Investigation", 1950)
"A good estimator will be unbiased and will converge more and more closely (in the long run) on the true value as the sample size increases. Such estimators are known as consistent. But consistency is not all we can ask of an estimator. In estimating the central tendency of a distribution, we are not confined to using the arithmetic mean; we might just as well use the median. Given a choice of possible estimators, all consistent in the sense just defined, we can see whether there is anything which recommends the choice of one rather than another. The thing which at once suggests itself is the sampling variance of the different estimators, since an estimator with a small sampling variance will be less likely to differ from the true value by a large amount than an estimator whose sampling variance is large." (Michael J Moroney, "Facts from Figures", 1951)
"Consistency and completeness can also be characterized in terms of models: a theory T is consistent if and only if it has at least one model; it is complete if and only if every sentence of T which is satisfied in one model is also satisfied in any other model of T. Two theories T1 and T2 are said to be compatible if they have a common consistent extension; this is equivalent to saying that the union of T1 and T2 is consistent." (Alfred Tarski et al, "Undecidable Theories", 1953)
"[I]n probability theory we are faced with situations in which our intuition or some physical experiments we have carried out suggest certain results. Intuition and experience lead us to an assignment of probabilities to events. As far as the mathematics is concerned, any assignment of probabilities will do, subject to the rules of mathematical consistency." (Robert Ash, "Basic probability theory", 1970)
"Information that is only partially structured (and therefore contains some 'noise' is fuzzy, inconsistent, and indistinct. Such imperfect information may be regarded as having merit only if it represents an intermediate step in structuring the information into a final meaningful form. If the partially Structured information remains in fuzzy form, it will create a state of dissatisfaction in the mind of the originator and certainly in the mind of the recipient. The natural desire is to continue structuring until clarity, simplicity, precision, and definitiveness are obtained." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)
"A single observation that is inconsistent with some generalization points to the falsehood of the generalization, and thereby 'points to itself'." (Ian Hacking, "The Emergence Of Probability", 1975)
"In any particular application, graphical or other informal analysis may show that consistency or inconsistency with H0 is so clear cut that explicit calculation of p is unnecessary." (David R Cox, "The role of significance tests", Scandanavian Journal of Statistics 4, 1977)
"A mathematical model is any complete and consistent set of mathematical equations which are designed to correspond to some other entity, its prototype. The prototype may be a physical, biological, social, psychological or conceptual entity, perhaps even another mathematical model." (Rutherford Aris, "Mathematical Modelling", 1978)
"When evaluating a model, at least two broad standards are relevant. One is whether the model is consistent with the data. The other is whether the model is consistent with the ‘real world.’" (Kenneth Bollen, "Structural Equations with Latent Variable", 1989)
"The term chaos is used in a specific sense where it is an inherently random pattern of behaviour generated by fixed inputs into deterministic (that is fixed) rules (relationships). The rules take the form of non-linear feedback loops. Although the specific path followed by the behaviour so generated is random and hence unpredictable in the long-term, it always has an underlying pattern to it, a 'hidden' pattern, a global pattern or rhythm. That pattern is self-similarity, that is a constant degree of variation, consistent variability, regular irregularity, or more precisely, a constant fractal dimension. Chaos is therefore order (a pattern) within disorder (random behaviour)." (Ralph D Stacey, "The Chaos Frontier: Creative Strategic Control for Business", 1991)
"When looking at the end result of any statistical analysis, one must be very cautious not to over interpret the data. Care must be taken to know the size of the sample, and to be certain the method for gathering information is consistent with other samples gathered. […] No one should ever base conclusions without knowing the size of the sample and how random a sample it was. But all too often such data is not mentioned when the statistics are given - perhaps it is overlooked or even intentionally omitted." (Theoni Pappas, "More Joy of Mathematics: Exploring mathematical insights & concepts", 1991)
"Data are generally collected as a basis for action. However, unless potential signals are separated from probable noise, the actions taken may be totally inconsistent with the data. Thus, the proper use of data requires that you have simple and effective methods of analysis which will properly separate potential signals from probable noise." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)
"Every messy data is messy in its own way - it’s easy to define the characteristics of a clean dataset (rows are observations, columns are variables, columns contain values of consistent types). If you start to look at real life data you’ll see every way you can imagine data being messy (and many that you can’t)!" (Hadley Wickham, "R-help mailing list", 2008)
"It is the consistency of the information that matters for a good story, not its completeness. Indeed, you will often find that knowing little makes it easier to fit everything you know into a coherent pattern." (Daniel Kahneman, "Thinking, Fast and Slow", 2011)
"Having NUMBERSENSE means: (•) Not taking published data at face value; (•) Knowing which questions to ask; (•) Having a nose for doctored statistics. [...] NUMBERSENSE is that bit of skepticism, urge to probe, and desire to verify. It’s having the truffle hog’s nose to hunt the delicacies. Developing NUMBERSENSE takes training and patience. It is essential to know a few basic statistical concepts. Understanding the nature of means, medians, and percentile ranks is important. Breaking down ratios into components facilitates clear thinking. Ratios can also be interpreted as weighted averages, with those weights arranged by rules of inclusion and exclusion. Missing data must be carefully vetted, especially when they are substituted with statistical estimates. Blatant fraud, while difficult to detect, is often exposed by inconsistency." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)
"Accuracy and coherence are related concepts pertaining to data quality. Accuracy refers to the comprehensiveness or extent of missing data, performance of error edits, and other quality assurance strategies. Coherence is the degree to which data - item value and meaning are consistent over time and are comparable to similar variables from other routinely used data sources." (Aileen Rothbard, "Quality Issues in the Use of Administrative Data Records", 2015)
"The dialectical interplay of experiment and theory is a key driving force of modern science. Experimental data do only have meaning in the light of a particular model or at least a theoretical background. Reversely theoretical considerations may be logically consistent as well as intellectually elegant: Without experimental evidence they are a mere exercise of thought no matter how difficult they are. Data analysis is a connector between experiment and theory: Its techniques advise possibilities of model extraction as well as model testing with experimental data."
"A good estimator has to be more than just consistent. It also should be one whose variance is less than that of any other estimator. This property is called minimum variance. This means that if we run the experiment several times, the 'answers' we get will be closer to one another than 'answers' based on some other estimator." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)
"Estimators are functions of the observed values that can be used to estimate specific parameters. Good estimators are those that are consistent and have minimum variance. These properties are guaranteed if the estimator maximizes the likelihood of the observations." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)
"There are other problems with Big Data. In any large data set, there are bound to be inconsistencies, misclassifications, missing data - in other words, errors, blunders, and possibly lies. These problems with individual items occur in any data set, but they are often hidden in a large mass of numbers even when these numbers are generated out of computer interactions." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)
"[...] a hypothesis test tells us whether the observed data are consistent with the null hypothesis, and a confidence interval tells us which hypotheses are consistent with the data." (William C Blackwelder)
No comments:
Post a Comment