17 July 2021

Science: On Correlation (Quotes)

"It had appeared from observation, and it was fully confirmed by this theory, that such a thing existed as an 'Index of Correlation', that is to say, a fraction, now commonly written T, that connects with close approximation every value of the deviation on the part of the subject, with the average of all the associated deviations of the Relative [...]" (Francis Galton, "Memories of My Life", 1908)

"One of the main duties of science is the correlation of phenomena, apparently disconnected and even contradictory." (Frederick Soddy, "The Interpretation of Radium and the Structure of the Atom", 1909)

"To speak of the cause of an event is therefore misleading. Any set of antecedents from which the event can theoretically be inferred by means of correlations might be called a cause of the event. But to speak of the cause is to imply a uniqueness [...]." (Bertrand Russell, "Mysticism and Logic: And Other Essays", 1910)

"'Causation' has been popularly used to express the condition of association, when applied to natural phenomena. There is no philosophical basis for giving it a wider meaning than partial or absolute association. In no case has it been proved that there is an inherent necessity in the laws of nature. Causation is correlation. [...] perfect correlation, when based upon sufficient experience, is causation in the scientific sense." (Henry E. Niles, "Correlation, Causation and Wright's Theory of 'Path Coefficients'", Genetics, 1922)

"The futile elaboration of innumerable measures of correlation, and the evasion of the real difficulties of sampling problems under cover of a contempt for small samples, were obviously beginning to make its pretensions ridiculous. These procedures were not only ill-aimed, but for all their elaboration, not sufficiently accurate." (Sir Ronald A Fisher, "Statistical Methods for Research Workers", 1925)

"Graphic methods are very commonly used in business correlation problems. On the whole, carefully handled and skillfully interpreted graphs have certain advantages over mathematical methods of determining correlation in the usual business problems. The elements of judgment and special knowledge of conditions can be more easily introduced in studying correlation graphically. Mathematical correlation is often much too rigid for the data at hand." (John R Riggleman & Ira N Frisbee, "Business Statistics", 1938)

"[…] statistical literacy. That is, the ability to read diagrams and maps; a 'consumer' understanding of common statistical terms, as average, percent, dispersion, correlation, and index number."  (Douglas Scates, "Statistics: The Mathematics for Social Problems", 1943)

"Another thing to watch out for is a conclusion in which a correlation has been inferred to continue beyond the data with which it has been demonstrated." (Darell Huff, "How to Lie with Statistics", 1954)

"Keep in mind that a correlation may be real and based on real cause and effect, and still be almost worthless in determining action in any single case." (Darell Huff, "How to Lie with Statistics", 1954)

"When you find somebody - usually an interested party - making a fuss about a correlation, look first of all to see if it is not one of this type, produced by the stream of events, the trend of the times." (Darell Huff, "How to Lie with Statistics", 1954)

"There is no correlation between the cause and the effect. The events reveal only an aleatory determination, connected not so much with the imperfection of our knowledge as with the structure of the human world." (Raymond Aron, "The Opium of the Intellectuals", 1955)

"The well-known virtue of the experimental method is that it brings situational variables under tight control. It thus permits rigorous tests of hypotheses and confidential statements about causation. The correlational method, for its part, can study what man has not learned to control. Nature has been experimenting since the beginning of time, with a boldness and complexity far beyond the resources of science. The correlator’s mission is to observe and organize the data of nature’s experiments." (Lee J Cronbach, "The Two Disciplines of Scientific Psychology", The American Psychologist Vol. 12, 1957)

"It has been said that data collection is like garbage collection: before you collect it you should have in mind what you are going to do with it." (Russell Fox & Max Gorbuny, "The Science of Science: Methods of Interpreting Physical Phenomena", 1964)

"Today we preach that science is not science unless it is quantitative. We substitute correlation for causal studies, and physical equations for organic reasoning. Measurements and equations are supposed to sharpen thinking, but [...] they more often tend to make the thinking non-causal and fuzzy." (John R Platt, "Strong Inference", Science Vol. 146 (3641), 1964)

"If we gather more and more data and establish more and more associations, however, we will not finally find that we know something. We will simply end up having more and more data and larger sets of correlations." (Kenneth N Waltz, "Theory of International Politics Source: Theory of International Politics", 1979)

"The invalid assumption that correlation implies cause is probably among the two or three most serious and common errors of human reasoning." (Stephen J Gould, "The Mismeasure of Man", 1980)

"Correlation and causation are two quite different words, and the innumerate are more prone to mistake them than most." (John A Paulos, "Innumeracy: Mathematical Illiteracy and its Consequences", 1988)

"Nature normally hates power laws. In ordinary systems all quantities follow bell curves, and correlations decay rapidly, obeying exponential laws. But all that changes if the system is forced to undergo a phase transition. Then power laws emerge-nature's unmistakable sign that chaos is departing in favor of order. The theory of phase transitions told us loud and clear that the road from disorder to order is maintained by the powerful forces of self-organization and is paved by power laws. It told us that power laws are not just another way of characterizing a system's behavior. They are the patent signatures of self-organization in complex systems." (Albert-László Barabási, "Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life", 2002)

"If you flip a coin three times and it lands on heads each time, it's probably chance. If you flip it a hundred times and it lands on heads each time, you can be pretty sure the coin has heads on both sides. That's the concept behind statistical significance - it's the odds that the correlation (or other finding) is real, that it isn't just random chance." (T Colin Campbell, "The China Study", 2004)

"Nonetheless, the basic principles regarding correlations between variables are not that diffcult to understand. We must look for patterns that reveal potential relationships and for evidence that variables are actually related. But when we do spot those relationships, we should not jump to conclusions about causality. Instead, we need to weigh the strength of the relationship and the plausibility of our theory, and we must always try to discount the possibility of spuriousness." (Joel Best, "More Damned Lies and Statistics: How numbers confuse public issues", 2004)

"Given the important role that correlation plays in structural equation modeling, we need to understand the factors that affect establishing relationships among multivariable data points. The key factors are the level of measurement, restriction of range in data values (variability, skewness, kurtosis), missing data, nonlinearity, outliers, correction for attenuation, and issues related to sampling variation, confidence intervals, effect size, significance, sample size, and power." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"[...] if you want to show change through time, use a time-series chart; if you need to compare, use a bar chart; or to display correlation, use a scatter-plot - because some of these rules make good common sense." (Alberto Cairo, "The Functional Art", 2011)

"Economists should study financial markets as they actually operate, not as they assume them to operate - observing the way in which information is actually processed, observing the serial correlations, bonanzas, and sudden stops, not assuming these away as noise around the edges of efficient and rational markets." (Adair Turner, "Economics after the Crisis: Objectives and means", 2012)

"Without precise predictability, control is impotent and almost meaningless. In other words, the lesser the predictability, the harder the entity or system is to control, and vice versa. If our universe actually operated on linear causality, with no surprises, uncertainty, or abrupt changes, all future events would be absolutely predictable in a sort of waveless orderliness." (Lawrence K Samuels, "Defense of Chaos", 2013)

"The problem of complexity is at the heart of mankind’s inability to predict future events with any accuracy. Complexity science has demonstrated that the more factors found within a complex system, the more chances of unpredictable behavior. And without predictability, any meaningful control is nearly impossible. Obviously, this means that you cannot control what you cannot predict. The ability ever to predict long-term events is a pipedream. Mankind has little to do with changing climate; complexity does." (Lawrence K Samuels, "The Real Science Behind Changing Climate", LewRockwell.com, August 1, 2014)

"The correlational technique known as multiple regression is used frequently in medical and social science research. This technique essentially correlates many independent (or predictor) variables simultaneously with a given dependent variable (outcome or output). It asks, 'Net of the effects of all the other variables, what is the effect of variable A on the dependent variable?' Despite its popularity, the technique is inherently weak and often yields misleading results. The problem is due to self-selection. If we don’t assign cases to a particular treatment, the cases may differ in any number of ways that could be causing them to differ along some dimension related to the dependent variable. We can know that the answer given by a multiple regression analysis is wrong because randomized control experiments, frequently referred to as the gold standard of research techniques, may give answers that are quite different from those obtained by multiple regression analysis." (Richard E Nisbett, "Mindware: Tools for Smart Thinking", 2015)

"The theory behind multiple regression analysis is that if you control for everything that is related to the independent variable and the dependent variable by pulling their correlations out of the mix, you can get at the true causal relation between the predictor variable and the outcome variable. That’s the theory. In practice, many things prevent this ideal case from being the norm." (Richard E Nisbett, "Mindware: Tools for Smart Thinking", 2015)

"Correlation is not equivalent to cause for one major reason. Correlation is well defined in terms of a mathematical formula. Cause is not well defined." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

"The degree to which one variable can be predicted from another can be calculated as the correlation between them. The square of the correlation (R^2) is the proportion of the variance of one that can be 'explained' by knowledge of the other." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

"Correlation quantifies the relationship between features. The purpose of correlation analysis is to understand the dependencies between features, so that observed effects can be explained or desired effects can be achieved." (Thomas A Runkler, "Data Analytics: Models and Algorithms for Intelligent Data Analysis" 3rd Ed., 2020)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...