19 May 2012

Knowledge Representation: On Data (Quotes)

"Mathematicians obtain the solution of a problem by the mere arrangement of data, and by reducing their reasoning to such simple steps, to conclusions so very obvious, as never to lose sight of the evidence which guides them." (Antoine Lavoisier, "Elements of Chemistry In a New Systematic Order". 1790)

"Before anything can be reasoned upon to a conclusion, certain facts, principles, or data, to reason from, must be established, admitted, or denied." (Thomas Paine, "Rights of Man", 1791)

"The modern age has a false sense of superiority because of the great mass of data at its disposal. But the valid criterion of distinction is rather the extent to which man knows how to form and master the material at his command." (Johann Wolfgang von Goethe, "On Theory of Color", 1810)

"The errors which arise from the absence of facts are far more numerous and more durable than those which result from unsound reasoning respecting true data." (Charles Babbage, "On the Economy of Machinery and Manufactures", 1832)

"In every branch of knowledge the progress is proportional to the amount of facts on which to build, and therefore to the facility of obtaining data." (James C Maxwell, [letter to Lewis Campbell] 1851)

"It usually happens in scientific progress, that when a great fact is at length discovered, it approves itself at once to all competent judges. It furnishes a solution to so many problems, and harmonizes with so many other facts, - that all the other data as it were crystallize at once about it." (Edward Everett, "The Uses of Astronomy", [An Oration Delivered at Albany] 1856)

"The ignoring of data is, in fact, the easiest and most popular mode of obtaining unity in one's thought." (William James, "The Sentiment of Rationality", Mind Vol. 4, 1879)

"It is a capital mistake to theorise before one has data." (Arthur C Doyle, "The Adventures of Sherlock Holmes", 1892)

"Physical research by experimental methods is both a broadening and a narrowing field. There are many gaps yet to be filled, data to be accumulated, measurements to be made with great precision, but the limits within which we must work are becoming, at the same time, more and more defined." (Elihu Thomson, "Annual Report of the Board of Regents of the Smithsonian Institution", 1899)

"The data with which any scientific inquiry has to do are trivialities in some other bearing than that one in which they are of account." (Thorstein Veblen, "The Place of Science in Modern Civilisation and Other Essays", 1906)

"The first step in beginning the scientific study of a problem is to collect the data, which are or ought to be 'facts'." (John A Thomson, "Introduction to Science", 1911)

"The man of science, by virtue of his training, is alone capable of realising the difficulties - often enormous - of obtaining accurate data upon which just judgment may be based." (Sir Richard Gregory, "Discovery; or, The Spirit and Service of Science", 1918)

"Philosophy, like science, consists of theories or insights arrived at as a result of systemic reflection or reasoning in regard to the data of experience. It involves, therefore, the analysis of experience and the synthesis of the results of analysis into a comprehensive or unitary conception. Philosophy seeks a totality and harmony of reasoned insight into the nature and meaning of all the principal aspects of reality." (Joseph A Leighton, "The Field of Philosophy: An outline of lectures on introduction to philosophy," 1919)

"A 'poor evaluation' of the probability of anything may reflect ignorance of relevant data which 'ought' to be known. (Clarence I Lewis, "Mind and the World-Order: Outline of a Theory of Knowledge", 1924)

"No human mind is capable of grasping in its entirety the meaning of any considerable quantity of numerical data." (Frank Yates & Ronald Fisher, "Statistical Methods for Research Workers", 1925)

"Take the situation of a scientist solving a problem, where he has certain data, which call for certain responses. Some of this set of data call for his applying such and such a law, while others call for another law." (George H Mead, "Mind, Self, and Society", 1934)

"The laws of science are the permanent contributions to knowledge - the individual pieces that are fitted together in an attempt to form a picture of the physical universe in action. As the pieces fall into place, we often catch glimpses of emerging patterns, called theories; they set us searching for the missing pieces that will fill in the gaps and complete the patterns. These theories, these provisional interpretations of the data in hand, are mere working hypotheses, and they are treated with scant respect until they can be tested by new pieces of the puzzle." (Edwin P Whipple, "Experiment and Experience", [Commencement Address, California Institute of Technology] 1938)

"Not even the most subtle and skilled analysis can overcome completely the unreliability of basic data." (Roy D G Allen, "Statistics for Economists", 1951)

"When evaluating the reliability and generality of data, it is often important to know the aims of the experimenter. When evaluating the importance of experimental results, however, science has a trick of disregarding the experimenter's rationale and finding a more appropriate context for the data than the one he proposed." (Murray Sidman, "Tactics of Scientific Research", 1960)

"Philosophers of science have repeatedly demonstrated that more than one theoretical construction can always be placed upon a given collection of data." (Thomas Kuhn, "The Structure of Scientific Revolutions", 1962) 

"We must include in any language with which we hope to describe complex data-processing situations the capability for describing data." (Grace Hopper, "Management and the Computer of the Future", 1962)

"Modern science is characterized by its ever-increasing specialization, necessitated by the enormous amount of data, the complexity of techniques and of theoretical structures within every field. Thus science is split into innumerable disciplines continually generating new subdisciplines. In consequence, the physicist, the biologist, the psychologist and the social scientist are, so to speak, encapusulated in their private universes, and it is difficult to get word from one cocoon to the other." (Ludwig von Bertalanffy, "General System Theory", 1968)

"At root what is needed for scientific inquiry is just receptivity to data, skill in reasoning, and yearning for truth. Admittedly, ingenuity can help too." (Willard v O Quine, "The Web of Belief", 1970)

"Statistical methods of analysis are intended to aid the interpretation of data that are subject to appreciable haphazard variability." (David V. Hinkley & David Cox, "Theoretical Statistics", 1974)

"In a way, science might be described as paranoid thinking applied to Nature: we are looking for natural conspiracies, for connections among apparently disparate data." (Carl Sagan, "The Dragons of Eden", 1977)

"If we gather more and more data and establish more and more associations, however, we will not finally find that we know something. We will simply end up having more and more data and larger sets of correlations." (Kenneth N Waltz, "Theory of International Politics Source: Theory of International Politics", 1979)

"There is a tendency to mistake data for wisdom, just as there has always been a tendency to confuse logic with values, intelligence with insight. Unobstructed access to facts can produce unlimited good only if it is matched by the desire and ability to find out what they mean and where they lead." (Norman Cousins, "Human Options : An Autobiographical Notebook", 1981) 

"Data in isolation are meaningless, a collection of numbers. Only in context of a theory do they assume significance […]" (George Greenstein, "Frozen Star", 1983)

"Data is raw. It simply exists and has no significance beyond its existence (in and of itself). It can exist in any form, usable or not. It does not have meaning of itself. In computer parlance, a spreadsheet generally starts out by holding data." (Russell L Ackoff, "Towards a Systems Theory of Organization, 1985)

"Information is data that has been given meaning by way of relational connection. This 'meaning' can be useful, but does not have to be. In computer parlance, a relational database makes information from the data stored within it." (Russell L Ackoff, "Towards a Systems Theory of Organization", 1985)

"Intuition becomes increasingly valuable in the new information society precisely because there is so much data." (John Naisbitt, "Re-Inventing the Corporation", 1985)

"The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." (John Tukey, "Sunset Salvo", The American Statistician Vol. 40 (1), 1986)

"Intuition is the art, peculiar to the human mind, of working out the correct answer from data that is, in itself, incomplete or even, perhaps, misleading." (Isaac Asimov, "Forward the Foundation", 1993)

"Now that knowledge is taking the place of capital as the driving force in organizations worldwide, it is all too easy to confuse data with knowledge and information technology with information." (Peter Drucker, "Managing in a Time of Great Change", 1995) 

"Paradigms are the most general-rather like a philosophical or ideological framework. Theories are more specific, based on the paradigm and designed to describe what happens in one of the many realms of events encompassed by the paradigm. Models are even more specific providing the mechanisms by which events occur in a particular part of the theory's realm. Of all three, models are most affected by empirical data - models come and go, theories only give way when evidence is overwhelmingly against them and paradigms stay put until a radically better idea comes along." (Lee R Beach, "The Psychology of Decision Making: People in Organizations", 1997)

"Data is discrimination between physical states of things (black, white, etc.) that may convey or not convey information to an agent. Whether it does so or not depends on the agent's prior stock of knowledge." (Max Boisot, "Knowledge Assets", 1998)

"The unit of coding is the most basic segment, or element, of the raw data or information that can be assessed in a meaningful way regarding the phenomenon." (Richard Boyatzis, "Transforming qualitative information", 1998)

"While hard data may inform the intellect, it is largely soft data that generates wisdom." (Henry Mintzberg, "Strategy Safari: A Guided Tour Through The Wilds of Strategic Management", 1998)

"The more data we have, the more likely we are to drown in it." (Nassim N Taleb, "Fooled by Randomness", 2001)

"Data is a fact of life. As time goes by, we collect more and more data, making our original reason for collecting the data harder to accomplish. We don't collect data just to waste time or keep busy; we collect data so that we can gain knowledge, which can be used to improve the efficiency of our organization, improve profit margins, and on and on. The problem is that as we collect more data, it becomes harder for us to use the data to derive this knowledge. We are being suffocated by this raw data, yet we need to find a way to use it." (Seth Paul et al. "Preparing and Mining Data with Microsoft SQL Server 2000 and Analysis", 2002)

"Good communication is not just data transfer. You need to show people something that addresses their anxieties, that accepts their anger, that is credible in a very gut-level sense, and that evokes faith in the vision." (John Kotter, "The Heart of Change: Real-Life Stories of How People Change Their Organizations", 2002) 

"Thought, without the data on which to structure that thought, leads nowhere." (Victor J Stenger, "Has Science Found God?: The Latest Results in the Search for Purpose in the Universe", 2003)

"The best scientists aren't the ones who know the most data; they're the ones who know what they're looking for." (Noam Chomsky, [Guardian] 2005)

"Perception requires imagination because the data people encounter in their lives are never complete and always equivocal. [...] We also use our imagination and take shortcuts to fill gaps in patterns of nonvisual data. As with visual input, we draw conclusions and make judgments based on uncertain and incomplete information, and we conclude, when we are done analyzing the patterns, that out picture is clear and accurate. But is it?" (Leonard Mlodinow, "The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"Finding patterns is easy in any kind of data-rich environment; that's what mediocre gamblers do. The key is in determining whether the patterns represent signal or noise." (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don't", 2012)

"The inherent nature of complexity is to doubt certainty and any pretense to finite and flawless data. Put another way, under uncertainty principles, any attempt by political systems to 'impose order' has an equal chance to instead 'impose disorder'." (Lawrence K Samuels, "Defense of Chaos: The Chaology of Politics, Economics and Human Action", 2013)

"The value of having numbers - data - is that they aren't subject to someone else's interpretation. They are just the numbers. You can decide what they mean for you." (Emily Oster, "Expecting Better", 2013)

"A study that leaves out data is waving a big red flag. A decision to include orxclude data sometimes makes all the difference in the world. This decision should be based on the relevance and quality of the data, not on whether the data support or undermine a conclusion that is expected or desired." (Gary Smith, "Standard Deviations", 2014)

"Another way to secure statistical significance is to use the data to discover a theory. Statistical tests assume that the researcher starts with a theory, collects data to test the theory, and reports the results - whether statistically significant or not. Many people work in the other direction, scrutinizing the data until they find a pattern and then making up a theory that fits the pattern." (Gary Smith, "Standard Deviations", 2014)

"Data clusters are everywhere, even in random data. Someone who looks for an explanation will inevitably find one, but a theory that fits a data cluster is not persuasive evidence. The found explanation needs to make sense and it needs to be tested with uncontaminated data." (Gary Smith, "Standard Deviations", 2014)

"Data without theory can fuel a speculative stock market bubble or create the illusion of a bubble where there is none. How do we tell the difference between a real bubble and a false alarm? You know the answer: we need a theory. Data are not enough. […] Data without theory is alluring, but misleading." (Gary Smith, "Standard Deviations", 2014)

"If somebody ransacks data to find a pattern, we still need a theory that makes sense. On the other hand, a theory is just a theory until it is tested with persuasive data." (Gary Smith, "Standard Deviations", 2014)

"Self-selection bias occurs when people choose to be in the data - for example, when people choose to go to college, marry, or have children. […] Self-selection bias is pervasive in 'observational data', where we collect data by observing what people do. Because these people chose to do what they are doing, their choices may reflect who they are. This self-selection bias could be avoided with a controlled experiment in which people are randomly assigned to groups and told what to do." (Gary Smith, "Standard Deviations", 2014)

"These practices - selective reporting and data pillaging - are known as data grubbing. The discovery of statistical significance by data grubbing shows little other than the researcher’s endurance. We cannot tell whether a data grubbing marathon demonstrates the validity of a useful theory or the perseverance of a determined researcher until independent tests confirm or refute the finding. But more often than not, the tests stop there. After all, you won’t become a star by confirming other people’s research, so why not spend your time discovering new theories? The data-grubbed theory consequently sits out there, untested and unchallenged." (Gary Smith, "Standard Deviations", 2014)

"We naturally draw conclusions from what we see […]. We should also think about what we do not see […]. The unseen data may be just as important, or even more important, than the seen data. To avoid survivor bias, start in the past and look forward." (Gary Smith, "Standard Deviations", 2014)

"Any knowledge incapable of being revised with advances in data and human thinking does not deserve the name of knowledge." (Jerry Coyne, “Faith Versus Fact”, 2015)

"The term data, unlike the related terms facts and evidence, does not connote truth. Data is descriptive, but data can be erroneous. We tend to distinguish data from information. Data is a primitive or atomic state (as in ‘raw data’). It becomes information only when it is presented in context, in a way that informs. This progression from data to information is not the only direction in which the relationship flows, however; information can also be broken down into pieces, stripped of context, and stored as data. This is the case with most of the data that’s stored in computer systems. Data that’s collected and stored directly by machines, such as sensors, becomes information only when it’s reconnected to its context."  (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...