Discussion
1. Entropy
By “entropy” as it relates to information theory, the Foundation adopts Hubert P. Yockey’s distinction between Maxwell-Boltzmann-Gibbs entropy, Shannon probability-distribution entropy, and Kolmogorov-Solomonoff-Chaitin sequence/algorithmic complexity. (See Information Theory and Molecular Biology, Cambridge University Press, 1992, sections 2.2.2 and 2.4.1 – 2.4.6). (See also, Yockey, H.P., (1974) “An application of information theory to the Central Dogma and the sequence hypothesis.” Journal of Theoretical Biology, 46, 369-406, and Yockey, H.P.(1981) Self Organization, Origin of Life Scenarios, and Information Theory, J. Theor. Biology, 91, 13-31, and Yockey, H.P. (2000) Origin of life on earth and Shannon’s theory of communication, Comput Chem, 24, 1, pp 105-123) Yockey argues that there is no “balancing act” between algorithmic informational entropy and Maxwell-Boltzmann-Gibbs-type entropy. The two are not on the same see-saw. The two probability spaces are not isomorphic. Information theory lacks the integral of motion present in thermodynamics and statistical mechanics. In addition, there is no code linking the two “alphabets” of stochastic ensembles. Kolmogorov-Solomonoff-Chaitin complexity does not reside in the domain of stochastic ensembles of statistical mechanics. They have no relation despite endless confusion and attempts in the literature to merge the two.
“Highly ordered” is paradoxically opposite from “complex” in algorithmic-based information theory. The emergent property of “instructions,” “organization,” and the “message” of “messenger biomolecules” is simply not addressed in Maxwell-Boltzmann-Gibbs equations of heat equilibration and energy flux between compartments. Surprisingly, the essence of genetic “prescriptive information” and “instructions” is not addressed by current “information theory” either. Shannon information theory concerns itself primarily with data transmission, reception, and noise-reduction processing without regard for the essence of the “message” itself.
The Foundation questions whether “order,” physical “complexity,” or “shared entropy” are synonymous with “prescriptive information,” “instructions,” or “organization.” Christoph Adami emphasizes that information is always “about something, and cannot be defined without reference to what it is information about.” It is “correlation entropy” that is “shared” or “mutual.” Thus, says Adami, “Entropy can never be a measure of complexity. Measuring correlations within a sequence, like Kolmogorov and Chaitin (and Lempel-Ziv, and many others) is not going to reveal how that sequence is correlated to the environment within which it is to be interpreted. Information is entropy “shared with the world,” and the amount of information a sequence shares with its world represents its complexity.” (Personal communication; see also PNAS, April 25, 2000, 97, #9, 4463-4468).
Differences of perspective among information theorists are often definitional. “Complexity” and “shared entropy” (shared uncertainty between sender and receiver) has unfortunately often been used synonymously with “prescriptive information (instruction).” But is it? Mere complexity and shared entropy seem to lack the specification and orchestrational functionality inherent in the genetic “instruction” system of translation.
The confusion between algorithmic instruction and Maxwell-Boltzmann-Gibbs entropy may have been introduced through the thought experiment imagining Maxwell’s Demon – a being exercising intelligent choice over the opening and closing of a trap door between compartments. Statistical mechanics has no empirical justification for the introduction of purposeful control over the trap door.
Solar energy itself has never been observed to produce prescriptive information (instruction/organization). Photons are used by existing instructional mechanisms which capture, transduce, store, and utilize energy for work. Fiber optics is used by human intelligence to transmit meaningful prescriptive information (instruction) and message. But raw energy itself must not be confused with functional prescriptive information/instructions. The latter is a form of algorithmic programming. Successions of certain decision-node switch settings determine whether a genetic “program” will “work” to accomplish its task.
2. Is life autonomous?
Some argue that life exhibits the characteristics of autonomy. Life is orchestrated by the prescriptive information (instruction) content and the inherent systems with which it finds itself. It is directed (passive voice) by its inherited genome. Because of this others argue that even a prokaryote’s autonomy is apparent rather than real. Cells are fully dependent upon “recipe” and control mechanisms delivered from a prior and external source. Cells also remain dependent upon their environment, especially for energy. But without the transducing mechanisms instructed by its genome, energy will only accelerate its demise.
3. Does life display Negentropy?
- It is mathematically impossible for entropy to be a negative entity. (Yockey, 1992, Information Theory and Molecular Biology, Cambridge University Press, p 84)
- Organisms do not violate the Second Law of Thermodynamics any more than any other physical entity in open systems far from equilibrium. Their existing genetic instructions, command and control mechanisms, and machinery simply allow them to process incoming nutrients and energy in full accord with the Second Law. The problem lies in the derivation of the functional biological information that makes both of these processes possible. By what mechanism did initial instructions arise sufficient to produce such highly conceptual metabolic and replicative systems? This is the quest of The Gene Emergence Project, and the object of The Origin-of-Life Prize offer.
4. Specified aperiodic complexity
Many investigators, such as Leslie Orgel (Origins of Life, 1973, New York, John Wiley, 189-190), for many years have regarded genetic information as having an additional component besides mere complexity. Matrices of prescriptive information retention are not only extremely improbable ensembles, but ones which are specified in a way that yields biofunction. “Specified complexity” means that only a certain few out of a large sample space of potential or real ensembles will produce metabolic function. Specified complexity instructs and integrates biochemical pathways into homeostatic metabolism.
5. Algorithmic instruction
Life is an integration of many algorithmic processes that give rise to biofunction and overall homeostatic metabolism. Algorithms are processes or procedures that give rise to useful function. Algorithms are not merely linear sequences of symbols. They do something. Each symbol represents a choice from among symbol options. Each choice is critical to the determination of eventual function. There is an organizational property – a certain element of seeming conceptuality – to the biological information /instructions that produce the citric acid cycle, for example. This aspect raises bioinformation to a more instructive, orchestrational, and recipe level than mere physical order, complexity, probabilistic uncertainty, or “mutual uncertainty” between two sets.
Algorithmic compression schemes are valuable in defining plain complexity. Such algorithms address internal sequence compressibility only. But these types of algorithms tell us nothing about whether the sequence instructs any function external to itself. The degree of compressibility is not critical to defining prescriptive information (instruction). The functionality the sequence produces is.
Algorithms are usually schema of successive decision node “choices” that lead somewhere useful. The sequence of choices accomplishes some useful task. Each decision node represents a fork in the road. One false turn, and the potential function at the end of the sequence can be lost. Dendrograms of these decision node choice options give rise to many orders of magnitude of potential terminal tree branches. Only a few of these branches usually yield sophisticated function. A biopolymer represents this sequence of decision-node symbol “choices.” Even when homologous protein sequences are considered from genetic drift, sometimes the equivalent of only one branch out of 2^30 (10^10) branches in sequence space “work” as the needed enzyme. Life-origin scenarios must provide explanation and mechanism for how such unlikely algorithmic strings come together at the same place and at the same time to produce not only the local function of each individual algorithmic program, but the integration of many hundreds of such strings into homeostatic metabolism.
6. The source of genetic information in nature
No theory of genetic information is complete without a model of mechanism for the source of such prescriptive information within Nature. It is not sufficient for a submission to the Prize to limit discussion of prescriptive information (instruction) theory to its replication, transmission, modification, or matrix of information retention. All submissions must address the source of the prescriptive information through non-supernaturalistic natural processes. Which of the four known forces of physics, or what combination of these forces, produced prescriptive, functional information, and how? What is the empirical evidence for this kind of prescriptive information (instruction) spontaneously arising within Nature?
7. Genetic code
In all known phenomenological life, genetic code manifests
- the conveyance of a functional coded message, using a sign system, to distant sites through an information channel to energy-consuming decoding receivers – ribosomal “machines,”
- symbolic, indirect representation of that message from one alphabet into another (e.g., codons of nitrogen base “language” being translated into the end-product of physical amino acid sequence “language.”)
- prespecification of extremely unlikely and complex future events (see Dembski in suggested readings below) suggesting “apparent intent,” “apparent planning,” or “apparent purpose.” (as Richard Dawkins describes it, “apparent design”),
- instructions capable of effecting and affecting many individual manufacturing processes, and of mediating the cooperation of all of those diverse processes toward the one organismal and seemingly “conceptual” end of being and staying alive, and
- the ability of that information (instruction) not only to give the directions or orders of what should be done, but to bring to pass those orders in the form of actual physical molecules, products, and life processes.
- the seemingly “irreducible complexity” argued by Michael Behe (see suggested readings below)
- the initial writing of this prescriptive information by nature, not just the modification of existing genetic instruction through mutation.
8. Scaffolding models
a. “Scaffolding” models of prelife (e.g., silicone/clay matrix models) constitute acceptable submissions as long as detailed and plausible hypothetical mechanisms along with empirical correlation are provided linking such inanimate crystal matrix models to “the arch” of current carbon-chemistry life. Such models would have to explain how random defects in crystal layers got arranged into functional genetic prescriptive information (instruction). The issue is not the medium or matrix that retains the information. The issue isthe source of prescriptive information (instruction) itself in any medium or matrix of Nature.
b. “Scaffolding” models would also have to explain how inanimate crystals or the ions/radicals adsorbed onto them progressively acquired the nine minimum functions and capabilities of living organisms listed under the provided definition of “life.”
9. Biochemical correlation
a. The hypothetical mechanism must demonstrate correspondence with “the real world” of biochemistry.
b. The submission must provide adequate empirical support strongly suggesting that such a hypothetical scenario can take place naturally in a prebiotic environment. Simulation of abiogenesis must be independent of the factor of human intelligence that is so often subconsciously incorporated into computer hardware/software experimental design and simulation.
c. Thermodynamic realities must be clearly addressed, including specific discussion of any supposed pockets of momentary exception to the Second Law of increasing Maxwell-Boltzmann-Gibbs entropy. The Foundation’s view is that Prigogine’s dissipative structures, and life itself, operate within the constraints of the 2nd Law.
Maxwell-Boltzmann-Gibbs entropy must not be confused with statistical Shannon entropy or Kolmogorov-Chaitin-Solomonoff-Yockey “complexity.” The latter two are nonphysical, abstract, mathematical constructs. All physical matrices of prescriptive information retention, however, are subject to the plight of Maxwell-Boltzmann-Gibbs entropy. They manifest a tendency toward deterioration in both closed and open systems. Repair mechanisms for these messenger biomolecules, therefore, require all the more starting instructional integrity. Prescriptive information would have been necessary in any primordial life form’s genome to correct for continuous noise corruption of its functional protogenes. Deterioration of existing recipe in a physical matrix is far more probable than the spontaneous writing of new conceptually complex metabolic algorithms. Building-block synthesis, for instance, would have required something like a reductive citric acid cycle. There are no simple algorithms for integrating such a multistep, highly-directional pathway.
d. Empirical support does not have to be original research, but can be gleaned from existing scientific literature. Previously published empirical support must be described in detail and well referenced within the applicant’s published research paper, explaining exactly how those controlled observations demonstrate empirical correlation with the applicant’s theory.
10. “Design” anthropomorphisms
It is easy for us to attribute “apparent instruction” or “apparent design” to projections of human intelligence onto the data. Based on current knowledge of molecular biology, any protocell or protobiont imaginable clearly must have manifested such “concept” to come to life long before any humans appeared on the scene to project or anthropomorphize anything. The Foundation believes that use of the words “order” and “complexity” are grossly inadequate euphemisms for the clearly observable prescriptive information, genetic instructions, biomessage, and functional biochemical pathways inherent in the simplest free-living organisms. The Foundation further believes that these empirical properties of genetic instruction and life can be investigated scientifically.
The simplest known living organisms are replete with empirical evidence of organizational unity and coherence which directs future biochemical events toward undeniable ends and purposes. Prokaryotes such as Archaea exhibit the integration of multiple biological systems into extraordinary organismic cooperation. The Foundation interprets such observable genetic instructions to be undeniable empirical evidence of objectively existent “concept” independent of human mentation. The “unusual effectiveness of mathematics” in physics offers more evidence. Such conceptual capacity predated human intelligence in any evolutionary paradigm. A hypothetical mechanism is therefore needed for two aspects of life origin:
- how seemingly unintelligent natural processes could have written such highly prescriptive recipe/message linguistic-like code, and
- how such an indirect system could have arisen which effects (brings into existence) so many hundreds of integrated and far-removed phenotypic processes and products in the simplest organisms.
11. Appeals to unknown laws
Appealing to unknown “laws” as the source of biological instruction constitutes a “category error” of logic theory. “Laws” do not cause anything. They are merely human generalizations, mental constructions, and mathematical descriptions of existing forces and mass-energy relationships. Even “chance” is a probabilistic rational construct. Neither chance nor “laws” cause effects. Unknown laws, therefore, cannot provide a mechanism for prescriptive information (instruction) genesis. Appealing to unknown laws constitutes a “naturalism of the gaps,” corresponding to supernaturalists’ appealing to a “God of the gaps” for scientific explanation. Neither is acceptable in naturalistic science.
12. Infinity issues
Appeals to multiple or “parallel” cosmoses or to an infinite number of cosmic “Big Bang/Crunch” oscillations as essential elements of proposed mechanisms are not acceptable in submissions due to a lack of empirical correlation and testability. Such beliefs are without hard physical evidence and must therefore be considered unfalsifiable, currently outside the methodology of scientific investigation to confirm or disprove, and therefore more mathematically theoretical and metaphysical than scientific in nature. Recent cosmological evidence also suggests insufficient mass for gravity to reverse continuing cosmic expansion. The best cosmological evidence thus far suggests the cosmos is finite rather than infinite in age.
13. Computerization
a. Computerized models must be free of subtle, inherent teleological design flaws which become incorporated into the model itself. The insidious role of human intelligence in both hardware and software must be acknowledged, addressed, and somehow divorced from hypothetical models themselves.
b. Models based on conditional probabilities must justify empirically why the environment would have selected for each plateau along the way. For example, why would the environment have favored and preserved the intermediary steps in many metabolic pathways of archaea when no useful product is produced until the last step? Many of these multistep, indirect manufacturing pathways constitute all-or-none processes. Such biochemical pathways have no phenotypic “plateaus” in physical biochemical reality to support theoretical arguments of selectable incremental function. Yet the random occurrence of the full pathway as a whole is statistically prohibitive in a trillion billion years, let alone in the mere 15-billion-year age of our cosmos.
c. Other factors limit the number of statistical trials for an exploding cosmic egg to randomly give rise to such sophisticated pathways in 15-billion years. The finite number of nucleons available in the cosmos to react with one another (10^80?) is an example. We can no longer appeal to infinity of particles, space, or time as a means of overcoming the statistical prohibitiveness within the only cosmos with which we have empirical experience. Abundant data, mathematical proofs, and our best theories all suggest that our cosmos is finite, not infinite. We have no scientific knowledge of any other cosmos, let alone an infinite number of imaginary cosmoses.
d. Parallel computer models must similarly have direct empirical correlation with naturally occurring environmental, chemical, biochemical, and molecular biological scenarios. “Directed evolution” experiments must not incorporate the artificial selection of investigators into their experimental design. “Directed evolution” is a self-contradictory phrase. Evolution by definition is never directed. Evolution in fact has no goal or purpose.