Incomplete Nature - Part 13
Library

Part 13

A transmission affected by thermodynamic perturbations that make it less than perfectly reliable will introduce an additional level of uncertainty to contend with, but one that decreases information capacity. An increase in the Boltzmann entropy of the physical medium that const.i.tutes the signal carrier corresponds to a decrease in the correlation between sent and received signals. Although this does not decrease the signal entropy, it reduces the amount of uncertainty that can be removed by a given signal, and thus reduces the information capacity.

This identifies two contributors to the entropy of a signal-one a.s.sociated with the probability of a given signal being sent and the other a.s.sociated with a given signal being corrupted. This complementary relationship is a hint that the physical and informational uses of the concept of entropy are more than merely a.n.a.logous. By exploring the relationship between Shannon entropy and Boltzmann entropy, we can shed light on the reason why change in Shannon entropy is critical to information. But the connection is subtle, and its relationship to the way that a signal conveys its information "content" is even subtler.

INFORMATION AND REFERENCE.

Warren Weaver, who wrote a commentary article that appeared in a book presentation of Shannon's original paper, commented that using the term information to describe the measure of the unpredictability reduced by a given signal is an atypical use of the term.6 This is because Shannon's notion of information is agnostic with respect to what a signal is or could be about. This agnosticism has led to considerable confusion outside of the technical literature, because it is almost ant.i.thetical to the standard colloquial use of the term. Shannon was interested in measuring information for engineering purposes. So he concentrated exclusively on the properties of transmission processes and communication media, and ignored what we normally take to be information, that is, what something tells us about something else that is not present in the signal medium itself.

This was not merely an arbitrary simplification, however. It was necessary because the same sign or signal can be given any number of different interpretations. Dirt on a boot can provide information on anything from personal hygiene to evidence about the sort of geological terrain a person recently visited. The properties of some medium that give it the potential to convey information don't determine what it is about, they merely make reference possible. So, in order to provide a finite measure of the information potential of a given signal or channel, Shannon had to ignore any particular interpretation process, and stop the a.n.a.lysis prior to including any consideration of what a sign or signal might be about. What is conveyed is not merely a function of this reduction of Shannon entropy.

This is where the relation between Shannon and Boltzmann entropy turns out to be more than merely a.n.a.logical. A fuller conception of information requires that these properties be considered with respect to two different levels of a.n.a.lysis of the same phenomenon: the formal characteristics of the signal and the material-energetic characteristics of the signal. Consider what we know about Boltzmann entropy. If, within the boundaries of a physical system such as a chamber filled with a gas, a reduction of entropy is observed, one can be pretty certain that something not in that chamber is causing this reduction of entropy. Being in an improbable state or observing a non-spontaneous change toward such a state is evidence of extrinsic perturbation-work imposed from outside the system.

Despite their abstract character, information transmission and interpretation are physical processes involving material or energetic substrates that const.i.tute the transmission channel, storage medium, sign vehicles, and so on. But physical processes are subject to the laws of thermodynamics. So, in the case of Shannon entropy, no information is provided if there is no reduction in the uncertainty of a signal. But reduction of the Shannon entropy of a given physical medium is necessarily also a reduction of its Boltzmann entropy. This can only occur due to the imposition of outside constraints on the sign/signal medium because a reduction of Boltzmann entropy does not tend to occur spontaneously. When it does occur, it is evidence of an external influence.

Openness to external modification is obvious in the case of a person selecting a signal to transmit, but it is also the case in more subtle conditions. Consider, for example, a random hiss of radio signals received by a radio antenna pointed toward the heavens. A normally distributed radio signal represents high informational entropy, the expected tendency in an unconstrained context, which for example might be the result of random circuit noise. If this tendency were to be altered away from this distribution in any way, it would indicate that some extrinsic non-random factor was affecting the signal. The change could be due to an astronomical object emitting a specific signal. But what if instead of a specific identifiable signal, there is just noise when there shouldn't be any?

Just such an event did occur in 1965 when two Bell Labs scientists, Arno Penzias and Robert Wilson, aimed a sensitive microwave antenna skyward, away from any local radio source, and discovered that they were still recording the "hiss" of microwave noise no matter which way they pointed the antenna. They were receiving a more or less random microwave signal from everywhere at once! The obvious initial a.s.sumption was that it probably indicated a problem with the signal detection circuit, not any specific signal. Indeed, they even suspected that it might be the result of pigeons roosting in the antenna. Because it exhibited high Shannon entropy and its locus of origin also had high Shannon entropy (i.e., equal probability from any location), they a.s.sumed that it couldn't be a signal originating from any object in s.p.a.ce.

Only after they eliminated all potential local sources of noise did they consider the more exotic possibility: that it did not originate in the receiving system itself, but rather from outside, everywhere in the cosmos at once. If the signal had consisted only of a very narrow band of frequencies, had exhibited a specific oscillatory pattern, or was received only from certain directions, that would have provided Shannon information, because then compared with signals that could have been recorded, this would have stood out. They only considered it to be information about something external, and not mere noise, when they had compared it to many more conceivable options, thus effectively increasing the entropy (uncertainty) of potential sources that could be eliminated. As they eliminated each local factor as a possible source of noise, they eliminated this uncertainty. In the end, what was considered the least reasonable explanation was the correct one: it was emanating from empty s.p.a.ce, the cosmic background. The signal was eventually interpreted as the deeply red-shifted heat of the Big Bang.

In simple terms, the brute fact of these deviations from expectation, both initially, then with respect to possible sources of local noise, and much later with respect to alternative cosmological theories, was what made this hiss information about something. Moreover, the form of this deviation from expectation-its statistical uniformity and lack of intrinsic asymmetries-provided the basis for the two scientists' interpretation. Heat is random motion. The form of this deviation from what they expected-that the signal would be irregularly distributed and correlated with specific objects-ultimately provided the clue to its interpretation. In other words, this became relevant for a second kind of entropy reduction: a.n.a.lytic reduction in the variety of possible sources. By comparing the form of the received signal with what could have been its form, they were able to eliminate many possible causes of both intrinsic and extrinsic physical influence until only one seemed plausible.

This demonstrates that something other than merely the reduction of the Shannon entropy of a signal is relevant to understanding how that signal conveys evidence of another absent phenomenon. Information is made available when the state of some physical system differs from what would be expected. And what this information can be about depends on this expectation. If there is no deviation from expectation, there is no information. This would be the case for complete physical isolation of the signal medium from outside influence. The difference is that an isolated signal source cannot be about anything. What information can be about depends on the nature of the reduction process and the constraints exhibited by the received signal. How a given medium can be modified by interaction with extrinsic factors, or how it can be manipulated, is what is most relevant for determining what information can be conveyed by it.

Not only is the Shannon entropy of an information-bearing process important to its capacity; its physical dynamics with respect to the physical context in which it is embedded is important to the determination of what it can be about. But in comparison to the Shannon entropy of the signal, the physical constraints on the forms that a change in signal entropy can take contribute another independent potential for entropy reduction: a reduction of the entropy of possible referents. In other words, once we begin considering the potential entropy of the cla.s.s of things or events that a given medium can convey, the physical characteristics of the medium, not merely its range of potential states, become important. This reduces the Shannon entropy of the range of possible phenomena that a given change in that medium can be about. In other words, what we might now call referential information is a second order form of information, over and above Shannon information (Figures 12.1, 12.2).

A first hint of the relationship between information as form and as a sign of something else is exemplified by the role that pattern plays in the a.n.a.lysis. In Shannon's terms, pattern is redundancy. From the sender's point of view, any redundancy (defined as predictability) of the signal has the effect of reducing the amount of information that can be sent. In other words, redundancy introduces a constraint on channel capacity. Consider how much less information could be packed into this page if I could only use two characters (like the 0s and 1s in a computer). Having twenty-six letters, cases, punctuation marks, and different-length character strings (words), separated by s.p.a.ces (also characters), decreases the repet.i.tion, increases the possibility of what character can follow what, and thereby decreases redundancy. Nevertheless, there is sufficient redundancy in the possible letter combinations to make it possible to fairly easily discover typos (using redundancy for error correction will be discussed below).

FIGURE 12.1: Depiction of the logic that Claude Shannon used to define and measure potential information-conveying capacity ("Shannon information").

FIGURE 12.2: Depiction of the way that Shannon information depends on the susceptibility of a medium to modification by work. It derives its potential to convey information about something extrinsic to that medium by virtue of the constraints imposed on that medium by whatever is responsible for this work.

Less information can get transmitted if some transmissions are predictable from previous ones, or if there are simply fewer alternatives to choose from. But the story is slightly more complicated when we add reference. From the receiver's point of view, there must be some redundancy with what is already known for the information conveyed by a signal to even be a.s.sessed. In other words, the context of the communication must already be redundantly structured. Both sender and receiver must share the set of options that const.i.tute information.

Shannon realized that the introduction of redundancy is also necessary to compensate for any unreliability of a given communication medium. If the reliability of a given sign or signal is questionable, this introduces an additional source of unpredictability that does not contribute to the intended information: noise. But just as redundancy reduces the unpredictability of signals, it can also reduce the unreliability of the medium conveying information. Introducing expected redundancy into a message being transmitted makes it possible to distinguish these two sources of Shannon entropy (the variety of possible signals that could have been generated versus the possible errors that could have arisen in the process). In the simplest case, this is accomplished by resending a signal multiple times. Because noise is, by definition, not constrained by the same factors as is the selection of the signal, each insertion of a noise-derived signal error will be uncorrelated with any other, but independent transmissions of multiple identical signals will be correlated with one another by definition. In this way, noisy components of a signal or a received message can be detected and replaced.

Error-reducing redundancy can be introduced by means other than by signal retransmission. English utilizes only a fraction of possible letter combinations, with very asymmetric probabilities and combinatorial options. Grammar and syntax further limit what is an appropriate and inappropriate word string. Last but not least, the distinction between sense and nonsense limits what words and phrases are likely to occur in the same context. This internal redundancy of written English makes typos relatively easy to identify and correct.

Redundancy as a means of information rectification is also relevant to a.s.sessing the reliability of second-order (i.e., referential) information. For example, when one hears multiple independent reports by observers of the same event, the redundancy in the content of these various accounts can serve an a.n.a.logous function. Even though there will be uncorrelated details from one account to another, the redundancies between independent reports make these more redundant aspects more reliable (so long as they are truly independent, i.e., don't reflect the influence of common biases). So, whereas redundancy decreases information capacity, it is also what makes it possible to distinguish information from noise, both in terms of the signal and in terms of what it conveys.

IT TAKES WORK.

What a signal medium can indicate is dependent on the possibility of physical relation with some relevant features of its physical context, and the possibility that this can result in a change of its Shannon entropy. Since reduction in the Shannon entropy of a physical medium is also often a reduction of its physical entropy, such a change is evidence that work was done to modify the signal medium. Recall that work in some form is required to change something from its spontaneous state or tendency to a non-spontaneous state. So, for any physical medium, a contragrade change in its state is an indication that work has been performed. Moreover, a contragrade change can only be produced by extrinsic perturbation; the medium must be an open system in some respect. The reference conveyed by a reduction in Shannon entropy is therefore a function of the ways that the medium is susceptible to outside interference.

Though the external factors that alter a system's entropy (variety of possible states) are not intrinsic features of that medium, the signal constraint is an intrinsic feature. Referential information is in this sense inferred from the form of the constraints embodied in the relationship between unconstrained possibility and received signal. In this way, Shannon information, which is a.s.sessed in terms of this constraint, embodies a trace of the work that produced it.

But openness to the possibility of an extrinsic influence involves specific physical susceptibilities, which also constrain what kind of work is able to modify it, thus constraining the domain of potential extrinsic phenomena that can be thereby indicated. Because of this explicit physical constraint, even the absence of any change of signal entropy can provide referential information. No change in entropy is one possible state. This means that even unconstrained fluctuations may still be able to convey referential information. This is information about the fact that of the possible influences on signal constraint, none were present. It is the mere possibility of exhibiting constraints due to extrinsic influence that is the basis of a given medium's informative power.

This possibility demonstrates that reference is more than just the consequence of physical work to change a signal medium. Although such a relation to work is fundamental, referential information can be conveyed both by the effect of work and by evidence that no work has been done.7 This is why no news can be news that something antic.i.p.ated has not yet occurred, as in the examples of messages conveyed by absence discussed above. The informative power of absence is one of the clearest indications that Shannon information and referential information are not equivalent. This is again because the constraint is both intrinsic and yet not located in the signal medium; it is rather a relationship between what is and what could have been its state at any given moment. A constraint is not an intrinsic property but a relational property, even if just in relation to what is possible. Only when a physical system exhibits a reduction of entropy compared to some prior state, or more probable state, is extrinsic influence indicated. This ubiquitous tendency can, conversely, be the background against which an unaltered or unconstrained feature can provide information about what hasn't occurred. If the sign medium exhibits no constraint, or hasn't diverged from some stable state, it can be inferred that there has been no extrinsic influence even though one could have been present. The relationship of present to absent forms of a sign medium embodies the openness of that medium to extrinsic intervention, whether or not any interaction has occurred. Importantly, this also means that the possibility of change due to work, not its actual effect, is the feature upon which reference depends. It is what allows absence itself, absence of change, or being in a highly probable state, to be informative.

Consider a typo in a ma.n.u.script. It can be thought of as a reduction of referential information because it reflects a lapse in the constraint imposed by the language that is necessary to convey the intended message. Yet it is also information about the proficiency of the typist, information that might be useful to a prospective employer. Or consider a technician diagnosing the nature of a video hardware problem by observing the way the image has become distorted. What is signal and what is noise is not intrinsic to the sign medium, because this is a determination with respect to reference. In both, the deviation from a predicted or expected state is taken to refer to an otherwise un.o.bserved cause. Similarly, a sign that doesn't exhibit the effects of extrinsic influence-for example, setting a burglar alarm to detect motion-can equally well provide information that a possible event (a break-in) did not occur.

In all these cases, the referential capacity of the informational vehicle is dependent on physical work that has, or could have, altered the state of some medium open to extrinsic modification. This tells us that the link between Shannon entropy and Boltzmann entropy is not mere a.n.a.logy or formal parallelism. More important, it demonstrates a precise link to the concept of work. Gregory Bateson's description of information as "a difference that makes a difference" is actually a quite elegant description of work. Now we can see why this must be so. The capacity to reflect the effect of work is the basis of reference.

TAMING THE DEMON.

This should not surprise us. The existence of an intimate relationship between information and work has been recognized since almost the beginning of the science of thermodynamics. In 1867, James Clerk Maxwell initially explored the relationship in terms of a thought experiment. He imagined a microscopic observer (described as a demon), who could a.s.sess the velocity of individual gas molecules on either side of a divided container and could control a door between them to allow only faster- (or slower-) than-average molecules through one way or the other. Using this information about each molecule, the demon would thus be able to decrease the entropy of the system in contradiction to the second law of thermodynamics. This would normally only be possible by doing thermodynamic work to drive the entire system away from thermodynamic equilibrium; and once in this far-from-equilibrium state, this thermodynamic gradient itself could be harnessed to do further work. So it appears on the face of things that the demon's information about molecular velocity is allowing the system to progressively reverse the second law, with only the small amount of work that determines the state of the door. This seems consistent with our intuition that information and entropy have opposite signs, in the sense that a decrease in entropy of the system increases the predictability of molecular velocities on either side of this divide. For this reason, information is sometimes described as "negentropy,"8 and has been equated with the orderliness of a system.

Maxwell's demon does not have to be a tiny homunculus. The same process could conceivably be embodied in a mechanical device able to link differences of detected molecular velocity and the correlated operation of the door. In the century that followed Maxwell's presentation of this thought experiment, many sophisticated a.n.a.lyses probed the question of whether the information gleaned by such an apparatus would in fact be able to cheat the second law of thermodynamics. As many a.n.a.lyses were subsequently to show, the mechanisms able to gather such information and use it to open the pa.s.s-through would inevitably require more work than the potential gained by creating this increase of heat gradient, and would therefore produce a net increase in total entropy of the system. The increase in entropy would inevitably exceed the reduction of entropy produced by the demon's efforts.

Although purely theoretical, this a.n.a.lysis validated the a.s.sumptions of thermodynamics, and also made it possible to measure the amount of Shannon information required to produce a given amount of Boltzmann entropy decrease. So, although the demon's activity in transforming differences of molecular velocity into differences of local entropy doesn't ultimately violate the second law, it provides a model system for exploring the relationship between Shannon information and work.

But if we replace the demon with an equivalent mechanical apparatus, does it make sense to say that it is using information about velocity to effect this change in entropy, or is it merely a mechanistic linkage between some physical change that registers velocity and whatever physical process opens the door? Although as external observers we can interpret a signal whose changes correlate with molecular velocity as representing information about that property, there is nothing about the mechanism linking this signal state to the state of the door that makes it more than just a physical consequence of interacting with the signal.

What enables the observer to interpret that the signal is about velocity is the independent availability of a means for relating differences of molecular velocity to corresponding differences of signal state. The differential activation of the door mechanism is merely a function of the physical linkage of the signal-detection and door-operation mechanisms. Molecular velocity is otherwise irrelevant, and a correlation between signal value and molecular velocity is not in any way necessary to the structure or operation of the door-opening mechanism. A correlation with molecular velocity is, however, critical to how one might design such a mechanism with this Maxwellian outcome in mind. And it is cryptically implied in the conception of an observing demon. Unlike its mechanical subst.i.tute, the demon must respond because of this correlation in order to interpret the signal to be about molecular velocity. The designer must ensure that this correlation exists, while the demon must a.s.sume that it exists, or at least be acting with respect to this correlation and not merely with respect to the signal. Thus the demon, like an outside observer, must already have information about this habit of correlation in order to interpret the signal as indicating this missing correlate. In other words, an independent source of information about this correlation is a precondition for the signal to be about velocity, but the signal-contingent door-opening mechanism has no independent access to this additional information.

More important, correlation is not a singular physical interaction, but rather a regularity of physical interactions. A mechanism that opens the door in response to a given signal value is not responding to this regularity but only to a singular physical influence. In this respect, there is an additional criterion besides being susceptible to extrinsic modification that const.i.tutes the referential value of an informing medium: this modifiability must have a general character.

The a.n.a.lysis so far has exposed a common feature of both the logic of information theory (Shannon) and the logic of thermodynamic theory (Boltzmann). This not only helps explain the a.n.a.logical use of the entropy concept in each, it also explains why it is necessary to link these approaches into a common theory to begin to define the referential function of information. Both of these formal commonalities and the basis for their unification into a theory of reference depend on physical openness. In the case of cla.s.sic information theory, the improbability of receiving a given sign or signal with respect to the background expectation of its receipt compared to other options defines the measure of potential information. In the case of cla.s.sic thermodynamics, the improbability of being in some far-from-equilibrium state is a measure of its potential to do work, and also a measure of work that was necessarily performed to shift it into this state. Inversely, being in a most probable state provides no information about any extrinsic influence, and indeed suggests that to the extent that this medium is sensitive to external perturbation, none was present that could have left a trace.

The linkage between these two theories hinges on the materiality of communication (e.g., the const.i.tution of its sign and/or signal medium). So, in a paradoxical sense, the absent content that is the hallmark of information is a function of the necessary physicality of information processes.

13.

SIGNIFICANCE.

The first surprise is that it takes constraints on the release of energy to perform work, but it takes work to create constraints. The second surprise is that constraints are information and information is constraint.

-STUART KAUFFMAN, PERSONAL COMMUNICATION.

ABOUTNESS MATTERS.

As we have seen, nearly every physical interaction in the universe can be described in terms of Shannon information, and any relationship involving physical malleability, whether exemplified or not, can be interpreted as information about something else. This has led some writers to suggest that the universe is made of information, not matter. But this tells us little more than that the universe is a manifold of physical differences and that most are the result of prior work. Of course, not every physical difference is interpreted, or even can be interpreted, though all may at some point contribute to future physical changes. Interpretation is ultimately a physical process, but one with a quite distinctive kind of causal organization. So, although almost every physical difference in the history of the universe can potentially be interpreted to provide information about any number of other linked physical occurrences, the unimaginably vast majority of these go uninterpreted, and so cannot be said to be information about anything. Without interpretation, a physical difference is just a physical difference, and calling these ubiquitous differences "information" furthermore runs the risk of collapsing the distinction between information, matter, and energy, and ultimately eliminating the entire realm of ententional phenomena from consideration.

Although any physical difference can become significant and provide information about something else, interpretation requires that certain very restricted forms of physical processes must be produced. The organization of these processes distinguishes interpretation from mere physical cause and effect. Consider again Gregory Bateson's aphorism: "a difference that makes a difference." Its meaning turns on the ambiguity between two senses of to "make a difference." The more literal meaning is to cause something to change from what otherwise would have occurred. This is effectively a claim about performing work (in any of the senses that we have described). In idiomatic English, however, it also means to be of value (either positive or negative) to some recipient or to serve some purpose. I interpret Bateson's point to be that both meanings are relevant. Taken together, then, these two meanings describe work that is initiated in order to effect a change that will serve some end. Some favored consequence must be promoted, or some unwanted consequence must be impeded, by the work that has been performed in response to the property of the sign medium that is taken as information.

This is why an interpretive process is more than a mere causal process. It organizes work in response to the state of a sign medium and with respect to some normative consequence-a general type of consequence that is in some way valued over others. This characterization of creating/making a difference in both senses suggests that the sort of work that Bateson has in mind is not merely thermodynamic work. To "make a difference" in the normative sense of this phrase is (in the terms we have developed) to support some teleodynamic process. It must contribute to the potential to initiate, support, or inhibit teleodynamic work, because only teleodynamic processes can have normative consequences. So to explain the basis of an interpretation process is to trace the way that teleodynamic work transforms mere physical work into semiotic relationships, and back again.

As our dynamical a.n.a.lysis has shown, teleodynamic work emerges from and depends on both morphodynamic and thermodynamic work. Consequently, these lower forms of work must also be involved in any process of interpretation. If this is the case, then an interpreting process must depend on extrinsic energetic and material resources and must also involve far-from-equilibrium self-organizing processes. In this roundabout way, like the dynamical processes that characterize organisms, the interpretation of something as information involves a form of recursive organization whereby the interpretation of something as information indirectly reinforces the capacity to do this again.

BEYOND CYBERNETICS.

Perhaps the first hint that the 2,000-year-old mystery of interpretation might be susceptible to a physical explanation rather than remaining forever metaphysical can be attributed to the development of a formal theory of regulation and control, which can rightfully be said to have initiated the information age. This major step forward in defining the relationship between information and its physical consequences was provided by the development of cybernetic theory in the 1950s and 60s. The term cybernetic was coined by its most important theoretician, Norbert Wiener, and comes from the same Greek root as the word "government," referring to steering or controlling. Within cybernetic theory, for the first time it became possible to specify how information (in the Shannonian sense) could have definite physical consequences and could contribute to the attractor dynamics const.i.tuting teleonomic behaviors.

In chapter 4, we were introduced to the concept of teleonomic behavior and the simple mechanistic exemplar of negative feedback regulation: thermostatic control circuit. This model system not only demonstrates the fundamental principles of this paradigm, and the way it conceives of the linkage between information as a physical difference and a potential physical consequence; it also provides a critical clue to its own inadequacy.

A thermostat regulates the temperature in a room by virtue of the way that a switch controlling a heating device is turned on or off by the effects of that temperature. The way these changes correlate with the state of the switch and the functioning of the heating device creates a deviation-minimizing pattern of behavior. It's a process whereby one difference sets in motion a chain of difference-making processes that ultimately "make a difference" in keeping conditions within a desired range for some purpose. Thus a difference in the surrounding temperature produces a difference in the state of the switch, which produces a difference in the operation of the heater, which produces a difference in the temperature of the room, which produces a difference in the state of the switch, and so forth. At each stage, work is done on a later component in the circuit with respect to a change in some feature of the previous component, resulting in a circularity of causal influences. Thus it is often argued that each subsequent step along the chain of events in this cycle "interprets" the information provided by the previous step, and that information is being pa.s.sed around this causal circuit. But in what sense do these terms apply? Are they merely metaphoric?

We can dissect this problem by dissecting the circuit itself. A cla.s.sic mechanical-electrical thermostat design involves a mercury switch attached to a coiled bimetallic strip, which expands when warmed, thus tipping the switch one way, and contracts when cooled, tipping the switch the other way (cf. Figure 4.1). The angle of the switch determines whether the circuit is completed or interrupted. But let's consider one of these steps in isolation. Is the coiling and uncoiling of a bimetallic strip information about temperature? It certainly could be used as such to an observer who understood this relationship and was bringing this knowledge to bear in considering the relationship. But what if this change of states goes unnoticed? Physically, there is no difference. The change in state of the coiled strip and of the room temperature will occur irrespective of ether being observed. Like the wax impression of a signet ring, it is merely a physical phenomenon that could be interpreted as information about something in particular. Of course, it could also be interpreted as information about many other things. For example, its behavior could be interpreted to be information about the differential responsiveness of the two metals. Or it could be mistakenly interpreted as magic or some intrinsic tendency to grow and shrink at random. Is being incorporated into a thermostatic circuit sufficient to justify describing the coiling behavior as "information" about temperature to the circuit? Or is this too still only one of many possible things it could provide information about? What makes it information about anything rather than just a simple physical influence? Clearly, it is the process of interpretation that matters, not merely this physical tendency, and that is an entirely separate causal process.

Consider, in contrast, a single-cell organism responding to a change in temperature by changing its chemical metabolism. Additionally, a.s.sume that some molecular process within the organism, which is the equivalent of a simple thermostatic device, accomplishes this change. In many respects, it is more like a thermostat installed by a human user to maintain room temperature than a feedback process that might occur spontaneously in inorganic nature. This is because both the molecular regulator of the cell and the engineered thermostat embody constraints that are useful to some superordinate system for which they are at the same time both supportive and supported components. In a thermostat, it is the desired attractor dynamics (desired by its human users), and not any one specific material or energetic configuration, that determines its design. In organisms, such convergent behaviors were likely to have been favored by natural selection to buffer any undesirable changes of internal temperature. Indeed, in both living and engineered regulators, there can be many different ways that a given attractor dynamics is achieved. Moreover, a.n.a.logously functioning living mechanisms often arise via parallel or convergent evolution from quite different precursors.

This drives home the point that it is this pattern of behavior that determines the existence of both the evolved and engineered regulatory systems, not the sharing of any similar material const.i.tution or a common accidental origin. In contrast, Old Faithful was formed by a singular geological accident, and the regularity of its deviation-minimizing hydrothermal behavior had nothing to do with its initial formation. Nor does its feedback logic play any significant role in how long this behavior will persist. If the geology changes or the source of water is depleted, the process will simply cease.

We are thus warranted in using the term information to describe the physical changes that get propagated from component to component in a designed or evolved feedback circuit only because the resultant attractor dynamics itself played the determinate role in generating the architecture of this mechanism. In such cases, we also recognize that its physical composition and component dynamical operations are replaceable so long as this attractor-governed behavior is reliably achieved. In contrast, it is also why, in the case of Old Faithful or any other accidentally occurring non-living feedback process, it feels strange to use information terminology to describe their dynamics, except in a metaphoric or merely Shannonian sense. Although they too may exhibit a tendency to converge-toward or resist-deviation-away-from a specific attractor state, the causal histories and future persistence of these processes lack this crucial attribute. Indeed, a designed or evolved feedback mechanism and an accidentally occurring a.n.a.logue might even be mechanistically identical, and we still would need to make this distinction.

WORKING IT OUT.

As Shannon's a.n.a.lysis showed, information is embodied in constraints, and, as we have additionally shown, what these constraints can be about is a function of the work that ultimately was responsible for producing them (or could have produced them, even if they are never generated), either directly or indirectly. But as Stuart Kauffman points out in the epigraph at the beginning of this chapter, not only does it take work to produce constraints, it takes constraints to produce work. So one way in which the referential content of information can indirectly influence the physical world is if the constraints embodied in the informing medium can become the basis for specifying further work. And differences of constraint can determine differences in effect.

This capacity for one form of work to produce the constraints that organize another, independent form of work is the source of the amplifying power of information. It affords a means to couple otherwise unrelated contragrade processes into highly complex and indirect chains. And because of the complementary roles of constraint and energy-gradient reduction, it also provides the means for using the depletion of a small energy gradient to create constraints that are able to organize the depletion of a much larger energy gradient. In this way, information can serve as the bridge linking the properties of otherwise quite separate and unrelated material and energetic systems. As a result, chains of otherwise non-interacting contragrade processes can be linked. Work done with the aid of one energy gradient can generate constraints in a signaling medium, which can in turn be used to channel work utilizing another quite different energy gradient to create constraints in yet some other medium, and so forth. By repeating such transfers step by step from medium to medium, process to process, causal linkages between phenomena that otherwise would be astronomically unlikely to occur spontaneously can be brought into existence. This is why information, whether embodied in biological processes, engineered devices, or theoretical speculations, has so radically altered the causal fabric of the world we live in. It expands the dimensions of what Kauffman has called the "adjacent possible" in almost unlimited ways, making almost any conceivable causal linkage possible (at least on a human scale).

In this respect, we can describe interpretation as the incorporation of some extrinsically available constraint to help organize work to produce other constraints that in turn help to organize additional work which promotes the maintenance of this reciprocal linkage between forms of work and constraint. So, unlike a thermostat, where the locus of interpretive activity is extrinsic to the cycle of physical interactions, an interpretive process is characterized by an entanglement between the dynamics of its responsiveness to an extrinsic constraint and the dynamics that maintains the intrinsic constraints that enable this responsiveness. Information is in this way indirectly about the conditions of its own interpretation, as well as about something else relevant to these conditions. Interpreting some constraint as being about something else is thus a projection about possibility in two ways: it is a prediction that the source of the constraint exists; and also that it is causally relevant to the preservation of this projective capacity. But a given constraint is information to an interpretive process regardless of whether these projected relationships are realized. What determines that a given constraint is information is that the interpretive process is organized so that this constraint is correlated with the generation of work that would preserve the possibility of this process recurring under some (usually most) of the conditions that could have produced this constraint.

For this reason, interpretation is also always in some sense normative and the relationship of aboutness it projects is intrinsically fallible. The dynamical process of interpretation requires the expenditure of work, and in this sense the system embodying it is at risk of self-degradation if this process fails to generate an outcome that replenishes this capacity, both with respect to the constraints and the energy gradient that are required. But the constraint that serves as the sign of this extrinsic feature is a general formal property of the medium that embodies it, and so it cannot be a guarantee of any particular specific physical referent existing. So, although persistence of the interpretive capacity is partly conditional on this specificity, that correlation may not always hold.

The interpretive capacity is thus a capacity to generate a specific form of work in response to particular forms of system-extrinsic constraints in such a way that this generates intrinsic constraints that are likely to maintain or improve this capacity. But, as we have seen, only morphodynamic processes spontaneously generate intrinsic constraints, and this requires the maintenance of far-from-equilibrium conditions. And only teleodynamic systems (composed of reciprocal morphodynamic processes) are capable of preserving and reproducing the constraints that make this preservation possible. So a system capable of interpreting some extrinsic constraint as information relevant to this capability is necessarily a system dependent on being reliably correlated in s.p.a.ce and time with supportive non-equilibrium environmental conditions. Maintaining reliable access to these conditions, which by their nature are likely to be variable and transient, will thus be aided by being differentially responsive to constraints that tend to be correlated with this variability.

Non-living cybernetic mechanisms exhibit forms of recursive dynamical organization that generate attractor-mediated behavior, but their organization is not reflexively dependent on and generated by this dynamics. This means that there is no general property conveyed by each component dynamical transition from one state of the mechanism to the next. Only a specific dynamical consequence.

As Gregory Bateson emphatically argued, confusing information processes with energetic processes was one of the most problematic tendencies of twentieth-century science. Information and energy are distinct and in many respects should be treated as though they occupy independent causal realms. Nevertheless, they are in fact warp and weft of a single causal fabric. But unless we can both clearly distinguish between them and demonstrate their interdependence, the realms they exemplify will remain isolated.

INTERPRETATION.

For engineering purposes, Shannon's a.n.a.lysis could not extend further than an a.s.sessment of the information-carrying capacity of a signal medium, and the uncertainty that is reduced by receipt of a given signal. Including referential considerations would have introduced an infinite term into the quantification-an undecidable factor. What is undecidable is where to stop. There are innumerable points along a prior causal history culminating in the modification of the sign/signal medium in question, and any of these could be taken to be the relevant reference. The process we call interpretation is what determines which is the relevant one. It must "pick" one factor in the trail of causes and effects leading up to the constraint reflected in the signal medium. As everyday experience makes clear, what is significant and what is not depends on the context of interpretation. In different contexts and for different interpreters, the same sign or signal may thus be taken to be about very different things. The capacity to follow the trace of influences that culminated in this particular signal modification in order to identify one that is relevant is in this way entirely dependent on the complexity of the interpreting system, its intrinsic information-carrying/producing capacity, and its involvement with this same causal chain.

Although the physical embodiment of a communication medium provides the concrete basis for reference, its physical embeddedness also opens the door to an open-ended lineage of potentially linked influences. To gain a sense of the openness of the interpretive possibilities, consider the problem faced by a detective at a crime scene. There are many physical traces left by the interactions involved in the critical event: doors may have been opened, furniture displaced, vases knocked over, muddy footprints left on a rug, fingerprints on the doork.n.o.b, filaments of clothing, hair, and skin cells left behind during a struggle, and so on. One complex event is reflected in these signs. But for each trace, there may or may not be a causal link to this particular event of interest. Each will also have a causal history that includes many other influences. The causal history reflected in the physical trace taken as a sign is not necessarily relevant to any single event, and which of the events in this history might be determined to be of pragmatic relevance can be different for different interpretive purposes and differently accessible to the interpretive tools that are available.

This yields another stricture on the information interpretation process. The causal history contributing to the constraints imposed on a given medium limits, but does not specify, what its information can be about. That point in this causal chain that is the referent must be determined by and with respect to another information process. All that is guaranteed by a potential reduction of the Shannon entropy of a signal is a possible definite linkage to something else. But this is an open-ended set of possibilities, only limited by processes that spontaneously obliterate certain physical traces or that block certain physical influences. Shannon information is a function of the potential variety of signal states, but referential entropy is additionally a function of the potential variety of factors that could have contributed to that state. So what must an interpretive process include in order to reduce this vast potential entropy of possible referents?

In the late nineteenth-century world of the fictional detective Sherlock Holmes, there were far fewer means available to interpret the physical traces left behind at a crime scene. Even so, to the extent that Holmes had a detailed understanding of the physical processes involved in producing each trace, he could use this information to extrapolate backwards many steps from effect to cause. This capacity has been greatly augmented by modern scientific instruments that, for example, can determine the chemical const.i.tution of traces of mud, the manufacturer of the fibers of different fabrics, the DNA sequence information in a strand of hair, and so on. With this expansion of a.n.a.lytic means, there has come an increase in the amount of information which can be extracted from the same traces that the fictional Holmes might have encountered. These traces contain no more physical differences than they would have in the late nineteenth century; it is simply that more of these have become interpretable, and to a greater causal depth. This enhancement of interpretive capacity is due to an effective increase in the interpretable Shannon entropy. But exactly how does this expansion of a.n.a.lytic tools effectively increase the Shannon entropy of a given physical trace?

Although from an engineer's perspective, every possible independent physical state of a system must be figured into the a.s.sessment of its potential Shannon entropy, this is an idealization. What matters are the distinguishable states. The distinguishable states are determined with respect to an interpretive process that itself must also be understood as a signal production process with its own potential Shannon entropy. In other words, one information source can only be interpreted with respect to another information production process. The maximum information that can be conveyed is consequently the lesser of the Shannon entropies of the two processes. If the receiving/interpreting system is physically simpler and less able to a.s.sume alternative states than the sign medium being considered, or the relative probabilities of its states are more uneven (i.e., more constrained), or the coupling between the two is insensitive to certain causal interactions, then the interpretable entropy will be less than the potential entropy of the source. This, for example, happens with the translation of DNA sequence information into protein structure information. Since there are sixty-four possible nucleotide triplets (codons) to code for twenty amino acids, only a fraction of the possible codon entropy is interpretable as amino acid information.2 One consequence of this is that scientists using DNA sequencing devices have more information to work with than does the cell that it comes from.

This limitation suggests two interesting a.n.a.logies to the thermodynamic constraints affecting work that were implicit in Shannon's a.n.a.lysis. First, the combined interpretable Shannon entropy of a chain of systems (e.g., different media) through which information is transferred can be no greater than the channel/signal production device with the lowest entropy value. Each coupling of system-to-system will tend to introduce a reduction of the interpretable entropy of the signal, thus reducing the difference between the initial potential and final received signal entropy. And second, information capacity tends to be lost in transfer from medium to medium if there is noise or if the interpreting system is of lower entropy (at least it cannot be increased), and with it the specificity of the causal history that it can be about. Since its possible reference is negatively embodied in the form of constraints, what a sign or signal can be about tends to degrade in specificity spontaneously with transmission or interpretation. This latter tendency parallels a familiar thermodynamic tendency which guarantees that there is inevitably some loss in the capacity to do further work in any mechanical process. This is effectively the informational a.n.a.logy to the impossibility of a perpetual motion machine: interpretive possibility can only decrease with each transfer of constraints from one medium to another.

This also means that, irrespective of the amount of Shannon information that can be embodied in a particular substrate, what it can and cannot be about also depends on the specific details of the medium's modifiability and its capacity to modify other systems. We create instruments (signal receivers) whose states are affected by the physical state of some process that we wish to monitor and use the resulting changes of the instrument to extract information about that phenomenon, by virtue of its special sensitivities to its physical context. The information it provides is thus limited by the instrument's material properties, which is why the creation of new kinds of scientific instruments can produce more information about the same objects. The expansion of reference that this provides is implicit in the Shannon-Boltzmann logic. So, while the material limits of our media are a constant source of loss in human information transmission processes, they are not necessarily a serious limitation in the interpretation of natural information sources, such as in scientific investigations. In nature, there is always more Boltzmann entropy embodied in an object or event treated as a sign than current interpretive means can ever capture.

NOISE VERSUS ERROR.

One of the clearest indications that information is not just order is provided by the fact that information can be in error. A signal can be corrupted, its reference can be mistaken, and the news it conveys can be irrelevant. These three normative (i.e., evaluative) a.s.sessments are also hierarchically dependent upon one another.

A normative consideration requires comparison. This isn't surprising since it too involves an interpretation process, and whatever information results is a function of possibilities eliminated. Shannon demonstrated that unreliability in a communication process can be overcome by introducing a specified degree of redundancy into the signal, enabling an interpreter to utilize the correlations among similar components to distinguish signal from noise. For any given degree of noise (signal error) below 100 percent, there is some level of redundant transmission and redundancy checking that can distinguish signal from noise. This is because the only means for a.s.sessing accuracy of transmission irrespective of content is self-consistency. If a communication medium includes some degree of intrinsic redundancy, such as involving only English sentences, then errors such as typos are often easy to detect and correct irrespective of the content. Because this process is content-independent, it is even possible to detect errors in encrypted messages before they are decoded. Errors in transmission or encoding that result from sources such as typing errors, transmission errors, or receiving errors will be uncorrelated with each other in each separate transmission, while the specific message-carrying features will be highly correlated from one replica to another.

This logic is not just restricted to human communication. It is even used by cells in cleaning up potentially noisy genetic information, irrespective of its function. This is possible because the genetic code is redundant, such that nucleotides on either side of the double helix molecule must exactly complement one another or the two sides can't fully re-anneal after being separated during decoding. Thus a mechanism able to detect non-complementarity of base pairing can, irrespective of any functional consequence, be evolved to make functional repairs, so long as the damage is not too extensive.

There is a related higher-order logic involved in checking the accuracy of representation. Besides the obvious utility of being able to determine the accuracy of information about something, this issue has important philosophical significance as well. The a.s.sessment of referential error has been a non-trivial problem for correspondence and mapping theories of reference since at least the writings of the philosopher David Hume in 173940. This is because a correspondence is a correspondence, irrespective of whether it is involved in a representational relationship or is of any significance for any interpretive process. In some degree or other, it is possible to find some correspondence relation between almost any two facts. What matters is the determination of a specific correspondence, and this requires a means for distinguishing accurate correspondence relationships and ignoring spurious ones. The solution to this problem has a logic that is a.n.a.logous to correcting for signal noise in Shannon's theory.