Part 4 The Singularity Is Near_ When Humans Transcend Biology

Part 4

RAY: Why do you suppose it turned out that way? Why do you suppose it turned out that way?

GEORGE 2048: Because you left one thing out of your equation. Although people realized that stock values would increase rapidly, that same realization also increased the discount rate (the rate at which we need to discount values in the future when considering their present value). Think about it. If we know that stocks are going to increase significantly in a future period, then we'd like to have the stocks now so that we can realize those future gains. So the perception of increased future equity values also increases the discount rate. And that cancels out the expectation of higher future values. Because you left one thing out of your equation. Although people realized that stock values would increase rapidly, that same realization also increased the discount rate (the rate at which we need to discount values in the future when considering their present value). Think about it. If we know that stocks are going to increase significantly in a future period, then we'd like to have the stocks now so that we can realize those future gains. So the perception of increased future equity values also increases the discount rate. And that cancels out the expectation of higher future values.

MOLLY 2104: Uh, George, that was not quite right either. What you say makes logical sense, but the psychological reality is that the heightened perception of increased future values did have a greater positive impact on stock prices than increases in the discount rate had a negative effect. So the general acceptance of exponential growth in both the price-performance of technology and the rate of economic activity did provide an upward draft for the equities market, but not the tripling that you spoke about, Ray, due to the effect that George was describing. Uh, George, that was not quite right either. What you say makes logical sense, but the psychological reality is that the heightened perception of increased future values did have a greater positive impact on stock prices than increases in the discount rate had a negative effect. So the general acceptance of exponential growth in both the price-performance of technology and the rate of economic activity did provide an upward draft for the equities market, but not the tripling that you spoke about, Ray, due to the effect that George was describing.

MOLLY 2004: Okay, I'm sorry I asked. I think I'll just hold on to the few shares I've got and not worry about it. Okay, I'm sorry I asked. I think I'll just hold on to the few shares I've got and not worry about it.

RAY: What have you invested in? What have you invested in?

MOLLY 2004: Let's see, there's this new natural language-based search-engine company that hopes to take on Google. And I've also invested in a fuel-cell company. Also, a company building sensors that can travel in the bloodstream. Let's see, there's this new natural language-based search-engine company that hopes to take on Google. And I've also invested in a fuel-cell company. Also, a company building sensors that can travel in the bloodstream.

RAY: Sounds like a pretty high-risk, high-tech portfolio. Sounds like a pretty high-risk, high-tech portfolio.

MOLLY 2004: I wouldn't call it a portfolio. I'm just dabbling with the technologies you're talking about. I wouldn't call it a portfolio. I'm just dabbling with the technologies you're talking about.

RAY: Okay, but keep in mind that while the trends predicted by the law of accelerating returns are remarkably smooth, that doesn't mean we can readily predict which compet.i.tors will prevail. Okay, but keep in mind that while the trends predicted by the law of accelerating returns are remarkably smooth, that doesn't mean we can readily predict which compet.i.tors will prevail.

MOLLY 2004: Right, that's why I'm spreading my bets. Right, that's why I'm spreading my bets.

CHAPTER THREE.

Achieving the Computational Capacity of the Human Brain

As I discuss in Engines of Creation, if you can build genuine AI, there are reasons to believe that you can build things like neurons that are a million times faster. That leads to the conclusion that you can make systems that think a million times faster than a person. With AI, these systems could do engineering design. Combining this with the capability of a system to build something that is better than it, you have the possibility for a very abrupt transition. This situation may be more difficult to deal with even than nanotechnology, but it is much more difficult to think about it constructively at this point. Thus, it hasn't been the focus of things that I discuss, although I periodically point to it and say: "That's important too."-ERIC DREXLER, 1989

The Sixth Paradigm of Computing Technology: Three-Dimensional Molecular Computing and Emerging Computational Technologies

In the April 19, 1965, issue of Electronics, Gordon Moore wrote, "The future of integrated electronics is the future of electronics itself. The advantages of integration will bring about a proliferation of electronics, pushing this science into many new areas,"1 With those modest words, Moore ushered in a revolution that is still gaining momentum. To give his readers some idea of how profound this new science would be, Moore predicted that "by 1975, economics may dictate squeezing as many as 65,000 components on a single silicon chip." Imagine that. With those modest words, Moore ushered in a revolution that is still gaining momentum. To give his readers some idea of how profound this new science would be, Moore predicted that "by 1975, economics may dictate squeezing as many as 65,000 components on a single silicon chip." Imagine that.

Moore's article described the repeated annual doubling of the number of transistors (used for computational elements, or gates) that could be fitted onto an integrated circuit. His 1965 "Moore's Law" prediction was criticized at the time because his logarithmic chart of the number of components on a chip had only five reference points (from 1959 through 1965), so projecting this nascent trend all the way out to 1975 was seen as premature. Moore's initial estimate was incorrect, and he revised it downward a decade later. But the basic idea-the exponential growth of the price-performance of electronics based on shrinking the size of transistors on an integrated circuit-was both valid and prescient.2 Today, we talk about billions of components rather than thousands. In the most advanced chips of 2004, logic gates are only fifty nanometers wide, already well within the realm of nanotechnology (which deals with measurements of one hundred nanometers or less). The demise of Moore's Law has been predicted on a regular basis, but the end of this remarkable paradigm keeps getting pushed out in time. Paolo Gargini, Intel Fellow, director of Intel technology strategy, and chairman of the influential International Technology Roadmap for Semiconductors (ITRS), recently stated, "We see that for at least the next 15 to 20 years, we can continue staying on Moore's Law. In fact, ... nanotechnology offers many new k.n.o.bs we can turn to continue improving the number of components on a die.3 The acceleration of computation has transformed everything from social and economic relations to political inst.i.tutions, as I will demonstrate throughout this book. But Moore did not point out in his papers that the strategy of shrinking feature sizes was not, in fact, the first paradigm to bring exponential growth to computation and communication. It was the fifth, and already, we can see the outlines of the next: computing at the molecular level and in three dimensions. Even though we have more than a decade left of the fifth paradigm, there has already been compelling progress in all of the enabling technologies required for the sixth paradigm. In the next section, I provide an a.n.a.lysis of the amount of computation and memory required to achieve human levels of intelligence and why we can be confident that these levels will be achieved in inexpensive computers within two decades. Even these very powerful computers will be far from optimal, and in the last section of this chapter I'll review the limits of computation according to the laws of physics as we understand them today. This will bring us to computers circa the late twenty-first century.

The Bridge to 3-D Molecular Computing. Intermediate steps are already under way: new technologies that will lead to the sixth paradigm of molecular three-dimensional computing include nanotubes and nanotube circuitry, molecular computing, self-a.s.sembly in nanotube circuits, biological systems emulating circuit a.s.sembly, computing with DNA, spintronics (computing with the spin of electrons), computing with light, and quantum computing. Many of these independent technologies can be integrated into computational systems that will eventually approach the theoretical maximum capacity of matter and energy to perform computation and will far outpace the computational capacities of a human brain. Intermediate steps are already under way: new technologies that will lead to the sixth paradigm of molecular three-dimensional computing include nanotubes and nanotube circuitry, molecular computing, self-a.s.sembly in nanotube circuits, biological systems emulating circuit a.s.sembly, computing with DNA, spintronics (computing with the spin of electrons), computing with light, and quantum computing. Many of these independent technologies can be integrated into computational systems that will eventually approach the theoretical maximum capacity of matter and energy to perform computation and will far outpace the computational capacities of a human brain.

One approach is to build three-dimensional circuits using "conventional" silicon lithography. Matrix Semiconductor is already selling memory chips that contain vertically stacked planes of transistors rather than one flat layer.4 Since a single 3-D chip can hold more memory, overall product size is reduced, so Matrix is initially targeting portable electronics, where it aims to compete with flash memory (used in cell phones and digital cameras because it does not lose information when the power is turned off). The stacked circuitry also reduces the overall cost per bit. Another approach comes from one of Matrix's compet.i.tors, Fujio Masuoka, a former Toshiba engineer who invented flash memory. Masuoka claims that his novel memory design, which looks like a cylinder, reduces the size and cost-per-bit of memory by a factor of ten compared to flat chips. Since a single 3-D chip can hold more memory, overall product size is reduced, so Matrix is initially targeting portable electronics, where it aims to compete with flash memory (used in cell phones and digital cameras because it does not lose information when the power is turned off). The stacked circuitry also reduces the overall cost per bit. Another approach comes from one of Matrix's compet.i.tors, Fujio Masuoka, a former Toshiba engineer who invented flash memory. Masuoka claims that his novel memory design, which looks like a cylinder, reduces the size and cost-per-bit of memory by a factor of ten compared to flat chips.5 Working prototypes of three-dimensional silicon chips have also been demonstrated at Rensselaer Polytechnic Inst.i.tute's Center for Gigascale Integration and at the MIT Media Lab. Working prototypes of three-dimensional silicon chips have also been demonstrated at Rensselaer Polytechnic Inst.i.tute's Center for Gigascale Integration and at the MIT Media Lab.

Tokyo's Nippon Telegraph and Telephone Corporation (NTT) has demonstrated a dramatic 3-D technology using electron-beam lithography, which can create arbitrary three-dimensional structures with feature sizes (such as transistors) as small as ten nanometers.6 NTT demonstrated the technology by creating a high-resolution model of the Earth sixty microns in size with ten-nanometer features. NTT says the technology is applicable to nanofabrication of electronic devices such as semiconductors, as well as creating nanoscale mechanical systems. NTT demonstrated the technology by creating a high-resolution model of the Earth sixty microns in size with ten-nanometer features. NTT says the technology is applicable to nanofabrication of electronic devices such as semiconductors, as well as creating nanoscale mechanical systems.

Nanotubes Are Still the Best Bet. In The Age of Spiritual Machines, I cited nanotubes-using molecules organized in three dimensions to store memory bits and to act as logic gates-as the most likely technology to usher in the era of three-dimensional molecular computing. Nanotubes, first synthesized in 1991, are tubes made up of a hexagonal network of carbon atoms that have been rolled up to make a seamless cylinder. In The Age of Spiritual Machines, I cited nanotubes-using molecules organized in three dimensions to store memory bits and to act as logic gates-as the most likely technology to usher in the era of three-dimensional molecular computing. Nanotubes, first synthesized in 1991, are tubes made up of a hexagonal network of carbon atoms that have been rolled up to make a seamless cylinder.7 Nanotubes are very small: single-wall nanotubes are only one nanometer in diameter, so they can achieve high densities. Nanotubes are very small: single-wall nanotubes are only one nanometer in diameter, so they can achieve high densities.

They are also potentially very fast. Peter Burke and his colleagues at the University of California at Irvine recently demonstrated nanotube circuits operating at 2.5 gigahertz (GHz). However, in Nano Letters Nano Letters, a peer-reviewed journal of the American Chemical Society, Burke says the theoretical speed limit for these nanotube transistors "should be terahertz (1 THz = 1,000 GHz), which is about 1,000 times faster than modern computer speeds."8 One cubic inch of nanotube circuitry, once fully developed, would be up to one hundred million times more powerful than the human brain. One cubic inch of nanotube circuitry, once fully developed, would be up to one hundred million times more powerful than the human brain.9 Nanotube circuitry was controversial when I discussed it in 1999, but there has been dramatic progress in the technology over the past six years. Two major strides were made in 2001. A nanotube-based transistor (with dimensions of one by twenty nanometers), operating at room temperature and using only a single electron to switch between on and off states, was reported in the July 6, 2001, issue of Science Science.10 Around the same time, IBM also demonstrated an integrated circuit with one thousand nanotube-based transistors. Around the same time, IBM also demonstrated an integrated circuit with one thousand nanotube-based transistors.11 More recently, we have seen the first working models of nanotube-based circuitry. In January 2004 researchers at the University of California at Berkeley and Stanford University created an integrated memory circuit based on nanotubes.12 One of the challenges in using this technology is that some nanotubes are conductive (that is, simply transmit electricity), while others act like semiconductors (that is, are capable of switching and able to implement logic gates). The difference in capability is based on subtle structural features. Until recently, sorting them out required manual operations, which would not be practical for building large-scale circuits. The Berkeley and Stanford scientists addressed this issue by developing a fully automated method of sorting and discarding the nonsemiconductor nanotubes. One of the challenges in using this technology is that some nanotubes are conductive (that is, simply transmit electricity), while others act like semiconductors (that is, are capable of switching and able to implement logic gates). The difference in capability is based on subtle structural features. Until recently, sorting them out required manual operations, which would not be practical for building large-scale circuits. The Berkeley and Stanford scientists addressed this issue by developing a fully automated method of sorting and discarding the nonsemiconductor nanotubes.

Lining up nanotubes is another challenge with nanotube circuits, since they tend to grow in every direction. In 2001 IBM scientists demonstrated that nanotube transistors could be grown in bulk, similar to silicon transistors. They used a process called "constructive destruction," which destroys defective nanotubes right on the wafer instead of sorting them out manually. Thomas Theis, director of physical sciences at IBM's Thomas J. Watson Research Center, said at the time, "We believe that IBM has now pa.s.sed a major milestone on the road toward molecular-scale chips....If we are ultimately successful, then carbon nanotubes will enable us to indefinitely maintain Moore's Law in terms of density, because there is very little doubt in my mind that these can be made smaller than any future silicon transistor.13 In May 2003 Nantero, a small company in Woburn, Ma.s.sachusetts, cofounded by Harvard University researcher Thomas Rueckes, took the process a step further when it demonstrated a single-chip wafer with ten billion nanotube junctions, all aligned in the proper direction. The Nantero technology involves using standard lithography equipment to remove automatically the nanotubes that are incorrectly aligned. Nantero's use of standard equipment has excited industry observers because the technology would not require expensive new fabrication machines. The Nantero design provides random access as well as nonvolatility (data is retained when the power is off), meaning that it could potentially replace all of the primary forms of memory: RAM, flash, and disk. In May 2003 Nantero, a small company in Woburn, Ma.s.sachusetts, cofounded by Harvard University researcher Thomas Rueckes, took the process a step further when it demonstrated a single-chip wafer with ten billion nanotube junctions, all aligned in the proper direction. The Nantero technology involves using standard lithography equipment to remove automatically the nanotubes that are incorrectly aligned. Nantero's use of standard equipment has excited industry observers because the technology would not require expensive new fabrication machines. The Nantero design provides random access as well as nonvolatility (data is retained when the power is off), meaning that it could potentially replace all of the primary forms of memory: RAM, flash, and disk.

Computing with Molecules. In addition to nanotubes, major progress has been made in recent years in computing with just one or a few molecules. The idea of computing with molecules was first suggested in the early 1970s by IBM's Avi Aviram and Northwestern University's Mark A. Ratner. In addition to nanotubes, major progress has been made in recent years in computing with just one or a few molecules. The idea of computing with molecules was first suggested in the early 1970s by IBM's Avi Aviram and Northwestern University's Mark A. Ratner.14 At that time, we did not have the enabling technologies, which required concurrent advances in electronics, physics, chemistry, and even the reverse engineering of biological processes for the idea to gain traction. At that time, we did not have the enabling technologies, which required concurrent advances in electronics, physics, chemistry, and even the reverse engineering of biological processes for the idea to gain traction.

In 2002 scientists at the University of Wisconsin and University of Basel created an "atomic memory drive" that uses atoms to emulate a hard drive. A single silicon atom could be added or removed from a block of twenty others using a scanning tunneling microscope. Using this process, researchers believe, the system could be used to store millions of times more data on a disk of comparable size-a density of about 250 terabits of data per square inch-although the demonstration involved only a small number of bits.15 The one-terahertz speed predicted by Peter Burke for molecular circuits looks increasingly accurate, given the nanoscale transistor created by scientists at the University of Illinois at Urbana-Champaign. It runs at a frequency of 604 gigahertz (more than half a terahertz).16 One type of molecule that researchers have found to have desirable properties for computing is called a "rotaxane," which can switch states by changing the energy level of a ringlike structure contained within the molecule. Rotaxane memory and electronic switching devices have been demonstrated, and they show the potential of storing one hundred gigabits (1011 bits) per square inch. The potential would be even greater if organized in three dimensions. bits) per square inch. The potential would be even greater if organized in three dimensions.

Self-a.s.sembly. Self-a.s.sembling of nanoscale circuits is another key enabling technique for effective nanoelectronics. Self-a.s.sembly allows improperly formed components to be discarded automatically and makes it possible for the potentially trillions of circuit components to organize themselves, rather than be painstakingly a.s.sembled in a top-down process. It would enable large-scale circuits to be created in test tubes rather than in multibillion-dollar factories, using chemistry rather than lithography, according to UCLA scientists. Self-a.s.sembling of nanoscale circuits is another key enabling technique for effective nanoelectronics. Self-a.s.sembly allows improperly formed components to be discarded automatically and makes it possible for the potentially trillions of circuit components to organize themselves, rather than be painstakingly a.s.sembled in a top-down process. It would enable large-scale circuits to be created in test tubes rather than in multibillion-dollar factories, using chemistry rather than lithography, according to UCLA scientists.17 Purdue University researchers have already demonstrated self-organizing nanotube structures, using the same principle that causes DNA strands to link together in stable structures. Purdue University researchers have already demonstrated self-organizing nanotube structures, using the same principle that causes DNA strands to link together in stable structures.18 Harvard University scientists took a key step forward in June 2004 when they demonstrated another self-organizing method that can be used on a large scale.19 The technique starts with photolithography to create an etched array of interconnects (connections between computational elements). A large number of nanowire field-effect transistors (a common form of transistors) and nanoscale interconnects are then deposited on the array. These then connect themselves in the correct pattern. The technique starts with photolithography to create an etched array of interconnects (connections between computational elements). A large number of nanowire field-effect transistors (a common form of transistors) and nanoscale interconnects are then deposited on the array. These then connect themselves in the correct pattern.

In 2004 researchers at the University of Southern California and NASA's Ames Research Center demonstrated a method that self-organizes extremely dense circuits in a chemical solution.20 The technique creates nanowires spontaneously and then causes nanoscale memory cells, each able to hold three bits of data, to self-a.s.semble onto the wires. The technology has a storage capacity of 258 gigabits of data per square inch (which researchers claim could be increased tenfold), compared to 6.5 gigabits on a flash memory card. Also in 2003 IBM demonstrated a working memory device using polymers that self-a.s.semble into twenty-nanometer-wide hexagonal structures. The technique creates nanowires spontaneously and then causes nanoscale memory cells, each able to hold three bits of data, to self-a.s.semble onto the wires. The technology has a storage capacity of 258 gigabits of data per square inch (which researchers claim could be increased tenfold), compared to 6.5 gigabits on a flash memory card. Also in 2003 IBM demonstrated a working memory device using polymers that self-a.s.semble into twenty-nanometer-wide hexagonal structures.21 It's also important that nanocircuits be self-configuring. The large number of circuit components and their inherent fragility (due to their small size) make it inevitable that some portions of a circuit will not function correctly. It will not be economically feasible to discard an entire circuit simply because a small number of transistors out of a trillion are non functioning. To address this concern, future circuits will continuously monitor their own performance and route information around sections that are unreliable in the same manner that information on the Internet is routed around nonfunctioning nodes. IBM has been particularly active in this area of research and has already developed microprocessor designs that automatically diagnose problems and reconfigure chip resources accordingly.22

Emulating Biology. The idea of building electronic or mechanical systems that are self-replicating and self-organizing is inspired by biology, which relies on these properties. Research published in the The idea of building electronic or mechanical systems that are self-replicating and self-organizing is inspired by biology, which relies on these properties. Research published in the Proceedings of the National Academy of Sciences Proceedings of the National Academy of Sciences described the construction of self-replicating nanowires based on prions, which are self-replicating proteins. (As detailed in chapter 4, one form of prion appears to play a role in human memory, whereas another form is believed to be responsible for variant Creutzfeldt-Iakob disease, the human form of mad-cow disease.) described the construction of self-replicating nanowires based on prions, which are self-replicating proteins. (As detailed in chapter 4, one form of prion appears to play a role in human memory, whereas another form is believed to be responsible for variant Creutzfeldt-Iakob disease, the human form of mad-cow disease.)23 The team involved in the project used prions as a model because of their natural strength. Because prions do not normally conduct electricity, however, the scientists created a genetically modified version containing a thin layer of gold, which conducts electricity with low resistance. MIT biology professor Susan Lindquist, who headed the study, commented, "Most of the people working on nanocircuits are trying to build them using 'top-down' fabrication techniques. We thought we'd try a 'bottom-up' approach, and let molecular self-a.s.sembly do the hard work for us." The team involved in the project used prions as a model because of their natural strength. Because prions do not normally conduct electricity, however, the scientists created a genetically modified version containing a thin layer of gold, which conducts electricity with low resistance. MIT biology professor Susan Lindquist, who headed the study, commented, "Most of the people working on nanocircuits are trying to build them using 'top-down' fabrication techniques. We thought we'd try a 'bottom-up' approach, and let molecular self-a.s.sembly do the hard work for us."

The ultimate self-replicating molecule from biology is, of course, DNA. Duke University researchers created molecular building blocks called "tiles" out of self-a.s.sembling DNA molecules.24 They were able to control the structure of the resulting a.s.sembly, creating "nanogrids," This technique automatically attaches protein molecules to each nanogrid's cell, which could be used to perform computing operations. They also demonstrated a chemical process that coated the DNA nanoribbons with silver to create nanowires. Commenting on the article in the September 26, 2003, issue of the journal Science, lead researcher Hao Yan said, "To use DNA self-a.s.sembly to template protein molecules or other molecules has been sought for years, and this is the first time it has been demonstrated so clearly." They were able to control the structure of the resulting a.s.sembly, creating "nanogrids," This technique automatically attaches protein molecules to each nanogrid's cell, which could be used to perform computing operations. They also demonstrated a chemical process that coated the DNA nanoribbons with silver to create nanowires. Commenting on the article in the September 26, 2003, issue of the journal Science, lead researcher Hao Yan said, "To use DNA self-a.s.sembly to template protein molecules or other molecules has been sought for years, and this is the first time it has been demonstrated so clearly."25

Computing with DNA. DNA is nature's own nanoengineered computer, and its ability to store information and conduct logical manipulations at the molecular level has already been exploited in specialized "DNA computers." A DNA computer is essentially a test tube filled with water containing trillions of DNA molecules, with each molecule acting as a computer. DNA is nature's own nanoengineered computer, and its ability to store information and conduct logical manipulations at the molecular level has already been exploited in specialized "DNA computers." A DNA computer is essentially a test tube filled with water containing trillions of DNA molecules, with each molecule acting as a computer.

The goal of the computation is to solve a problem, with the solution expressed as a sequence of symbols. (For example, the sequence of symbols could represent a mathematical proof or just the digits of a number.) Here's how a DNA computer works. A small strand of DNA is created, using a unique code for each symbol. Each such strand is replicated trillions of times using a process called "polymerase chain reaction" (PCR). These pools of DNA are then put into a test tube. Because DNA has an affinity to link strands together, long strands form automatically, with sequences of the strands representing the different symbols, each of them a possible solution to the problem. Since there will be many trillions of such strands, there are multiple strands for each possible answer (that is, each possible sequence of symbols).

The next step of the process is to test all of the strands simultaneously. This is done by using specially designed enzymes that destroy strands that do not meet certain criteria. The enzymes are applied to the test tube sequentially, and by designing a precise series of enzymes the procedure will eventually obliterate all the incorrect strands, leaving only the ones with the correct answer. (For a more complete description of the process, see this note:26) The key to the power of DNA computing is that it allows for testing each of the trillions of strands simultaneously. In 2003 Israeli scientists led by Ehud Shapiro at the Weizmann Inst.i.tute of Science combined DNA with adenosine triphosphate (ATP), the natural fuel for biological systems such as the human body.27 With this method, each of the DNA molecules was able to perform computations as well as provide its own energy. The Weizmann scientists demonstrated a configuration consisting of two spoonfuls of this liquid supercomputing system, which contained thirty million billion molecular computers and performed a total of 660 trillion calculations per second (6.6 i 10 With this method, each of the DNA molecules was able to perform computations as well as provide its own energy. The Weizmann scientists demonstrated a configuration consisting of two spoonfuls of this liquid supercomputing system, which contained thirty million billion molecular computers and performed a total of 660 trillion calculations per second (6.6 i 1014 cps). The energy consumption of these computers is extremely low, only fifty millionths of a watt for all thirty million billion computers. cps). The energy consumption of these computers is extremely low, only fifty millionths of a watt for all thirty million billion computers.

There's a limitation, however, to DNA computing: each of the many trillions of computers has to perform the same operation at the same time (although on different data), so that the device is a "single instruction multiple data" (SIMD) architecture. While there are important cla.s.ses of problems that are amenable to a SIMD system (for example, processing every pixel in an image for image enhancement or compression, and solving combinatorial-logic problems), it is not possible to program them for general-purpose algorithms, in which each computer is able to execute whatever operation is needed for its particular mission. (Note that the research projects at Purdue University and Duke University, described earlier, that use self-a.s.sembling DNA strands to create three-dimensional structures are different from the DNA computing described here. Those research projects have the potential to create arbitrary configurations that are not limited to SIMD computing.)

Computing with Spin. In addition to their negative electrical charge, electrons have another property that can be exploited for memory and computation: spin. According to quantum mechanics, electrons spin on an axis, similar to the way the Earth rotates on its axis. This concept is theoretical, because an electron is considered to occupy a point in s.p.a.ce, so it is difficult to imagine a point with no size that nonetheless spins. However, when an electrical charge moves, it causes a magnetic field, which is real and measurable. An electron can spin in one of two directions, described as "up" and "down," so this property can be exploited for logic switching or to encode a bit of memory. In addition to their negative electrical charge, electrons have another property that can be exploited for memory and computation: spin. According to quantum mechanics, electrons spin on an axis, similar to the way the Earth rotates on its axis. This concept is theoretical, because an electron is considered to occupy a point in s.p.a.ce, so it is difficult to imagine a point with no size that nonetheless spins. However, when an electrical charge moves, it causes a magnetic field, which is real and measurable. An electron can spin in one of two directions, described as "up" and "down," so this property can be exploited for logic switching or to encode a bit of memory.

The exciting property of spintronics is that no energy is required to change an electron's spin state. Stanford University physics professor Shoucheng Zhang and University of Tokyo professor Naoto Nagaosa put it this way: "We have discovered the equivalent of a new 'Ohm's Law' [the electronics law that states that current in a wire equals voltage divided by resistance]....[It] says that the spin of the electron can be transported without any loss of energy, or dissipation. Furthermore, this effect occurs at room temperature in materials already widely used in the semiconductor industry, such as gallium a.r.s.enide. That's important because it could enable a new generation of computing devices."28 The potential, then, is to achieve the efficiencies of superconducting (that is, moving information at or close to the speed of light without any loss of information) at room temperature. It also allows multiple properties of each electron to be used for computing, thereby increasing the potential for memory and computational density.

One form of spintronics is already familiar to computer users: magnetoresistance (a change in electrical resistance caused by a magnetic field) is used to store data on magnetic hard drives. An exciting new form of nonvolatile memory based on spintronics called MRAM (magnetic random-access memory) is expected to enter the market within a few years. Like hard drives, MRAM memory retains its data without power but uses no moving parts and will have speeds and rewritability comparable to conventional RAM.

MRAM stores information in ferromagnetic metallic alloys, which are suitable for data storage but not for the logical operations of a microprocessor. The holy grail of spintronics would be to achieve practical spintronics effects in a semiconductor, which would enable us to use the technology both for memory and for logic. Today's chip manufacturing is based on silicon, which does not have the requisite magnetic properties. In March 2004 an international group of scientists reported that by doping a blend of silicon and iron with cobalt, the new material was able to display the magnetic properties needed for spintronics while still maintaining the crystalline structure silicon requires as a serniconductor.29 An important role for spintronics in the future of computer memory is clear, and it is likely to contribute to logic systems as well. The spin of an electron is a quantum property (subject to the laws of quantum mechanics), so perhaps the most important application of spintronics will be in quantum computing systems, using the spin of quantum-entangled electrons to represent qubits, which I discuss below.

Spin has also been used to store information in the nucleus of atoms, using the complex interaction of their protons' magnetic moments. Scientists at the University of Oklahoma also demonstrated a "molecular photography" technique for storing 1,024 bits of information in a single liquid-crystal molecule comprising nineteen hydrogen atoms.30

Computing with Light. Another approach to SIMD computing is to use multiple beams of laser light in which information is encoded in each stream of photons. Optical components can then be used to perform logical and arithmetic functions on the encoded information streams. For example, a system developed by Lenslet, a small Israeli company, uses 256 lasers and can perform eight trillion calculations per second by performing the same calculation on each of the 256 streams of data. Another approach to SIMD computing is to use multiple beams of laser light in which information is encoded in each stream of photons. Optical components can then be used to perform logical and arithmetic functions on the encoded information streams. For example, a system developed by Lenslet, a small Israeli company, uses 256 lasers and can perform eight trillion calculations per second by performing the same calculation on each of the 256 streams of data.31 The system can be used for applications such as performing data compression on 256 video channels. The system can be used for applications such as performing data compression on 256 video channels.

SIMD technologies such as DNA computers and optical computers will have important specialized roles to play in the future of computation. The replication of certain aspects of the functionality of the human brain, such as processing sensory data, can use SIMD architectures. For other brain regions, such as those dealing with learning and reasoning, general-purpose computing with its "multiple instruction multiple data" (MIMD) architectures will be required. For high-performance MIMD computing, we will need to apply the three-dimensional molecular-computing paradigms described above.

Quantum Computing. Quantum computing is an even more radical form of SIMD parallel processing, but one that is in a much earlier stage of development compared to the other new technologies we have discussed. A quantum computer contains a series of qubits, which essentially are zero and one at the same time. The qubit is based on the fundamental ambiguity inherent in quantum mechanics. In a quantum computer, the qubits are represented by a quantum property of particles-for example, the spin state of individual electrons. When the qubits are in an "entangled" state, each one is simultaneously in both states. In a process called "quantum decoherence" the ambiguity of each qubit is resolved, leaving an unambiguous sequence of ones and zeroes. If the quantum computer is set up in the right way, that decohered sequence will represent the solution to a problem. Essentially, only the correct sequence survives the process of decoherence. Quantum computing is an even more radical form of SIMD parallel processing, but one that is in a much earlier stage of development compared to the other new technologies we have discussed. A quantum computer contains a series of qubits, which essentially are zero and one at the same time. The qubit is based on the fundamental ambiguity inherent in quantum mechanics. In a quantum computer, the qubits are represented by a quantum property of particles-for example, the spin state of individual electrons. When the qubits are in an "entangled" state, each one is simultaneously in both states. In a process called "quantum decoherence" the ambiguity of each qubit is resolved, leaving an unambiguous sequence of ones and zeroes. If the quantum computer is set up in the right way, that decohered sequence will represent the solution to a problem. Essentially, only the correct sequence survives the process of decoherence.

As with the DNA computer described above, a key to successful quantum computing is a careful statement of the problem, including a precise way to test possible answers. The quantum computer effectively tests every possible combination combination of values for the qubits. So a quantum computer with one thousand qubits would test 2 of values for the qubits. So a quantum computer with one thousand qubits would test 21,000 (a number approximately equal to one followed by 301 zeroes) potential solutions simultaneously. (a number approximately equal to one followed by 301 zeroes) potential solutions simultaneously.

A thousand-bit quantum computer would vastly outperform any conceivable DNA computer, or for that matter any conceivable nonquantum computer. There are two limitations to the process, however. The first is that, like the DNA and optical computers discussed above, only a special set of problems is amenable to being presented to a quantum computer. In essence, we need to I be able to test each possible answer in a simple way.

The cla.s.sic example of a practical use for quantum computing is in factoring very large numbers (finding which smaller numbers, when multiplied together, result in the large number). Factoring numbers with more than 512 bits is currently not achievable on a digital computer, even a ma.s.sively parallel one.32 Interesting cla.s.ses of problems amenable to quantum computing include breaking encryption codes (which rely on factoring large numbers). The other problem is that the computational power of a quantum computer depends on the number of entangled qubits, and the state of the art is currently limited to around ten bits. A ten-bit quantum computer is not very useful, since 2 Interesting cla.s.ses of problems amenable to quantum computing include breaking encryption codes (which rely on factoring large numbers). The other problem is that the computational power of a quantum computer depends on the number of entangled qubits, and the state of the art is currently limited to around ten bits. A ten-bit quantum computer is not very useful, since 210 is only 1,024. In a conventional computer, it is a straightforward process to combine memory bits and logic gates. We cannot, however, create a twenty-qubit quantum computer simply by combining two ten-qubit machines. All of the qubits have to be quantum-entangled together, and that has proved to be challenging. is only 1,024. In a conventional computer, it is a straightforward process to combine memory bits and logic gates. We cannot, however, create a twenty-qubit quantum computer simply by combining two ten-qubit machines. All of the qubits have to be quantum-entangled together, and that has proved to be challenging.

A key question is: how difficult is it to add each additional qubit? The computational power of a quantum computer grows exponentially with each added qubit, but if it turns out that adding each additional qubit makes the engineering task exponentially more difficult, we will not be gaining any leverage. (That is, the computational power of a quantum computer will be only linearly proportional to the engineering difficulty.) In general, proposed methods for adding qubits make the resulting systems significantly more delicate and susceptible to premature decoherence.

There are proposals to increase significantly the number of qubits, although these have not yet been proved in practice. For example, Stephan Gulde and his colleagues at the University of Innsbruck have built a quantum computer using a single atom of calcium that has the potential to simultaneously encode dozens of qubits-possibly up to one hundred-using different quantum properties within the atom.33 The ultimate role of quantum computing remains unresolved. But even if a quantum computer with hundreds of entangled qubits proves feasible, it will remain a special-purpose device, although one with remarkable capabilities that cannot be emulated in any other way. The ultimate role of quantum computing remains unresolved. But even if a quantum computer with hundreds of entangled qubits proves feasible, it will remain a special-purpose device, although one with remarkable capabilities that cannot be emulated in any other way.

When I suggested in The Age of Spiritual Machines The Age of Spiritual Machines that molecular computing would be the sixth major computing paradigm, the idea was still controversial. There has been so much progress in the past five years that there has been a sea change in att.i.tude among experts, and this is now a mainstream view. We already have proofs of concept for all of the major requirements for three-dimensional molecular computing: single-molecule transistors, memory cells based on atoms, nanowires, and methods to self-a.s.semble and self-diagnose the trillions (potentially trillions of trillions) of components. that molecular computing would be the sixth major computing paradigm, the idea was still controversial. There has been so much progress in the past five years that there has been a sea change in att.i.tude among experts, and this is now a mainstream view. We already have proofs of concept for all of the major requirements for three-dimensional molecular computing: single-molecule transistors, memory cells based on atoms, nanowires, and methods to self-a.s.semble and self-diagnose the trillions (potentially trillions of trillions) of components.

Contemporary electronics proceeds from the design of detailed chip layouts to photolithography to the manufacturing of chips in large, centralized factories. Nanocircuits are more likely to be created in small chemistry flasks, a development that will be another important step in the decentralization of our industrial infrastructure and will maintain the law of accelerating returns through this century and beyond.

The Computational Capacity of the Human Brain

It may seem rash to expect fully intelligent machines in a few decades, when the computers have barely matched insect mentality in a half-century of development. Indeed, for that reason, many long-time artificial intelligence researchers scoff at the suggestion, and offer a few centuries as a more believable period. But there are very good reasons why things will go much faster in the next fifty years than they have in the last fifty. . . . Since 1990, the power available to individual AI and robotics programs has doubled yearly, to 30 MIPS by 1994 and 500 MIPS by 1998. Seeds long ago alleged barren are suddenly sprouting. Machines read text, recognize speech, even translate languages. Robots drive cross-country, crawl across Mars, and trundle down office corridors. In 1996 a theorem-proving program called EQP running five weeks on a 50 MIPS computer at Argonne National Laboratory found a proof of a Boolean algebra conjecture by Herbert Robbins that had eluded mathematicians for sixty years. And it is still only Spring. Wait until Summer.-HANS MORAVEC, "WHEN WILL COMPUTER HARDWARE MATCH THE HUMAN BRAIN?" 1997

What is the computational capacity of a human brain? A number of estimates have been made, based on replicating the functionality of brain regions that have been reverse engineered (that is, the methods understood) at human levels of performance. Once we have an estimate of the computational capacity for a particular region, we can extrapolate that capacity to the entire brain by considering what portion of the brain that region represents. These estimates are based on functional simulation, which replicates the overall functionality of a region rather than simulating each neuron and interneuronal connection in that region.

Although we would not want to rely on any single calculation, we find that various a.s.sessments of different regions of the brain all provide reasonably close estimates for the entire brain. The following are order-of-magnitude estimates, meaning that we are attempting to determine the appropriate figures to the closest multiple of ten. The fact that different ways of making the same estimate provide similar answers corroborates the approach and indicates that the estimates are in an appropriate range.

The prediction that the Singularity-an expansion of human intelligence by a factor of trillions through merger with its nonbiological form-will occur within the next several decades does not depend on the precision of these calculations. Even if our estimate of the amount of computation required to simulate the human brain was too optimistic (that is, too low) by a factor of even one thousand (which I believe is unlikely), that would delay the Singularity by only about eight years.34 A factor of one million would mean a delay of only about fifteen years, and a factor of one billion would be a delay of about twenty-one years. A factor of one million would mean a delay of only about fifteen years, and a factor of one billion would be a delay of about twenty-one years.35 Hans Moravec, legendary roboticist at Carnegie Mellon University, has a.n.a.lyzed the transformations performed by the neural image-processing circuitry contained in the retina.36 The retina is about two centimeters wide and a half millimeter thick. Most of the retina's depth is devoted to capturing an image, but one fifth of it is devoted to image processing, which includes distinguishing dark and light, and detecting motion in about one million small regions of the image. The retina is about two centimeters wide and a half millimeter thick. Most of the retina's depth is devoted to capturing an image, but one fifth of it is devoted to image processing, which includes distinguishing dark and light, and detecting motion in about one million small regions of the image.

The retina, according to Moravec's a.n.a.lysis, performs ten million of these edge and motion detections each second. Based on his several decades of experience in creating robotic vision systems, he estimates that the execution of about one hundred computer instructions is required to re-create each such detection at human levels of performance, meaning that replicating the image-processing functionality of this portion of the retina requires 1,000 MIPS. The human brain is about 75,000 times heavier than the 0.02 grams of neurons in this portion of the retina, resulting in an estimate of about 1014 (100 trillion) instructions per second for the entire brain. (100 trillion) instructions per second for the entire brain.37 Another estimate comes from the work of Lloyd Watts and his colleagues on creating functional simulations of regions of the human auditory system, which I discuss further in chapter 4.38 One of the functions of the software Watts has developed is a task called "stream separation," which is used in teleconferencing and other applications to achieve telepresence (the localization of each partic.i.p.ant in a remote audio teleconference), To accomplish this, Watts explains, means "precisely measuring the time delay between sound sensors that are separated in s.p.a.ce and that both receive the sound." The process involves pitch a.n.a.lysis, spatial position, and speech cues, including language-specific cues. "One of the important cues used by humans for localizing the position of a sound source is the Interaural Time Difference (ITD), that is, the difference in time of arrival of sounds at the two ears." One of the functions of the software Watts has developed is a task called "stream separation," which is used in teleconferencing and other applications to achieve telepresence (the localization of each partic.i.p.ant in a remote audio teleconference), To accomplish this, Watts explains, means "precisely measuring the time delay between sound sensors that are separated in s.p.a.ce and that both receive the sound." The process involves pitch a.n.a.lysis, spatial position, and speech cues, including language-specific cues. "One of the important cues used by humans for localizing the position of a sound source is the Interaural Time Difference (ITD), that is, the difference in time of arrival of sounds at the two ears."39 Watts's own group has created functionally equivalent re-creations of these brain regions derived from reverse engineering. He estimates that 1011 cps are required to achieve human-level localization of sounds. The auditory cortex regions responsible for this processing comprise at least 0.1 percent of the brain's neurons. So we again arrive at a ballpark estimate of around 10 cps are required to achieve human-level localization of sounds. The auditory cortex regions responsible for this processing comprise at least 0.1 percent of the brain's neurons. So we again arrive at a ballpark estimate of around 1014 cps i 10 cps i 103).

Yet another estimate comes from a simulation at the University of Texas that represents the functionality of a cerebellum region containing 104 neurons; this required about 10 neurons; this required about 108 cps, or about 10 cps, or about 104 cps per neuron. Extrapolating this over an estimated 10 cps per neuron. Extrapolating this over an estimated 1011 neurons results in a figure of about 10 neurons results in a figure of about 1015 cps for the entire brain. cps for the entire brain.

We will discuss the state of human-brain reverse engineering later, but it is clear that we can emulate the functionality of brain regions with less computation than would be required to simulate the precise nonlinear operation of each neuron and all of the neural components (that is, all of the complex interactions that take place inside each neuron). We come to the same conclusion when we attempt to simulate the functionality of organs in the body. For example, implantable devices are being tested that simulate the functionality of the human pancreas in regulating insulin levels.40 These devices work by measuring glucose levels in the blood and releasing insulin in a controlled fashion to keep the levels in an appropriate range. While they follow a method similar to that of a biological pancreas, they do not, however, attempt to simulate each pancreatic islet cell, and there would be no reason to do so. These devices work by measuring glucose levels in the blood and releasing insulin in a controlled fashion to keep the levels in an appropriate range. While they follow a method similar to that of a biological pancreas, they do not, however, attempt to simulate each pancreatic islet cell, and there would be no reason to do so.

These estimates all result in comparable orders of magnitude (1014 to 10 to 1015 cps). Given the early stage of human-brain reverse engineering, I will use a more conservative figure of 10 cps). Given the early stage of human-brain reverse engineering, I will use a more conservative figure of 1016 cps for our subsequent discussions. cps for our subsequent discussions.

Functional simulation of the brain is sufficient to re-create human powers of pattern recognition, intellect, and emotional intelligence. On the other hand, if we want to "upload" a particular person's personality (that is, capture all of his or her knowledge, skills, and personality, a concept I will explore in greater detail at the end of chapter 4), then we may need to simulate neural processes at the level of individual neurons and portions of neurons, such as the soma (cell body), axon (output connection), dendrites (trees of incoming connections), and synapses (regions connecting axons and dendrites). For this, we need to look at detailed models of individual neurons. The "fan out" (number of interneuronal connections) per neuron is estimated at 103. With an estimated 1011 neurons, that's about 10 neurons, that's about 1014 connections. With a reset time of five milliseconds, that comes to about 10 connections. With a reset time of five milliseconds, that comes to about 1016 synaptic transactions per second. synaptic transactions per second.

Neuron-model simulations indicate the need for about 103 calculations per synaptic transaction to capture the nonlinearities (complex interactions) in the dendrites and other neuron regions, resulting in an overall estimate of about 10 calculations per synaptic transaction to capture the nonlinearities (complex interactions) in the dendrites and other neuron regions, resulting in an overall estimate of about 1019 cps for simulating the human brain at this level. cps for simulating the human brain at this level.41 We can therefore consider this an upper bound, but 10 We can therefore consider this an upper bound, but 1014 to 10 to 1016 cps to achieve functional equivalence of all brain regions is likely to be sufficient. cps to achieve functional equivalence of all brain regions is likely to be sufficient.

IBM's Blue Gene/L supercomputer, now being built and scheduled to be completed around the time of the publication of this book, is projected to provide 360 trillion calculations per second (3.6 i 1014 cps). cps).42 This figure is already greater than the lower estimates described above. Blue Gene/L will also have around one hundred terabytes (about 10 This figure is already greater than the lower estimates described above. Blue Gene/L will also have around one hundred terabytes (about 1015 bits) of main storage, more than our memory estimate for functional emulation of the human brain (see below). In line with my earlier predictions, supercomputers will achieve my more conservative estimate of 10 bits) of main storage, more than our memory estimate for functional emulation of the human brain (see below). In line with my earlier predictions, supercomputers will achieve my more conservative estimate of 1016 cps for functional human-brain emulation by early in the next decade (see the "Supercomputer Power" figure on p. 71). cps for functional human-brain emulation by early in the next decade (see the "Supercomputer Power" figure on p. 71).

Accelerating the Availability of Human-Level Personal Computing. Personal computers today provide more than 10 Personal computers today provide more than 109 cps. According to the projections in the "Exponential Growth of Computing" chart (p. 70), we will achieve 10 cps. According to the projections in the "Exponential Growth of Computing" chart (p. 70), we will achieve 1016cps by 2025. However, there are several ways this timeline can be accelerated. Rather than using general-purpose processors, one can use application-specific integrated circuits (ASICs) to provide greater price-performance for very repet.i.tive calculations. Such circuits already provide extremely high computational throughput for the repet.i.tive calculations used in generating moving images in video games. ASICs can increase price-performance a thousandfold, cutting about eight years off the 2025 date. The varied programs that a simulation of the human brain will comprise will also include a great deal of repet.i.tion and thus will be amenable to ASIC implementation. The cerebellum, for example, repeats a basic wiring pattern billions of times.

We will also be able to amplify the power of personal computers by harvesting the unused computation power of devices on the Internet. New communication paradigms such as "mesh" computing contemplate treating every device in the network as a node rather than just a "spoke."43 In other words, instead of devices (such as personal computers and PDAs) merely sending information to and from nodes, each device will act as a node itself, sending information to and receiving information from every other device. That will create very robust, self-organizing communication networks. It will also make it easier for computers and other devices to tap unused CPU cycles of the devices in their region of the mesh. In other words, instead of devices (such as personal computers and PDAs) merely sending information to and from nodes, each device will act as a node itself, sending information to and receiving information from every other device. That will create very robust, self-organizing communication networks. It will also make it easier for computers and other devices to tap unused CPU cycles of the devices in their region of the mesh.

Currently at least 99 percent, if not 99.9 percent, of the computational capacity of all the computers on the Internet lies unused. Effectively harnessing this computation can provide another factor of 102 or 10 or 103 in increased price-performance. For these reasons, it is reasonable to expect human brain capacity, at least in terms of hardware computational capacity, for one thousand dollars by around 2020. in increased price-performance. For these reasons, it is reasonable to expect human brain capacity, at least in terms of hardware computational capacity, for one thousand dollars by around 2020.

Yet another approach to accelerate the availability of human-level computation in a personal computer is to use transistors in their native "a.n.a.log" mode. Many of the processes in the human brain are a.n.a.log, not digital. Although we can emulate a.n.a.log processes to any desired degree of accuracy with digital computation, we lose several orders of magnitude of efficiency in doing so. A single transistor can multiply two values represented as a.n.a.log levels; doing so with digital circuits requires thousands of transistors. California Inst.i.tute of Technology's Carver Mead has been pioneering this concept.44 One disadvantage of Mead's approach is that the engineering design time required for such native a.n.a.log computing is lengthy, so most researchers developing software to emulate regions of the brain usually prefer the rapid turnaround of software simulations. One disadvantage of Mead's approach is that the engineering design time required for such native a.n.a.log computing is lengthy, so most researchers developing software to emulate regions of the brain usually prefer the rapid turnaround of software simulations.

Human Memory Capacity. How does computational capacity compare to human memory capacity? It turns out that we arrive at similar time-frame estimates if we look at human memory requirements. The number of "chunks" of knowledge mastered by an expert in a domain is approximately 10 How does computational capacity compare to human memory capacity? It turns out that we arrive at similar time-frame estimates if we look at human memory requirements. The number of "chunks" of knowledge mastered by an expert in a domain is approximately 105 for a variety of domains. These chunks represent patterns (such as faces) as well as specific knowledge. For example, a world-cla.s.s chess master is estimated to have mastered about 100,000 board positions. Shakespeare used 29,000 words but close to 100,000 meanings of those words. Development of expert systems in medicine indicate that humans can master about 100,000 concepts in a domain. If we estimate that this "professional" knowledge represents as little as 1 percent of the overall pattern and knowledge store of a human, we arrive at an estimate of 10 for a variety of domains. These chunks represent patterns (such as faces) as well as specific knowledge. For example, a world-cla.s.s chess master is estimated to have mastered about 100,000 board positions. Shakespeare used 29,000 words but close to 100,000 meanings of those words. Development of expert systems in medicine indicate that humans can master about 100,000 concepts in a domain. If we estimate that this "professional" knowledge represents as little as 1 percent of the overall pattern and knowledge store of a human, we arrive at an estimate of 107 chunks. chunks.

Based on my own experience in designing systems that can store similar chunks of knowledge in either rule-based expert systems or self-organizing pattern-recognition systems, a reasonable estimate is about 106 bits per chunk (pattern or item of knowledge), for a total capacity of 10 bits per chunk (pattern or item of knowledge), for a total capacity of 1013 (10 trillion) bits for a human's functional memory. (10 trillion) bits for a human's functional memory.

According to the projections from the ITRS road map (see RAM chart on p. 57), we wi

< Prev TOC

Add to Library Next