A Short History of EBooks - Part 11
Library

Part 11

Murray was among the first authors to add a website to his books - an opportunity that many would soon adopt: "If a book can be web-extended (living partly in cybers.p.a.ce), then an author can easily update and correct it, whereas otherwise the author would have to wait a long time for the next edition, if indeed a next edition ever came out. (...) I do not know if I will publish books on the web - as opposed to publishing paper books. Probably that will happen when books become multimedia.

(I currently am helping develop multimedia learning materials, and it is a form of teaching that I like a lot - blending text, movies, audio, graphics, and - when possible - interactivity)."

He added in August 1999: "In addition to 'web-extending' books, we are now web-extending our multimedia (CD-ROM) products - to update and enrich them."

In October 2000, "our company - EDVantage Software - has become an internet company instead of a multimedia (CD-ROM) company.

We deliver educational material online to students and teachers."

= The internet as a novel "character"

Alain Bron lives in Paris, France. He is a consultant in information systems and a writer. The internet is one of the "characters" of his second novel, "Sanguine sur toile"

(Sanguine on the web), available in print from Editions du Choucas in 1999, and in PDF format from Editions 00h00 in 2000.

Alain wrote in November 1999: "In French, 'toile' means the web as well as the canvas of a painting, and 'sanguine' is the red chalk of a drawing as well as one of the adjectives derived from blood ('sang' in French). But would a love of colors justify a murder? 'Sanguine sur toile' is the strange story of an internet surfer caught up in an upheaval inside his own computer, which is being remotely operated by a very mysterious person whose only aim is revenge. I wanted to take the reader into the worlds of painting and enterprise, which intermingle, escaping and meeting up again in the dazzle of software. The reader is invited to try to untangle for himself the threads twisted by pa.s.sion alone. To penetrate the mystery, he will have to answer many questions. Even with the world at his fingertips, isn't the internet surfer the loneliest person in the world? In view of the compet.i.tion, what is the greatest degree of violence possible in an enterprise these days? Does painting tend to reflect the world or does it create another one? I also wanted to show that images are not that peaceful.

You can use them to take action, even to kill."

What part does the internet play in his novel? "The internet is a character in itself. Instead of being described in its technical complexity, it is depicted as a character that can be either threatening, kind or amusing. Remember the computer screen has a dual role - displaying as well as concealing. This ambivalence is the theme throughout. In such a game, the big winner is of course the one who knows how to free himself from the machine's grip and put humanism and intelligence before everything else."

= The web and its hyperlinks

Like many artists, Jean-Paul began searching how hyperlinks could expand his writing towards new directions. He switched from being a print author to being an hypermedia author, and created "Cotres furtifs" (Furtive Cutters) as a website "telling stories in 3D". He enjoyed the freedom given by online (self-)publishing, and wrote in August 1999: "The internet allows me to do without intermediaries, such as record companies, publishers and distributors. Most of all, it allows me to crystallize what I have in my head: the print medium (desktop publishing, in fact) only allows me to partly do that."

He also insisted on the growing interaction between digital literature and technology. "The future of cyber-literature, techno-literature, digital literature or whatever you want to call it, is set by the technology itself. It is now impossible for an author to handle all by himself the words and their movement and sound. A decade ago, you could know well each of Director, Photoshop or Cubase (to cite just the better known software), using the first version of each. That is not possible any more. Now we have to know how to delegate, find more solid financial partners than Gallimard, and look in the direction of Hachette-Matra, Warner, the Pentagon and Hollywood. At best, the status of multimedia director (?) will be the one of video director, film director, manager of the product. He is the one who receives the golden palms at Cannes, but who would never have been able to earn them just on his own. As twin sister (not a clone) of the cinematograph, cyber- literature (video + the link) will be an industry, with a few isolated craftsmen on the outer edge (and therefore with below- zero copyright)."

Jean-Paul added in June 2004: "Surfing the web is like radiating in all directions (I am interested in something and I click on all the links on a home page) or like jumping around (from one click to another, as the links appear). You can do this in the written media, of course. But the difference is striking. So the internet changed how I write. You don't write the same way for a website as you do for a script or a play.

In fact, it is not the internet which changed how I write, it is the first Mac that I discovered through the self-learning of HyperCard. I still remember how astonished I was during the month when I was learning about b.u.t.tons, links, surfing by a.n.a.logies, objects or images. The idea that a simple click on one area of the screen allowed me to open a range of piles of cards, and each card could offer new b.u.t.tons and each b.u.t.ton opened on to a new range, etc. In brief, the learning of everything on the web that today seems really ba.n.a.l, for me it was a revelation (it seems Steve Jobs and his team had the same shock when they discovered the ancestor of the Mac in the laboratories of Rank Xerox). Since then I write directly on the screen: I use the print medium only occasionally, to fix up a text, or to give somebody who is allergic to the screen a kind of photograph, something instantaneous, something approximate.

It is only an approximation, because print forces us to have a linear relationship: the text is developing page after page (most of the time), whereas the technique of links allows another relationship to the time and s.p.a.ce of imagination. And, for me, it is above all the opportunity to put into practice this reading/writing 'cycle', whereas leafing through a book gives only an idea - which is vague because the book is not conceived for that."

2005: GOOGLE GETS INTERESTED IN EBOOKS

= [Overview]

The beta version of Google Print went live in May 2005. In October 2004, Google launched the first part of Google Print as a project aimed at publishers, for internet users to be able to see excerpts from their books and order them online. In December 2004, Google launched the second part of Google Print as a project intended for libraries, to build up a world digital library by digitizing the collections of main partner libraries. In August 2005, Google Print was stopped until further notice because of lawsuits filed by a.s.sociations of authors and publishers for copyright infringement. The program resumed in August 2006 under the new name of Google Books.

Google Books has offered books digitized in the partic.i.p.ating libraries (Harvard, Stanford, Michigan, Oxford, California, Virginia, Wisconsin-Madison, Complutense of Madrid and New York Public Library), with either the full text for public domain books or excerpts for copyrighted books. Google settled a lawsuit with a.s.sociations of authors and publishers in October 2008, with an agreement to be signed in 2009.

= Google Print

In October 2004, Google launched the first part of Google Print as a project aimed at publishers, for internet users to be able to see excerpts from their books and order them online. In December 2004, Google launched the second part of Google Print as a project intended for libraries, to build up a digital library of 15 million books by digitizing the collections of main partner libraries, beginning with the universities of Michigan (7 million books), Harvard, Stanford and Oxford, and the New York Public Library. The planned cost in 2004 was an average of US $10 per book, and a total budget of $150 to $200 million for ten years. The beta version of Google Print went live in May 2005. In August 2005, Google Print was stopped until further notice because of lawsuits filed by a.s.sociations of authors and publishers for copyright infringement.

= Google Books

The program resumed in August 2006 under the new name of Google Books. Google Books has offered excerpts from books digitized by Google in the partic.i.p.ating libraries - that now included Harvard, Stanford, Michigan, Oxford, California, Virginia, Wisconsin-Madison, Complutense of Madrid and New York Public Library. Google Books provided the full text for public domain books and excerpts for copyrighted books. According to some media buzz, Google was scanning 3,000 books a day.

The inclusion of copyrighted works in Google Books was widely criticized by authors and publishers worldwide. In the U.S., lawsuits were filed by the Authors Guild and the a.s.sociation of American Publishers (AAP) for alleged copyright infringement.

The a.s.sumption was that the full scanning and digitizing of copyrighted books infringed copyright laws, even if only snippets were made freely available. Google replied this was "fair use", referring to short excerpts from copyrighted books that could be lawfully quoted in another book or website, as long as the source (author, t.i.tle, publisher) was mentioned.

After three years of conflict, Google reached a settlement with the a.s.sociations of authors and publishers in October 2008, with an agreement to be signed in 2009.

As of December 2008, Google had 24 library partners, including a Swiss one (University Library of Lausanne), a French one (Lyon Munic.i.p.al Library), a Belgian one (Ghent University Library), a German one (Bavarian State Library), two Spanish ones (National Library of Catalonia and University Complutense of Madrid) and a j.a.panese one (Keio University Library). The U.S. partner libraries were, by alphabetical order: Columbia University, Committee on Inst.i.tutional Cooperation (CIC), Cornell University Library, Harvard University, New York Public Library, Oxford University, Princeton University, Stanford University, University of California, University of Michigan, University of Texas at Austin, University of Virginia and University of Wisconsin-Madison.

2006: TOWARDS A WORLD PUBLIC DIGITAL LIBRARY

= [Overview]

Conceived by the Internet Archive to offer a universal public digital library, the Open Content Alliance (OCA) was launched in October 2005 as a group of cultural, technology, non profit and governmental organizations willing to build a permanent archive of multilingual digitized text and multimedia content.

The project took off in 2006, with the digitization of public domain books around the world. Unlike Google Books, the Open Content Alliance (OCA) has made them searchable through any web search engine, and has not scanned copyrighted books, except when the copyright holder has expressly given permission. The first contributors to OCA were the University of California, the University of Toronto, the European Archive, the National Archives in United Kingdom, O'Reilly Media and the Prelinger Archives. The digitized collections are freely available in the Text Archive section of the Internet Archive. In December 2008, one million ebooks were posted under OCA principles by the Internet Archive.

= [In Depth]

The Internet Archive and Yahoo! conceived the Open Content Alliance (OCA) in early 2005 to offer broad public access to the world culture. The OCA also wanted to address the issues of the Google Book project, with its copyright issues and its availability from one search engine only. The OCA was launched with the goal of digitizing only public domain books and making them searchable and downloadable through any search engine.

What exactly is the Internet Archive? Founded in April 1996 by Brewster Kahle, the Internet Archive is a non-profit organization that has built an "internet library" to offer permanent access to historical collections in digital format for researchers, historians and scholars. An archive of the web is stored every two months or so. In late 1999, the Internet Archive started to include more collections of archived webpages on specific topics. It also became an online digital library of text, audio, software, image and video content. In October 2001, with 30 billion stored webpages, the Internet Archive launched the Wayback Machine, for users to be able to surf the archive of the web by date. In 2004, there were 300 terabytes of data, with a growth of 12 terabytes per month.

There were 65 billion pages (from 50 million websites) in 2006 and 85 million pages in 2008. The Internet Archive now defines itself as "a nonprofit digital library dedicated to providing universal access to human knowledge."

In October 2005, the Internet Archive launched the Open Content Alliance (OCA) with other contributors as a collective effort for "building a digital archive of global content for universal access" (subt.i.tle of the OCA home page) that would be a permanent repository of multilingual text and multimedia content.

As explained on its website in 2007, the OCA "is a collaborative effort of a group of cultural, technology, nonprofit, and governmental organizations from around the world that helps build a permanent archive of multilingual digitized text and multimedia material. An archive of contributed material is available on the Internet Archive website and through Yahoo! and other search engines and sites. The OCA encourages access to and reuse of collections in the archive, while respecting the content owners and contributors."

The project aims at digitizing public domain books around the world and make them searchable through any web search engine and downloadable for free. Unlike Google Books, the OCA scans and digitizes only public domain books, except when the copyright holder has expressly given permission. The first contributors to the OCA were the University of California, the University of Toronto, the European Archive, the National Archives in United Kingdom, O'Reilly Media and Prelinger Archives. The digitized collections are freely available in the Text Archive section of the Internet Archive. 100,000 ebooks were publicly available in December 2006 (with 12,000 new ebooks added per month), 200,000 ebooks in May 2007, and one million ebooks in December 2008.

Microsoft has been one of the partners of the OCA, while also developing its own project. The beta version of Live Search Books was released in December 2006, with a search possible by keyword for non copyrighted books digitized by Microsoft in partner libraries. The British Library and the libraries of the universities of California and Toronto were the first ones to join in, followed in January 2007 by the New York Public Library and Cornell University. Books offered full text views and could be downloaded in PDF files. In May 2007, Microsoft announced agreements with several publishers, including Cambridge University Press and McGraw Hill, for their books to be available in Live Search Books. After digitizing 750,000 books and indexing 80 million journal articles, Microsoft ended the Live Search Books program in May 2008, to focus on other activities, and closed the website. These books are available in the OCA collections of the Internet Archive.

A main issue for digital libraries is the lack of proofreading of digitized books, that ensures a better accuracy of the text without any loss from the print version. The only digital library proofreading its books has been Project Gutenberg, with 28,000 high-quality ebooks available in January 2009. Good OCR (Optical Character Recognition) software run on image files - obtained from scanning print pages - is said to ensure 99% accuracy. If the step of the proofreading seems essential to Project Gutenberg, whose goal is to reach a 99.99% accuracy for its ebooks - above the 99.95% accuracy set up as a standard for Library of Congress -, this step is skipped by the Internet Archive, the OCA, Google and many others. Some R&D teams work on better quality OCR technology, which means that they would have to go back to the original image files to provide a higher quality book in the future, if they do want to provide digital versions without any loss from the print version.

2007: WE READ ON VARIOUS ELECTRONIC DEVICES

= [Overview]

Amazon.com launched its own reading device, the Kindle, in November 2007. In the mid-1990s, people read on their desktop computers before reading on their laptops. The Palm Pilot was launched in March 1996 as the first PDA, and people began reading on PDAs. 23 million Palm Pilots were sold between 1996 and 2002. Its main compet.i.tors were the Pocket PC (launched by Microsoft in April 2000) and the PDAs of Hewlett-Packard, Sony, Handspring, Toshiba and Casio. People also began reading on the first smartphones launched by Nokia or Sony Ericsson. Some companies launched dedicated reading devices like the Rocket eBook, the SoftBook Reader, the Gemstar eBook and the Cybook, all models that didn't last long. Better reading devices emerged then, like the Cybook (new version) in 2004, the Sony Reader in 2006 and the Kindle in 2007. LCD screens were replaced by screens using the E Ink technology. The next step should be an ultra-thin flexible display called electronic paper (epaper), launched in 2001 by E Ink, Plastic Logic and others.

= First reading devices

How about a book-sized electronic reader that could store many books at once? From 1998 onwards, some pioneer companies began working on dedicated reading devices, and launched the Rocket eBook (created by NuvoMedia), the EveryBook (created by EveryBook), the SoftBook Reader (created by SoftBook Press), and the Millennium eBook (created by Librius.com).