A Short History of EBooks - Part 6
Library

Part 6

= Digital libraries

# A definition

What exactly is a digital library? The Universal Library Project, hosted by Carnegie Mellon University, defined it in 1998 as "a digital library of digital doc.u.ments, artifacts, and records. The advantage of having library material available in digital form is threefold: (1) the content occupies less s.p.a.ce and can be replicated and made secure electronically; (2) the content can be made immediately available over the internet to anyone, anywhere; and (3) search for content can be automated.

The promise of the digital library is the promise of great cost reductions while providing great increases in archive availability and accessibility. (...) There are literally thousands of digital library initiatives of a great many varieties going on in the world today. Digital libraries are being formed of scholarly works, archives of historical figures and events, corporate and governmental records, museum collections and religious collections. Some take the form of scanning and putting doc.u.ments to the World Wide Web. Still other digital libraries are formed of digitizing paintings, films and music. Work even exists in 3D reconstructive digitization that permits a digital deconstruction, storage, transmission, and reconstruction of solid object."

Since the mid-1990s, libraries were studying how to store an enormous amount of data and make it available on the internet through a reliable search engine. Library 2000 was a project run between 1995 and 1998 by the MIT Laboratory for Computer Science (MIT: Ma.s.sachusetts Inst.i.tute of Technology) to explore the implications of large scale online storage, using the digital library of the future as an example. It developed a prototype using the technology and system configurations expected to be economically feasible in 2000.

Another project was the Digital Library Initiative, supported by grants from NSF (National Science Foundation), DARPA (Defense Advanced Research Projects Agency) and NASA (National Aeronautics and s.p.a.ce Administration). As mentioned on its website in 1998: "The Initiative's focus is to dramatically advance the means to collect, store, and organize information in digital forms, and make it available for searching, retrieval, and processing via communication networks - all in user-friendly ways."

The British Library was a pioneer in Europe. Brian Lang, chief executive of the library, explained on its website in 1998: "We do not envisage an exclusively digital library. We are aware that some people feel that digital materials will predominate in libraries of the future. Others antic.i.p.ate that the impact will be slight. In the context of the British Library, printed books, ma.n.u.scripts, maps, music, sound recordings and all the other existing materials in the collection will always retain their central importance, and we are committed to continuing to provide, and to improve, access to these in our reading rooms.

The importance of digital materials will, however, increase. We recognize that network infrastructure is at present most strongly developed in the higher education sector, but there are signs that similar facilities will also be available elsewhere, particularly in the industrial and commercial sector, and for public libraries. Our vision of network access encompa.s.ses all these."

The Digital Library Programme was expected to begin in 1999.

"The development of the Digital Library will enable the British Library to embrace the digital information age. Digital technology will be used to preserve and extend the Library's unparalleled collection. Access to the collection will become boundless with users from all over the world, at any time, having simple, fast access to digitized materials using computer networks, particularly the internet."

Another pioneer in Europe was the French National Library (BnF: Bibliotheque nationale de France). The BnF launched its digital library Gallica in October 1997 as an experimental project to offer digitized texts and images from print collections relating to French history, life and culture. When interviewed by Jerome Strazzulla in the daily Le Figaro of June 3, 1998, Jean-Pierre Angremy, president of BnF, stated: "We cannot, we will not be able to digitize everything. In the long term, a digital library will only be one element of the whole library."

The first step of the program, a major collection of 19th- century French texts and images, was available online one year later.

# Some projects

In Germany, the Bielefeld University Library (Bibliothek der Universitat Bielefeld) began posting online versions of German rare prints in 1996. Michael Behrens, in charge of the digital library project, wrote in September 1998: "To some here, 'digital library' seems to be everything that, even remotely, has to do with the internet. The library started its own web server some time in summer 1995. (...) Before that, it had been offering most of its services via Telnet, which wasn't used much by patrons, although in theory they could have accessed a lot of material from home. But in those days almost n.o.body really had internet access at home... We started digitizing rare prints from our own library, and some rare prints which were sent in via library loan, in November 1996. (...)

In that first phase of our attempts at digitization, starting November 1996 and ending June 1997, 38 rare prints were scanned as image files and made available via the web. During the same time, there were also a few digital materials prepared as accompanying material for lectures held at the university (image files as excerpts from printed works). These are, for copyright reasons, not available outside of campus. The next step, which is just being completed, is the digitization of the Berlinische Monatsschrift, a German periodical from the Enlightenment, comprising 58 volumes, and 2,574 articles on 30,626 pages. A somewhat bigger digitization project of German periodicals from the 18th and early 19th century is planned.

The size will be about 1,000,000 pages. These periodicals will be not just from the holdings of this library, but the project would be coordinated here, and some of the technical would be done here, also."

Other digital libraries were created from scratch, with no back up from a traditional library. They were "only" digital. This was the case of Athena in Switzerland, and Projetto Manuzio in Italy.

Athena was founded in 1994 by Pierre Perroud, a Swiss teacher, and hosted on the website of the University of Geneva. Athena was created as a multilingual digital library specializing in philosophy, science, literature, history and economics, either by digitizing doc.u.ments or by providing links to existing etexts. The Helvetia section provided doc.u.ments about Switzerland. Geneva being the main city in French-speaking Switzerland, Athena also focused on putting French texts online. A specific page offered an extensive selection of other digital libraries worldwide, with relevant links.

Projetto Manuzio was launched by Liber Liber as as a free digital library for texts in Italian. Liber Liber is an Italian cultural a.s.sociation aimed at the promotion of any kind of artistic and intellectual expression. It wanted to link humanities and science by using computer technology in humanities. Projetto Manuzio was named after the famous 16th- century Venetian publisher who improved the printing techniques invented by Gutenberg.

As stated on its website in 1998, Projetto Manuzio wanted "to make a n.o.ble idea real: the idea of making culture available to everybody. How? By making books, graduation theses, articles, tales or any other doc.u.ment which could be digitized in a computer available all over the world, at any minute and free of charge. Via modem, or using floppy disks (in this case, by adding the cost of a blank disk and postal fees), it is already possible to get hundreds of books. And Projetto Manuzio needs only a few people to make such a masterpiece as Dante Alighieri's Divina Commedia available to millions of people."

Some "only" digital libraries were organized around an author, for example The Marx/Engels Internet Archive (MEIA). MEIA was created in 1996 to offer a chronology of the collected works of Karl Marx and Frederick Engels, and link this chronology to the digital versions of these works "as one work after another is brought online". As explained on the website in 1998: "There's no way to monetarily profit from this project. 'Tis a labor of love undertaken in the purest communitarian sense. The real 'profit' will hopefully manifest in the form of individual enlightenment through easy access to these cla.s.sic works.

Besides, transcribing them is an education in itself... Let us also add that this is not a sectarian/One-Great-Truth effort.

Help from any individual or any group is welcome. We have but one slogan: 'Piping Marx & Engels into cybers.p.a.ce!'"

A search engine was set up for the digital library. "As larger works come online, they will also have small search pages made for them alone - for instance, Capital will have a search page for that work alone."

The Biographical Archive gave access to biographies of Marx and Engels, as well as short biographies and photographs of their family members and friends. The Photo Gallery gathered photos of the Marx and Engels clan from 1839 to 1894, and their dwellings from 1818 to 1895, with "many more to come". The section "Others" included a list of works from all Marxist writers, for example James Connolly, Daniel DeLeon and Hal Draper, as well as a short biography. The Non-English Archive listed the works of Marx and Engels freely available online in other languages (Danish, French, German, Greek, Italian, j.a.panese, Polish, Portuguese, Spanish and Swedish). It seems that the project was later renamed the Marxists Internet Archive.

= Library treasures go online

Libraries began digitizing their treasures, and putting the digital versions on the web for the world to enjoy. The British Library was a pioneer in this field. One of the first digitized treasures was Beowulf, the earliest known narrative poem in English, and one of the most famous works of Anglo-Saxon poetry. The British Library holds the only known ma.n.u.script of Beowulf, dated circa 1000. The poem itself is much older than the ma.n.u.script - some historians believe it might have been written circa 750. The ma.n.u.script was badly damaged by fire in 1731. 18th-century transcripts mentioned hundreds of words and characters which were then visible along the charred edges, and subsequently crumbled away over the years. To halt this process, each leaf was mounted on a paper frame in 1845.

Scholarly discussions on the date of creation and provenance of the poem continue around the world, and researchers regularly require access to the ma.n.u.script. Taking Beowulf out of its display case for study not only raised conservation issues, it also made it unavailable for the many visitors who were coming to the British Library expecting to see this literary treasure on display. Digitization of the ma.n.u.script offered a solution to these problems, as well as providing new opportunities for researchers and readers worldwide.

The Electronic Beowulf Project was launched as a database of digital images of the Beowulf ma.n.u.script, as well as related ma.n.u.scripts and printed texts. In 1998, the database included the fiber-optic readings of hidden characters and ultra-violet readings of erased text in the ma.n.u.script; the full electronic facsimiles of the 18th-century transcripts of the ma.n.u.script; and selections from the main 19th-century collations, editions and translations. Major additions to the database were planned for the following years, such as images of contemporary ma.n.u.scripts, links to the Toronto Dictionary of Old English Project, and links to the comprehensive Anglo-Saxon bibliographies of the Old English Newsletter.

The database project was developed in partnership with two leading experts in the United States, Kevin Kiernan, from the University of Kentucky, and Paul Szarmach, from the Medieval Inst.i.tute of Western Michigan University. Professor Kiernan edited the electronic archive and supervised the making of a CD-ROM with the main electronic images.

Brian Lang, chief executive of the British Library, explained on its website in 1998: "The Beowulf ma.n.u.script is a unique treasure and imposes on the Library a responsibility to scholars throughout the world. Digital photography offered for the first time the possibility of recording text concealed by early repairs, and a less expensive and safer way of recording readings under special light conditions. It also offers the prospect of using image enhancement technology to settle doubtful readings in the text. Network technology has facilitated direct collaboration with American scholars and makes it possible for scholars around the world to share in these discoveries. Curatorial and computing staff learned a great deal which will inform any future programmes of digitization and network service provision the Library may undertake, and our publishing department is considering the publication of an electronic scholarly edition of Beowulf. This work has not only advanced scholarship; it has also captured the imagination of a wider public, engaging people (through press reports and the availability over computer networks of selected images and text) in the appreciation of one of the primary artefacts of our shared cultural heritage."

Other treasures of the British Library were available online as well: "Magna Carta", the first English const.i.tutional text, signed in 1215, with the Great Seal of King John; the "Lindisfarne Gospels", dated 698; the "Diamond Sutra", dated 868, sometimes referred to as the world's earliest print book; the "Sforza Hours", dated 1490-1520, an outstanding Renaissance treasure; the "Codex Arundel", a notebook from Leonardo Da Vinci, in the late 15th or early 16th century; and the "Tyndale New Testament", as the first print version in English by Peter Schoeffer in Worms.

New treasures followed. The digitized version of the Bible of Gutenberg was available online in November 2000. Gutenberg printed its Bible in 1454 or 1455 in Germany, perhaps printing 180 copies, with 48 copies still available in 2000, and three copies - two full ones and one partial one - at the British Library. The two full copies - a little different from each other - were digitized in March 2000 by j.a.panese experts from Keio University of Tokyo and NTT (Nippon Telegraph and Telephone Communications). The images were then processed to offer a full digital version on the web a few months later.

1999: LIBRARIANS GET DIGITAL

= [Overview]

The job of librarians, that had already changed a lot with computers, went on to change even more with the internet.

Electronic mail became commonplace for internal and external communications. Librarians could subscribe to newsletters and partic.i.p.ate in newsgroups and discussion forums. In 1999, librarians were running intranets for their organizations, like Peter Raggett at the OECD Library, or they were running library websites, like Bruno Didier at the Inst.i.tute Pasteur Library.

Computers made catalogs much easier to handle, as well as library loans and book orders. This was the case for Anissa Rachef at the French Inst.i.tute in London. Librarians could type in bibliographic records in a computer database that was sorting out book records by alphabetical order, with search engines for queries by author, t.i.tle, year and subject. By networking computers, the internet gave a boost to union catalogs for a state, a province, a department, a country or a region, and made things simpler for interlibrary loan.

= Two experiences

# At the OECD

The OECD Library was among the first ones in Europe to set up an extensive intranet for the staff of its organization. What is OECD? "The OECD is a club of like-minded countries. It is rich, in that OECD countries produce two thirds of the world's goods and services, but it is not an exclusive club.

Essentially, membership is limited only by a country's commitment to a market economy and a pluralistic democracy. The core of original members has expanded from Europe and North America to include j.a.pan, Australia, New Zealand, Finland, Mexico, the Czech Republic, Hungary, Poland and Korea. And there are many more contacts with the rest of the world through programmes with countries in the former Soviet bloc, Asia, Latin America - contacts which, in some cases, may lead to membership." (excerpt from its website in 1999)

The OECD Central Library serves the OECD staff to support their research work, with more than 60,000 monographs and 2,500 periodicals in early 1999, as well as microfilms and CD-ROMs, and subscripions to databases like Dialog, Lexis-Nexis and UnCover.

Peter Raggett, deputy-head (and then head) of the Central Library, first worked in government libraries in United Kingdom before joining the OECD in 1994. An avid internet user since 1996, Peter wrote in August 1999: "At the OECD Library we have collected together several hundred websites and have put links to them on the OECD intranet. They are sorted by subject and each site has a short annotation giving some information about it. The researcher can then see if it is possible that the site contains the desired information. This is adding value to the site references and in this way the Central Library has built up a virtual reference desk on the OECD network. As well as the annotated links, this virtual reference desk contains pages of references to articles, monographs and websites relevant to several projects currently being researched at the OECD, network access to CD-ROMs, and a monthly list of new acquisitions. The Library catalogue will soon be available for searching on the intranet. The reference staff at the OECD Library uses the internet for a good deal of their work. Often an academic working paper will be on the web and will be available for full-text downloading. We are currently investigating supplementing our subscriptions to certain of our periodicals with access to the electronic versions on the internet."

What about finding information on the internet? "The internet has provided researchers with a vast database of information.

The problem for them is to find what they are seeking. Never has the information overload been so obvious as when one tries to find information on a topic by searching the internet. When one uses a search engine like Lycos or AltaVista or a directory like Yahoo!, it soon becomes clear that it can be very difficult to find valuable sites on a given topic. These search mechanisms work well if one is searching for something very precise, such as information on a person who has an unusual name, but they produce a confusing number of references if one is searching for a topic which can be quite broad. Try and search the web for Russia AND transport to find statistics on the use of trains, planes and buses in Russia. The first references you will find are freight-forwarding firms who have business connections with Russia."

How about the future? "The internet is impinging on many peoples' lives, and information managers are the best people to help researchers around the labyrinth. The internet is just in its infancy and we are all going to be witnesses to its growth and refinement. (...) Information managers have a large role to play in searching and arranging the information on the internet. I expect that there will be an expansion in internet use for education and research. This means that libraries will have to create virtual libraries where students can follow a course offered by an inst.i.tution at the other side of the world. Personally, I see myself becoming more and more a virtual librarian. My clients may not meet me face-to-face but instead will contact me by email, telephone or fax, and I will do the research and send them the results electronically."

# At the Pasteur Inst.i.tute

"The Pasteur Inst.i.tutes are exceptional observatories for studying infectious and parasite-borne diseases. They are wedded to the solving of practical public health problems, and hence carry out research programmes which are highly original because of the complementary nature of the investigations carried out: clinical research, epidemiological surveys and basic research work. Just a few examples from the long list of major topics of the Inst.i.tutes are: malaria, tuberculosis, AIDS, yellow fever, dengue and poliomyelitis." (excerpt from the website in 1999)

Bruno Didier, librarian and webmaster of the library website, explained in August 1999: "The main aim of the Pasteur Inst.i.tute Library website is to serve the Inst.i.tute itself and its a.s.sociated bodies. It supports applications that have became essential in such a big organization: bibliographic databases, cataloging, ordering of doc.u.ments and of course access to online periodicals (presently more than 100). It is a window for our different departments, at the Inst.i.tute but also elsewhere in France and abroad. It plays a big part in doc.u.mentation exchanges with the inst.i.tutes in the worldwide Pasteur network. I am trying to make it an interlink adapted to our needs for exploration and use of the internet. The website has existed in its present form since 1996 and its audience is steadily increasing. (...) I build and maintain the webpages and monitor them regularly. I am also responsible for training users. The web is an excellent place for training and it is included in most ongoing discussion about that."

How about the future? "Our relationship with both the information and the users is what changes. We are increasingly becoming mediators, and perhaps to a lesser extent 'curators'.

My present activity is typical of this new situation: I am working to provide quick access to information and to create effective means of communication, but I also train people to use these new tools. (...) I think the future of our job is tied to cooperation and use of common resources. It is certainly an old project, but it is really the first time we have had the means to set it up."