"What good is a 5.25 inch floppy disk containing WordStar documents, for example, if you no longer have a computer that can read the file format, let alone a drive that can read the disk?" ~ RLG
One "35mm slide is reckoned to contain 18MB of information." [s single TIFF image can range between 10 and 20 MB]~ Susan Jane Williams
"Digital resources give smaller libraries a chance to level the playing field." ~ Bruce Newell
"The Library of Congress receives over 2 million requests a day for digital files, compared to 2 million requests per year for items to be delivered to readers in its rooms. According to a recent report, there are now 50 million historical documents posted on the web by the National Archives alone." ~ Peter Kaufman
"There was very little trust in the print medium when it was first developed -- it was seen as unstable and subject to piracy and fraudulent copying." ~ Donald Waters
The International Coalition of Library Consortia created a useful statement on best practices in the selection and purchase of digital information.
Washington State Library [although it may not exist much longer' has a useful site devoted to planning a digital project called Digital Best Practices.
Yale University Library sponsors the LibLicense website and it contains much useful information. An interesting model license is also available from John Cox Associates.
IMLS has created a "Framework of Guidance for Building Good Digital Collections."
RLG and NPO have created "Guidelines for Digital Imaging: Guidance for Selecting Materials for Digitization."
The Cornell University Library has created a useful digital imaging tutorial on selecting items for digitization.
The Digital Library Federation maintains a database of digital library materials.
Columbia University Libraries has a policy with criteria for digital imaging.
The Open Archives Initiative [OAI] promotes inter operability standards to allow digital collections to be freely shared. The Research Library Group maintains a website for those working with the OAIS model.
Born Digital is a product that was first created as a digital file.
A digital object is an item created or reformatted into the digital format. Digital collections consist of objects. An object might be a book, an artifact, or a paper. An object may be a master or preservation copy or use copies. If the object has been reformatted it should be a true and complete copy of the original. Best practices for objects are available:
A digital collection is "a selected and organized set of digital materials (objects) along with the metadata that describes them and at least one interface that gives access to them." There is rapid growth in digital collections, often with digitized source material and "some fairly simple access tools." Interpretation and presentation are less well developed that the digitizing of cultural heritage materials.
A digital library is a library whose collections consist of digital objects. It is not a mixed format collection. The Digital Library Federation provides a definition:
"Organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of , and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities." Digital libraries focus on audience and understanding. A library contains resources that allow users to find needed content and place that content in perspective via linkage to other content.
An electronic library is the same thing. A virtual library is a library that contains both local digital collections and access to distant digital collections in such a way that access is seamless and the user is unaware of which items are local and which are distant.
A hybrid library is one that provides "seamless access to integrated print, digital, local, and remote resources."
An institutional digital repository is a "set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members."
Metadata is the "sum total of what one can say about any information object at any level of aggregation." Metadata has three aspects:
Metadata provides access to digital objects and should increase access and use of digital materials.
"The first major acknowledgement of the importance of digital libraries came in a 1994 announcement that $24.4 million of U.S. federal funds would be dispersed among six universities for 'digital library' research." These projects were more focused on IT issues, problems and solutions than on developing useful collections. Some call them "computer science experiments." In late 1998, digital library funding included the need for some elements of traditional libraries, including linkage to real users and custodialship. Since then, both museums and research libraries have begun digitizing special collections and making these available via the WWW. Metadata standards have received increasing attention as we move from digital collections to digital libraries.
Digital collections offer both opportunities and threats. The major opportunity is to provide considerably increased access to a much wider variety of items. For example, the International Children's Digital Library, developed by the University of Maryland and the Internet Archive provides access to over 200 books from 27 cultures and expects to host 10,000 books from over 100 countries. Another major opportunity is to allow items to "talk to each other." For example, a geographical reference in a historical study could link directly to a map. A reference to a musical composition could like directly to an audio file. Portable digital libraries also become possible with individual libraries housed on an IPod like device or a PDA for viewing or listening. Amazon.com's ability to search the text of books presents another opportunity -- to find chunks of content formerly invisible in works. Increasingly, the library is a portal and not a place.
The major threat is the increasing cost of providing this access while continuing to support need print collections. A minor threat is the growing need for information professionals with new skills and tolerance of increased ambiguity and change.
There is "growing and persistent demand for more and more digital content."
The principles traditionally applied to collection development and management work well here. Quality, relevancy, cost, and audience remain critical.
Selecting digital items places more emphasis on functionality, technical requirements, licensing terms, and service requirements. Still, many principles here are similar to those used traditionally. For example, IMLs, in its collections principles says:
Rather like principles, increasing attention has been given to best practice. Some times, these are really principles and other times that reflect the best procedure for doing something.
There are two major differences. With commercial items, licensing creates a variety of problems, especially since the information agency must purchase limited access for a short time period. With items free of copyright or where the copyright is held by the agency digitizing, the collection developer has assumed the role of publisher, making decisions about content and format.
Developing and managing digital collections requires considerably more time and effort, especially with a variety of licenses that vary notably in what is allowed and restricted. Users must be informed of license restrictions and the agency is responsible for insuring that license requirements are met. More involvement with legal oversight is needed.
Historically, libraries were responsible for preserving published content. Publishers are reluctant to allow this today even when licensing prevents libraries from providing proper archival access. JSTOR is an example of a project where publishers grant perpetual rights to periodical issues so that they may be archived. To some digital items appear and disappear with "alarming rapidity" as publishers decide to remove titles from aggregator's collections' With purchased hard copy materials, libraries handled preservation. With leased digital copies, libraries depend on external for-profit firms to preserve material so that it will be available in the future. However, many of these firms show little interest in preserving items unless they are popular and continue to sell well.
Selecting bundles of periodicals or books rather than individual items is a convenience but it may create relatively homogeneous collections and lessen access to less visible items that may be useful in the future.
Originally, the hope was that digital collections would reduce the cost of collection development. So far, this has not happened. Digital collections do dramatically reduce the need for storage space [including storing, shelving, retrieving] as well as the need for processing. However, the costs of acquiring material and maintaining the needed infrastructure needed for user access have been greater than expected.
Many digital items and collections may be purchased by the drink. This means that items are purchased on a pay-per-view access to individual articles or chunks of intellectual content. The traditional approach might be called by the kitchen sink. In this case, as in a periodical subscription, you purchase all of the intellectual content even if no one uses particular items.
The most popular digital materials and collections in most information agencies are periodicals and full-text databases of periodical articles. Managing e-periodical collections are more complex and more labor-intensive because of licensing, negotiation, and the need to find the additional monies needed to pay for these collections."Pricing of digital products is chaotic and arbitrary." Preservation and archiving remain serious problems.
Increasingly, publishers "bundle" their several periodical titles. This "big deal" provides lower cost access to the bundle, but most information agencies will not want or need all of the titles in the bundle. Because of the cost of bundles, much of this purchasing is done by cooperative purchasing groups (consortia). While this makes access more affordable, it reduces local control and makes selection and retention decisions more difficult. Collection developers no longer select and weed particular titles to shape the collection. Bundles also create problems with duplication since the same title may appear in different products leased by the library.
The quality of digitized periodicals may vary. As publishers remove titles from aggregators, full-text collections become less complete. Incomplete content is frequently a problem as are the quality of graphics.
There is no single source for digital periodicals and collections, interface and searching procedures vary from one to another which complicates user instruction and use.
Because many information agencies have funding problems, publishers and vendors increasingly offer use-based pricing [by the drink] in contrast to pricing based upon the number of potential users.
Intellectual and physical access has improved with citations that are tied to the appropriate full text so that a click takes the reader to the cited source. SFX is perhaps the best known tool to accomplish this. Such service requires expanded collections and agreements among publishers and vendors.
Digital books have been notably successful in the reference area both as stand-alone items such as encyclopedias and as collections. OCLC's netLibrary is the best known example of a service providing library users with access to books of some academic interest. E-books have been slow to take off and are still in a preliminary stage.
The shift to digital resources is well established and is "proceeding inexorably. Critics note that library directors have already lost control of their collections even though they may not realize it. Note however, that different disciplines are adapting differently to digital products so the rate of change varies. The regular, sometimes dramatic increase in subscription costs and some lack of flexibility by publishers is creating another wave of periodical cancellations.
Open-Access archives are being established, sometimes with commercial publisher cooperation, for the deposit of research papers. While recently published items may not be available, at least there is public access for items one or more years after publication and the possibility of preservation.
There are two notable reasons for creating a digital collection:
There is some question about the viability of digital preservation, especially with the need for refreshing. However, some research institutions are creating "preservation quality" digital masters. Here, the information agency is most likely to digitize all or part of a collection, usually a special one.
Extending the reach of the collection includes both increasing access for distant as well as local users and making the collection itself more useful by adding interpretation and showing relationships between various items. A collection integrates content that otherwise might be somewhat scattered.
Although part of extending the reach of an existing analog collection, image building is clearly an important aspect of this reason. Digitization associated with an exhibit or an anniversary is often more tightly related to image and visibility enhancement.
Ideally, the selection of items to be included in a collection or the selection of a collection for digitization should involve thoughtful evaluation of:
Without this evaluation, collections are digitized with the "field of dreams" development philosophy, i.e. "build it and they will come."
Donald Waters has identified three cost barriers to digitization:
Nearly all digitized collections are special ones with particular interest in primary source materials. There is little interest in creating such collections based on general collections. Obviously, copyright inhibits as does the size of these collections. Too, this is the area where existing publishers are most likely to create collections and make them available.
Collections of items with images or visual appeal is much more likely than those that are textual. Visual materials are more attractive and more interesting. They also do not require OCR or text encoding.
"There are no absolute rules for creating good collections, objects or metadata. ... The key to a successful project is not to follow any particular path, but to think strategically and make wise choices." Here, the information professional needs to think like a publisher. That is to say, the collection developer must consider:
Typically, about one-third of the costs are related to digital conversion, one-third for cataloging and descriptive metadata, and one-third for administration, quality control, and the like.
Another consideration for the "publisher" is whether the collection should be part of a cooperative initiative or an independent one. Cooperative projects, especially if they involve well known and experienced information agencies are most likely to be successful.
When creating digital collections, the collection developer or collection manager is responsible for:
Ownership of intellectual property is a continuing problem. Permission must be obtained and legal advice followed if there is any question about material that may not be in the public domain or owned by the information agency.
Best practices are solidifying so that it is easier than before to identify and follow particular steps in creating a digital collection. Note that best practice is found both in not-for-profit agencies and for-profit firms since the digitization may be outsourced rather than being done at home.
Preservation or "future-proofing" will vary according to several variables:
The key notion for preservation of digital objects is the digital repository. Such a repository is a permanent home for digital objects and collections. The best-known model for creating and maintaining such a repository is the Open Archival Information System Reference Model [OAIS]. OAIS is a conceptual framework for preserving digital information. This model begins with a digital object and data about that object. This package is sent to the repository which creates an archival information package [AIP]. The package is then stored so that information about the object may be retrieved as well as the object itself. The retrieved package, called the dissemination information package [DIP] will contain the object and whatever software is needed to view or use the object. A well received handbook on the Preservation Management of Digital Materials is most helpful.
What are the major differences between purchasing intellectual content and renting it? Focus especially on costs.
Are most libraries likely to hold items worthy of digitization consideration?
With digitization, "collection development" takes on a new meaning. What might be involved in "developing a digital collection?
Preservation is a major issue with digital collections. What initiatives might be taken to insure that digital materials are still usable twenty-five years from now?
