IS 590s: Issues and Trends Affecting STM Information Provision


Meeting 9: Gray Literature


Robinson logo

"Isn't the Web just a hugh pile of gray literature?" ~ Andrew K. Pace

"A major benefit of reading gray, literature, which seems counter-intuitive, is that it is more likely to report studies with non-significant results than peer-reviewed literature."


Definition and value

Define

Although gray [Europeans prefer "grey"] literature has existed since the beginning of scholarly communication in some form or another, the term itself became popular in the 1970s. Professional recognition of the importance of gray literature and problems with intellectual and physical access is relatively recent.

Historians of science note that for many years scientific communication was gray literature reporting on research in progress via letters, notebooks, and the like. Often, this was done in a personal and informal manner. With the growth of industrialization and the creation of industrial research laboratories, research documentation and results began to appear in house periodicals such as the Bell Laboratories Record.

Gray literature is not limited by format and may include any format that is accessible to users. "White" literature is the conventionally published literature found in periodicals and books issued by those in the publishing business. Sometimes, the same content appears in both as with the "gray" pre-print and the "white" peer-reviewed periodical article.

Gray literature is usually not "scholarly" because it is not peer-reviewed; however, it is often specifically focused on research issues and is of scholarly interest.

How would you define "scholarly literature"?

"Gray literature is the body of reports, studies, surveys, workshop proceedings, etc. often produced by local government agencies, private organizations and educational facilities, which have not been {peer} reviewedand published in journals or other standard publications and thus are not widely available fro study."

Gray literature is literature not issued or distributed by commercialpublishers so that publishing is not the primary function of the issuing organization.

Does this definition seem reasonable?

While gray literature appears in many formats and for many subjects, reports [sometimes called technical reports] are the most popular form. Other forms frequently mentioned include:

Can you connect each of these categories to some aspect of scientific and technical research and development?

Gray [also Grey] literature is also called "fugitive," "documentary," "non-conventional," and "ephemeral" because it is relatively invisible. In many libraries in the past, gray literature was placed in the vertical file. For example, free travel literature by tourist agencies and transportation firms would be placed in a file folder under "New Mexico: travel" or some such. Science fair project pamphlets would be similarly filed.

Why would libraries not catalog, classify, and add subject headings to gray literature items and then put them on shelf with other items in the came classification?

Value

Gray literature is valuable for several reasons:

"These documents often contain valuable and unique information is not found elsewhere. The result is that a large pool of scientific and economic information is seldom accessed by the research community [Pacific Island Gray Literature Project]." Further, these reports are not integrated into the known body of scientific literature. Thus, valuable data and results remain largely unknown.

Which of the rationales for collecting and providing meaningful access to grey literature is most persuasive?

Barriers

Gray literature is often issued in limited quantities because it is intended for a small audience. Because they are often published for an internal and/or a narrowly specialized external audience, bibliographic control is weak or non-existent. Thus, much of the gray literature remains invisible to those who would find it useful. Similarly, acquisition is problematic, although the appearance of many more reports on the web promises dramatically increased visibility. Libraries and other information agencies holding reports often do not provide full intellectual access. Instead, reports are often filed under corporate author and as a numbered series with no access to a particular item. Thus, gray literature is often valuable but not valued because users are unaware of its value.

Relatively few gray literature items are cited in published research articles, perhaps five to seven percent of the total number of citations. That may be because few items are found and used or because of the notion that these are informal and need not be cited. Conference proceedings, newsletters, and government publications/documents are gray literature categories more likely to be found in libraries and cited in STM publications.

In the past, many information professionals considered gray literature to be on the margins. However, the ease of creating digital content and making it available via the web suggests that in the future gray literature will be dominant form of literature. This may be less likely in science than in humanities/fine arts and the social sciences because of the emphasis placed on peer review and the tradition publications. Note too that digital web-based gray literature is globally accessible at low cost and is quickly produced and distributed. Quality remains the main concern [but peer-review need not be limited to commercial distribution] and infoglut creates the need for some sort of filtering. Preservation is a secondary concern. Needless to say, the publishing community is not too enthusiastic about the rapid growth of gray literature.

Will the web and Google solve the access problems associated with gray literature and make it visible, used, and less ephemeral?

GreyNet is the Grey Net Literature Service providing access via its archives to reports and articles [not free] on issues associated with GreyLit and hosting an annual conference. There is also a peer-reviewed periodical, the International Journal on Grey Literature.

Corporate authors

Most of the STM gray literature has both corporate and individual authors. With externally funded research, the intellectual property rights may rest with the funding agency. In a corporate environment, the scientists do "work for hire." Much of the gray literature of interest to academics and other researchers are produced by research institutes and centers. For example, the Scripps Research Instituteis actively involved in cancer research and publishes a notable annual scientific report. Most researchers will be familiar with the institutes and centers in their research area, but may need help in research content in adjacent disciplines.

Agencies issuing or responsible for gray literature include:

Often, the key to finding gray literature is the ability to identify research organizations likely to focus on a particular topic.

Technical reports

Taken broadly, gray literature includes a very broad collection of content since any item that is not issued by a commercial publisher or found in the traditional content distribution system is included. However, for many information professionals, gray literature and reports are the same.

For those in STM, the technical  [is a formal report, which describes research or other significant developments in a field of the applied sciences, gives details of the investigation and results of a scientific problem]report is nearly synonymous with gray literature. "Technical reports describe the progress or results of scientific or technical research and development. These include national or international reports by university departments, institutes, private industry, or government agencies and laboratories." For the U.S. government, technical reports are associated with a variety of initiatives, but military and defense needs created both "big science" and each project created a multitude of reports. Sometimes, security needs prevented reports from public circulation.

Often, technical reports are aimed at a very specific internal audience and perhaps a selective and quite specialized external audience. Technical reports are associated with both internally funded and externally funded research and development, especially since the reports are often produced to satisfy accountabilityrequirements.

What do you see as the major difference between a technical report and a peer-review periodical article if both were done reporting the same research?

Technical reports typically have both corporate authors [the agency where the work for hire was done] and individual authors [the PI and others primarily responsible]. In general, government funded research reports are more easily found than corporate funded research reports although that may be less true as more government funded research is done by contractors who may claim rights to the intellectual property.

Since technical reports are often not captured by the usual bibliographic control tools, they are usually housed/arranged by corporate author and then the technical report number since most reports are part of a continuing series. Unhappily, different agencies have different numbering systems. Some item numbers are long and not at all intuitive. Still, common elements include:

Here is an example of the technical report elements found via OSTI for a technical report.

........................................................................

Accession Number: ADA445773
Full Text (pdf) Availability: Handle / proxy Url: http://handle.dtic.mil/100.2/ADA445773
Citation Status: ACTIVE
Title: Railroad Generalship: Foundations of Civil War Strategy
Fields and Groups : 130600 - SURFACE TRANSPORTATION AND EQUIPMENT
Corporate Author: ARMY COMMAND AND GENERAL STAFF COLL FORT LEAVENWORTH KS COMBAT STUDIES INST
Personal Author(s) : Gabel, Christopher R.
Report Date: 1997
Media Count: 33 Pages(s)
Organization Type: A - ARMY
Report Number(s): XACSI (XACSI)
Monitor Acronym(s): XA (XA)
Monitor Series: CSI (CSI)
Descriptors: *RAIL TRANSPORTATION, MILITARY OPERATIONS, MILITARY HISTORY, BATTLEFIELDS, CIVIL WAR(UNITED STATES), ARMY PERSONNEL, MILITARY STRATEGYZRAIL TRANSPORTATION, MILITARY OPERATIONS, MILITARY HISTORY, BATTLEFIELDS, CIVIL WAR(UNITED STATES), ARMY PERSONNEL, MILITARY STRATEGYZ
Abstract: Since the dawn of history, military strategy has been dominated by the inexorable calculus of logistics-distance, time, transport capacity, and consumption. For thousands of years, every army that waged war relied upon the muscles of its men and animals to carry it across the countryside. It is sobering to consider that, up until 1830, every soldier that ever went into battle got there on his own feet or by the efforts of an animal. Every weapon, every round of ammunition, every pound of food eaten by an army, every tent peg, and every bandage reached the battlefield by muscle power. The only exceptions were those resources transported by water and those extracted from the countryside. Ironically, the armies with the largest contingents of draft animals for their supply trains also faced the most difficult logistical challenges: each of the animals pulling a supply wagon had to eat too, which meant that even more wagons and animals were needed to carry food for the animals hauling supplies for the fighting troops. Naturally, one then needed animals to carry fodder for the animals carrying fodder. This pattern of diminishing returns compounded dramatically the farther an army got from its supply base. Typically, food for animals constituted more than half of an army s supply requirement. Under the best of circumstances, an army relying exclusively on muscle-power transport could carry a maximum of about ten days worth of supplies. No wonder that armies of the preindustrial age were so often hungry, ragged, and exhausted, spending far more time scouring the countryside for food than they did fighting the enemy.
Distribution Limitation(s): 01 - APPROVED FOR PUBLIC RELEASE
Source Code: 414305
Document Location: DTIC
Geopolitical Code: 2002
Distribution Statement: Approved for public release; distribution is unlimited.
Citation Created: 10 MAY 2006

.........................................................................

Does this example seem very different for the elements you would find in a library catalog?

As you become more familiar with particular corporate authors, you will learn that their technical reports series have a unique code. For example:

Databases

Because databases are now nearly all digital, it is relatively easy to provide access to them via the web. Most gray literature produced in the past few years is born digital and placed in a database so that individual items can easily be placed on a server and accessed via a website. Depending on whether or not the database content is available to Google or other firms with searchbots, this literature may be easily available to those who know enough to be able to find it.

Some databases remain part of the "deep web" [The deep Web is the part of theInternet that is inaccessible to conventionalsearch engines, and consequently, to most users. According to researcher Marcus P. Zillman of DeepWebResearch.info, as of January 2006, the deep Web contained somewhere in the vicinity of 900 billion pages of information. In contrast, Google, the largest search engine, had indexed just 25 billion pages. Deep Web content includes information in private databases that are accessible over the Internet but not intended to be crawled by search engines] and are thus hidden from all but internal or local  users. At one time, PDF files on the web were largely invisible, but Google substantially improved its ability to to identify and provide access to PDF documents/publications, including providing a HTML alternative to many.

Why is the deep web problematic for technical reports?

Some argue that search engines will increasingly take the place of traditional databases. For example, science.gov might replace a wide variety of subject and topic specific databases just as some suggest that with a good desktop search engine individual users need no longer worry about placing documents in particular folders. Google mail is a good example with its tagging and no very few folders.

Does this seem reasonable?

Although heavily focused on energy, OSTI [the DOE Office of Scientific and Scientific Information] is a, if not the, major provider of access to U.S. federal government ST reports via its major databases. Anyone providing ST information should spend time on the OSTI website becoming quite familiar with the services and collections provided, including these databases:

Websites

There are a growing number of websites that provide access to technical reports issued in the U.S. There are also equivalent websites for other countries and continents. Here is a short list of examples that you need to be familiar with.

Besides these, searching "technical information center" in science.gov will yield more specific report collections. As more academic and corporate information centers develop institutional repositories, more gray literature will appear on institutional websites, often those associated with research centers or institutes. Knowing the institutional affiliation of the primary author will often lead to documentation.

Google Scholar is also useful for finding grey literature.

Bibliographic control

Common barriers to intellectual and physical access include:

Each of the major STM agencies in the U.S. federal government hosts substantial and reasonably comprehensive report databases. At the same time, a growing number of reports are available as pdf files on agency or department websites. Still, the National Technical Information Service [NTIS] is charged with capturing and preservingaccess to U.S. government technical reports from its collection of more than two million items. As a cost-recovery operation, NTIS needs to charge for its products and services since it receives minimal funding from Congress. Optical scanning of technical reports began in 1997. Publications of less than five pages are free while the rest begin at a fee of $8.95 each. Besides down loadable pdf files, items may also be delivered on CD at a lower price. Note that the Government Printing Office typically distributed relatively few STM reports so the depository library program has done poorly in this area. Although less demanded, NTIS is also charged with capturing and preserving all audio - video products created by federal agencies. Data files are also an important component of their collections. Cambridge Scientific Abstracts includes NTIS publications from 1964.

What is a cost-recovery model of information provision?


Last major revision: March 2007.

Return to 590s Page

train picture