Evaluating Collections


Focus:

Efficiency, Effectiveness, Equity
Standards
Activity Measures
Performance Measures
Check listing
Cost Measures
Expert Inspection
Outcome/Benefit Measures
Problems


Quotes

"The pertinent question for managers is not how to do things right but how to find the right things to do." ~ Peter Drucker

"For many librarians, project directors, staff, and board members, 'evaluation' is thought of as an unfortunate requirement that takes time and money away from your services and activities you really want to do. Evaluation often is negatively associated with 'accountability' and 'monitoring' because its impetus comes from a requirement established by the governing or funding agent." ~ Susan G. Hanson

"Traditionally, the size and variety of its collections were the main measures of the excellence of an academic library. Because scarcity was the problem -- it was a question of acquiring one of the few copies known to exist of any title -- the notion of too many books was inconceivable." ~ Ian Winkworth

"Librarians also need to be concerned about eyeballs. That means paying attention to the customers, especially the loyal ones. In the final analysis, while the library's service may inform, educate, or entertain its users, these are not the goals of the library, but the conditions that they enhance. The goal is not even to produce quality service,per se. The library's basic goal is to produce satisfied and loyal customers who will keep coming back." ~ Ellen Altman

"Information is a ubiquitous commodity in the digital age, and librarians no longer have a singular claim on it." ~ Joanne Gard Marshall

"Since the economic downturn, many state libraries, public libraries, and even university libraries have had their budgets slashed dramatically. ... Now, some cuts are inevitable for any public agency when state revenue goes down and the budget has to be balanced, and libraries are less critical than some other government services, but it's rare for any agency to be eliminated. I was horrified, but not surprised. The fact that it's happening to libraries is the clearest sign yet that the work librarians do is invisible to the people who make funding decisions. I believe we have failed to demonstrate the value we are returning for public funds."

"I suspect we've gotten complacent, assuming that of course any town/state/corporation/university has to have a library. We've assumed that doing a good job was enough to win public support. We've believed so completely that libraries are the heart of our communities that we haven't recognized how many people don't share that believe and don't much want to spend good money supporting ANY public institution."

"Instead of assuming automatic public support, I think we need to earn it. Not just be quietly doing good work, but by showing how our work improves the lives of our communities and our users." ~ Marylaine Block

Evaluation Type

Performance measurementt informs those who fund collections of the value received for their money.

Needs assessment focuses on what users need, the degree to which those needs are met, and what should be done to better met needs in the future. Community analysis is a form of needs assessment.

Formative evaluation focuses on particular programs, their effectiveness, efficiency, and how they might be improved. Evaluating a reader guidance program would be an example of formative evaluation.

Outcome and impact evaluation focuses on what happens to users as a result of exposure to a service or collection. For example, children using a well developed collection of books and periodicals create better assignment responses.

Summative evaluation focuses on accountability and decision-making about a program based on its quality and impact. Evaluating a leased popular fiction collection as part of a decision on whether or not to continue it would be an example of summative evaluation.

Mission, Goals and Objectives

Accountability requires agreed upon measures thoroughly based on the mission, goals, and objectives of the parent institution. The objectives of the information agency must be clearly linked with the objectives of the parent institution and as concrete as the objectives of that institution. Thus,

The major failing of collection evaluation has been that it rarely relates collection use specifically to the objectives and outputs of the parent institution. We routinely collect data on inputs and outputs not directly linked to institutional success.

The Information Audit

Although rarely encountered outside of special libraries and information centers, the information audit would be a much larger evaluation of all the information resources available in an organization. Here, all information assets anywhere in the organization are identified and measured. All types of information would be included:

The purpose of the information audit is to:

Metrics are simply countable measures of information agency performance, often focusing on interactions between the agency and customers. For example, customer retention is a key metric for most businesses.

Evaluation of Individual Items

While the emphasis here is on the evaluation of a "collection" of items, evaluation may also involve particular items within the collection. In the original selection decision, evaluation was probably done by the reviewer rather than the collection developer. In evaluating individual items, we use the same criteria used by reviewers.

For example, we ask descriptive questions:

We also ask evaluative questions:

On the basis of the answers to these questions, we determine the value of a particular item within the collection. In "reselecting" an item, we are especially concerned with past use, the degree to which the content is obsolete, and the degree to which the content and the container are unique. Although not always reliable, past use remains the best indicator of future use.

One could also use these same variables to measure the collection. We could characterize a collection as an individual item by looking at:

False Assumption

Some information professionals assume that the value of a collection is "self-evident." In my experience, this is a false and dangerous assumption. Many community members are not aware of the value of the collection. Some of those who fund the information agency may not be aware of the value of the collection. The collection manager is responsible for sharing the good news on a regular basis: the collection is good, used, and useful. The evaluation process should provide the collection manager with the evidence needed to support the good news. The value of a collection must be demonstrated again and again.

Terms

While "evaluation" is still heavily used, "assessment" is popular too and may be a better term in some environments.

"Utility" or "quality" or "adequacy" or "goodness" in regard to any collection requires consideration of audience (good for whom) and purpose (good for what). Without a clear sense of audience and purpose, collection evaluation is not likely to be successful.

Efficiency of collection development and management is not often measured. However, in an age of increasing demand for accountability, it seems reasonable that collection managers evaluate their efficiency. Efficiency is process oriented. It looks specifically at how inputs are used to create a particular output. In colloquial terms, efficient people do it "right." This means that a task is done quickly, inexpensively, and without error.

Effectiveness is most important and should be measured frequently. Effectiveness is result oriented. It looks specifically at the difference the collection has made for community members. Effectiveness could be measured via wants met, needs met, or some combination of both. Collections are effective when they increase institutional success.

"Cost-effectiveness" measures the cost of developing and managing the "right" collection. Often, this is done by relating the over-all cost of an item to the number of uses.

"Cost-benefit" measures the cost of developing and managing the collection against measurable benefits. To what degree does the collection justify its existence?

Equity is simple fairness or the degree to which the agency allocates resources for collection development so that diverse audiences get their "fair share." It is often most difficult for professionals to agree on what the fair share might be. This could be a political rather than a professional decision.

Extensiveness is the amount of collection provided in relation to the size of the community served. Normally, extensive collections are larger collections with more items per user.

Economy is the degree to which collections are developed at low or reasonable cost and used frequently enough so that cost per use is attractive and persuasive.

Inputs are the resources made available so that the collection exists and may be used. Money for purchasing items is particularly important, but other inputs are related to space and staff. The actual cost of an individual item only partially represents its true cost. For example, opportunity cost [purchasing one item means that several other items cannot be purchased] is a continual problem.

Outputs represent activities which are possible because the collection is available. Reading, viewing, and listening are immediate outputs. A published paper in a well-regarded periodical may also be an output as are reference services

Three Different Approaches

Collection Centered Measures

Traditionally, collection-centered measures have been most popular. These measures focus on the quality of the collection (usually by matching holdings against "best" lists) and the completeness of the collection (usually matched against a list of issued items). The key assumption here is that larger collections are better collections so that size may be a proxy for goodness. This is not an intuitive assumption and some are uncomfortable with it.

User-Centered Measures

Less popular, but receiving more attention today are the customer-centered measures. These measures focus on the degree to which users find the collection to be useful. Survey research methods, interviews and questionnaires as well as focus groups, are used to query users and potential users. In particular, we must be concerned with the degree to which services and collections are important and make a difference for particular users.

User centered

We need to be careful with the now popular notion that the value of information is directly related to the amount that users will pay for it since most of our users pay only indirectly for collections and services.

Benchmarking

"The ongoing activity of comparing one's own process, product, or service against the best known similar activity, so that challenging but attainable goals can be set and a realistic course of action implemented to efficiently become and remain best of the best in a reasonable time."

Benchmarking is the process of modeling excellence. An organization with an excellent program or activity is studied to determine the attributes of that excellence. Procedures are then copied in the hope that similar results will follow. One may use benchmarks established by other units in the same organization, by peers (or competitors), by a group (industry for example), or by the best regardless of group.

Benchmarking is common in IT units, but is rarely encountered in libraries and makes only a modest appearance in the library literature. The focus is on identifying the best performers for a particular activity, capturing that performance, using it to create a "benchmark," and then comparing local performance to the benchmark. For example, if we knew which academic research library did the best job with selecting periodicals, we could compare our performance with theirs. "After all, you simply can't know how well your company is doing unless you have something you can compare it with." When the two performances are nearly equal, we would claim excellence. It seems likely that peer-centered measures would focus more on efficiency than effectiveness, but both should be possible. Since benchmarks are supposed to be linked to customer satisfaction, this method should also measure effectiveness.

Typical Collection Development Objectives

Completeness is the process of developing a comprehensive collection. This has been a major goal of research collections until recently when it became clear that such collections were not attainable. Completeness may be measured via the use of comprehensive lists or the holdings of known comprehensive collections.

Quality is the attribute associated with selecting only the very best items from among those that are available. These are the items most likely to have a significant impact on research and scholarship. Quality is often measured by comparing holdings against a "best list." Customer views on quality may also be used.

Impact or return on investmentt is the degree to which the collection helps the institution to be more successful.

Availability is the degree to which desired items are available when requested. Availability of hard copy items is often a function of the number of duplicate copies or insuring that items may be used only within the building. Availability may be measured through interaction with users or testing using lists of popular or well regarded items.

Document exposure is the amount of time that users spend reading, viewing, or listening to items in the collection. Often, this is a function of popularity or visibility. In school or college, document exposure may be a function of teacher coercion.

Value is the cost of an item divided by the number of uses. This can be misleading since an item may be of considerable value to one user while retaining a high cost per use. Turn-over or the degree to which items leave the shelve or are used is often used as part of defining value.

Measurement

One of the reasons that collection-centered evaluation has been popular is that it is relatively easy to measure the number of titles or volumes. It is more difficult to measure user satisfaction. Inevitably, collection evaluation involves some statistics. However, statistics must be interpreted and often there is some confusion about the meaning of a particular finding. We must be careful not to select collection attributes simply because they are easy to measure.

Typically, metrics must meet these conditions. See how well they match up with collection use data.

Relationships

Evaluation's success depends on several variables:

Why Evaluate?

There are several reasons to evaluate a collection. The two most likely reasons are that the parent institution has program accreditation with a collection evaluation component or that there is a space problem requiring some weeding. Evaluation may also be a response to a complaint about collection gaps or weaknesses. Evaluation may be part of an accountability initiative that requires evidence that collection development funds were spent wisely. Finally, the collection developer may evaluate to insure that selection decisions were appropriate (feedback). Regardless of rationale, the well-managed collection will be evaluated on a regular basis. Normally, one or a few segments would be evaluated at a time.

Evaluation Categories

While there are many evaluation measures, most tend to group into a few obvious categories:

  1. Activity measures
  2. Cost measures
  3. Market penetration/Expenditure measures and
  4. Benefit measures
.

To some degree, these categories ascend in importance so that the benefit measures are of most value. They are also the most difficult.

Activity Measures

Activity measures fall into two large families:

  1. Measures associated with developing and managing the collection and
  2. Measures associated with using the collection.

Collection Development and Management

There are a large number of activities associated with developing and managing collections from scanning review sources and interviewing users to binding periodicals and backing up digital files. Each of these activities might be measured. Traditionally, however, the focus has been on the number of items added to the collection and the size of the collection. Sometimes collection size is related to subjects, date issued, formats, or audiences. Formulas may be used to indicate how large a certain academic library collection should be based on such variables as number of minors, majors, MS students, Ph.D students, faculty, and the mean price of a new item. Since volumes added or total collection size is relatively easy to count, it is easy to compare collections and identify the best one (the largest).

The number of items added to a collection or part of a collection is an notable activity. A collection which does not grow is soon obsolete, especially in the sciences, technologies, and most of the social sciences. Additions to the collection related to the number of eligible items that might have been added is a telling measure of the degree to which a collection is being kept up to date.

Check listing

Check listing is comparing the content of a "best" list to what is held in the collection. Lists may be selective or comprehensive and may come from a wide variety of sources. Lists may stand alone or may be assembled in chunks from here and there. Here are some example lists:

Some check listing is clearly a collection measure. For example, check listing a best list or a comprehensive list is collection-centered. Such a check answers questions about collection goodness and collection completeness.

List checking is relatively easy to do. Items not held (gaps) may be checked for order. It is relatively easy to compare holdings between institutions. While lists are readily available for some topics, they are difficult to find or create for others. Note that to be valid a list must be:

Bibliometrics

Bibliometrics is the checking of scholarly references, often from periodical articles. It is a sophisticated version of list checking. It is based on these assumptions

  1. Items used are cited
  2. Items cited are used
  3. Scholar conducted a comprehensive literature search and examined all or nearly all of the relevant literature
  4. Past use is a good predictor of future use

Bibliometric measures attempt to learn more about which items were most useful for scholarship. For example, we can learn about the use of foreign language items, older items, items published abroad, and the like. By looking at citation linkages, we can discover the relationships between research and discovery. Eugene Garfield and the Institute of Scientific Information (ISI) have played a major role in making bibliometrics visible in the United States. Citation measures may be used to measure the impact of a scholar's work or the utility of an expensive periodical.

Objectives & Standards

Objectives may be internal or external. Internal objectives are created locally. Useful objectives are quite specific, such as users will be able to retrieve 70 percent of the information that they seek immediately, 90 percent within one week, and 98 percent within one month, and allow for data collection and analysis. If created by the local institution or information agency, the example above would be an internal objective. However, if created by an external agency such as an accrediting agency, this would be an external objective. Standards represent particularly important external objectives.

Standards, regardless of whether they be qualitative or quantitative, are closely linked to collection size since larger collections are usually seen as better than smaller ones. Standards are often not well linked to the mission, goals, and objectives of the parent organization unless those focus on size. Standards may be voluntary, in which case they are really more "moral suasion" than anything else, or they may be required in which case they do make a difference. Accreditation of many educational institutions and some medical ones requires that supporting collections meet certain standards. Collection developers/managers in these institutions need to be most familiar with the current set of standards so they can prepare long before the arrival of the site visit team. Often, it is necessary to educate parent institution officials on these standards and how local collections are likely to perform. Where collections are not likely to perform well, senior administrators need to be told and the collection manager needs to be able to specify particular changes needed to insure compliance.

Standards are based on certain assumptions:

Each of these assumptions may be debated and not all are easy to defend. Regardless, if standards are important to the parent institution, they should be important to you.

Traditionally, standards have been quantitative. For example,

Such standards are easy to check, but outsiders may wonder why 30,000 items rather than 29,000 or 31,000. Some administrators may consider the standard to be a maximum and lose interest in the collection when it reaches the target.

Qualitative standards are designed to be more flexible, but may not be reliable because they require interpretation. I suspect that most of those who apply a qualitative standard translate into some quantitative equivalent. Otherwise, what would one make of collections that need to "adequately" support the curriculum, or contain "appropriate" coverage of "major" topics, or that simply need to be of "sufficient" size. Be careful and check with peers if faced with qualitative standards so that you know what to expect. It is your responsibility to find out. Do not assume that some one will tell you all that you need to know.

User Activity Measures

The key question is what constitutes "use." Traditionally, many collections have measured use by what goes out the door. This ignores the value of "in-house" use. It is also possible that items may leave the collection and not be used. They may also leave the collection and be used by several individuals. Use, however, is of little importance. What is important is that an item was found "useful." Few collection managers measure utility in any meaningful way.

Typical user activities might include:

Use may be related to subject, format, and demographics associated with the user. The time (day, season too) may also interest. Turn-over or how many times an item leaves the shelf during a year is another way to measure collection use. Some public libraries have a turnover target of 7 per year. Automated circulation systems or server transaction logs can identify use by class or individual item. They can also map some user characteristics to the item used. Expenditures may also be contrasted with use to see if the classes receiving the most money are those which are most heavily used.

Performance Measures

Performance measures the essential user activity: the ability to leave the collection with exactly what she wanted. Performance may be measured in terms of the number of desired items retrieved or how long it took to retrieve a certain number or percentage of items. The former is most often used in libraries. Performance measures begin with a list of desired items. The list may come from a user interview or may be generated by the information professional from bibliographies and the like. The list is then matched against a holdings list to discover the number of items in the collection. Since user failure at the index/catalog is often a problem, it is better for agency staff to search for holdings. The number held is then matched against the number of items that are available for use (on the shelf in a library collection). Performance equals the number (percentage) of desired items available for use. To use a formula, P = H (% of items held) x A (% of held items available). Decisions will need to be made as to when to allow substitute items to count and how often to check for availability since availability may fluctuate from day to day.

Since most users really don't care about holdings and local collections, the better measure of performance may be how long it takes to deliver a set percentage of desired items to the user. Some thought will need to be given to what an acceptable percentage might be. Agency staff then measure how long it takes to deliver 75% (decided upon) of the requested items within 96 hours (decided upon). Here, performance may be divided into access time or how long it takes to identify the needed item and delivery time how long before the user has the item requested. Different information needs might have different time values. For example, some might be 10 minutes or less while others might be more than 24 hours but less than a week.

Note that both of these performance measures assume that users begin with a known item . This is likely true in a research environment, but in many others users find what they want by browsing. One public library study found that fewer than 20 percent of the users found what they checked out by using the catalog. Using agency staff, it may be possible to convert a fuzzy want into something specific enough to search for and then create a performance measure.

Expert Inspection

Expert inspection tends to be collection-centered unless the subject expert is a local one. Subject experts are identified and arrangements are made for them to visit the local collection and evaluate it. The expert may speak with local users or may go directly to the collection. Usually, the focus is on collection goodness (do you hold standard, expected works?) and completeness (how comprehensive is it?). The resulting evaluation is only as good as the expert. Finding an expert with the broad knowledge needed to evaluate a good-sized collection segment may be a problem. Outside research environments, it may be difficult to find an expert or to know if an expert is really an expert. The asset of this method is that the subject expert is likely to know what is good, what might be rare (off to special collections), has needed foreign language skills, and the like. The expert should be objective and is removed from any local special pleading. She may bring new ideas and perspectives to developing/managing the collection. Liabilities may include:

Experts may consider several collection attributes, including:

Cost Measures

We can also measure the cost of providing informational and entertainment items via locally developed or distant collections. Measures can be fairly gross as in parent agency expenditures on the information agency or information agency expenditures on collections. Expenditures per capita, per user, per subject, or per format may be more informative. We can compare expenditures by subject, format, or audience to use in these same categories to see if the most resources are given to the most popular items.

The best measure is to capture the cost per use of items in the collection. This requires the ability to capture all costs associated with selecting the item and making it available for use (including related space and staff costs). Obviously, the more that an item is used, the lower the per use cost. One of the persuasive rationales for publicly-funded collections is that we can make useful items available for use much less inexpensively than if individuals or small groups had to buy these items. Here, processing costs really make a difference. A file may be available without cost, but it may require some time,effort, and expense to place it on a server so that it may be used.

Although rarely done, we could also attempt to measure non-monetary user costs associated with collection use. In the literature, the effective price paid for information or recreational material includes the price paid by users in time and effort to identify, retrieve, and use material. For example, time spent in getting to the collection, identifying an item, retrieving an item, and checking it out. Frustration and loss of face associated with collection use are also examples of costs. In some cases, these "invisible costs" are very important.

Market Penetration Measures

Market penetration is the degree to which those eligible to use the collection actually use it. The greater the use, the greater the market penetration. For example, if only 25 percent of undergraduate students check material out of the university library, then market penetration is low. There is some question about the degree to which not using a collection is a reflection on the state of the collection and the information agency. Some argue that no matter what is said or done, only a limited percentage of eligible users will use any collection (unless of course you are a student in an elementary school where you are required to check out one book from the collection every other week).

Your service rate [here the number of items in the collection that are used divided by the population served] is a good start. You might also compare change in the numbers of the population served with change in the number of items used. This is another approach toward measuring market penetration. Once you know your service rate, you should be able to place cost per use within a persuasive perspective. This is especially appropriate when examining particular topics within the collection.

Outcome and Benefit Measures

An outcome is a measurable product resulting from collection use. Outcomes may be positive or negative. However, few collection developers consider negative outcomes. A benefit is a positive outcome resulting from collection use. For example, by using a collection of position announcements on the web I learn of a position which I take. Here the benefit is the position and all of the good that comes from gainful employment and contributions to society.

Benefits may be organizational or individual. For example, collection use may benefit the organization by allowing a staff member to do something better, quicker, or perhaps not do something after literature use showed that was not necessary. Here, benefit may be specifically linked to ROI or return on investment. Collections should provide measurable return on investment. Collection use may also benefit individuals while not having an immediate, tangible impact on the organization. Individual or "self" benefits may certainly benefit the employing organization in the longer run.

We need to be able to answer the question "what is that people can do because the collection is available and used" or "what would people NOT be able to do if this collection was not available"? We may then consider asserting that without the collection community members would be unable to do certain useful activities. There will always be some doubt as to the importance of the collection use in the final outcome, but it should certainly play a visible role in the ability to do something. It is frequently argued that publicly funded collections improve the quality of life in the community. If that is the case, we need to be able to provide links between use and positive outcomes. For example, what specific positive outcomes result from a professionally developed collection in the school library media center? [Some studies have found that schools with librarians have better skills test scores than those without.] Will test scores increase? More students be accepted in college? Is use associated with higher GPA?

Collections may benefit individuals, organizations, and society. The thoughtful collection developer will look for telling examples of each to include in the annual report or use persuasively at budget time.

Considerations and Problems

Barriers

There are several problems associated with evaluating collections. A major one is that too often the evaluation has not focused on specific user wants and needs. Too little attention has been given to why collections are not used or are under used. In particular, barriers to use should receive more attention. Often, we fail to promote the collection or make it visible. If there is "a reader for every book," then we need to let users know that their book is available.

Knowledgeable Users

An obvious problem with user-oriented evaluation is whether users know enough to make useful comments. Often, users don't know enough to realize what the collection should be. Sometimes,their recollection of recent use experiences is not reliable. Some users experience the collection only via proxies and may voice opinion without first-hand knowledge. Many users may not feel comfortable in making critical comments, especially to the person responsible for developing the collection.

Audience

Audience is always important when an evaluation is prepared. You need to know who will read it and what they want. Are there specific questions to be answered?

Is there a desireable length or format? When do they want the report? Don't begin an evaluation until you have a clear sense of what "they" want.

Available Resources

Resources are important too. A quality evaluation usually requires a fair amount of resources. Resources include time and support. Will you have staff support or do you do it by yourself? Is professional literature available? Will you have access to needed software/hardware? Will you have access to research help with sampling and other decisions?

How Often?

Frequency of collection evaluation will vary according to the rationale for evaluation. Accreditation usually comes every five or six years so that is when the supporting collections would be evaluated. Some parent organizations may require evaluation of all units at certain intervals.

Change?

An evaluation should result in action. Collections should change in some visible way because of the findings, conclusions, and recommendations associated with the evaluation. There is little point to collection evaluation unless it makes a difference.

Automated Evaluation

OCLC, especially that it now includes WLN, offers automated collection assessment/analysis services. Using OCLC records, the service can provide libraries with a detailed analysis of holdings, include publication date and duplication. With permissions, collections in one collection may be compared to another. Collections may be compared to the Books for College Libraries best list, books reviewed in Booklist and books in Outstanding Academic Books. In the future, more types of analysis are likely to be available.

Book vendors also provide low cost evaluation services, often based on the age of the items in the collection. Follett Library Services, and Sagebrush Corporation's BenchMARC are two good examples.


Discussion

One

Select an information provider of your choice. Discuss specific steps you might take to make the value of collections more visible to the community.

Two

Select an information provider of your choice. Select three to five activities associated with developing and managing collections. Discuss how you might measure efficiency and effectiveness for these activities.

Three

Intrigued by the notion of benchmarking, you would like to identify best performers in collection development/management activities so you can compare your performance with theirs. How would you go about this activity?

Four

Which aspects of developing and managing collections lend themselves to benchmarking? Why?

Five

You are interested in activity measures. Prepare a list of as many collection use activities as you can think of. Be prepared to defend inclusion on the list.

Six

Select an information provider of your choice. Prepare a list of reasonably specific benefits associated with collection use. These are to be used in a promotional/publicity campaign.



Return to 560 page
Train picture