In the last edition of this text, we opened this particular chapter with a reference to a statement by Sheila Intner and Elizabeth Futas that the 1990s was the decade of evaluation. Whether it was the decade of evaluation/assessment may be debatable; however, an annotated bibliography on the topic by Thomas Nisonger suggests it was a decade of very, very high interest in the subject. Nosing started updating his prior bibliography on academic library collection evaluation but found that the volume of literature on file topic for all types of libraries published in the 1990s was so large it warranted a separate work. His update covers the years 1992 to 2002 and includes over 600 entries!
If the 1990s was not the decade of evaluation, what is not debatable is that evaluation, assessment, outcomes, and accountability were words and activities that many librarians were, and are, addressing. Certainly, accreditation bodies were, and are, demanding evidence that institutional expenditures were more than just expenditures; that is, they wanted to know the result of spending the funds. Long statistical lists of books added, journal subscriptions started, and items circulated were no longer acceptable. Accreditation visiting teams started asking questions like, "So you added x thousands of books, have x hundreds of subscriptions, and your users check out an average of x items per month. Have you any evidence that this has made any difference and helped meet the institutional mission and goals?" They also asked, "What evidence of quality do you have, and how do you and your institution define that concept?"
Answering such questions is difficult unless you have systematically collected a body of data with such questions in mind. What this means is that some of the measures of evaluation we have employed in the past, such as collection size, will not be acceptable in some circumstances when outcome and accountability are at issue. Certainly, the questions asked go beyond just our collections, both physical and virtual, but carefully evaluating our collections will help us address the broader issues (such as, Is the library worth the money expended on it?).
What are the strengths of the collection? How effectively have we spent our collection development moneys? How useful are the collections to the service community? How do our collections compare to those of our peers? These are but a few of the questions one may answer by conducting a collection evaluation assessment project. Evaluation completes the collection development cycle and brings one back to needs assessment activities. Although the term evaluation has several definitions, there is a common element in all of them related to placing a value or worth on an object or activity. Collection evaluation involves both objects and activities, as well as quantitative and qualitative values.
Hundreds of people have written about collection evaluation-Stone, Clapp-Jordan, Evens, Bonn, Lancaster, Mosher, McGrath, Broadus, and Hall, to name a few. Though the basics remain unchanged, the application of the basics has become more and more sophisticated over the years. Computers make it possible to handle more data, as well as a wider variety of data. Bibliographic and numeric databases can provide valuable data that in the past would have been exceedingly difficult, if not impossible, to obtain (e.g., see Metz's older, but still interesting, Landscape of Literatures). Bibliographic utilities, such as OCLC, and products for assessing and comparing collections are more widespread today than in the past. Despite the assistance of technology and increasingly sophisticated systems of evaluation, as Betty Rosenburg, a longtime teacher of collection development, repeatedly stated, the best tool for collection evaluation is an intelligent, cultured, experienced selection officer with a sense of humor and a thick skin. Because there are so many subjective and qualitative elements involved in collection development, Rosenburg's statement is easy to understand and appreciate. Although this chapter will not help one develop the personal characteristics she identified as important, it does outline the basic methods available for conducting an evaluation project and provides a few examples.
Before undertaking any evaluation, the library must carefully define the project's purposes and goals. One definition of evaluationis a judgment as to the value of X, based on a comparison, implicit or explicit, with some known value, Y. If the unknown and the (presumably) known values involve abstract concepts that do not lend themselves to quantitative measurement, there are bound to be differences of opinion regarding the value. There are many criteria for determining the value of a book or of an entire collection: economic, moral, religious, aesthetic, intellectual, educational, political, and social, for example. The value of an item or a collection fluctuates depending on which yardstick one employs. Combining several measures is effective as long as there is agreement as to their relative weight. So many subjective factors come into play in the evaluation process that one must work through the issues before starting. One important benefit of having the goals defined and the criteria for the values established ahead of time is that interpretation of the results is much easier. It may also help to minimize differences of opinion about the results.
Libraries and information centers, like other organizations, want to know how they compare with similar organizations. Comparative data can be useful, but they can also be misleading. Like all other aspects of evaluation, comparative data present significant problems of definition and interpretation. What, for example, does library A gain by comparing itself with library B, except, perhaps, an inferiority complex-or a delusion as to its own status? Without question, some libraries are better than others, and comparisons may well be important in discovering why this is so. Two key issues in interpreting comparisons are that (1) one assumes a close approximation of needs among the comparative groups and (2) one assumes the existence of standards or norms that approximate optimum conditions. Neither assumption has a solid basis in reality. If the library or its parent organization is considering starting a new service or program, comparative data from libraries already supporting similar services can provide valuable. planning information. Though comparisons are interesting and even helpful in some respects, one should be cautious in interpreting the significance of the findings.
Robert Downs, an important historical figure in the field of evaluation, noted:
From the internal point of view, the survey, if properly done, gives one an opportunity to stand off and get an objective look at the library, to see its strengths, its weaknesses, the directions in which it has been developing, how it compares with other similar libraries, how well the collection is adapted to its clientele, and provides a basis for future planning.'
Downs believed that, in addition to their internal value, surveys are an essential step in preparing for library cooperative acquisitions projects and resource sharing.
Organizations conduct evaluations for several reasons, including
to develop an intelligent, realistic acquisitions program based on a thorough knowledge of the existing collection;
to justify increased funding demands or for particular subject allocations; and
to increase the staff's familiarity with the collection.
It is possible to divide collection evaluation purposes into two broad categories: internal reasons and external reasons. The following lists provide a variety of questions or purposes for each category.
Collection development needs
What is the true scope of the collections (i.e.., what is the subject coverage)?
What is the depth of the collections (i.e.., what amount and type of material constitute the collection)?
How does the service community use the collection (i.e., what are the circulation and use within the library)?
What is the collection's monetary value? (This must be known for insurance and capital assessment reasons.)
What are the strong areas of the collection (in quantitative and qualitative terms)?
What are the weak areas of the collection (in quantitative and qualitative terms)?
What problems exist in the collection policy and program?
What changes should be made in the existing program?
How well are collection development officers carrying out their duties?
Provide data for possible cooperative collection development programs.
Provide data for deselection (weeding) projects.
Provide data for formal cancellation projects.
Provide data to determine the need for a full inventory.
Assist in determining allocations needed to strengthen weak areas.
Assist in determining allocations needed to, maintain areas of strength.
Assist in determining allocations needed for retrospective collection development.
Assist in determining overall allocations.
Local institutional needs
Is the library's performance marginal, adequate, or above average?
Is the budget request for materials reasonable?
Does the budget provide the appropriate level of support?
Is the library comparable to others serving similar communities?
Are there alternatives to space expansion (e.g., weeding)?
Is the collection outdated?
Is there sufficient coordination in the collection program (i.e., does
the library really need all those separate collections)?
Is the level of duplication appropriate?
Is the cost/benefit ratio reasonable?
Provide data for accreditation groups.
Provide data for funding agencies.
Provide data for various networks, consortia, and other cooperative
Provide data to donors about collection needs.
Having undertaken numerous evaluation projects, as staff members and consultants, the authors have found that these reasons inevitably surface in one form or another. Not all the reasons apply to every type of information environment, but most have wide applicability.
After the library or evaluators establish the purposes for carrying out the evaluation, the next step is determining the most effective methods of evaluation. A number of techniques are available, and the choice depends, in part., upon the purpose and depth of the evaluation process. George Bonn's article "Evaluation of the Collection" lists five general approaches to evaluation:
Compiling statistics on holdings.
Checking standard lists-catalogs and bibliographies.
Obtaining opinions from regular users.
Examining the collection directly.
Applying standards [which involves the use of various methods mentioned earlier], listing the library's document delivery capability, and noting the relative use of a particular group.
Most of the methods developed in the recent past draw on statistical techniques. Some of the standards and guidelines of professional associations and accrediting agencies employ statistical approaches and formulas that give evaluators some quantitative indicators of what is adequate. Standards, checklists, catalogs, and bibliographies are other tools of the evaluator.
ALA's Guide to the Evaluation of Library Collections- divides assessment methods into collection-centered measures and use-centered measures. Within each category are a number of specific evaluative methods. The Guide summarizes the major techniques currently used to evaluate information collections. These methods focused on print resources, but there are elements that one can also employ in the evaluation of electronic resources.
Checking list, bibliographies, and catalogs;
comparative use statistics; and
analysis of ILL statistics;
in-house use studies;
simulated use studies; and
document delivery tests.
Each method has its advantages and disadvantages. Often it is best to employ several methods that will counterbalance one another's weaknesses. We will touch on each of these; however, the use-centered methods all share the same broad characteristics as circulation studies. Each is valuable in its own way to evaluate some aspect of use, but due to space limitations, we will not address all of the variations. One should also consult items listed in the Notes and Further Readings sections before planning an evaluation project.
Collection-Centered MethodsList Checking
The checklist method is an old standby for evaluators. It can serve a variety of purposes. Used alone or in combination with other techniques--usually with the goal of coming up with some numerically based statement, such as "We (or they) have X percentage of the books on this list"-it provides objective data. Consultants frequently check holdings against standard bibliographies (or suggest that the library do it) and report the results. Checklists allow the evaluator to compare the library's holdings against one or more standard lists of materials for a subject area (Business Journals of the United States), for a Type of library (Books for College Libraries), or for a class of user (Best Books for Junior High Readers).
When asked to assess a collection, we use checklists as part of the process, if appropriate lists are available. Whenever possible, we ask a random sample of subject experts at the institution to identify one or two bibliographies or basic materials lists in their specialty that they believe would be reasonable to use in evaluating the collection. The responses, or lack of responses, provide information about each respondent's knowledge of publications in her or his field and indicate the degree of interest in the library collection. When appropriate, we also use accreditation checklists, if there is doubt about the collection's adequacy.
As collections increase in size, there is less concern with checking standard bibliographies. However, it is worthwhile for selectors to occasionally take time to review some of the best-of-the-year lists published by various associations. Such reviews will help selectors spot titles missed during the year and can serve as a check against personal biases playing too great a role in the selection process. Selectors quickly identify items not selected and can take whatever steps are necessary to remedy the situation. Often such lists appear in selection aids; it takes little extra time to review the list to conduct a minievaluation.
Self-surveys by the library staff frequently make use of checklist methods. M. Llewellyn Raney conducted the first checklist self-survey, at least the first reported in the literature, in 1933 for the University of Chicago libraries. This survey used 300 bibliographies to check the entire collection for the purpose of determining future needs. There is little question that this pioneering effort demonstrated the value of using checklists to thoroughly examine the total collection.
Obviously, one can use a variety of checklists in any situation. The major factor determining how many lists to employ is the amount of time available for the project. Today's OPACs make list-checking a faster process, but it still requires a substantial amount of time. Many evaluators have their favorite standard lists, but there is a growing use of highly specialized lists in an effort to check collection depth as well as breadth. Most evaluators advocate using serials and media checklists in addition to book checklists. Large research libraries (academic, public, or special) seldom use the basic lists; instead, they rely on subject bibliographies and specially compiled lists. One of the quality control checks supported by the RLG/ARL, in the past is a conspectus project using a specially prepared checklist technique (see chapter 3 for discussion of the conspectus). Specially prepared bibliographies are probably the best checklist method. However, preparing such lists takes additional time, and many libraries are unwilling to commit that much staff effort to the evaluation process.
Using any checklist requires certain assumptions; one is that the selected list reflects goals and purposes that are similar to those of the checking institution. Normally, unless an exa inination of a collection is thorough, the checklist method merely samples the list. Thus, the data are only as good as the sampling method employed.
The shortcomings of the checklist technique for evaluation are many,. and eight criticisms appear repeatedly:
Title selection was for specific, not general, use.
Almost all lists are selective and omit many worthwhile titles.
Many titles have little relevance for a specific library's community.
Lists may be out-of-date.
A library may own many titles that are not on the checklist but that are as good as the titles on the checklist.
Interlibrary loan service carries no weight in the evaluation.
Checklists approve titles; there is no penalty for having poor titles.
Checklists fail to take into account special materials that may be important to a particular library.
To answer these criticisms, the checklist would have to be all things to all libraries. All too often, there is little understanding that not all works are of equal value or equally useful to a specific library. Though some older books continue to be respected for many years, an out-of-date checklist is of little use in evaluating a current collection.
Obviously, the time involved in effectively checking lists is a concern. Spotty or limited checking does little good, but most libraries are unable or unwilling to check an entire list. Checklist results show the percentage of books from the list that is in the collection. This may sound fine, but there is no standard proportion of a list a library should have. How should one interpret the fact that the library holds 53 percent of some list? Is it reasonable or necessary to have every item? Comparing one library's holdings with another's on the basis of percentage of titles listed is of little value, unless the two libraries have almost identical service populations. In a sense, the use of a checklist assumes some correlation between the percentage of listed books held by a library and the percentage of desirable books in the library's collection. This assumption may or may not be warranted. Equally questionable is the assumption that listed books not held necessarily constitute desiderata and that the proportion of items held to items needed (as represented on the list) constitutes an effective measure of a library's adequacy.
This lengthy discussion of the shortcomings of the checklist method should serve more as a warning than a prohibition. There are benefits from using this method in evaluation. Many librarians believe that checking lists helps to reveal gaps and weaknesses in a collection; that the lists provide handy selection guides if the library wishes to use them for this purpose; and that the revelation of gaps and weaknesses may lead to reconsideration of selection methods and policies. Often, nonlibrary administrators respond more quickly and favorably to information about gaps in a collection when the evaluators identify the gaps by using standard lists than when they use other means of identifying the weaknesses.
As its name implies, this method depends on personal expertise for making the assessment. What are the impressionistic techniques used by experts f Some evaluators suggest examining a collection in terms of the library's policies and purposes and preparing a report based on impressions of how well the collection meets those goals. The process may involve reviewing the entire collection using the shelf list; it may cover only a single subject area; or, as is frequently the case, it may involve conducting shelf examinations of various subject areas. Normally, the concern is with estimating qualities like the depth of the collection, its usefulness in relation to the curriculum or research, and deficiencies and strengths in the collections.
Very rarely is this technique used alone. It occurs most frequently during accreditation visits, when an accreditation tea.-in member walks into the stacks, looks around, and comes out with a sense of the value of the collection. No consultant who regularly uses this technique limits it to shelf reading. Rather, consultants prefer to collect impressions from the service community. Though each person's view is valid only for the individual's areas of interest, in combination, individuals' views should provide an overall sense of the service community's views. (This approach falls into the category of user satisfaction.) Users make judgments about the collection each time they look for something. They will have an opinion even after one brief visit. Thus, the approach is important, if for no other reason than that it provides the evaluator with a sense of what the users think about the collection. Further, it encourages user involvement in the evaluation process.
The evaluation draws on information compiled from various sources--personal examination of the shelves, qualitative measures, and the impressions of the service community. Subject specialists give their impressions of the strengths and weaknesses of a collection. Sometimes, the evaluator employs questionnaires and interviews to collect the data from many people. Less frequently, specialists' impressions may constitute the entire evaluation. Library staff member opinions about the collection add another perspective to the assessment; often, these views differ sharply from those of the users and those of an outsider.
Because many large public libraries employ subject specialists, most special libraries have in-depth subject specialists available, and school libraries can draw on teachers for subject expertise, thus method is viable in any library environment.
The major weakness of the impressionistic technique is that it is over_ whelmingly subjective. Obviously, the opinions of those who use the collection regularly and the views of subject specialists are important. Impressions may be most useful as part of an evaluation when used in connection with other methods of examining a collection, but their value depends on the objectives of the individual evaluation project, and their significance depends on their interpretation.
Comparative Use Statistics
Comparisons among institutions can offer useful, if sometimes limited, data for evaluation. The limitations arise due to institutional differences in objectives, programs, and service populations. For instance, a junior college with only a liberal arts program requires one type of library, whereas a community college with both a liberal arts curriculum and strong vocational programs requires a much larger collection. Comparing the first library to the second would be like comparing apples and oranges. There simply is no basis for comparison and no point in it unless one can effectively isolate the liberal arts components.
Comparing libraries is difficult because of the way some libraries generate "statistics about their collections and service usage. On paper, two libraries may appear similar, yet their book collections may differ widely. Some years ago, Eli Oboler documented this problem:
One library, without any footnote explanation, suddenly increased from less than twenty-five thousand volumes added during 1961-62 to more than three times that number while the amount shown for books and other library materials only increased approximately 50 percent. Upon inquiry the librarian of this institution stated that, "from storage in one attic we removed forty thousand items, some of which have been catalogued, but in the main we are as yet unsure of the number which will be added. The addition of a large number of volumes also included about one-fourth public documents, state and federal, and almost fifty thousand volumes in microtext "'
No one suggests that it is possible to determine the adequacy of a library's collection solely in quantitative terms. Number of volumes is a poor measure of the growth of the library's collection in relation to the programs and services it provides. However, when standards fail to provide quantitative guidelines, budgeting and fiscal officers, who cannot avoid quantitative bases for their decisions, adopt measures that seem to have the virtue of simplicity but are essentially irrelevant to the library's function. Therefore, it is necessary to develop quantitative approaches for evaluating collections that are useful in official decision making and that retain the virtue of simplicity while being relevant to the library's programs and services.
Some useful comparative evaluation tools have been developed as a result of technology and growth of bibliographic utilities. Two widely used products were produced by AMIGOS and WLN. The AMIGOS product employed data from OCLC, and the WLN product uses its own bibliographic database. Due to the merger of OCLC and WLN, only the product that WLN developed is currently available. Such products allow one to compare one collection against one or more other collections in terms of number of titles in a classification range. One could, with either product, identify "gap" titles, items not in the collection but in the collection(s) of the other libraries.
The product that still exists is OCLC/WLN's "Automated Collection Analysis Service" or ACAS (