header
Vol. 13 No. 1, March 2008



Managing and evaluating digital repositories


Alesia Zuccala
Rathenau Institute, Science System Assessment Unit, Postbus 95366, 2509 CJ Den Haag, Netherlands

Charles Oppenheim and Rajveen Dhiensa
Department of Information Science, Loughborough University, Loughborough, Leicestershire, LE11 3TU, United Kingdom


Abstract
Introduction. We examine the role of the digital repository manager, discuss the future of repository management and evaluation and suggest that library and information science schools develop new repository management curricula.
Method. Face-to-face interviews were carried out with managers of five different types of repositories and a Web-based survey was carried out with users. The LexiURL Web link evaluation software provided a 'webometric' basis for investigating potential users online.
Results. Few managers had received any formal training. The repositories were relatively new and web statistics had been used by the managers to monitor their success. The LexiURL analysis indicated that the networks associated with the repository sites were predictable and made sense to managers because expected co-links and known links appeared in the network diagrams. Users of the repositories discovered them through friends and colleagues.
Conclusion. Digital repositories require ongoing evaluation to determine their quality and new directions for growth. A LexiURL analysis could be carried out by managers every four to six months and used as a complement to transaction log file analyses. Repository managers will need formal training in the future and we suggest a set of modules that would be suitable for a specialist programme.


Introduction

A significant amount of digital repository research and development activity is taking place in the United Kingdom, much of which is associated with the Joint Information Systems Committee's (JISC) Focus on Access to Institutional Repositories (FAIR) programme. In 2005, JISC initiated another call for repository projects with the intention of:

bringing together people and practices from across various domains (research, learning, information services, institutional policy, management and administration, records management and so on) to ensure the maximum degree of coordination in the development of digital repositories, in terms of their technical and social (including business) aspects. (Joint Information Systems Committee 2006).

In addition to the JISC's call for projects, a number of technical architecture, metadata standards, copyright and interoperability issues have been identified as critical to the development, management and sustainability of digital repositories (Day 2003; Gadd et al. 2003a, b, c; Guy et al. 2004; McLean and Lynch 2004; Medeiros 2003). Academics and other professionals are increasingly occupied with discussing these issues online (Andrew 2006; Harnad 2006a), to share all of the latest information concerning practical and technical challenges.

Here, we focus on the unique role of the digital repository manager and investigate the evolution of this role in light of current research and practice. Qualitative and quantitative data taken from a JISC digital repository project entitled 'User Needs and Potential Users of Digital Repositories: An Integrated Analysis' are used to enhance our discussion and give support to the idea that new programmes will soon be needed to help train the growing numbers of professionals engaged in repository management.

Definining repository and the role of the repository manager

Central to the role of a repository manager is the purpose or primary attributes of the repository. Specialised repositories are being developed for different purposes, for example, e-prints repositories, e-learning repositories, data repositories, e-thesis repositories and subject-based repositories; therefore, a useful definition of a repository needs to encompass all types. Crow (2002), Gibbons (2004) and Heery and Anderson (2005) offer very similar definitions. Crow (2002: 16) emphasizes the value of open access and creating a digital repository 'with few if any barriers to access'. Gibbons explains that the common feature of digital repositories is that they 'contain digital content', adding:

The range of different types of digital content can be vast, including text, audio, video, images, learning objects and datasets. The material may be born digital or of a physical medium that has been digitized, such as scanned images. (Gibbons 2004: 6)

Heery and Anderson (2005: 2) specify that content is deposited in a repository, whether by a creator or third party and that the repository architecture manages content as well as metadata and offers a minimum set of basic services, for example, put, get, search and access control. Moreover, a repository must be sustainable, trusted, well supported and well managed.

The role of a chief repository manager should be to recognise and define the raison-d'être of the repository so that depositors, users and members of the public will be familiar with its existence and purpose. Once these users know about the repository, its benefits must be advertised; hence, the manager needs to understand the importance of establishing a promotional programme. The repository manager should also have a clear sense of what constitutes the success or failure of a growing repository, including long-term financing, institutional support for document contribution mandates, and time to encourage individuals to contribute materials. He needs to be well educated on the technical aspects of the repository's construction, including its underlying software, standards adopted for metadata and standards for interoperability. The manager must think about what the repository can do for its contributors and create an appropriate evaluation programme using informetrics, bibliometrics, log file analyses, or webometric analyses when needed. The manager will have to keep up to date with current information science research, think about how to implement user-assistance programmes and make sure that deposited, accessed and used materials do not infringe copyright. Clearly, the development of a repository requires a great deal of work; hence, without a critical support team of information and computing specialists, the digital repository is not likely to be successful.

Some background

The role and core competencies of library professionals have been the subject of recent debate (Chan 2006; Mullins and Linehan 2005; Sargeant and Harrison 2004) and similar attention has been paid to the role of digital technologies (Choi and Rasmussen 2006; Hastings and Tennant 1996; Spink and Cool 1999; Perry 2005). When an individual chooses the professional role of librarian we trust and understand s/he has a core background of specialised training. Repository managers generally have not necessarily had specialised training, or training in library or information science. The work of a librarian can be measured against the theoretical underpinnings and standards of librarianship, whereas repository management is evolving from a new vision: a new scholarly communication movement based on the philosophy and standards of open access (see Jacobs 2006).

The term open access has been given a variety of definitions and its meaning is still evolving; however, following the Budapest Open Access Initiative meeting, a definition was produced:

First, open access works are freely available. Second, they are 'online', which would typically mean that they are digital documents available on the Internet. Third, they are scholarly works... Fourth, the authors of these works are not paid for their efforts. Fifth, as most but not all authors of peer-reviewed journal articles are not paid and such works are scholarly, these articles are identified as the primary type of open access material. Sixth, there are an extraordinary number of permitted uses for open access materials; users can copy and distribute open access works without constraint. Seventh, there are two key open access strategies: self-archiving and open access journals. (Bailey 2006:15)

Self-archiving is one strategy, which Harnad (2003; 2006b) describes as the 'green route to open access'. When an author provides 'limitless free “eprints” of electronic versions of their own final drafts on their own institutional Websites for all potential users Web-wide who cannot afford the journal version', he or she is said to be 'self-archiving'. (Harnad 2006b: 1). Evidence has been produced to show that open archiving of papers results in an increase in citations (Brody and Harnad 2004; Brody et al. 2006; Hajjem et al. 2005; Kurz et al. 2004; Moed 2006) and should continue if open access advocates convince scholars that this is one of the most important rewards associated with their participation. The reasons for this are uncertain, but there seems to be a consensus that an advantage exists.

In the absence of a core training programme for repository managers, repository development work is now falling into the hands of reference librarians (Chan et al. 2006), although Koehler (2006: 19) notes that the 'process of organizing OA materials can complicate the functions of library technical services' as well. 'In addition to questioning where articles must reside, one must decide who is responsible for migrating and archiving the works and who will resolve the links'. Genoni (2004: 300) also writes about the need for librarians to 'approach the task of content development in repositories by applying some of the procedures and skills associated with collection management'.

When academic reference librarians are asked to initiate repository projects it is because they are engaged in public service (e.g., a liaison to academic faculty) and situated in institutions that need to build a repository. The library administration at the University of New Mexico, Health Sciences Library and Informatics Center is one example:

Faced with a decision about which unit of the library should take on the responsibility of planning for and implementing the institutional repository, the library chose the Reference and User Support Service unit... based on the library's view that the web is a public service rather than a collection…. The Electronic Services Development Librarian position had been created and that position included the [library's] web site as a major component. (Philips et al. 2005: 3)

and, further:

Placing responsibility for the institutional repository in the Reference and User Support Services unit was a logical outgrowth of the philosophy, the organizational structure and the personal interests and skills of the incumbents in the positions. (Philips et al. 2005: 3)

Scholars who emphasize the 'changing roles reference librarians' are aware of the fact that 'libraries have moved beyond a custodial role to contribute actively to the evolving scholarly communication process' (Crow 2002; cited in Chan et al. 2005: 270). Academic librarians, in the traditional sense, take on the duty of keeping faculty and students at a college or university informed about recent acquisitions, including what is new and available in a particular discipline and including digital resources. They also teach bibliographic instruction (e.g., effective online search skills) as part of their information literacy programmes.

When the Hong Kong University of Science and Technology Library first created its institutional repository, a dramatic shift occurred in terms of what was expected of their academic library professionals. All were

...engaged in all stages of its development: the definition of goals and scope, evaluation of system and content, forming strategies and procedures, interpreting publishers' policies, contacting and servicing faculty members, acquisition of content and promotional efforts. (Chan et al. 2005: 271)

Chan et al. (2005: 271) admit that 'the learning curve for [the staff was] steep. Certain individuals 'juggled multiple roles' and 'some of these roles [were] extensions of existing ones; others [were] brand new'.

The staff at this Library took the opportunity to learn about repository management as they progressed and in many respects it was a trial and error process. For instance, the reference librarians e-mailed all faculty members and invited them to submit papers to the new repository, but 'the response was pathetic' (Chan et al. 2005: 275). They resorted to the new job of scanning all departmental homepages and those of individual faculty members to see how many had posted full-text publications on the Web (89 out of 450). In the end, permission was obtained to post 150 documents, but the reference librarians had to take up an advocacy role, which required them to 'check individual publisher's policies or negotiate for self-archiving rights' (Chan et al. 2005: 277).

Advocacy work for open access to a university's research output does not constitute traditional academic reference work; thus, in this case, it is not clear how effort put into the development of the new repository affected the normal reference services. If the reference librarian's role is evolving and changing, through involvement in repositories, to what degree should repository management become an important part of an information science school's curriculum and when should this curriculum become part of the agenda? The answer rests upon the degree to which the first digital repositories are successful.

Jones et al. (2006: 17) indicate that the institutional repository is 'a strong and important new idea' for academic organizations because its 'appeal lies in the idea of “groundedness”; institutions are themselves the ground from which emerge outputs of research – ideas, proposals, hypotheses, experiments, data and reported results'. Conversely, the authors note that,

...it is not yet clear whether institutional repositories will take root and flourish... The concept of institutionality is an increasingly fragile one when we consider digital content and digital libraries and we, therefore, must ask whether we should be developing institutional repositories at all. (Jones et al. 2006: 17)

Jenkins and Breakstone (2005) provide some interesting ideas regarding repository promotional work and suggest that librarians avoid library jargon when promoting a new repository, since it is better to use terms that are more readily understood and have meaning for the target audience (a similar suggestion was made by Gibbons (2004)). At the University of Oregon, Scholars' Bank was the chosen term. Likewise, Ohio State University decided to focus on creating a Research Bank or Knowledge Bank for their academic community (Rogers 2003). Jenkins and Breakstone (2005: 317-318) also direct librarians and repository developers 'to position the repository as complementary to traditional publishing'. Whilst this idea of complementarity sounds positive, thought has to be given to what academics previously needed and expected from the traditional publishing industry and how this has changed with the development of repositories. Jones et al. remind us that in

...pre-digital times, when researchers wrote up their results for publication, they would have been posted to a publisher – the only agent with the technology to present the finished paper in pleasing form and to reproduce it... In the digital age, the presentation and reproduction function do not require the intermediation of a publisher. (Jones et al. 2006: 18)

If any intermediary work is to be carried out in the digital repository age, it should be thought of in terms of workflow and administration. To properly manage a repository, all persons associated with its development and maintenance must be prepared,

...to examine how [to] structure the administrative tasks so as to produce individual modules, or workflow steps, which then allow for a standardised treatment of the relevant elements of the system. (Jones et al. 2006: 86)

For an institutional repository, there is a predefined list of workflow areas with specific tasks that need administering:

A significant portion of the digital repository literature demonstrates a justified concern with copyright laws and other aspects of intellectual property rights, such as moral rights and database rights (Gadd et al. 2003a, b, c; Gladney 1999). Within this area of responsibility, repository managers are advised to 'examine the needs of each of the main stakeholder groups involved in the creation and dissemination of [scholarly works, materials, or data]' (Jones et al. 2006: 140). Stakeholder groups can include authors, institutions, funding bodies, publishers, users, libraries and members of the general public and each will have their own priorities. An author's priority, for example, is to have other individuals access, make use of and cite their work, for scholarship and learning; thus s/he is likely to be concerned with just certain aspects of copyright (i.e., that his or her name should be associated with the work and the work should not be amended or exploited commercially, without permission). At least one member of a repository management team will have to discuss the individual elements required for 'a comprehensive deposit and end-users licence agreement, including a depositor's declaration, the repository's rights and responsibilities and [material] re-use terms and conditions' (Jones et al. 2006: 148).

Case studies pertaining to repository management are growing and with the dawn of a new repository era it is useful to draw attention to Ray's (2001: 4) note that 'case studies of library work are not prominent in the literature on librarianship'. Why then are case studies so important to repository work? In Ray's (2001: 4) view 'there is a growing interest in the future role of librarians, but it typically views the production of new roles as linked to technology'. Repository development work is transforming the technology and culture of scholarly communication; hence case studies are needed to help information professionals bear witness to this gradual process.

Pinfield et al. (2002), Ashworth et al. (2004) and Hey (2004) each write about what it was like to set up institutional e-prints repositories at the Universities of Edinburgh, Nottingham, Glasgow and Southampton in the UK. Pinfield et al.'s study explains how the project management team tried to make it,

...as easy as possible [for researchers] to contribute. At the beginning [the project team] allowed researchers at the university to e-mail papers to an archive administrator [thus emphasizing that] the library would do the work. The team felt that the academics [did] not want additional bureaucratic burdens nor did they want to learn new IT skills. (Pinfield et al. 2002: 8)

In the US, Rogers's (2003: 127) paper indicates that 'while defining the scope of [Ohio State University's] Knowledge Bank, the Planning Committee considered steps other institutions [were] taking to manage their digital content'. In Australia, Kennan and Wilson encourage repository managers to learn from research and practice in Information Systems, i.e., work associated with the phrase requirements uncertainty. The creation of a repository can be an incremental process or a results-driven process, meaning that 'other institutional intellectual capital and additional functionality could be added as organizational change and learning takes place, or as more resources become available'. (Kennan & Wilson 2006: 11)

The tasks associated with developing and managing a repository are becoming increasingly clear now that resources are available to help new repository managers adjust to their roles. Soon, the future of repositories and their success will be left to those who know not only how to develop them, but evaluate them as well. Are digital repositories fulfilling their primary objectives? How are these objectives evolving over time and how can we be sure that they are meeting the needs of users? In the next section of this paper, we discuss the findings of a Joint Information Systems Committee-funded Project carried out in 2005 and 2006, which was designed to evaluate five different types of digital resources across the United Kingdom from a management perspective, a user perspective and from a Web-based perspective using a new link analysis software tool, LexiURL.

The Joint Information Systems Committee study

The JISC-funded user needs study was initiated in September 2005, shortly after the implementation of the 2005 Digital Repositories Programme. The following public repositories, including one digital library, were selected for evaluation:

From a management perspective, our research goal was to acquire an in-depth understanding of the creation rationale for each repository, the collaborative work associated with the resource's construction, the managers' strategies for identifying users and promoting the resource and their current approach to using Web statistics for user assessment. In the second phase, we employed an online questionnaire to learn more about the needs and perceptions of the current users; that is, the factors motivating them to use (or not use) the resource and their general usage experiences. With the introduction of LexiURL, a new Web link evaluation program, the third aim of the study was to provide repository managers with a Webometric plan for investigating potential users or uncovering hidden user communities, so that they might work towards building stronger links (i.e., Web and real-world links) between themselves and other relevant organizations or activities, at national and international levels.

Repository types and management practices

The managers who agreed to meet with us for interviews took an average of one and a half hours to respond to a set of questions listed from a structured interview schedule (see Appendix). The questions for each interview session (five sessions in total) were the same; however, short discussions occurred during our meetings when it was valuable to elaborate upon specific points. Some of the managers met with us on an individual basis and others came to us in teams of two or three people. All remarks in quotation marks that follow have been copied from the interview transcripts.

Since the selected repositories, including one digital library, were different in type, it was interesting to evaluate them from a comparative perspective. By choosing to evaluate the NeLH, our aim was to determine if certain aspects of digital repository management could be learned from current practices in digital librarianship. During the period in which we carried out our management interviews we first learned about the rationale behind each resource's construction.

National electronic Library for Health

The National electronic Library for Health was created in 1998 because of,

...a realisation that clinicians, doctors, nurses, speech therapists, dieticians and all kind of therapeutic professionals, needed access to information quickly (Service manager, NeLH).

According to the service manager, health care professionals across the UK sometimes find it difficult to achieve quick and easy access to medical information when they need it. Normally this is related to the fact that the library of a hospital or medical centre is located in a separate wing and is not always a convenient place to get to in order to do an information search. Often, medical professionals are also called to work outside a traditional medical setting; thereby finding themselves in a position where it is too time consuming to get to a medical research library. When using a Web-based digital library, a health care practitioner will only need to be in a place where he or she has access to the Internet; hence, the service was Web-based to:

...provide clinicians with access to the best current evidence on conditions and treatments to improve patient care (Service manager, NeLH).

At the time it became available it was not 'aimed at the public' but 'it has a sister service called the NHS Direct that is'. Both were developed 'more or less at the same time and 'there [has been] a lot of cross usage … something like 10-15%'.

CogPrints

CogPrints, the subject specialty repository, was created in 1997 for the cognitive science research community, because of the success of the Los Alamos physics e-prints arXiv. At our interview with the manager, we learned that there was a background interest 'in demonstrating that [subject specialty repositories] were not just for physicists' and that they could 'work for other disciplines'. The CogPrints manager was convinced that if the new subject specialty repository grew to be successful, it would show that 'archives with self-archive papers [were] not just a special quirk of physics'.

e-Prints Soton

e-Prints Soton was created in 2002/2003, at the same time as 'the JISC FAIR programme was initiated' (and was funded by the same programme) and shortly after the ECS database was created at the Electronics and Computer Science department. The development of e-Prints Soton was closely associated with 'the issue of push and pull of the open access movement' (Service managerA, Soton). This university-based repository team felt that it was a 'natural progression in the publishing debate as a whole' and that the creation of e-Prints Soton:

...would enable the university to organize its institutional research output in a way that would allow better analysis of where the research is going (Service managerB, Soton).

UK Data Archive

The UK Data Archive was created in 1967/1968, because:

...the UK research council thought it would be a good idea [to create] a one-stop shop [for researchers] so that … rather than having to individually go to the data providers, mainly government departments and commercial data providers, [they] would be able to go to a central location and obtain all their data (Service manager, Data Archive).

The notion was to get one,

...organization brokering access agreements and licensing arrangements and copyright arrangements rather than individuals having to do that on a one to one basis (Service manager, Data Archive).

Research councils have been major sources of financial support for:

...data collection exercises; therefore in order to maximise secondary use of the data, [sponsored researchers have been] required to offer data to the archive (Service manager, Data Archive).

Essentially, 'the data archive' was first created as an 'archive of investments, made by the research councils themselves'. According to the manager, this repository 'has become more important over the last few years … because of its change in status'. Not only has it gradually become a digitized resource (since 1999), it is also:

...a legal place to deposit; the only digital repository in the country that has legal place as status, [which] means that members of the public can come to [it to] acquire digital materials (Service manager, Data Archive).

Jorum

The Jorum e-learning repository, was created in 2005/2006 and funded by JISC to host:

...content created for the [higher and further education] community [as well as] to stimulate a community of users for teaching resources (Service managerA, Jorum).

Outside the United Kingdom 'other teaching and learning repositories' have been created,

...but none that were doing quite the same as Jorum. MERLOT is… another international repository, [which is] essentially a library catalogue system where people can come and search for content but the content isn't contained within the repository. This is not the case with Jorum, since… it houses metadata records that describe the content that can be found elsewhere, but it can also be held in the repository itself (Service managerA, Jorum).

Collaborative work

When we asked our interviewees to provide a brief explanation of who was or is currently involved in their project, all confirmed the importance of collaboration or teamwork. CogPrints, for example, was:

...created by an Electronics and Computer Science PhD student at Southampton University. The second version, post Open Access Initiative (OAI), was rewritten by another PhD student to make CogPrints OAI compliant. The third version [was] taken over by another PhD student at Southampton (Service manager, CogPrints).

and since then CogPrints has been the project of,

a very savvy population of computer scientists and pretty good hardware resources (Service manager, CogPrints).

When it was time for the University of Southampton to develop another, much larger e-prints archive, a project team was formed by the Southampton Oceanography Centre library, which included the Centre, the School of Electronics and Computer Science and Information Systems Services. The members of this library project team found that 'it was [much more of a] collaborative effort within [different parts] of the institution'. We were told that:

...after a period of initial development... backing from the Deputy Vice-Chancellor for Research was secured [because] there was a further 'buy-in' from the institution regarding [the repository's value for] the UK Research Assessment Exercise. It was possible to see how a repository would help manage the Research Assessment process and aid the management of information more generally in relation to research (Service managerA, Soton).

Identifying and understanding users

To populate a digital repository with useful materials, a professional development team needs to identify and sufficiently understand the needs of their service's primary users. At the National electronic Library for Health, a number of user groups were identified from the pilot work, when the management team

...had panels of people reviewing material and coming to some sort of consensus [regarding the material's value]. The user groups consisted of a wide range of people, including doctors, nurses, various other allied health professionals and library and information workers as well (Service manager, NeLH).

All of the users have now become key partners because they act as advocates on [the National electronic Library for Health's] behalf by getting people to use the library and they also do a lot of training… in literature searching, for example and use of database (Service manager, NeLH).

Through these user groups the management team has identified a key quality resource that the clinicians feel they need to access quickly. Hitting the Headlines, for example,

...is a review of the coverage of health issues in the press. Two or three times a week, a story is picked up from the press and examined, conclusions are then drawn as to the validity or otherwise of the newspaper reporting. Clinicians find this very useful as patients pick up on these stories from the TV or newspapers and often clinicians are not aware of what the position is. So it helps clinicians to help patients (Service manager, NeLH).

At e-Prints Soton, the development team,

...wanted to capture the whole output of the University, but 'saw that [they]<> needed to start in a specific area. Research was [their] key focus, [and this included] conference papers, posters, project reports and all the different things that research encompasses (Service managerA, Soton).

The researcher was the primary user that they had in mind; therefore one management interviewee said: 'what the researcher thinks is important is what goes into the repository' (Service manager C, Soton).

Another interviewee added:

When we talk about users we mean… people who are depositing the work, such as authors. Users were wanted from a spread of areas across the university but we started with [those] we knew were interested. Other groups were targeted which would set good examples for the rest of the university. For example, Education was targeted as they would not have a database that they could regularly deposit material into, so they would be encouraged to self archive via the repository and hence show other faculties that it is a good idea to be proud of their research and have it made visible (Service manager B, Soton).

Promotional work

After a repository is created, people are expected to become users; however, new users will not necessarily recognize a service's value unless it is sufficiently publicised. At the UK Data Archive leaflets are available to the general public concerning all branches of its service:

Publicity takes on a variety of forms, including the distribution of hard copy documents, electronic documents and specialist documents aimed at specialist audiences. The publicity materials inform users and potential users – e.g., the Archive's annual report – but they also but they serve another purpose of showing sponsors what [the management team] is actually doing (Service manager, Data Archive).

Users of the Data Archive are invited to register and provide contact details so that they can access all materials. A newsletter is available in hardcopy and as a .pdf version on the Web. The Archive has '20,000 registered users, but [approximately] only 4000 have asked for a hard copy' (Service manager, Data Archive). Mailing lists are also used to inform users of new releases of data and a lot of promotional material such as paper brochures are produced and distributed at workshops and conferences.

The Jorum e-learning repository launched its resource for public use in two stages. First, new depositors and contributors were given an opportunity to become familiar with the repository (in November 2005), then, shortly after,

...the user service, [which allows] people to download content, went live in January 2006. The two separate services were staggered slightly to allow some content to build up (Service managerA, Jorum).

Throughout the two launches, articles were written, newsletters were produced and mailing lists were targeted. The Jorum e-learning repository is also promoted at events, some on invitation; others organized by the management team.

We promote the service to e-learning, ILT people, learning resource staff in institutions. We do this in a variety of ways in attempt to get at end users, so we promote to the right people in the right places to encourage uptake (Service manager B, Jorum).

With respect to user training,

...a train-the-trainer approach is used, whereby training and outreach events are held all over the country to give an overview of what Jorum is and showcase some of the [deposited] materials. These are typically half day events, for intermediaries who will in turn pass the information on to end users. The intermediaries are provided with the resources to deliver sessions to users at their institutions (Service manager B, Jorum).

Measuring success

Since many different assessment programmes and tools may be used to measure the success of a new digital resource, one of our objectives was to ask the library/repository managers if and how they had been obtaining actionable information from the Web the better to understand users.

The National electronic Library for Health service manager demonstrated a high degree of awareness regarding his users:

We know that [they are] in many cases are overworked, exceptionally busy and have a number of competing priorities; therefore our strategy is really to try and sell NeLH to them, by telling them what's in it for them. The key messages are that we're always available, that we're easy to find and you can find the information within a few minutes of going onto our site (Service manager, NeLH).

Asked if usage statistics were collected for assessment purposes:

Yes, we do, on a monthly basis. The statistical software used to track users is called WebTrends® and it enables information such as what are the most visited pages, the average time spent on the site, entry and exit pages, so it enables, to a certain extent, the mapping of a users' journey through the site (Service manager, NeLH).

We asked if the service traces where users come from and he said:

Yes, Google was one of the highest entry points to [the National electronic Library for Health Website] (Service manager, NeLH).

Regular use of the service,

...breaks down something like 40% General Practitioners (GP's), 30-35% nurses and 15% professions allied to medicine. The remainder is students and the general public. The students are from a variety of related areas, such as life sciences (Service manager, NeLH).

We asked if the management team had come across any benefits to current or new users:

Yes, we have and do. Success stories are a key part of NeLH's public relations. For example, we make a point of publicizing the fact that someone saw something on our site that directly benefited or contributed to patient care. Some individuals have said: I've changed, or improved my practice through something I've read. Testimonials of this nature demonstrate that people are finding the National electronic Library for Health useful. Some people volunteer this information through the feedback facilities available on the site. Positive feedback is received on what people found on the site, for example: I was able to do this, because I found this. Other information is sought by asking clients in the user community and the library community and they relay the feedback that users have given (Service manager, NeLH).

The e-Prints Soton management team spoke about collecting some usage statistics for a user assessment, but admitted that this has not been a major part of their focus yet.

Yes, we have done a little bit of this, [but] our main focus has been to work on the [development of] the repository. We are very conscious about the fact that we need to see and show the vice chancellor some good statistics. At the moment statistics are modest but they will be much more sophisticated and will tie in with other statistics for other repositories around (Service manager A, Soton).

When asked, 'Who have you identified recently as the main users of E-Prints Soton?' the reply from another interviewee was:

Academic users [i.e., faculty] within the University use it for their own reasons, whether it is to create a bibliography or see what other people are doing. There are also users from outside the university, internationally. We know this because we get e-mails from all over the world, particularly in nursing (Service manager C, Soton).

The CogPrints manager mentioned that this repository's user base was located worldwide and that a majority could be identified as 'almost certainly academics'. He also stated that the subject-based repository was 'not the kind that the layman would be particularly interested in'. In terms of collecting actionable information from the Web, the management team at CogPrints has implemented an online system for collecting Web statistics, but the manager provider satisfaction was more relevant than user satisfaction:

The relevant question is how do you get the 85% of the non-providers to be providers, so that they can get the enhanced impact. CogPrints should not be looked at ... in fact, open access [to published articles] should not even be looked at from a user standpoint; it should be looked at from a provider standpoint (Service manager, CogPrints).

The Jorum e-learning management team said that the collection of usage statistics was 'one of the things that [they were] currently looking at'. One of the interviewees stated:

...currently we are collecting statistics on who is logging on to the service, the number of downloads, etc.(Service manager A, Jorum).

This respondent was able to tell us that they were up to 140 registered higher education and further education institutional members.

LexiURL link analysis

LexiURL is free software designed to retrieve link data from search engines, like Yahoo!, Google, or AltaVista and calculate summary statistics for lists of links or URLs. Its output is a series of standard reports that convey information about page URLs, sites and Web domains linking to a main site of interest. Although LexiURL is a flexible, generic program, many of its functions are useful for a digital repository link analysis.

Before each of the management interviews, a Web link analysis report was prepared and presented to the managers at the meetings for discussion of the implications of the data. All link data were organized in a uniform format that explained how the links could be examined or manipulated for evaluation purposes, or visited on the Web for further insight. The information given to the managers included a list of the page URLs linking to their repository, a list of all second and top level domains and a co-link network map. Figures 1 and 3 show two co-link map examples: one created for the e-Prints Soton management team (October 20th, 2005) and another created for the Jorum team (April 3, 2006). Distances between the nodal points (Websites) represent a kind of similarity-based relationship of 'co-linkedness' on the Web. Co-linked Websites occur 'when two pages both have inlinks from a third page' (Thelwall 2004: 5). Lines leading to the site of interest represent directed inward links and line thickness indicates the link frequency.

Our research interest in the maps was to give the repository managers a method of visualising the Web network in which their service was situated, at the time of the study. The e-Prints Soton site was situated within an academic co-link environment, as expected, but many of the co-linked sites were not directly linked to e-Prints Soton. Jorum's co-linked sites were either university sites, or sites related to e-learning (e.g., the MERLOT e-learning resource).

Figures 2 and 4, following each co-link map, graph the number of different sites in second or top level domains that contain at least one page linking to the e-Prints Soton Website and one page linking to the Jorum Website. The responses obtained from the managers concerning this data were positive, given the fact that we were introducing a Web analysis technique that they had not seen before.

A link analysis using LexiURL should ideally be carried out for each of the repositories approximately every four to six months. Over this period a manager may be able to detect changes in the co-link maps representing the resource's online network or Web community. New links might appear and a regular review of their context (i.e., where they are situated on a Web page and what type of organization is creating the link) would give managers an opportunity to think about places where new users might be surfing the Web and address the needs of potential user groups. From our initial analyses, we discovered that a growing proportion of links to the UK resources were coming from international Websites (eg. the University of Queensland Australia directed a link to the e-Prints SOTON site; listing it as a key resource on their Databases for Social Sciences page). As a result of this information, we are certain that managers will want to see that these links are preserved and will want to know if such links are being followed as access points to their resource.

LexiURL can also easily be used as a supplement to a log file analysis. Log files provide information about daily user activities on the Web, either in terms of the search engines used and phrases/words users' type to carry out a search, or the Web URLs (links) that are being followed. A LexiURL analysis is a complement to log file data because it extracts lists of links from the Web (using Yahoo!) that exist 'in the wild', which can be compared to log file (followed) URLs. Furthermore, we recommend that managers consider using LexiURL to perform comparative link analyses with 'competitor' sites or other international repositories similar in scope and purpose. If more links or different types of links are found to be directed to the site of another similar resource, then perhaps these links represent previously unrecognized users, or areas for further outreach and cooperation.

Figure1.jpg

Figure 1: Top 49 sites co-linked with University of Southampton e-Prints, including directed links.

Figure2.jpg


Figure 2: Second or top level domains with at least one page linking to the University of Southampton e-Prints.

Figure3


Figure 3: Top 49 sites co-linked with Jorum, including directed links.

Figure4.jpg


Figure 4: Second or top level domains with at least one page linking to Jorum

Repository users

The results of our user survey provided us with current information concerning the perceptions some users have of the repositories, what they want or need from them and how they approach them on the Web. Our survey was carried out on the Internet using a Web-based questionnaire. To obtain participants we compiled a set of relevant mailing lists on the Internet (e.g., mailing lists for health care professionals, lecturers, educators, researchers etc who would likely be interested in the repository's content) and sent out announcements regarding our questionnaire through the lists.> We wrote to some of the school heads at the University of Southampton and asked if they would agree to circulate an announcement regarding our questionnaire and some of the repository managers were helpful in encouraging people to complete our survey.

Figure 5 shows the total number of survey respondents corresponding to each service. 54% of the respondents were female; 44% were male (2% of the individuals surveyed did not respond to the gender question). As expected, the majority of individuals who completed the survey were between the ages of 25 and 65 (92 %) (2% did not respond to this question). 82% of our survey respondents were residents of the UK.

Figure5.jpg

Figure 5: Total number of survey respondents corresponding to each service.

Sixteeen percent of respondents were residents of other countries, for instance, the United States, Australia, Canada, New Zealand, Israel, China, India, Thailand, Haiti, Iran, Ecuador, Bolivia, Brazil, Mexico, Uruguay, Trinidad and Tobago and parts of Europe (i.e., Italy, France, Germany, Turkey, Hungry, Finland). Most of the foreign survey respondents were associated with CogPrints, but this was expected since this subject repository has a greater international focus than the other resources evaluated here. Most of the respondents were librarians or information professionals or academic staff and researchers. However, some managers (i.e., IT or project managers) and many nurses, teachers, students and physicians and public health care practitioners also completed the survey.

Figures 6, 7, 8, and 9 and Table 1 below present the results obtained concerning the questions shown in captions.

Figure6.jpg

Figure 6: How did you first learn about the existence of the library/repository?

Figure7.jpg

Figure 7: Are you a user of [digital library/repository name]?
Figure8.jpg

Figure 8: What is your usual Web access point to the digital library/repository?

When we compared all Web access point answers to the digital library/repository to the frequency of use responses, we observed that:


Table 1: Web access point and frequency of use
Frequency of use Web access point to the digital library or repository
Bookmarked (83) Type URL to reach site (50) Follow link from another page (42) Personal homepage link (20) Search for it by name on Web (18) Other (e.g., e-mail link / Athena portal / Desktop icon) (8)
Everyday 16 2 3 3 0 0
2-3 times a week 14 12 6 8 1 1
Once a week 10 5 5 2 3 1
Approximately every 2 weeks 16 4 6 0 3 1
Once a month 4 5 6 2 2 0
A few times a year 23 22 16 5 9 5

Figure9.jpg

Figure 9: The material on the digital library orrepository is usually relevant to what I need.

Two additional sections of the user survey were created to encourage users to indicate what type of information they would like to see available at each of the online resources and state also what type of benefits they had experienced when using materials from the repository or library site. One user of the National electronic Library for Health stated that s/he was able to obtain information that is clinically relevant much faster than previously. Another user wrote about the personal benefits of using the health library online:

I accidentally stumbled on to some useful information concerning a condition I suffer from myself. This information was completely new to me and started me on the road to finding some more, which has offered me another treatment option and improved my own health (User C).

A user of CogPrints said that s/he was, '...getting better in [his/her] work and feeling more comfortable about being in touch with great resources for free' (User F). One unexpected benefit related to us by an e-Prints Soton user was that after depositing materials on the site s/he had '...received contact from other researchers with similar interests'. (User D). Also, one UK Data Archive user said that, 'the availability of data led to a new funded stream in his/her research programme' (User G). And, finally, we were told by one Jorum user that the existence of this e-learning repository (even at its earlier stages) had 'increased the stimulation of teaching staff and their motivation' (User E).

Amongst the individuals who completed our surveys (excluding the National electronic Library for Health) 50% percent identified themselves as service users and 50% percent as non-users. Ten percent of non-users commented that they were still learning about the services and 22% claimed to be interested in using them in the future. Only 8% stated that they did not want to use them at all.

Some of the survey respondents who indicated that they were not users of the services studied, said that they were users of other types of digital libraries and/or repositories; hence, when we asked, 'For what purpose did you use another digital library or repository?' the responses were as follows:


Management implications

Digital repositories are not static and require ongoing evaluation to determine their quality and to identify new directions for growth. Management teams of well-established and well-used repositories may need to become knowledgeable about collecting Web link statistics, download statistics or citation statistics in the future for a variety of analytic purposes, so that interested parties will have an adequate measure of a repository's success.

Repository uses can be as varied as the users themselves; hence it is important for managers to communicate regularly with users (e.g., through an open forum) in order to share information and obtain feedback. Repository management teams who set up and maintain registration databases, listservs, or interactive newsgroups for users are engaging in an important management practice.

Although the development and management of a digital library differs from the process of creating and managing a digital repository, there are times when repository managers can and should learn from the work of their digital library colleagues. Because we included the National electronic Library for Health in this study, we obtained a valuable point of reference for how it is that repository managers might understand users. The National electronic Library for Health management team spent a lot of time researching the needs of users; this proved to be good practice, particularly in terms of users promoting the digital library. Repository managers are focused on how to develop their repositories and are intent on encouraging individuals to deposit, but over time they will have to focus more on understanding long-term user needs. A user-based focus will become especially important for managers of e-learning repositories because the expected value that e-learning objects will have on lecturers and students in higher education.

Based on the survey information generated from non-users, repository managers should not assume that non-use of their resource is due to an ignorance of or lack of familiarity with digital resources. Potential users could be using other types of digital libraries and repositories; therefore, it is a good best management practice to try to find out more about what is attracting them to other repositories (online competitors possibly) and develop publicity programmes that will bring people up to date on what makes their resource especially valuable.

Digital repository managers may need to give more consideration to the importance of personal information sharing among friends and work colleagues (Rosen 2000). A significant number of individuals surveyed for this project indicated that they had learned about the services studied through a friend or colleague. Initial evidence was also found to suggest that repository use can contribute to collegial networking. For example: 'I have received contact from other researchers with similar interests' (User D).

Personal Website links to online digital resources are normally not plentiful (e.g., Beaulieu 2005); however based on this project's user survey we discovered that persons who frequently use the studied services sometimes do have a directed link from their personal Website. A regular LexiURL link analysis should give the manager new insight into the number of personal pages linking to their resource over time, including some of the growing number of Weblogs. What is the relationship between the source of the link and the link target? Does the source simply acknowledge the digital resource or provide descriptive information concerning parts that they appreciate, recommend to others, or have consulted to great benefit?

Training implications

Earlier we indicated that the basic requirements to run a library or digital library successfully are covered to a greater or lesser extent by traditional library and information science schools' curricula, but none, to our knowledge, focuses on the particular needs and requirements of repository managers. It may be, of course, that some programmes on digital libraries include repositories as a type of digital resource. Mezick and Koenig's (2008) recent review of information science education draws attention to new programmes in knowledge management, information architecture and digital libraries, but makes no mention of other emerging areas such as social informatics (see Kling 1999) or institutional repository management. Both areas are closely related because managers clearly need to recognize the social context in which new repositories are developed before they can understand how they will influence the ways that people look for and use information.

With the increase in repository activity, there is little doubt that management training will be needed. Surveys of repository managers demonstrate that many felt their way when first starting, often making mistakes through ignorance of what was possible or desirable (Dhiensa 2006). During the interview phase of our study we asked the managers if they had received any training before setting up their repositories. We found that most had not participated in any formal training, or that it was carried out in-house.

For a new curriculum in repository management, materials could be drawn from existing curricula, but much of this information would need to focus on issues specific to repositories. In-house teaching could be supported by contributions from repository managers, for instance, as guest speakers invited to give presentations and share practical insights. The issues that need to be taught apply internationally; hence there is no reason why such a programme could not be provided to a world-wide audience using e-learning methods. The major components of a new curriculum might be (in thematic order):

The changing electronic publishing environment Repositories Management issues Librarianship Technical tools Legal issues

Core reading materials for a new repository management programme should include the books written by Jones et al. (2006), Jacobs (2006) and Cockburn (2001); however, most of the supporting literature will be journal articles and Web sites.

Conclusion

Many of the management issues that repository managers are facing are novel and the techniques available to assist them with long-term evaluations are either in their infancy, like LexiURL, or not well known. We have demonstrated the results of one fairly general strategy that can be applied to different repository types, including digital libraries, but because this evaluation has come at an early stage in the repository era, further evaluative research will be needed in the future.> This research shows that an overall evaluation process should, at the very least, consider the repository management team's> and the users' perspectives and should apply some type of objective measure to determine how these interacting factors are contributing to the repository's success. Most of the literature on repository management demonstrates a concern for institutional repositories and the effect that they will have on research outputs or research assessments; thus further research will be needed to determine how other types of repositories, e.g., learning object repositories, contribute to higher education and what kind of effect they are having on teaching and learning.

In sum, we believe there is a strong case for library and information science schools to develop programmes, or at minimum, specialist modules, to assist the ever increasing numbers of people who wish to train as repository managers. Since our project was limited to repositories in the UK and was an exploratory study, it will become increasingly important to find out how digital repository managers everywhere are learning their trade, keeping up with rapid information technology developments and coping with their training needs. New research, including market research, is needed to establish the best methods of providing such training. Might it, for example, be provided by library schools, computer science departments, professional associations, or commercial training providers? Also, how should it be delivered: by means of short courses, distance learning, or e-learning packages? With the rapid development and growing importance of repositories, these are issues that should not be left to chance.

Acknowledgements

We wish to thank the anonymous referees for their helpful comments. Funding for this research was provided to us in the United Kingdom by the Joint Information Systems Committee's Digital Repositories Programme.

References


Appendix: Management Interview Schedule

Rationale for Creating the Repository
Development of the Repository
Identification of Users and Publicizing the Repository
Benefits of the Digital Repository
Web Link Analysis

How to cite this paper

Zuccala, A., Oppenheim, C. & Dhiensa, R. (2008). "Managing and evaluating digital repositories " Information Research, 13(1) paper 333. [Available 21 November, 2007 at http://InformationR.net/ir/13-1/paper333.html]
Find other papers on this subject




Check for citations, using Google Scholar

delicious post Bookmark This Page


counter
Web Counter
© the authors, 2008.
Last updated: 20 November, 2007
Valid XHTML 1.0!