Re: Sun & Xia (2007) Assessment of Self-Archiving in Institutional Repositories

From: Steve Hitchcock <sh94r_at_ecs.soton.ac.uk>
Date: Mon, 5 Mar 2007 15:01:45 +0000

A survey soon to be published by the Preserv project sheds some further
light on one of the issues raised by Sun & Xia, that is the method of
deposit in repositories.

This is slightly out of context from the whole survey, which was broadly
seeking to identify preservation policy and activity, but here is the
relevant section from our paper. The results are based on what repository
managers told us, rather than on analysis of actual repositories.

How content is deposited in repositories

Before asking about content in the repositories, we need to know how it
arrives there because this determines how much information, or metadata,
about the content can be acquired at the point of deposit or ingest, and
how reliable that information might be. The original target for IRs was
open access content, copies of published research papers self-archived by
their authors, a low-cost way of building content. Alternatively deposit
might be mediated by someone other than the author. For example, some
senior or prolific authors might delegate deposit to secretaries or PAs.
Where legacy papers are being added to a repository on a large scale
departments might organise batch deposit from existing databases. Or for
repositories where it proved hard to generate content through
self-archiving, deposit might be mediated by repository staff or designated
editors to reduce the time required by authors. Each of these processes has
an effect on the deposit process, notably on the quality of metadata, with
repository-mediated deposit potentially offering the highest quality
metadata, but with higher costs for the repository.

In the questionnaire respondents were asked to estimate the use of each
type of deposit.

1 How is new material deposited in your repository: by author
self-archiving; mediated deposit by an agent on behalf of the author (e.g.
a personal assistant); mediated deposit by repository staff? If more than
one method is used, can you indicate rough proportions, or which method
predominates?

Some responses gave percentage estimates, while others indicated the
predominant method, so the results are presented in two ways in Table 1.

Table 1: Method of deposit in repositories ranked by content volume (based
on estimates from repository managers)

Dominant method
Repository-mediated 8 repositories
Self-archiving 5
Author-agent-mediated 2
Inconclusive 5

By percentage (all repositories types*)
Repository-mediated 50%
Self-archiving 35%
Author-agent-mediated 17%
* 15 repositories provided figures that were used in calculating this result

By percentage (subject repositories)
Repository-mediated 38%
Self-archiving 43%
Author-agent-mediated 19%

In percentage terms we find the same order and similar proportions among
all surveyed repositories as for the dominant method, but greater use of
self-archiving for the four subject repositories.

One repository did not give a usable indication for this table, but summed
up the range of options available to repositories: "We are shifting towards
more self-deposit: 4 schools are fully self-deposit; 3 use nominated
depositors; the rest export data from school databases. Repository staff
will not deposit but will do QA."

Context: This was a closed survey targetted at selected repositories from
the Registry of Open Access Repositories (ROAR) using two criteria:

1 The largest repositories by volume of content, i.e. those with most
content to preserve (obtained by using the ROAR 'Sort by Total OAI Records'
filter button)
2 From 1, those with a 'Preserv profile' (an exemplar preservation service
http://trac.eprints.org/projects/iar/wiki/Profile)

Steve Hitchcock
Preserv Project Manager
IAM Group, School of Electronics and Computer Science
University of Southampton, SO17 1BJ, UK
Email: sh94r_at_ecs.soton.ac.uk
Tel: +44 (0)23 8059 7698 Fax: +44 (0)23 8059 2865
http://preserv.eprints.org/

At 22:22 04/03/2007, Leslie Carr wrote:
>Assessment of Self-Archiving in Institutional Repositories:
>Depositorship and Full-Text Availability
>Jingfeng Xia and Li Sun
>Serials Review 2007; 33:14­21.
>
>Mark Twain allegedly responded to an untimely obituary with the
>comment "reports of my death are greatly exaggerated". This article
>posts an obituary for self archiving after evaluating a small number
>of repositories against two criteria (depositer identity and full
>text percentage); my response is that Self Archiving is making real
>gains in a successful but frustrating long and drawn-out process!
>
>The article highlights the difficulty of understanding repositories
>"due to time constraints" without (a) casting ones net widely enough
>to insulate from national difference, (b) undergoing an in-depth
>analysis of data and (c) attempting an intimate understanding of the
>processes that are manifest in each repository. A much fuller
>analysis will be soon forthcoming from the European DRIVER project,
>which has devoted resources to these factors.
>
>
>Comments about Southampton:
>
>>The success of the Soton database is primarily because
>>Southampton is the inventor of the EPrints application
>>and the home of an enthusiastic self - archiving
>>advocate^×Stevan Harnad. For many years, the uni-
>>versity has made tremendous endeavors to encourage
>>self-archiving among its faculty.
>
>The second sentence is true; the first sentence is not. The
>establishment of the Southampton repository is due almost entirely to
>the library staff at Southampton, and principally the efforts of
>Pauline Simpson and Jessie Hey, two of the librarians. Mark Brown,
>the chief librarian and Wendy White, the repository manager have
>become responsible for the repository's current success as it has
>moved from pilot to production status. Stevan, myself and EPrints
>have had relatively little to do with the institutional repository,
>dealing mainly with the school repository for (their own) School of
>Electronics and Computer Science. I sit on the University's
>repository steering committee, but have no operational commitment to
>the repository which also has its own independent support staff.
>
>[Incidentally, the ECS school repository is about half the size of
>the current institutional repository, has a full text deposit rate
>of around 70% and consists of almost entirely author deposits. More
>of this later.]
>
>
>>It is worth noting that author self-archiving is not the
>>major way of contribution to the accumulation of the
>>content in Southampton^Òs Soton repository. The findings
>>reveal that the majority of existing documents are not
>>deposited by authors. In other words, for most docu-
>>ments, the name appeared in the ^ÓDeposited By^Ô field of
>>a document is not found in the authors of the document.
>
>Since its beginning, the University repository has offered "mediated
>deposit", ie librarian assistance for users depositing their items.
>This "mediation" may be light-touch (editorial corrections) or more
>substantial (with the researcher indicating the existence of a new
>paper, and the librarians filling in all the details). A spectrum of
>assistance has been in operation, although the development of the
>service is moving the responsibility from the library staff to the
>schools, if not completely to the faculty themselves. In tandem with
>this natural development, the University management is shortly to
>require all researchers to deposit their research in the repository.
>
>So I am not entirely sure whose name appears in the "Deposited By"
>field - I will investigate and report back!
>
>At the moment the repository is still dealing with a backlog of
>several years' research. Each school is managing its backlog in a
>different way - some are relying on individual researchers to enter
>their data, some have an individual, some are using their own
>(legacy) databases.
>
>Many of the deposits will therefore not have a single individual who
>is named as the depositor, but instead it may be the school liaison,
>or the library staff who are named.
>
>The picture is further complicated by the national Research
>Assessment Exercise, which requires every researcher to have their
>best papers from the last 6 years made available in the repository.
>The level of detail required is quite excessive (accurate months of
>publication, journal ISSNs and DOIs) but ironically, full texts are
>specifically not required. This has meant that although the
>repository is gaining in size, it is suffereing from a lack of full
>texts (postprint PDFs) which will be adressed after the RAE
>collection period has finished later this year. The RAE processing
>led to a period of excessive sustained deposit, which peaked at 300
>items per day in the early summer of 2006.
>
>The current state of play is that the repository is receiving 30-50
>spontaneous daily self-deposits, independently of the backlog and RAE
>processing.
>
>>Table 3 shows that lack of full text is obvious in IRs in
>>the European institutions except for the University of
>>Trento, based in Italy.
>
>Although I can't comment on non-Soton repositories in general, I can
>say that a study that had included any of the Dutch repositories
>would have found a much higher rate of full text deposit. As I have
>already stated, full texts are much more common in the ECS repository
>at Southampton (about 70% compliance with the school mandate) and are
>expected to be so in the institutional repository (a) after the
>debilitating effects of the UK RAE and (b) after the adoption of the
>University's new policy of "requirement" rather than "support". I'm
>afraid that it's all a matter of MANDATE - the difference between a
>desirable but ultimately optional task and one that you will be
>expected to accomplish.
>
>
>>Self-archiving as a revolutionary way of publishing has
>>been a myth for a long time.
>It's been an actuality in many fields for many years - Physics,
>Economics, Computer Science. The institutional response to Self
>Archiving has been slow to develop - in a sustainable, supported and
>scalable way. And this is probably to be expected (though
>frustrating) when one considers how many years the issues can take to
>be discussed, agreed, adopted, tested, piloted and migrated through
>the whole of the University committee, management and faculty
>structures!
>
>--
>Les
Received on Mon Mar 05 2007 - 16:00:02 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:48 GMT