A bibliometric study of scholarly articles published by library and information science authors about open access
Jennifer Grandbois and Jamshid Beheshti
School of Information Studies, McGill University, Montreal, Canada
Advancements of information and communication technologies over the past two decades have changed significantly the way that information is disseminated and accessed, impacting scholarly communications. Printed journals have become less popular in academic libraries, both due to space limitations and patrons' demands, being replaced with digital publications which can be accessed from many countries across the world. In addition, recent budget cuts in most libraries, as well as the increasing cost of scholarly journals, the prices of which 'have risen faster than the average inflation rate' (Drott, 2006, p.82), have compelled many institutions to contemplate reducing their expensive subscriptions to scholarly publications.
Within this context of limited funds, librarians must find a way to provide the services and materials their patrons' desire in the most cost effective manner. Most information-seekers in this digital age expect free and easily accessible electronic materials (Baich, 2012; Sidorko and Yang, 2009), to the extent that 'in some situations information seekers will readily sacrifice content for convenience' (Connaway, Dickey and Radford, 2011, p. 27). Connaway et al.'s study found convenience to be 'the primary criteria used for making choices' (2011, p. 27), where convenience includes the format of the resource (print or digital), the quality of the content and how much time accessing and using the resource requires. As such, one solution to providing resources that patrons desire without excessive costs to the library involves open access resources, which are freely available online.
Although there are other definitions of open access, in this paper we use the definition presented by the Budapest Open Access Initiative (BOAI). Authors deserve 'control over the integrity of their work and the right to be properly acknowledged and cited,' but this literature should otherwise be freely and easily available on the Internet for anyone to
read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose' (BOAI, 2012, para. 8).
Accordingly the only barrier to accessing peer-reviewed academic literature should be that of having access to the Internet itself (BOAI, 2012).
There are two methods for authors to make their academic literature available in open access venues: publish their articles directly into an open access journal (gold open access) or self-archive their preprint, postprint or PDF, depending on the copyright regulations of the journal in which they are publishing their work, into an open access repository or on their personal Website (green open access) (BOAI, 2012, para.7). Thus to some extent the choice of how an article is made available is in the hands of the authors themselves, although they may be limited by a lack of appropriate journals to publish in or the funds to publish directly in open access journals. To counter these barriers some institutions have open access policies and resources for authors and some research funding agencies now require a digital copy of their funded authors' manuscripts to be submitted into an open access repository upon acceptance to a journal (Zuccala, 2009, p. 35-36).
The benefits of open access extend beyond the money saved by libraries. Antelman's study found that 'across a variety of disciplines, open-access articles have a greater research impact than articles that are not freely available' (2004, p. 379), which is confirmed by other studies that regarded the open access citation advantage (Davis, 2009; Norris, Oppenheim and Rowland, 2008; Xia and Nakanishi, 2012). Similar research analysed impact factor trends of open access journals over several years showing an increased trajectory of open access publishing and demonstrating that open access journals are comparable to more traditional journals (Gumpenberger, Ovalle-Perandones and Gorriaz, 2013; Mukherjee, 2007; Mukherjee, 2009a; Mukherjee, 2009b; Xia, 2012; Yuan and Hua, 2010). De Bellis lists several additional benefits, such as faster publication and dissemination, easier access for researchers and fulfilling the ethical human 'right to know' (2009, p. 299). Evidently there are many reasons to make academic articles available through open access.
Since library and information science authors should be well aware of this problem, it stands to reason that they would want to support open access publishing in every way possible, including adopting this method of publication for their own academic works. Despite this, previous studies have only shown between 27% (Way, 2010, p. 306) and 51% (Xia, Wilhoite and Myers, 2011, p. 796) of publications by library and information science authors available through open access options. However, these studies did not look specifically at articles about open access which would likely have a higher percentage of open access availability, since these authors are more aware of this issue, especially if their articles endorse this method of publication.
This study investigates this issue by gathering information about articles about open access written by library and information science authors, the journals they were published in, and their authors, from 2003 to 2011, to determine:
- What proportion of these articles endorse open access as well as what proportion is available as open access resources (directly or through self-archiving)?
- What are the characteristics of these articles, journals and authors?
- What correlations are there between article, journal and author attributes, and the availability of their publications?
- What are the open access publication longitudinal trends?
Although many studies about open access have been published over the last decade, this literature review targets those which are relevant to the present research and groups them by their main foci for analysis.
Discipline and subject analysis
Bibliometric studies about open access within the library and information science context presuppose differences between various disciplines and attempts to understand open access within this specific context (Kousha and Thelwall, 2006; Mukherjee, 2007; Mukherjee, 2009a; Mukherjee, 2009b; Way, 2010; Xia et al., 2011; Xia, 2012; Yuan and Hua, 2010). Some of these studies were more focused on article and journal analysis, which will be examined in more detail below. Only two of these studies consider the overall rate of article open access availability, which is one of the main focuses of this research (Way, 2010; Xia et al., 2011). To date there is only one study about open access as the subject of sampled articles. Although this study does gather data for multiple years, it is not limited to the library and information science discipline and it focuses on citation analysis, which is not directly relevant (Duzyol, Taskin and Tonta, 2010).
Way sampled articles from the top twenty library and information science journals (including open access journals) and found only 27% of articles available in open access options for 2007 (2010, p. 304-306). Way also emphasized the discrepancy between the high percentage of journals that allow self-archiving and the minimal amount of authors who actually archive their works (2010, p. 306-307). This presents a fundamental problem because 'if professionals in [library and information science] are unwilling to archive their works in repositories, it should not be surprising that repositories face difficulties in recruiting content' (Way, 2010, p. 306-307). Considering that 'a large number of institutional repositories have been initiated and operated by academic libraries' (Xia et al., 2011, p. 799) and that librarian positions are being created specifically for scholarly communications, publishing and digital repositories at university libraries, it is clear that if library and information science authors are not following open access practices themselves there may be a limit to how far the open access movement will advance.
Xia et al. were more concerned with library and information science author self-archiving practices, so open access journals were excluded from analysis, and despite this they reported a more promising 51% open access availability rate for 2006 (Xia et al., 2011, p. 794-796). They were also interested in correlations between author status and other variables, finding publication patterns between librarian and faculty authors to be similar in terms of the percentage of open access publications. Being the first study to compare author status with open access availability in this discipline, Xia et al. were surprised with this finding since 'librarians in general are more familiar with [open access] practice, and they understand better than teaching faculty the importance of free and quick information sharing' (Xia et al., 2011, p. 799). Evidently, understanding the open access publication patterns of library and information science authors is integral to the progression of the open access movement; however, since each of these studies only gathered data for a one year period at a time and there is a discrepancy between their methodologies, it is difficult to estimate the overall trends. Therefore the present research fills a gap by collecting data on library and information science articles about open access for multiple years to investigate the trends in the rate of open access availability.
A few studies focused on the attributes of the articles themselves from a multidisciplinary approach (Antelman, 2004) or within the library and information science field (Kousha and Thelwall, 2006). Aside from citation analysis to determine the impact of open access articles across four disciplines Antelman also tabulated the proportion of preprints and postprints for each selected discipline. The results showed that open access articles are cited between 45% (philosophy) and 91% (mathematics) more often than paid access articles, and also that self-archiving patterns do differ depending on the discipline (Antelman, 2004, p. 376-378). This research contributes to a better understanding of the benefits of open access and demonstrates that the open access culture of each discipline must be studied directly in order to be understood.
Kousha and Thelwall (2006) were interested in the 'motivations for URL citations to open access library and information science articles'. They found that the majority of citations to these open access articles were for formal scholarly purposes (43%) and another 18% were for informal scholarly purposes such as course readings and bibliographies (Kousha and Thelwall, p. 2006, p. 510), indicating that availability of open access contributes to scholarly communications. Thus article analysis is another integral component to understanding open access publications and the present research gathered information on the self-archived formats of articles (preprint, postprint, PDF) amongst other attributes, but within the library and information science discipline.
The impact factor and growth of open access journals have been analysed to determine if these journals are comparable to more traditional methods of academic publications. Whether from the multidisciplinary perspective (Gumpenberger et al., 2013) or from the library and information science viewpoint (Mukherjee, 2007; Mukherjee, 2009a; Mukherjee, 2009b; Xia, 2012; Yuan and Hua, 2010) these studies have shown that open access journals are a viable method of publication, that they are comparable to paid access journals, including in the sense that some journals have more impact than others, and that they have had a steady growth over five to ten year periods of data analysis. This means that the gold open access route can be effective, but since it takes time for these journals to establish themselves and they tend to be newer than most paid access or hybrid journals, it will take time before there are sufficient open access journal options within each discipline.
As mentioned in the introduction, the alternative open access route is through self-archiving, thus when an open access journal is not available authors can choose a traditional journal that allows them to archive their works in an open access format. Laakso found that 81% of the sampled articles were published in journals that allow some form of self-archiving (Laakso, 2014, p. 491), making it feasible to avoid publishing articles where self-archiving is not permitted. However, this assumes that authors are aware of green open access methods and publisher policies, which is not always the case (Kim, 2010, p. 1917).
What has not received much attention prior to this study but is integral to open access is hybrid journals. Many journals that began as paid-access only journals now either have an option for authors to pay for their articles to be available via open access, or in some cases newer issues will be paid access but older issues will be available in an open access archive. Thus the present research addresses this phenomenon by gathering information about each journal included in the study.
Author and country analysis
Since the choice of whether or not an article is made available through an open access option is primarily the author's prerogative, aside from when their affiliation mandate or funding agency policy requires self-archiving of the article, studies that focus on author analysis are also integral to the open access movement. In addition to article availability, Xia et al. (2011, p. 800) also compared first author status with other article attributes and found that faculty were more likely to 'publish longer articles, have more references, and collaborate more often than librarians'. The differences in the publication patterns of different authors indicates that studies and services should take these differences into consideration in order to be more accurate and beneficial. The research at hand is also concerned with the publication patterns of library and information science authors, however, with a more narrow focus on articles about open access and including articles that were directly published into open access journals.
Country analysis is grouped with author analysis since it is based on the country of the authors who published articles, rather than the country of the journal or publishing body. There are few studies that focus on country analysis and so far there does not appear to be any within the library and information science context. More broadly though, Sotudeh and Horri (2009, p. 21) determined that 'a relatively broad range of countries participate in [open access] movement whether by universally sharing and maintaining their [open access journals], or by submitting their scientific outputs'. This is promising since international support is required for the open access movement to succeed and studies such as this one shed light on which countries are flourishing in terms of open access practices as opposed to countries which may need more information and support.
The above studies lay an integral foundation for the current study; however, no research conducted to date has taken into consideration all aspects under investigation within this context. Considering a substantial difference (25%) in the findings of the two studies that tabulated the proportion of library and information science publications available as open access resources (Way, 2010; Xia et al., 2011) and given that each of these studies only collected data for single year periods, there is a clear need for further investigation. More information is needed to determine whether or not the open access movement is gaining momentum and the level of support required to promote open access. This study aims to fill some gaps in the research and gain a more comprehensive understanding of the longitudinal trends in publications by library and information science authors about open access.
This quantitative study gathered multiple points of data concerning scholarly articles about open access written by library and information science authors in order to better understand their open access publication patterns. Similar studies selected top-ranked library and information science journals from directories in order to perform their searches within this discipline (Way, 2010; Xia et al., 2011). However, for the purposes of the current study, searching a subject database was deemed more appropriate, which is the method utilized by Duzyol et al. (2010). Library and Information Science Abstracts, was used as the primary database in the field, since it indexes open access, paid access and hybrid journals. This search was conducted between January and March 2014.
The search in Library and Information Science Abstracts was conducted by requiring that open access was found in the title of the record of English-language articles that were considered peer reviewed published in scholarly journals within the specified year range. Articles found by this method that fit the search criteria were then searched in Google Scholar by full title in quotation marks to determine if paid access articles were available in an open access format (Mukherjee, 2009a; Norris et al., 2008; Way, 2010; Xia et al., 2011), as well as which self-archive format was used (preprint, postprint, PDF). Journal titles from these articles were searched in the SHERPA/RoMEO database or on the Websites of the journals themselves, if not available in this database, to determine information about journal types and self-archiving policies (Laakso, 2014). Table 1 indicates the type of data gathered from each article that fit the criteria as well as how it was found.
|Bibliographic information||Title, author, year of publication, journal, number of pages, number of references||Most information was included in the bibliographic record of each article. Unique references were counted manually when not included. Article title and author name were used to retrieve further information.|
|Article type||Research, case study, review, other||Article was reviewed when type of publication was not specified. Research included a method section, case studies were self-identified, reviews regarded other pertinent literature and research, other included editorials and conceptual papers.|
|Article availability||Open access, paid access, both||Determined by searching Google Scholar, followed by Google if not found, without university proxy.|
|Article self-archive format||Preprint, postprint, PDF, not applicable||Determined by reviewing the archived article file when found. Not applicable was automatically applied to directly open access articles.|
|Endorse OA||No, yes, uncertain||Determined by reviewing the article. Uncertain meant that both benefits and disadvantages of open access were listed with a neutral tone.|
|Journal type||Open access, paid access, hybrid||Determined by searching for each journal in the SHERPA/RoMEO database or conducting a Web search for journals not found in this database.|
|Journal self-archive policy||Preprint, postprint, PDF, unknown||Same method as journal type.|
|Author information||Number of authors, status, country||A search of the Web was conducted if author status or country was not listed within the paper itself.|
Criteria and categorization
Since this research is focused on the publication patterns of library and information science authors, any paper that did not have at least one author that fit this criterion was discarded. While conducting a search of the Web about the author it was determined whether they were an academic, librarian or other professional with a Master's degree in Library and Information Science or an equivalent degree. Academics included professors, lecturers and researchers, whereas librarians and other professionals included individuals with the librarian title even if they had faculty status within their organization, or professionals working within a related industry that held a library or information science degree. In some cases when researching the author it was discovered that the individual was both an academic and a librarian so a new category was created for this group (both).
Time was another limiting factor. Originally it was intended to gather information about articles published between 2001, the first year the Budapest Open access Initiative conference took place, and 2011. However, there were no articles fitting the search criteria published in 2002 or earlier, so only relevant articles from 2003 until 2011 were included in this study. The cut-off of 2011 is important for validity, since authors may have an embargo of up to twenty-four months between when their article is published in a paid access or hybrid journal and when they can self-archive their article (Laakso, 2014, p. 482). As such, including more recent articles might skew the results.
Book and article reviews, as well as articles or communiqués of less than two pages, were not included in this study, since the interest was in original and substantive publications about open access. In total, 203 articles out of a possible 327 publications were deemed to fit the criteria for the present research.
Excel and SPSS were used to derive descriptive statistics, determine if any correlations were present in the gathered data and produce trend analysis. Initially the interest was to investigate the correlations between article availability and all other variables, however, no significant correlation was found for this relationship.
Endorsement and availability
Of the 203 articles gathered for this study, 190 (94%) appeared to endorse open access, with only one article that obviously did not endorse it, and 12 (6%) articles were ambivalent. Combining two methods of open access publishing, 122 (60%) of the articles were made available via open access. More specifically, 81 (40%) articles were paid access only, 74 (36%) were directly published in an open access journal and 48 (24%) of the originally paid access articles were self-archived (Figure 1).
When regarding article availability per publication year, aside from 2007, direct publication in open access journals has always been below that of strictly paid-access articles (Table 2). However, when self-archived articles are included in the total open access article count, only 2004 and 2005 have more paid-access articles. The three highest years for combined open access are 2009 (27 articles or 13%), 2007 (22 articles or 11%) and 2011 (20 articles or 10%), whereas for the percentage of open access availability the highest years are 2003 (100%), 2007 (85%), 2008 and 2009 (60% each).
The goodness-of-fit tests for cumulative distributions show linear trends for paid access, open access and self-archived articles (Figure 2). As Table 3 indicates, the coefficient of determination (R2) for all the distributions is highest for the linear equation, suggesting a steady and relatively slow pace of growth for publications on the topic of open access. Among these distributions, the self-archive category has the slowest growth rate, while the paid access category has the highest growth rate. However, when comparing the cumulative open access total, including self-archived articles, with that of paid access articles, the linear growth is comparable until 2006, after which open access takes the lead.
y = a x + b
y = a xb
y = a ebx
|Paid access*||0.9702||y=10.595 x - 21230||0.9236||0.9234|
|Open access*||0.9836||y=10.429 x - 20901||0.8813||0.8810|
|Self-archived||0.9705||y=6 x - 12020||0.9395||0.9393|
|* For the purposes of calculation, eight data points were used|
As shown in Table 2, there has been an increase in publications about open access over the years and the three most recent years were the most productive, but the number of publications fluctuates depending on the year. Overall, 2009 was the most productive with 45 (22%) articles, followed by 2011 with 34 (17%) and 2010 with 27 (13%).
Figure 3 displays the percentages of articles by type. The majority of articles published about open access were original research, especially if the categories for case studies (39 or 19%) and research (90 or 44%) are combined (129 or 64%). Reviews were the second most common type (55 or 27%), with other types of articles being the least common (19 or 9%).
Only forty-eight of the 129 initially paid access articles were archived (37% self-archive rate): of these, the majority were archived as postprints (25 or 52%), with similar amounts of preprints (12 or 25%) and PDFs (11 or 23%). It was also found that eight articles (17%) were archived in formats not expressly allowed by their publisher's self-archive policy, although at least one of these did receive permission to do so.
Of the 203 articles, fifty-one (25%) were published in open access journals, two (1%) in strictly paid access journals and 150 (74%) in hybrid journals. Most articles published in hybrid journals were nonetheless strictly paid access (80 or 53%), forty-seven (31%) were self-archived and twenty-three (15%) were open access available from the journal itself. Of these twenty-three directly open access articles in hybrid journals, thirteen involved the authors paying for their article's open access availability, seven were archived by the journal and the policy for the remaining three could not be determined.
From the data gathered, there were a total of sixty-four unique journals of which sixteen (25%) were completely open access, two (3%) were strictly paid-access and forty-six (72%) were hybrid journals. All open access journals allowed authors to retain the copyright to their articles, whereas one of the paid access journals (School Libraries Worldwide) allowed joint copyright, and the copyright for the remaining journals was transferred to the publisher upon acceptance. The majority of hybrid journals (forty) allowed authors to choose to pay for open access, three journals allowed open access to archived issues (American Archivist, Law Library Journal, Learned Publishing), two journals have open access articles but the policy for the Journal of Scholarly Publishing is unclear whereas the Malaysian Journal of Library and Information Science requires the author to transfer their copyright to the journal for publication, and the South African Journal of Library and Information Science became open access in 2009 but not all materials from before that year are freely available.
Aside from three journals (5%) whose self-archive policy could not be determined, all journals in this study allowed some form of archiving. The majority of journals allowed the postprint of the article to be archived (40 or 63%), followed by the PDF (20 or 31%), and just one journal allowed only the preprint to be archived (Journal of Scholarly Publishing). Of the twenty journals that allowed the final copy of the article to be self-archived (PDF), sixteen were open access journals, three were hybrid journals (American Archivist, Libri, South African Journal of Library and Information Science), and one was a paid access journal (School Libraries Worldwide).
The publications gathered for this study had between one and five authors per article. There is a negative correlation between the total number of authors per article and the frequency of articles. The majority of articles had only one author (117 or 58%), followed by two authors (63 or 31%), three authors (18 or 9%), four authors (4 or 2%), and only one article was published with five authors.
The status of each author who published an article was combined so that if all authors were academics the author status was designated academic, and if all authors were professionals the author status was coded as professionals. However, if there was at least one academic and one professional, or if one author had been categorized as both, the author status was classified as both. Based on this classification, the majority of combined article author statuses were professionals (105 or 52%), followed by academics (56 or 28%), and the remainder were both (42 or 21%).
Although there was no statistically significant correlation between combined author status and article availability, the results show that professionals published the largest number of articles about open access in open access journals (45 or 22%), followed by academics (16 or 8%) and articles published by both types of authors (13 or 6%) as shown in Figure 4.
As shown in Table 4, of the 48 self-archived articles, professionals archived 21 articles (44%), academics archived 14 articles (29%) and both author types combined archived 13 articles (27%). A difference was found in the percentage of self-archived preprints, of which professionals accounted for 75%.
When self-archived articles are included in the total open access count (Figure 5), although all author statuses published more open access articles than paid access articles, professionals have the highest open access contribution (33%), followed by academics (15%) and both author types combined (13%). However for the open access advantage of each status, professionals are again in the lead (63%), closely followed by both (62%) and academics (54%).
Some significant relationships were found between combined author status and article attributes. Author status compared with article type showed a highly significant correlation (χ2 = 42.323, df = 6, p <0.000). As shown in Figure 6, professionals published the vast majority of case studies (31 or 79%) and reviews (40 or 73%), whereas academics published the majority of research articles (39 or 42%). Other article types had an even distribution.
Using one way analysis of variance (ANOVA), significant results were found for combined author status compared to the length of articles (F = 5.827, df = 2, 200, p <0.003) and number of unique references (F = 7.132, df = 2, 200, p <0.001). As demonstrated in Figure 7 the mean number of pages for academics is 13.68 pages versus the mean number of pages for professionals at 10.10 pages. For articles written by both authors the mean is at 11.57 pages. Figure 8 shows a similar pattern with the mean number of unique references for academics being 30.32 versus professionals at 18.70, and the mean for articles written by both types of authors is 24.26.
A relatively significant relationship (at p<0.1) between combined author status and the number of authors per article was found (χ2 = 8.384, df = 4, p = 0.078). Professionals published the majority of articles written by one author (66 or 56%) and two authors (32 or 51%), whereas articles published by three or more authors were more likely to be written by both authors (9 or 39%), as shown in Figure 9.
As with author status, the summary of author country per article was determined by classifying an article that has multiple authors from the same country as that country, or authors from more than one country per article under multiple countries. Based on this classification the top six countries for publications about open access are the USA (91 or 45%), the UK (19 or 9%), India (17 or 8%), Canada (10 or 5%), Iran (8 or 4%) and Germany (5 or 3%).
Although the USA is in the lead in terms of overall publications about open access, it does not have the highest rate of making articles available as an open access resource. The USA published forty-three of a total of ninety-one articles in open access venues (47%), compared broadly to other countries which published seventy-eight of a total of 112 articles as open access resources (70%).
Figure 10 demonstrates the frequency of open access versus paid access articles in individual countries aside from the USA. Countries with a 100% open access publishing rate are China, Denmark, France, Germany, Greece, Iran, Japan, Netherlands, Nigeria, Norway, Peru, Philippines, South Africa and Spain. Countries with a 50% or higher open access publishing rate in descending order are India (88%), Canada (70%), Australia (67%), Korea, Malaysia and Slovenia (50%).
Endorsement and availability
Overall the findings of this study reported a higher open access percentage (60%) than previous studies which focused on publications by library and information science authors. This means a 33% increase over Way's findings (2010) and a 9% increase over those of Xia et al. (2011). It might be argued that some of this difference is due to the inclusion of other article types since Xia et al. (2011) only included research papers, but if only research articles are considered in the present study, the open access rate is even higher (66%), whereas non-research articles are reported at 56% (Table 5).
|Availability||% open access|
|Open access||Paid access||Total|
Another way to consider this is based on the years that data were collected, since Xia et al. (2011) collected data for 2006 (p. 794) and Way (2010, p. 304) collected data for 2007. In this case the current study has very similar findings to that of Xia et al. (2011), 53% compared to their 51%, whereas the difference between this study and Way's (2010) is increased, 85% compared to 27%, despite Way having 14 of the 20 journals examined in common with the present study. This discrepancy is likely in part due to the present study's focus on articles about open access, but is also affected by Way's methodology, in which data was collected within a year of the original article publication (2010, p. 304). As mentioned above, this might not allow sufficient time for authors to be able to self-archive their works due to journal embargo periods, which can be up to two years. The current study allowed a little over two years between the latest article publication year and data collection, and received similar results to Xia et al. (2011) which waited a little over three years.
Nonetheless, a higher open access rate was expected for the present research since the articles included in this study were specifically about open access, 94% appeared to endorse open access within the article that they published and 98% of the articles were published in journals that allowed the author to self-archive. Since statistically significant relationships between availability and other variables were not determined in this study, further research needs to be conducted in order to understand why library and information science authors choose to publish their articles into open access or paid access journals and whether or not to self-archive their articles.
It was expected that the rate of open access publications would increase steadily over the years after the Budapest Open Access Initiative began, due to raised awareness and greater establishment of the benefits of open access. The cumulative distribution for availability shows a steady increase of combined open access through linear growth. Whether in the future the rate of growth of publications on open access remains linear, becomes exponential or diminishes remains to be seen.
Although there is some fluctuation, overall the number of articles about open access is increasing; however, there was a spike in 2009 that was not matched by any other year in this study (Table 2). Due to the delay between when an event occurs and when articles would be published about it, the increase in articles is perhaps related to the first Open Access Day 'held on October 14, 2008' and the related announcements 'beginning on August 4, 2008' (Curran, 2009, p. 34). The increased awareness brought the open access movement to the forefront briefly, but then the rate of publications reverted to the steady increase seen in most other years.
The self-archive rate (37%) was lower than expected since 98% of these articles were published in journals that allowed the authors to archive their works. This indicates that other factors affect the author's choice to archive articles. Kim (2010, p. 1909) argues that 'concerns about copyright, extra time and effort, technical ability, and age' are all factors that negatively affect author self-archive rates and that one solution to this problem is providing the right services.
For the articles that were archived, the library and information science field appears to have a preference for postprints (52%) compared to preprints (25%) and PDFs (23%). Antelman (2004, p. 378) compared the proportion of preprints and postprints in four disciplines, which ranged from predominantly preprints in mathematics (close to 90%) to predominantly postprints in engineering (close to 80%). Evidently the pattern of archiving articles varies greatly depending on the discipline. Kim (2010, )p. 1919 found this to be true from the qualitative perspective as well, stating that 'a self-archiving culture significantly affects the extent that professors self-archive their research'. However, since Kim only surveyed professors in broad subject areas, the closest category being social science, it would be ideal to conduct a survey on library and information science authors specifically with consideration for the differences between author status.
Another interesting finding involves authors archiving their works in a format that was not generally allowed by their publisher (17% of self-archived articles). Permission can sometimes be granted for this practice by the publisher and it was clear that at least one of those eight articles did have such permission. Nonetheless, this points to one of the issues found in Kim's (2010, p. 1917) study, that the 'majority of interviewees expressed concerns and uncertainty about copyright issues involved in self-archiving'. Thus not being aware of what rights authors have, not being sure what format is allowed based on what journal they published in and not being aware where to find this information is a problem that needs to be addressed when creating services that promote open access publication generally and self-archiving specifically.
As mentioned in the results section, the majority of unique journals in this study are hybrids that have both open access and paid access elements to them (72%), followed by strictly open access journals (25%) and a minority of strictly paid-access journals (3%). It is very interesting that traditional journals are becoming a rarity, which indicates a change in culture even though only 9% of articles published in hybrid journals include the author paying for open access to their article. This low rate is likely because 'for each accepted article the author is charged a fee of from $300 to over $3,000 depending on the journal' (Drott, 2006, p. 94). Even if it is also argued that 'almost every open access journal has a policy of permitting authors to request that fees be waived' (Drott, 2006, p. 95), it is unclear whether this is also true of hybrid journals, which is another issue that needs to be studied further.
Like traditional journals, not all open access journals have the same impact (Mukherjee, 2009b), thus authors should research the journals they choose to publish in. Although publishing directly into open access journals is ideal in the sense that authors retain their copyright, if a high quality open access journal is not available then subscription journals may be the better choice. When that is the case authors should at least ensure enough rights are retained to archive their articles. This is certainly feasible in library and information science since 98% of these articles were published in journals that allow self-archiving, with the remaining 2% having policies that could not be determined. This is 17% higher than the level shown in Laakso's (2014) results from a multidisciplinary approach. Some of this discrepancy might be due to the smaller sample size and the inclusion of open access journals in the current study; however, when open access journals are excluded, 97% of the remaining articles could still be self-archived.
Many comparisons can be made between the findings of this study and that of Xia et al. (2011) in terms of author status correlations. This study confirms that there is no statistically significant correlation between article availability and author status, and that academics/faculty are significantly more likely to publish longer articles and have more references per article. In terms of author status compared with the total number of authors per article, academics/faculty were only somewhat more likely to collaborate (41% of articles had more than one author) than professionals/librarians (37%).
However, some differences between these studies should be noted. In the study by Xia et al. (2011) study the open access percentages for professionals/librarians (52%) were much closer to that of academics/faculty (51%) than they were in the present study, in which professionals or librarians were found to have 63% and academics and faculty had 54%. Also, the total publications of faculty made up 69% (Xia et al., 2011, p. 796), whereas in this study the majority of articles were written by professionals (52%). This could be because Xia et al. (2011) only considered the status of the first author rather than combined author status and because they excluded all non-research papers from their study, since faculty were found to be 68% more likely to publish research than other article types in this study.
Considering this study focused on articles about open access, it is less surprising that there were no differences found between author status and article availability, since all authors in this study are aware of the open access movement to some extent. The findings of other correlations with author status are fairly reasonable since academics are often expected to conduct more theoretical or conceptual research as part of their employment, and resources are allotted for this, which can account for longer articles, more references per article and being more likely to publish research articles.
Another interesting finding is that although the USA accounted for the vast majority of articles about open access (45%), it did not have the highest rate of open access availability at 47%, compared to every other country combined at 70%. Author surveys should be conducted to understand this gap between theory and practice in the USA, and, if resources allow, surveys of stratified samples from countries with different open access rates.
One of the limitations of this study is the accuracy of author status classification. Xia et al. (2011) suggested that this classification could be inaccurate when not included within the paper or bibliographic record since a Web search usually reflects the author's current status, rather than that at the time of publication. In some cases, a more in-depth Web search may increase accuracy; however, limited information may be available about an author's previous employment.
Another constraint of the study may be the limited sample size. Data could be collected on all articles written by library and information science authors to increase the sample size. It may be also feasible to conduct a similar study across multiple sources and disciplines for comparative analysis and more generalizable results.
This study served to develop a greater understanding of the characteristics of scholarly publications about open access by library and information science authors over multiple years. Additionally, by comparing the methods of this study with previous studies it was confirmed that at least two years are required between when an article is published and when data should be collected to reflect an accurate rate of open access availability.
Since the vast majority of these articles were published in journals that allow author self-archiving (98%), the low rate of self-archiving (37%) is indicative of a problem that requires further research in order to properly promote and support the practice. This study found a clear gap between theory (94% of articles endorsed open access in writing) and practice (60% of articles were published in an open access venue), which could be better understood by conducting a qualitative study that determines the motivations and barriers in authors' choices in terms of whether or not they publish directly into open access journals and whether or not they self-archive their articles. As Björk (2004, Conclusions, para. 1) articulates:
Trying to get researchers to support the move towards open access, which most agree would be good for the advancement of science in principle, is like trying to get people to behave in a more ecological way. While most people recognise the need to save energy and recycle waste it takes much more than just awareness to get them to change their habits on a large scale. It takes a combination of measures of many different kinds.
Since the financial circumstances of libraries are unlikely to improve, a viable solution needs to be implemented consistently in order to achieve positive results. This solution will involve continuous iterations of research to determine the state of open access practice, what barriers are preventing further growth, and assessment of the benefits and limitations of current promotions and services. Considering the majority of publishers allow some form of author archiving, it is very important to address the low rate of self-archiving through further research and the creation of services for authors. The steady linear growth of publications on open access in library and information science is encouraging and it is hoped that with more research and more publications there will be a growing awareness among academics and professionals about the issues and challenges of open access.
We would like to thank the editors and reviewers of Information Research as well as all the proponents of the open access movement.
About the authors
Jennifer Grandbois completed her Bachelor of Arts (Honours English Literature) at Concordia University followed by her Master of Library and Information Studies (Librarianship Stream) at McGill University. Her research interests include assessment of library services and resources, information literacy and information-seeking behaviour, as well as scholarly communications and the open access movement. She can be contacted at: firstname.lastname@example.org.
Jamshid Beheshti has taught at the School of Information Studies at McGill University for more than twenty five years. He has been the principal investigator and co-investigator on more than a dozen Social Sciences and Humanities Research Council of Canada grants, and has published widely in many international journals, including JASIS&T, Information Processing & Management, and Education for Information. He has recently co-edited two books, The information behavior of a new generation: children and teens in the 21st Century (with Andrew Large, 2013), and New directions in children's and adolescents' information behavior research (with Dania Bilal, 2014).