The role of trust and authority in the citation behaviour of researchers
Clare Thornley, Anthony Watkinson, David Nicholas,
CIBER Research Ltd, Newbury, UK.
University of Tennessee, College of Communication and Information, 1345 Circle Park Drive, Knoxville, TN 37996-0341, USA, 865 974 7911
Hamid R. Jamali, Eti Herman,
CIBER Research Ltd, Newbury, UK
Suzie Allard, Kenneth J. Levine and Carol Tenopir,
University of Tennessee, College of Communication and Information, 1345 Circle Park Drive, Knoxville, TN 37996-0341, USA, 865 974 7911
The question of what citations to a research publication mean and whether their frequency should be used to judge the quality and usefulness of that research is a controversial issue within the academic and scholarly environment (Smith, 1981; MacRoberts and MacRoberts, 1989; Bornmann and Daniel, 2008; Tressler and Anderson, 2012) and this has become more so with the increase in research evaluation metrics in higher education. Is it really the case that a citation to a research publication, the act of connecting text statements through reference to the broader literature (Greenberg, 2009), indicates that it is a high quality and/or particularly useful contribution? Can it then be concluded that the more times something is cited the more useful and higher quality it is?
There are many potential complexities to these questions. What exactly do we mean by quality and usefulness (Knight and Burn, 2005; Wang and Strong, 1996) and how do they relate to what we mean by impact (Bornmann, 2013; Costas, van Leeuwen and Bordons, 2010; Wouters and Costas, 2012)? What about the important differences between disciplines in how researchers use and cite work (Harzing, 2010; Lancho-Barrantes, Guerrero-Bote and Moya-Anegón, 2010, Hellqvist, 2010), which make it difficult, for example, to compare the significance of the different citation scores of a physics and a history department. One way of shedding light on these questions is to ask researchers their motivations and reasons for citing the information sources used in their publications. This paper discusses the results of one such study.
The study was one element of a larger study, the Alfred P. Sloan Foundation funded 'Trust and authority in scholarly communications in the light of the digital transition' (Nicholas et al., 2014), which examined whether changing technologies for scholarly communication have influenced how researchers judge the trustworthiness and authority of sources which they use, disseminate and cite. We conducted interviews with eighty-seven researchers and went through one of their most recent publications with them asking why they had cited a selection of their sources and investigated their motivations for using them. Forty-five researchers were from the UK and forty-two researchers were from the USA. Their disciplines included the physical sciences, biological/life sciences and the social sciences. The main question posed about citation behaviour was: how important is the author's perceived trust in the authoritative nature of the publication that they cite?
We were investigating the extent to which citing a piece of work can be seen as an indicator that a researcher regards it is an authoritative and important contribution. As such this research is an exploration of citation behaviour in the context of trust and thus differs from most previous work on citation behaviour (Bornmann and Daniel, 2008), which has tended to focus on motivation in the broader sense. We do not assume that the question 'why did you cite this reference?' is exactly the same as 'why did you trust this reference?', rather we are exploring how they may be related. This paper then explores this in detail by investigating how researchers had come to make the decision to cite and what factors were seen as important, with a special focus on trust.
Scope of study
The context of the wider study was whether researchers' perceptions, practices and means of judging trust and authority have changed as a result of the digital transition, and if so, in what ways. Given the problematic nature of some of the information it has on offer, the Web-centred digital environment is a complicated one in which to make quality and reliability judgements (Casadevall and Fang, 2012; Fisher, Laurí and Chengular-Smith, 2012). The Web has greatly increased the number of people who are able to publish information and the size of their potential audience. The complexity and uncertainty thus characterising the digital world makes the mechanism of trust increasingly important. This is because trust, the willingness of a party to be vulnerable to the actions of another party, is a very effective mechanism for reducing the complexity of human conduct, particularly in risky and uncertain situations, as it enables people to come to decisions without first considering every possible eventuality (Corritore, Kracher and Wiedenbeck, 2003; Grabner-Kräuter and Kaluscha, 2003).
Trustworthiness, therefore, assumes a crucial role in our increasingly complex use of information. This has particularly far reaching implications for scholars, who have to base their investigations on accurate information and see to it that their contributions to knowledge are communicated reliably (Merton, 1973). The digital transition and increasing political pressures to closely assess scholarly productivity, with their detrimental effects on the value and dependability of some of the knowledge produced and communicated (Bauerlein, Gad-el-Hak, Grody, McKelvey and Stantley, 2010; Fisher et al., 2012), seem to necessitate that researchers approach their information sources, channels and metrics more carefully than ever.
In terms of citation behaviour, the digital transition has also increased the nature and availability of research publications which could potentially be cited. Changes in technology have facilitated changes in practice in both how research information is produced and made available. New methods of analysing documents include, for example, Webmetrics (Thelwall, 2010), which uses links to and between pages to map impact and subject relationships. This emerged in the 1990s, but its usefulness as a reliable tool is still debated. These changes have created an increase in the number of documents that can potentially be cited, new ways of finding them and new ways of measuring their citations. This can make it harder for researchers to comprehensively check all the information sources they use and thus potentially increase the importance of other approaches to establish, or at least strongly suggest, which sources should be trusted. Availability from formal publishing channels has also been increased by remote electronic access to journals which were previously print only, as well as facilitating the rise of digital-only journals, some of which are open access.
Technology enables researchers to share their own research by, for example, putting papers up on their own Website or a social media site. It has reduced the limits of space in traditional publishing channels, so images and data can be published as well as, or alongside, academic papers. The nature of researcher-researcher interaction has been changed by technologies such as e-mail and, more recently, the development of social media. The growth of social media includes, amongst other developments, self-publishing through blogs, networking sites such as Academia.edu and Research Gate, online forums and micro-blogging through Twitter. These all provide new and instant ways for researchers to disseminate their work and also new ways of finding out about other people's research.
This study was conducted through interviews with researchers in the physical sciences, biological/life sciences and social sciences as these disciplines were the focus of the funding body, which limited the remit to these subjects. This provides a reasonably representative selection of academic disciplines for a qualitative study focusing on motivations and the decision making process of individuals, rather than seeking to draw statistically significant generalizations about differences between different disciplines. Citation behaviour is central to decisions about trust and authority, as the decision to formally cite a publication in a researcher's work would normally be seen as an indication that it is a trusted and authoritative source; it has been read by the researcher and seen to have at least some relevance and useful input into the researcher's own work. We were testing and exploring this assumption.
What is new about this study?
This study provides new insights into the methods, theory and technological context of citation behaviour. It is different from most previous studies (Bornmann and Daniel, 2008; Harwood, 2009) in that it used a critical incident semi-structured interview technique to elicit responses rather than asking participants to choose from a pre-defined list of reasons for citing. We termed these critical incident interviews as we were aiming to explore the reasoning behind the citation being made or why that particular cited reference passed or succeeded in being included as a reference. What exactly was the decision making process involved in deciding whether to include or exclude a particular reference? The open nature of the interviews added a depth of understanding to citation motivations that can be missed when participants are just asked to choose a reason from a pre-defined list, especially as the categorization of motivations may not exactly match those of the interviewees.
A potential weakness of the open nature of the interviews over an anonymous questionnaire is that authors may be less likely to admit to citing for strategic reasons when their answers are being observed. There is no guarantee that this did not influence their answers though, as we found that for most authors their reasons were complex and multi-faceted, rather than either strategic or purely academic, as discussed in more detail later. As such, authors did not need to admit to strategic citing but openly discussed how political factors could have some impact in some cases, but that this was rarely, if ever, the sole reason. We asked authors to consider to what extent they thought each citation was authoritative and what factors had influenced this decision. This allowed a discussion of the complex intellectual and social factors that came to play in what was often a very considered process.
The quality of the data allowed us to make more nuanced judgements about whether the reasons for citing supported or refuted current theories of citation behaviour on the respective role of normative issues (Merton, 1973) to do with quality of content, and constructionist issues (Gilbert, 1977), which focus on social and political influences. In particular, we learnt that the relationship between content and context as a motivation for citing is complex and has close links to the related debate in the relative role of content and context in meaning within information retrieval.
The technological context in which citations now occur has changed significantly since many of the existing citation behaviour studies were carried out. Scholarly communication, both in terms of formal publishing and more informal methods of information sharing, has been changed by developments in technology. Citation behaviour takes place in the broader context of scholarly communication, so is likely to be affected by these wider changes. We investigate whether these changes in technology, known as the digital transition, have altered how researchers go about deciding what to cite.
There is a considerable body of research examining student approaches to establishing the credibility of information sources, particularly in the digital context; for example, the recent study of how students evaluate Wikipedia (Rowley and Johnson, 2013). In contrast, despite the extensive body of work on citation motivation of researchers, there has not been much focus on credibility and trust issues within this topic. Techniques and approaches to establishing trust and credibility have been found to be important in how students use sources and we investigate whether this can be extended to how researchers use sources in terms of the role of trust in deciding what to cite.
Theories of citation behaviour
What current theoretical frameworks do we have to provide a context for the question of what citations might mean? There are two main theoretical frameworks with competing perspectives on the relative role of intellectual content and the social and political power context in terms of what drives people to cite (Bornmann and Daniel, 2008; Nicolaisen, 2007). The normative theory of citing behaviour (Merton, 1973, 1988) claims that a citation is an acknowledgement of the intellectual influence of the cited work. As such it is generally appropriate to use citation counts as a method of evaluating research, as each citation can be seen as an endorsement by one's peers.
The alternative view is the social constructionist perspective (Gilbert, 1977), which demotes the importance of intellectual content as a motivation for citing and emphasizes the importance of the social context in which researchers work. They may cite work to create a certain impression or to try and persuade their peers of certain viewpoints. In this case, citing can be seen as a tool of rhetoric rather than a certain acknowledgement of intellectual value. Researchers sometimes cite documents not because they think they have made an important intellectual contribution, but because they think the citation will make the researcher's argument more convincing. Thus, rather than simply indicating an acknowledgement of intellectual content, a citation should be seen as action strongly influenced by the social and power context of its author.
In terms of the value of citation counts, or bibliometrics, as a tool for measuring the quality of research if the normative theory of citation is correct, it would seem broadly a fair evaluation system. This is clearly contingent on the assumption that reliable data was gathered (Garfield, 1986; Thornley, McLoughlin, Johnson and Smeaton, 2011) and that disciplinary differences were taken into account. If the social constructionist theory of citation is correct, then citation measures are not a fair evaluation system. Empirical studies which examined the evidence for and against these theories as discussed by Bornmann and Daniel (2008) suggest there is strong evidence that, in most cases, the normative theory is a better fit with the data. People tend to cite mainly because they are using and acknowledging the intellectual content of what they cite. This is also supported by studies which show a correlation between high citation counts and other measures of esteem such as the Nobel Prize (Brooks, 1985; Garfield, 1986).
There is some work which aims to synthesize these two views, or at least acknowledge that they may both show some aspects of citation in practice. Cronin's (1984) work, 'The citation process', embeds citation within the social context of science. Small's (1978) work provides some interesting insights into the role of context or scientific culture in how citations are made. He investigates how some highly cited documents in science appear to take on a cultural significance which means they almost must be cited in order for scientists to locate themselves in the correct tradition. His work used the method of textual analysis of citing documents and our qualitative, author-based method adds to this by collecting more data on how authors experience the citation decision-making process. More recently the inadequacy of the two major theories has been discussed by Camacho-Miñano and Núñez-Nickel (2009) through a comprehensive review of the literature and our work builds on this by providing empirical data.
Theories of meaning
The discussion of the relative importance of content or context (which is sometimes also known as use or social context) in analysing the meaning and significance of documents is also a central problem for information retrieval (Blair, 1990; Thornley and Gibb, 2009). It is important to note that the original motivation behind the creation of citation indexes was not research evaluation but improved information retrieval (Garfield, 1955). The premise was that, if you found one document that was relevant, then looking at documents that had cited that document would lead you to other to relevant documents. This is the case even if their content may not be similar enough to the original document for them both to be retrieved by the same search. This insight that connections between documents can be as important as the content of documents in establishing relevance has also been exploited by more recent search technologies such as Google (Brin and Page, 1998). Relevance, itself a much studied phenomenon (Borlund, 2003), is not the same thing as a document's content simply being similar to the user's search terms. It is possible for two documents to be relevant as a response to the same query, even if the two documents have very different content.
Thus, in one sense, the very existence of citation indexes tells us that researchers do not use documents simply because they have similar content to their own research problem. The ways in which a document can be relevant to research are multi-faceted, which means the motivations for use are multi-faceted, which suggests that the motivations for citing are also likely to be multi-faceted. There are many different kinds of use, or 'language games' to use Wittgenstein's (1958) term: thus, citation means many different kinds of things. This comparison between theories of meaning, which discuss both content and multiple types of context, and most theories of citation, which, at present seem to only acknowledge the importance of content or context, suggests that the dichotomy between the theories of citations is likely to be too simplistic. There are some exemptions to this. Wouters (1999) discusses the role of meaning in citation analysis and suggests that developing a theory of citation must involve recognition of the relationship between the content of the text and the process by which it is created and then cited. Our work provides some qualitative data from the perspective of researchers on how these two may be connected. Citation can be understood as a type of meaning behaviour, as it involves drawing connections between content to communicate to the reader. As such it includes complex relationships between content and context which are central to the nature of meaning. Context is also part of why we trust certain information sources over others.
There are two main approaches to collecting data on citation behaviour (Bornmann and Daniel, 2008). The first method involves careful context and content analysis by a reader of the actual text (usually an academic paper) in which the references are cited to ascertain the reasons for the citations, and is known either as the context or reader-based approach. The second method, used in this study, involves directly asking the author for their reasons, and is known as the author-based approach.
The reader-based approach uses the context within the text of a citation to discover what the motivation for citing was. Is it, for example, vital background to the new study, a counter claim or an incidental mention? One problem with the approach is that it relies on a third party (not the author) to accurately surmise the meaning of the citation from the text. A recent study by Willett (2013), which compared how readers' interpretation of the meaning of citations corresponded with the authors' motivations for the same citation, found a very low overlap of interpretations. This was despite the fact that in Willett's (2013) study both authors and readers were from the same academic department and thus had a substantial level of shared expertise and subject knowledge. The conclusion from his work (Willett, 2013, p.151) is that 'reader-reasons and author-reasons for citing particular references are typically so different that it would be unwise to use the former as a proxy for the latter'.
It is, of course, possible to argue that there is no compelling reason to favour the author interpretation over that of the informed and expert reader. If a reader believes a citation has been used in a certain way that does not correspond to the original intention of the author, it is possible that the reader has noticed something about the relationship between the two texts that the author had missed. Theoretical work in literary criticism, such as the introduction of 'the intentional fallacy' (Wimsatt and Beardsley, 1954) tends to demote the author and question the value of scrutinizing their exact intentions and motivations at the time of writing. This approach may have less relevance in the field of academic writing than in literature, particularly when looking at very recent publications, but it probably is important to acknowledge that an author's interpretation is not an absolute truth. The results from Willett's study do, however, clearly show that the reader viewpoint should not be used as a proxy for the author. Whatever their relative merits regarding insights into the meaning of citations, the reader and the author viewpoint are clearly quite often not in agreement.
The second approach to citation behaviour studies focuses on the author's motivations for citing and asks the author directly, either through interview or questionnaire, what their citation motivations were. This also has its weaknesses in so far as it is more difficult to set up than the reader-based approach as it involves contacting the actual author, not just locating some informed readers. It also assumes that the author can remember their motivations for citing. The author-based approach normally uses a matrix or framework of possible motivations and asks the author to indicate which one s/he used. Vinkler (1987), for example, asked twenty scientists to categorise their motivations into eighteen motivations which Vinkler then merged into two major groups: professional motivations and connectional motivations. Professional motivations were citations used because of the content of the cited work, for example the use of a method, whilst connectional motivations were citations used because of social or possible career reasons, for example, using a well-known paper from a top journal.
Willett's (2013) study used ten authors and asked them to pick ten citations from a very recent paper. He then used a revised version of the original Harwood (2009) classification scheme which gives a total of twenty-four different reasons (grouped under eleven main types of citation). Brooks (1985) interviewed twenty-six authors, asking them to identify their citing motivations for each reference as one of seven reasons which he derived from a literature review. Case and Miller (2011) surveyed 112 bibliometricians to find out whether they cite differently from other scholars, given the possibility that their expertise might have brought about a heightened level of awareness of their own citation practices. Building on previous work (Case and Higgins, 2000; Shadish, Tolliver, Gray and Gupta, 1995) their survey instrument listed thirty-one reasons for citation.
We followed the author-based approach and asked the authors, in semi-structured critical incident interviews, their motivations for citing a selection (five) of the references in a recent paper. The references discussed were chosen, in most cases, by the interviewer to ensure that they were distributed evenly throughout the reference list. This is important as references at the beginning of a paper may serve a different purpose than those that are used to support the method section or conclusions. In a small number of interviews the author had one or two references they particularly wanted to discuss and then the interviewer would also suggest other ones to ensure an even distribution. This does pose the potential of bias, but it only happened in a small minority of cases so is unlikely to have significantly affected the results.
Our main aim was to select a representative sample of references for which the researcher could recall his or her decision-making process for deciding to cite. The question on citations was one question out of four in the interview and the others covered more general issues about trust and authority in scholarly communication and the potential changes that technology may have brought about. This paper discusses in detail the citation question within the context of citation behaviour research, whilst other papers arising from the project will examine it as part of the broader scholarly communication debate (Nicholas et al., 2014). As such the discussion on motivations for citing, which we examine in this paper, was taking place within a context of broader discussion about why, and to what extent, the researcher trusted certain sources and types of information.
We piloted our interview method with three researchers of different age groups and disciplines and it was found that authors could accurately remember their citation motivations for a recent paper. Responses to questions on specific citations also, in many cases, developed into a more general discussion about their motivations and reasons for citing. Thus the interviews provided data both on the specific reasons for citing particular references, but also contextual data about the broader areas of the meaning and use of citations. We analysed the responses to the citation questions and used them to create a list of different responses (twenty-four in total) and then counted the number of times they were given. In some cases the researcher had more than one reason for using a particular reference and, in that case, if both were strong reasons, we would count them both.
As well as analysing the data on the reason for citing, we also used iterative coding to establish key themes in the broader discussion about citation and trust (Mostyn, 1985). The UK analysis was done first and guided the development of the USA matrix of responses. The US responses were almost completely consistent with the UK responses, although there were some small differences which are highlighted in the results section. The results of this study are the tables showing the frequency of citation reasons given and the more qualitative data derived from the discussion about those reasons.
Who we interviewed
The sample (Table 1) included forty-five researchers from across the UK (twenty-nine from England, three from Wales, nine from Scotland and four from Northern Ireland) and forty-two researchers from across the USA, whose disciplines ranged across the physical sciences, biological and life sciences, and social sciences. This makes a total of eighty-seven researcher interviews, which is a larger sample size than most similar studies (Bornmann and Daniel, 2008). The UK interviews were conducted between December 2012 and April 2013, and US interviews from December 2012 to July 2013. Participants were recruited through a mixture of e-mails from publishers in those disciplines and through snowballing, as some participants recruited through publishers were able to suggest other relevant participants. This was a particularly useful technique for contacting younger researchers as their limited publication output meant they were often not on publishers' contact lists. We interviewed a range of ages and specifically included some early career researchers to include younger people, brought up with Web technology, who may have different approaches to assessing the trustworthiness of information sources than older researchers.
|50 and older||20||44.4||17||40.5||37||42.5|
|Method||Face to face interview||31||68.9||4||9.5||35||40.2|
Results: what does the data tell us about why researchers cite?
|Reasons for citing||UK||USA||All|
|Original seminal work in the field||50||22.42||35||10.29||85||15.10|
|Journal or conference known||20||8.97||36||10.59||56||9.95|
|Known institution or research group||13||5.83||33||9.71||46||8.17|
|Researcher (author) wrote it||9||4.04||17||5.00||26||4.62|
|Known database or source||14||6.28||10||2.94||24||4.26|
|Lots of cites to paper||2||0.90||17||5.00||19||3.37|
|Useful because of informal source||5||2.24||7||2.06||12||2.13|
|Checked content, data or mathematics carefully||9||4.04||2||0.59||11||1.95|
|Convincing alternative argument||9||4.04||2||0.59||11||1.95|
|Original historical document (not digitized)||4||1.79||7||2.06||11||1.95|
|Found it cited in authoritative source||2||0.90||8||2.35||10||1.78|
|Innovative work (unique to USA)||-||0||7||2.06||7||1.24|
|Reviewer recommended it||2||0.90||7||2.06||9||1.60|
|Suggested by someone they trust (unique to UK)||6||2.69||-||0||6||1.07|
|Used to show incorrect method (unique to UK)||2||0.90||-||0||2||0.36|
The results in Table 2 initially appear to show that researchers cite work for reasons which are mainly to do with the perceived quality or authoritativeness of the work. There are also, however, a large variety of different reasons for why they may make that decision which could be seen to support both the normative and constructionist theories of citation. The reasons for citing and the decision-making process behind them portray a complex picture. Researchers have a strong preference for citing seminal publications and also those for which they know the reputation of the author or publication venue. Our interview method revealed some complex nuances behind the responses in the table and in this section we look at some of them in more detail. Note that in the UK 'suggested by someone they trust' and 'used to show incorrect method' were mentioned but did not come up in US interviews, and in the US interviews 'innovative work' was mentioned but did not come up in UK interviews.
What do 'negative' citations mean?
Garfield's (1955) original paper on the need for a scientific citation index stresses that one of the key motivations was that a researcher could check who had criticised a work before he used it himself. A citation index was meant to be a guard against the propagation of error as well as a method of improving information retrieval through allowing new connections between documents, beyond those solely based on shared subject content, to be made. In line with other studies (Bornmann and Daniel, 2008), we found that not many citations were 'negative citations' in that they criticized the work of another researcher (only 13 out of 566 reasons, 2.31%). Of these, 11 were cited to show a 'convincing alternative argument', and 2 were cited to show the use of an 'incorrect method' for the problem under investigation.
Thus, it is rare that a citation is a negative endorsement and even if it appears that way it is normally a nuanced criticism rather than an outright rejection. Even if a researcher cites a work that they do not agree with, it is not normally because they think it has no academic value, but rather that they want to show an alternative viewpoint or approach. These results would seem to suggest citations are, in nearly all cases, some kind of positive endorsement of the cited work in so far as negative citations are rare. This does not, of course, mean that the reasons for the positive endorsement are necessarily completely objective, but it is rare for a citation to be a condemnation. Given the small number of negative citations in our study we were not able to draw conclusions about the role of disciplinary differences in negative citations as discussed by Hyland (1999).
Are researchers more likely to cite people they know, friends or collaborators?
'Author known' is the most popular reason for citing, accounting for 24.16% of reasons, so the initial answer would seem to be, 'Yes'. In nearly all cases, 'author known' meant someone the researcher had actually met, as well as being familiar with their published work. Other responses that include familiarity and knowledge with the source of the cited work are: journal or conference known (9.95%); known institution or research group (8.17%); known database or source (4.26%); known Website (0.89%). Here it is important to establish what is meant by known. A central theme that came up in our interviews was the importance of social networks with other researchers in enabling them to carry out their research. This finding also resonates with Meho and Tibbo's (2003) work on information behaviour. Their work developed Ellis's (Ellis and Haugan, 1997) model in important ways by adding networking to his original framework of six generic features, which were starting, chaining, browsing, differentiating, monitoring and extracting.
Our study showed that these are not purely social networks, but are based on research and knowledge. In many cases people know the people they are citing, but this does not mean they are only citing them because they know them. They know them because they have shared or connected intellectual interests and professional activities and they have maintained the relationship partly because they respect that person's work. In the interviews if 'I know the author' was the first response, it was then followed by a discussion about why this had influenced and informed their view of their other motivations for citing. Knowing an author ranged from meeting them at a conference to long-established professional relationships. Conferences were important in this regard and one young scientist remarked 'it is only when you see a scientist defend their work from questions at a conference that you can really trust their work'.
The response 'author known' contains a complex mixture of intellectual, political and social factors. Knowledge of journals, conferences and institutions were also an acknowledgement of a 'good reputation' and an acknowledgement, on careful consideration, that this reputation was probably justified. One social scientist explained that a 'good reputation' may lead him to look at journal but he would still check it carefully, so reputation does not stop researchers examining work critically. Tacit knowledge of quality and authoritative levels are important, with one respondent from the social sciences explaining that 'everyone knows what the good journals are and you just know', noting that this does not necessarily exactly correspond with official ranking. A mathematical biologist also explained that you 'just know' that some journals have very stringent peer review systems so you are more likely to trust their content. Knowledge of both other researchers and publication sources is therefore not either a purely social/political phenomenon or a purely intellectual one, but a complex mixture of both.
Why researchers do or do not cite classic texts
The acknowledgement that a text was a classic, seminal text was an important reason to cite it, making up 15.10% of the reasons given. The interview discussion about this reason revealed that researchers regarded omitting a citation to the originator of an idea, technique or theory relevant to their area of research, and only citing derivative sources, as something that would reduce the authority of their work. If they read research which did not refer correctly to seminal original research sources and only cited derivative sources, then this reduced their trust in the work. Thus citing seminal texts is both motivated by a desire to accurately acknowledge the intellectual source of ideas, but also by an awareness of how they will appear to others if they fail to get this right, and thus includes a complex mix of social and intellectual motivations. Seminal texts were trusted and citing them was seen as making the author part of that trusted tradition rather than a newcomer who had not done their background research. Researchers know they judge others who miss seminal texts and they do not want to be seen as that kind of researcher.
It was seen as very important to have a correct trail of the progression of ideas and there were some examples given where failure to properly read and cite original papers had led to the propagation of misinterpretations and errors throughout the subsequent literature. Researchers also adapt their use of seminal texts depending on the audience they are writing for. If they were writing for their own subject specialists, rather than a more general audience, they may be slightly less likely to cite all the relevant classic texts on the assumption that they would already be known to their readers. This, to an extent, supports the theory of obliteration by incorporation (Garfield, 1975), where seminal texts become so subsumed into tacit knowledge that they are no longer cited. This is, however, perhaps less so for classic texts that can be used across disciplines. It also suggests that it is not in fact obliteration, but rather a small reduction given that seminal texts were still the second most important reason for citing.
Has the digital transition made any difference?
As discussed earlier in the paper, technology has both changed formal academic publishing and provided new less formal means of sharing, discussing and disseminating research information. On examining the reasons given for citing, there is no clear indication that these technological shifts have had any significant direct impact. The two most popular reasons for citing, i.e., a seminal text or that the author was known, which make up 39.26% of the reasons, are not linked to the digital transition in any clear way. In fact the only time new technology is explicitly mentioned is when it is rejected, i.e., a document was cited because it was the original document and not a digitized version (1.95%). In terms of the ability to publish data the researchers we interviewed did not cite direct links to their data or those of other researchers in their publications, though discussion revealed that this was something they were considering for the future.
Open access did not come up as reason for citing a source. Researchers assessed the authority of research published in open access journals on a pragmatic case by case basis, and some open access journals were known to have strong peer review and some less so. As such the nature of access to the research and the role that technology did or did not play in this was much less important than the perceived quality of its peer review process. Results from our larger project of which this study was a part show that although the digital transition is affecting behaviour in the wider research process, researchers remain conservative in their publication behaviour.
Do researchers cite non-peer-reviewed sources?
The digital transition provides new ways of publishing and accessing peer-reviewed publications, but it also allows for self-publishing, such as blogs or tweets, with no formal peer review constraints. There is also the traditional media, such as newspapers, which are not peer-reviewed though are subject to editorial control (but not normally by a subject expert). Researchers are also motivated by the desire to provide the reader with the most useful and accessible reference and, depending on the audience; they may use a more informal or less academic reference in this case (2.13%). It is rare for researchers to cite non-peer-reviewed sources of any sort and, if they do, it is normally to provide an example of an attitude in the public arena towards the subject they are studying. Examples from our study were the traditional media's coverage of a singer's death or an example of the labelling by the newspaper media of certain social groups. In a social science paper a particular Website was cited, rather than a journal, because the researcher thought that Website, unlike the journal, would lead readers to useful related information.
Researchers may use well-known blogs within their discipline or follow tweets of respected researchers to update themselves on new research, but in their own published work they will cite the publications not the social media source. The lengthy publication process in certain fast changing disciplines such as science is a concern for some researchers, whose work relies on the latest data and research. In such cases, they may tweet or blog about some of their findings in order to disseminate their work in a more timely manner. Social media are generally used by researchers to provide a quick and focused way to communicate about work which will shortly be in a peer-reviewed formal format. Researchers seem to be conservative in terms of the formal peer-reviewed publication process, and emerging digital technologies such as social media may shift their research behaviour in subtle ways, but this is not explicitly acknowledged in what they publish. Our findings here are similar to other studies about how social media tend to be ways of talking about peer reviewed academic research, rather than a new way of publishing their content (Thelwall, Tsou, Weingart, Holmberg and Haustein, 2013).
Do researchers use how often other researchers/ information sources are cited as an indicator of quality?
Researchers have a number of ways of establishing the authority and quality of sources they are unsure of: two of these are checking how often they are cited (3.37%) and where they were cited (1.78%), making a total of 5.15% of checking methods that included citation information. We know from the interviews that the normal reason researchers were unsure of the authority of a document was because the author and the source (e.g., journal) were unknown to them or they knew the source but it was non-peer reviewed and they did not know if it was good enough to cite in an academic publication. In this case researchers did use the number of citations to the document, and whether those citations were from an authoritative and trusted source, as one method of establishing whether they should cite it. From the discussion in the interviews we know that researchers see citing authoritative references as one way of establishing their own credibility. They do not want to be seen as citing a source they have not checked carefully. In looking at researchers' behaviour in checking how often unknown sources are cited there seems to be an underlying assumption that if someone has cited a work then they have also checked it with care.
Do researchers decide by themselves what to cite?
A strong theme of the wider research project on trust in scholarly communication (Nicholas et al., 2014) was the social nature of research in terms of the entire research process, which is then seen in the extent of co-authoring in published work. Researchers do not work in isolation and are rarely solely responsible for all the citations in their published work; nearly all researchers write papers with other researchers. In general, researchers trust their co-authors and would also work closely together on choosing references, although when the co-author is from a slightly separate disciplinary area there is more trust and less discussion. Sometimes citations are suggested by editors or reviewers and in one case, one of these was found to be inaccurate. The author had not carefully checked it on the assumption that the reviewer would have. The number of authors also varies between disciplines and this can influence how trust works in citation decisions. In astronomy, for example, papers based on large international research groups will often have over 800 authors. In this case the assumption was that with so many authors reading and knowing the literature it is very unlikely that a reference will be cited that contains errors. The issue of coercive citations, where authors are pressurised to include certain works in their publications (Wilhite and Fong, 2012), did not come up as a problem in the interviews. The strongest that could be said was that in two cases researchers felt they had been given a recommendation which, on reflection, they felt was not as appropriate as they had originally thought. The researchers were embarrassed by any perceived lack of care on their part in using these citations.
Do authors always read carefully the documents they cite?
If a citation is seen as an intellectual acknowledgement, then it matters that researchers who cite a text have read it closely and carefully considered the nature of the intellectual acknowledgement. If people cite casually then this implies that citation cannot be taken that seriously as a positive endorsement. Our findings here do not correspond with the results of some other research, see for example Simkin and Roychowdhury (2003), who estimate, through examining the distribution of misprints likely to have been propagated by copying citations, that only 20% of references have been read. Within an interview format, researchers are unlikely to reveal that they have not read work they cite, although in some cases we found they were candid regarding the embarrassment of finding out they have cited in error. However, there are also methodological weaknesses in methods based on textual analysis. In the interview discussions it was clear that in nearly all cases researchers do carefully check most of the work that they cite and see this is very important. There are also some differences between disciplines. All researchers interviewed working in a discipline using mathematics checked the mathematics of any work they were considering citing and would not cite something with mathematical errors. Within social science it was noted that errors in papers can, in some cases, be more subtle and difficult to spot. This suggests that the role of trust may vary depending on disciplinary area, and a larger scale study could draw more statistically significant results on this question.
Why do researchers cite themselves (self-citation)?
Authors stated that they cited a publication because it was their own work in 4.62% of the responses. Normally the reason given for self-citation was that it was important to show how the current publication had developed from previous work and to provide a storyline. As such the researcher is trying to show how their current research builds upon their previous research and, indeed, in some cases, it was said that avoiding citing their previous work could cause confusion to the reader. One researcher noted that she self-cites because her study was the first on a particular topic, whilst she felt others do so for publicity. Researchers are aware of the possible effect of overly self-citing and seem to use it with caution. This does not necessarily solve the question of whether self-citations should be included in researchers' citation counts. It is clearly a different kind of process, as rather than reflecting on their impact on the wider research community, it is an acknowledgement of the development of their work.
This section takes a number of key questions within bibliometrics and citation behaviour research and discusses what new insights our findings provide compared to previous studies. We discuss the following: the impact our findings have on the theoretical debate within citation studies; whether citation analysis should be used as a research evaluation tool; the relationship between theories of citation behaviour and theories of meaning; and whether the reader or author-based citation behaviour method is the most accurate.
Do researchers cite for normative or constructionist reasons?
We know from previous work (Bornmann and Daniel, 2008) that the evidence collected so far on citation behaviour is in favour of the normative approach. Our work broadly supports this conclusion in so far as accurately acknowledging authoritative and trusted intellectual content was seen as a key part of the citation process. Researchers generally cited sources they trusted and, at least from their perspective, they were comfortable that their reasons for trusting these sources were based on sound academic grounds. As we followed the author-based approach, we were able to gain insights into motivations for citing rather than the use of citing in the text as used by, for example, Frost (1979). This may explain why our findings have a greater emphasis on 'author known' than his work, as these data are impossible to get from simply analysing the citation in the text.
Our findings challenge, however, the assumption that intellectual content somehow happens in isolation from a researcher's position within a social context. The distinction between intellectual content and social context between the normative and constructionist theories is shown by our work to be more complex than previously acknowledged. Our focus on trust appears to have brought this issue to the fore, as trust could be taken to be an act of faith and a purely social phenomenon, but our data showed that trust is related to a judgement of perceived intellectual value, which is a considered process. Most normative decisions take place within a social context, but this does not mean that the decisions have no normative content whatsoever.
Examining the nature of the responses given, it is possible to divide them roughly into reasons purely to do with intellectual content (normative reasons) and reasons to do with knowledge of the social and academic context (constructionist reasons) in which the documents exist. The reasons that appear to be most to do with intellectual content are: sound method; theoretical approach; checked content, data or mathematics carefully; convincing alternative argument; original historical document (not digitised); agenda free; correct mathematics; and used to show incorrect method. Note that 'used to show incorrect method' is clearly not a positive endorsement, but it is about the intellectual content of the paper.
In total, the responses that could be grouped under normative reasons account for 19.18%, so these are in the minority, with the remaining responses (80.82%) taking into account factors outside the actual text such as: recommendations; citation levels; perceived seminal nature of the work. On initial viewing, these data would appear to support the constructionist theory of citation behaviour. This is, however, a complex area, as some reasons are hard to clearly place in one category. The most popular reasons overall were 'author known' and then 'original seminal work' with a combined percentage of 39.26%. These give the impression of constructionist reasons but, with the addition of the context surrounding the responses provided in the interviews, we know that 'author known' does not really mean in the purely social sense but was at least perceived as an indicator of the quality of their work. The complex nature of 'seminal texts', which have both intellectual content of importance but also undeniably a social and even symbolic function within disciplines, is also difficult to place.
Citation behaviour theories do not seem to be falsifiable in the Popperian (Popper, 1963) sense as it is possible to interpret the data in different ways which could lend support to either theory. Our findings from the broader interview discussion and the research project as a whole show that the social and collaborative nature of the research process is very important. The dichotomy between citing for purely intellectual content reasons and citing because it will have a certain effect on the position of the researcher in the social and political context in which s/he is operating seems a slightly artificial one. The ability to contribute intellectual content is one factor in changing one's status in the social environment and one's social skills will also often affect one's ability to complete and communicate intellectual content. If the work of a certain researcher is trusted that is normally because their work has been shown over time to have value. Thus, the social context and intellectual reasons are, in many cases, inextricably linked. We found with many of the authors that, within the same paper, both reasons would be used to different extents for different citations. Thus our empirical work supports the previous theoretical contributions on the complexity of the citing process done by, for example, Cronin (1984) and Camacho-Miñano and Núñez-Nickel (2009).
Our results show that acknowledging the quality and influence of content is normally a central reason for citing, but that knowledge of individuals, journals and institutions also plays a part in what to cite. The context, such as journal or conference proceedings, provides good indicators of the reliability of the document. If the document is actually going to be cited then most researchers do also study the content. They would, however, in some (0.53%) cases trust a co-author or another trusted colleague or collaborator (1.07%) to recommend a citation and they may not 're-check' this citation closely but rely on the other person's judgement. So in these cases a trusted social connection reduces the need to closely examine content.
Researchers are also wary of citing work which they think may make them appear less authoritative to their peers, so their perceived position in the academic social system is important. If, for example, researchers did choose to cite non-peer reviewed sources such as online forums, blogs etc., they tended to check them very carefully. One approach was to revert to traditional methods of establishing the authority of these sources by checking if they had been cited by any other academic sources. Alternatively the reputation within the academic community of the non-peer reviewed source would be checked and established, if it was not already well known to the researcher. This was normally done to check that the source had enough academic authority to be cited in a research paper, but also to ensure that what appeared to be a useful resource for the reader would not, in fact, turn out to be misleading.
Three of the researchers interviewed expressed regret and some embarrassment that they had published papers with citations that, on later reflection and investigation, had turned out not to be as authoritative as they previously thought. Our results support the normative theory of citation in so far as acknowledging that intellectual influence is a large part of citation motivation, but they also show that intellectual content cannot be understood in isolation from social context. The social constructionist theory also tells us something about how the process of citation works, through its acknowledgement of social networks and academic prestige. These networks, however, would appear to be part of a complex quality appraisal process, rather than a replacement for examining intellectual content.
Should citation counts be used as a research evaluation tool?
This study does show that when a researcher cites a work it is nearly always because that work has something valuable to contribute to the researcher's own work and that the researcher regards the cited work and its author as reliable and trustworthy sources. Nearly half (39.26%) of the reasons for citing were that the author thought it was a seminal work or they knew the author or the source (journal or conference). As we had discussed with them their reasons in an interview context, we discovered that 'knowing the author' did not constitute just some kind of personal favour but rather a personal, normally long-term, knowledge and understanding of their research and character. Thus when someone cites a work it is likely they are doing so because they are familiar with the work, confident of its authority and that the work is well established within their field.
The reasons for choosing a citation are, however, still varied (with a total of twenty-four reasons given) and the purpose of a citation depends on many factors, such as the intended audience. In general a citation is a positive acknowledgement that the source is a valuable contribution to the research, although the exact level to which it was actually crucial to the paper will vary depending on the reasons for the citation. Our work also suggests, however, that the reasons for citation are multi-faceted and also seem dependent on other aspects of academic esteem, such as level of profile and the publishing of seminal contributions. High levels of citation are an indicator of trust and authority, which is at least a necessary, if not sufficient, condition for impact, but are too varied in meaning to be used as a sole measurement for quality.
What is the relationship between theories of meaning and theories of citation behaviour?
When we examine citation in detail using qualitative interview methods we gain new insights into the complexity of citation behaviour. Both content and context are important in understanding the motivations behind the decision of a researcher to cite and the complex significance and meaning that citation may have. Citing is neither a purely private mental act, as also discussed by Nicolaisen (2003), nor a purely social one. We also see the multiple different levels of the meaning of citations, which connects closely to related philosophical work on the variety of meanings in information retrieval (Blair, 1990). Applying a purely normative approach to how citations reflect the value or impact of research work is likely to lead to a simplified understanding. Social media do not seem to have threatened the status of traditional peer-reviewed academic published work, but are having some influence on the context of the wider research process by facilitating social and research networks. It does not yet seem to have had a major impact on citation behaviour, but it is providing some new ways for researchers, and also the wider public, to disseminate and discuss research outside the confines of the formal peer-reviewed academic publishing process. As such it is providing a new context for different types of content and also a new way to publicise or discuss traditional content.
Is the reader-based or the author-based method more accurate for analysing citation behaviour?
Our research clearly shows the value of directly communicating with the author about their reasons for citing work. It shows that a detailed discussion with the researcher will reveal aspects of citation motivation that may not become apparent if they are simply asked to fill out a matrix of reasons. The responses of our researchers were not, in fact, that different from those found in previous citation studies (Bornmann and Daniel, 2008; Thelwall et al., 2013) but we did gain some useful context to the exact meaning of those responses. It is useful in complex cases, for example, for finding out exactly what a negative citation may mean and that they are rarely a straightforward repudiation of the work cited. The reason 'author known' could also suggest a simple use of reputation or personal contact but, in fact, researchers normally discussed how they knew the author and why they valued their work. This method was also useful in terms of finding out what the citation decision-making process was, rather than just the outcome of the decision. In our interviews we found, for example, that researchers would generally only cite a non-peer-reviewed source after careful checking. Thus a citation to a non-peer-reviewed source does not mean that a researcher perceives it in the same way as a peer-reviewed source; in fact, the difference in process shows how differently the researcher perceives it. Our approach gave us data at a level of detail which also showed that the dichotomy between constructionist and normative reasons for citing is largely misguided.
Conclusions and future research
This study provides us with some nuanced and detailed data about the role of trust and authority in why and how researchers choose to cite work in their publications. In nearly all cases they cite a work because they regard it as an authoritative and trustworthy source which provides a context or building block to their own research. Our findings here are in line with the accumulative evidence of previous studies (Bornmann and Daniel, 2008) which do, in general, show that a citation to a work is normally an indicator that it is of a certain quality and has made some contribution. However, not all citations are equal and each citation means different things and indicates different levels of influence and importance to the author's work. How authors make the decision to cite involves a complex mixture of analysing intellectual content and using trusted social and research contacts. Researchers are also aware of the impression the citation will give for the credibility and prestige of their publication and, by implication, themselves. As such the dichotomy between the normative and social context theories of citation behaviour would not seem to hold in the face of the evidence collected in these interviews. The normative theory is a better description of what citation means but the social contrast theory provides some important insights into how citation works by emphasising the role of trusted social networks in information gathering and citation. It also suggests that the connection between citation behaviour studies and theories of language within the context of information retrieval could usefully be further explored. Much work has been done on the respective roles of content and social context in meaning for information retrieval and this could add depth to our understanding of how citation behaviour works as it combines both the textual and social nature of the process.
This research was funded by the Alfred. P. Sloan Foundation.
About the authors
Clare Thornley is a consultant with CIBER Research Ltd and a Senior Research Fellow at the Innovation Value Institute, Maynooth University. She can be contacted at firstname.lastname@example.org.
Anthony Watkinson is currently Principal Consultant at CIBER Research Ltd, an Honorary Lecturer at University College London and an Affiliate of Oxford Brookes University. He can be contacted at email@example.com
David Nicholas is Director of CIBER Research Ltd, Professor of Information Science at Northumbria University and Adjunct Professor at the University of Tennessee. He received a PhD in information Science from City University London. He can be contacted at firstname.lastname@example.org.
Rachel Volentine is the manager of the user-experience lab at the University of Tennessee, where she graduated in 2010 with a Masters in Library and Information Science. Her Bachelor's degree is in History from Berry College in Rome, Georgia. She can be contacted at email@example.com.
Hamid R. Jamali is Principal Consultant with CIBER Research Ltd. He received his PhD in Information Science from University College London in 2008. He can be contacted at firstname.lastname@example.org.
Eti Herman is Principal Consultant with CIBER Research Ltd. She received a PhD in information science from City University London. She can be contacted at email@example.com.
Suzie Allard is an Associate Professor and Associate Director of the School of Information Science at the University of Tennessee, Knoxville. She received her PhD in Communication and M.S.L.S. from the University of Kentucky and B.A. in Economics from California State University. She can be contacted at firstname.lastname@example.org.
Kenneth J. Levine is an Associate Professor in the School of Communication Studies at the University of Tennessee, Knoxville. He holds a PhD in Communication from Michigan State University, J.D. from Case Western Reserve University, M.S. in Communication from Cleveland State University, and a B.S. in Mass Communication from Miami University (Ohio). He can be contacted at email@example.com.
Carol Tenopir is a Chancellor's Professor in the School of Information Sciences at the University of Tennessee as well as Director of Research and Director of the Center for Information and Communication Studies. She holds a PhD in Library and Information Science from the University of Illinois, M.L.S. from California State University, Fullerton, and B.A. degrees in English and History from Whittier College. She can be contacted at firstname.lastname@example.org.