In Web search we trust? Articulation of the cognitive authorities of Web searching
Information Studies, Åbo Akademi University, Åbo, Finland, and
Department of ALM, Uppsala University, Uppsala, Sweden
Studies have shown that search engines tend to be the channel of choice to information in diverse questions of work and everyday life in the developed countries (Rieh 2004). In spite of their popularity and the giant leaps taken in their design and development, both search engines and searching have their limits. Not all Web resources are equally authoritative (Cronin 2001) and not all searches return equally authoritative results. Investigations of Web search behaviour have evidenced both the complexity of credibility assessments (Wathen & Burkell 2002) and tendencies to uncritical acceptance of that what a search engine happens to retrieve (Pan et al.2007). Savolainen (2007) has shown how the selection of information sources is heavily influenced by assessments of media credibility and the perceptions of cognitive authority (in Wilson's terms, 1983). Kuhlthau's (2004) information search process (ISP) model underlined similarly the role of internal factors in information seeking that are more closely related to searcher's emotions and anxieties than to the results or explicit evaluative judgments. Also Rieh and Hilligoss (2007) emphasise the strongly contextual nature of authority and credibility on the Web. Because of this apparent significance of heuristic rather than explicit a posteriori credibility assessments, Taraborelli (2008) has argued that credibility research needs to take a closer look at both non-reputational and explicitly reputational cues and biases of source evaluation.
Drawing on the observations of both Taraborelli (2008) and Kuhlthau (2004), the present study aims to look closer at the perceived cognitive authority and credibility of the act of searching information. The aim of this paper is to report and discuss the results of an exploratory study of the articulation of cognitive authorities in the context of Web searching. The study is based on a qualitative analysis of 805 utterances related to search engine use harvested from the Web using the Webometric Analyst software (Thelwall 2009). The findings provide new information on how and why search engines and especially searches are considered and claimed to be authoritative by searchers on the Web. The study also presents new insights into the contexts of the authority of searching, and provides a new understanding of the implications and impact of the assumptions and claims of authoritativeness in the information ecology of the Internet. The results have implications for the design of search systems by augmenting the notion of context of the relevance of results (why something is relevant or not, and what are the implications of relevance) and by suggesting means to use cognitive authority in relevance feedback.
Traditionally in the information science contexts, the notions of credibility, authority and success have been seen as aspects of the perceived relevance. Su (1994) compared 20 different measures of information retrieval success and concluded that users tended to be more concerned with recall than precision. In a later study, Su (1998) found that the value of the results, as a whole is the best measure of success.
For over a decade, researchers have begun to make a clearer distinction between relevance and authoritativeness of information sources and channels. Rieh (2010) sees the growth of the Internet, a massive source of information of varying quality, as a major reason for considering credibility and cognitive authority as an independent research agenda. When availability is no more the principal issue of concern of the information seeker, the perceived relevance and preference of particular sources, channels and sets of results is increasingly based on factors other than recall and topicality, which include those such as media credibility and cognitive authority (Savolainen 2007). The mechanisms of judging credibility can also differ between different types (i.e., known authorities, independent information providers and aggregators of information sources (Chung et al. 2012) and between different cultures (Yi et al.2012). Savolainen and Kari (2006) have identified altogether 18 different relevance criteria in Web searching, of which specificity, topicality, familiarity, and variety were the most frequently mentioned.
The notion of cognitive authority was coined by Wilson (1983) to make a distinction between administrative authority (people or entities have authority because of their position) and cognitive authority (authority based on influence). The two premises of cognitive authority are the recognition of expertise and reputation. A person has to be expert in a topic and that expertise has to be known (i.e., recognised) before the person can function as a cognitive authority. Cognitive authority is not, however, restricted to people. Also books, journals and institutions have cognitive authority that is based on the personal authority of the author of the text, institutional authority of the publisher, authoritativeness of the type of text and the intrinsic plausibility of the claims made in the text (Wilson 1983).
During the past decade, there has been a growing interest in the notion of cognitive authority, as Rieh (2010) suggests, because of the rapid growth in the use of Web-based information resources. Rieh (2002) has studied cognitive authority on the Web and developed a Model of Judgment of Information Quality and Cognitive Authority that explains how people make predictive and evaluative judgments about which Websites contain credible information. According to Wathen and Burkell's (2002) related model of credibility assessment, people make their first judgments based on surface characteristics and if a Website seems promising, they make a second more in-depth evaluation, on message level and thirdly on cognitive level. In practice, people often use mental shortcuts and rules when they judge the authoritativeness of a text (Rieh and Hilligoss 2007; Rieh 2010). The processual and indirect nature of authority judgments has also been observed by Taraborelli (2008). He argues that instead of looking at evaluative judgments as in earlier studies, it may be more productive to look at a priori judgments of authority. Credibility research has shown that, in practice, people rely in their information seeking more on simple heuristics than on complex calculated evaluations (e.g., Rieh and Hilligoss 2007).
Even if it has become customary to study cognitive authority of information sources, the processual and contextual dependence of cognitive authority judgments may be taken to suggest that, in addition to entities like texts, journals and people, cognitive authority can reside also in activity. Hargittai et al. (2010) show how the search process together with search context, branding and routines and social networks play an important role in credibility assessments. In the context of known information sources such as Wikipedia, the credibility assesments depend also on the assumptions and knowledge of the forms of producing information (Francke and Sundin 2012). Jessen and Jørgensen (2012) have developed a model of aggregated credibility on the basis of these and other similar findings published in the literature. The model underlines the interplay of (external) authorities, social validation (by peers using e.g., comments or votes) and profiles (i.e., in online services, essentially a known, or perhaps rather claimed, identity). Even if the approach addresses the dynamics of credibility, the emphasis of Addelson (2003) on the active nature of cognitive authority and how it is implicitly and explicitly exercised in practice instead of being perceived as a static characteristic of things, warrants some further discussion. Because of its exercised nature, cognitive authority is also dependent on the event of exercise. It may be assumed that some instances of exercised cognitive authority have more cognitive authority because of the people, institutions and other influencers involved. The type and characteristics of the event, plausibility of the results and, for instance, if known, the effort that was put into producing the results, all affect the level and emergence of cognitive authority.
Method and material
The material for the present study consists of 805 utterances related to information searching and search engine use collected from the Web. The heuristically generated phrases used in the harvesting of the utterances are listed in Table 1. The heuristics was based on an in-depth exploration of Web-based discussion forums and blogs to discern the patterns of expressing failed and successful searching. Each of the chosen expressions was tested by using a Google search (http://www.google.com) and overviewing the first ten results for their relevance in the present study i.e., whether the utterances were related to searching or not. Different wordings (e.g., I searched in Internet) were tested and the final selection of utterances was based on the number of retrieved hits (phrases with a large number of hits were preferred) and their relevance to the topic (phrases with a low number of actual Web searching related hits were omitted) in the Google search test.
The data was collected using Webometric Analyst software (Thelwall 2009) with the Bing search engine API (Thelwall and Sud 2012) in November 2011. The software uses the API of a particular search engine to retrieve links to a set of pages containing specified phrases. The composition of the sample of retrieved links is determined by the search engine and its API. The applicability of the method for collecting data on human information behaviour has been discussed earlier by Huvila (2011b). The Web pages that contained the phrases were analysed using constant comparative method (Glaser 1965) to discern patterns and similarities between the information seeking situations and their contexts, and to increase the validity of the analysis. The total number of hits returned by Webometric Analyst, the number of search-related utterances and valid cases analysed in the present study are listed in Table 1. The validity of cases was determined by their relevance for the present study i.e., whether the cases could be retrieved in the analysis phase (i.e., the page was still available) and they were related to search activity. Finally, fifty cases were dropped as obvious cases of spam (i.e., identical compliments for providing good information with a link to a link farm). The utterances were coded and referred in the following according to the information 'source' mentioned in the phrase (e.g., [G]oogle, [B]ing, [I]nternet, [W]eb) together with an index based on the original list of 3006 utterances.
Association of Internet Researchers Ethics guidelines (Ess and AoIR Ethics Working Committee 2002) were applied when collecting and analysing the data. Because of the personal nature of many utterances (even if they are publicly available on the Web), the examples from the research data have been chosen with special consideration for the original writers of the utterances.
|I searched in Internet
|I searched in net
|I searched in the Internet
|I searched in the net
|I searched in the Web
|I searched in Web
|I searched on Ask
|I searched on Bing
|I searched on Google
|I searched on Internet
|I searched on net
|I searched on the Internet
|I searched on the net
|I searched on the Web
|I searched on Web
|I searched on Yahoo
|I searched using Bing
|I searched using Google
|I searched using Yahoo
|I tried to search in Bing
|I tried to search in Google
|I tried to search in Yahoo
|I tried to search on Google
|I tried to search on Yahoo
The thematic variation of using different phrases for expressing search attempts on different topics showed some distinct characteristics even if the topical variation did not seem to have a strong correlation with the phrasing of the utterances and claims of authority (compare for Huvila 2011a). Utterances containing references to cars seemed to occur more often with phrases containing a named search engine (7/172, 4%) than with generic Internet-, Net- or Web-related phrases (10/655, 1,6%). The references to the Internet with a mention of a specific search engine dominated in food, health, music and programming related utterances. In general, the topics ranged from personal questions to education (e.g., N490, I192), military aviation (e.g., G537), tractors (e.g., W180) and, for instance, cooking (e.g., W363, G489). A qualitative overview of the contexts of the utterances gives an impression of a geographical variety of the origins of the utterances with countries across the English-speaking world and a significant presence of non-native speakers. The popularity of the named search engines (Google 157 utterances, Yahoo 40, Bing 9) in the utterances is roughly similar to the published search engine use statistics.
The different utterances showed varying degrees of the level of evaluation of the search results. In terms of the model of credibility assessment of Wathen and Burkell (2002), 422/805 (52%) utterances contained indicative evidence of no or at most a shallow surface level of evaluation of the results. Some searchers seemed to base their evaluation on mere a priori claims that a Google search leads to good information by default (e.g., W344, I139) while others made a judgment on the basis of "nice"pictures (Y82) or by that information "looks serious"(I224). 68/805, (8%) contained message level evaluation or utterances on the necessity to evaluate the validity of the results. Searchers did, for instance, ask comments about the validity of their findings (e.g., G99, Y3), they expressed doubts about the relevance of results (e.g., W430, G504), were pondering the presence of contradictory (e.g., I102, N751) and similar information (e.g., N641, I600), or reflected upon the reviews of a particular piece of information (e.g., N7, N280). In 235/805 (29%) cases the utterances provide evidence of at least slightly deeper cognitive or empirical evaluation of the results. Searchers were referring their personal experiences of the relevance and quality of the information they had found (e.g., G84, W91), or vice versa, the similarity of their own prior experiences and the available information (e.g., N510, N707), and complimented bloggers and Web site owners on the practical helpfulness of the provided information (e.g. B4, I169).
The analysed utterances could also be categorised according to the presence of three major sources of authority (Table 2). The categories were constructed on the basis of authority claims made in direct conjunction to the phrase of utterance. Due to the nature of the empirical material, anecdotal secondary references to other authorities were recorded but omitted in the final categorisation.
|Source of authority
|Search engine use
|No explicit authority claim
First, 269/805 (33.4%) utterances contained indications of the influence of the people (groups, communities and individual Internet users) as an authority. Searchers relied and made claims on the basis of the assumed topical expertise and experience of individuals (including themselves). Utterances contained requests for comments on the authority of found information (e.g., G99, N418), references to good reviews (e.g., N7, N280) and expressions of disappointment of the presence of bad information provided by reputable communities or individuals on the Internet (e.g., N34, G302). One searcher chose a travel company because of the number of 'good comments' (I206) she found. Whereas the positive reputation could function as an authority, the lack of reputation was considered to be a sign of its absence: 'I wonder why you recommend the Qubz drive. I searched on the Web for information about it without success' (W17). Some of the utterances confirmed the validity of the results of a search or vice versa. For instance, searchers might have used a particular service (I206), tried the usefulness of the information in practice (I349), or they could confirm the information about a location they had visited (N586). On the other hand, the similarity to their own earlier experiences or symptoms (e.g., N510, N571) could confirm the validity of something the searchers had found on the Web. Others were asking for confirmation of something they had found on the Web (e.g., N121, I202). A qualitative reading of the utterances suggests that a large number of these questions are related to safety (e.g., of using something or travelling somewhere) and practical reliability of, for instance, procedures and devices. Searchers were asking about the dependability of, for instance, particular medicines (in the context of health related questions) or procedures (technology related questions). In some rare cases the authority was named, for instance, 'I searched in net and even Stephen Hawking [was] not saying much about this' (N234).
Secondly, in 132/805 (16.4%) cases the searchers made claims on the usefulness of search engine use (as an approach of becoming informed) by making inferences about the credibility of information using a series of shortcuts and a priori assumptions. The relevance ranking of results in search engines, comparison of results on multiple search engines and the presence of confirmatory (i.e., other people had had similar problems or the same information could be found in multiple sources, e.g., I180, I320, N279) or contradictory information were mentioned as shortcuts for making quality judgments on the unfamiliar topics. Some searchers mentioned a particular search query (e.g., G224, G242, G246, G515) as an authoritative reference to a particular piece of information. Trust in the authority of ranking was expressed in claims that relevant results 'are' among the first ranked results, but also that no out-dated or (subjectively) irrelevant information should be found among the top results (e.g., B21, G516). "Before when i searched on Bing.com it came up first thing. Not any more. But it will eventually find it."(B3) is a strong indication of how the reliance on search engines influence information seeking and management practices. At the same time it demonstrates the perceived authoritativeness of a named search engine not only as a momentary source of unspecific useful information, but also as a persistent point to accessing a particular piece of information.
A parallel aspect of the prominence of searching as an authority was the prevalence of indications of the significance of a priori assumptions. One of the most prevalent assumptions was that of the intrinsic quality of a particular search engine (e.g., G295) or the Internet as a whole as a good source of information (e.g., I535, I853, I863, N124, N156, N269, N565, G477) that is both accessible and easy to understand (e.g., G426, G476). Some searchers remarked that the Internet is a better source of information than certain individuals (e.g., mother-in-law in N718) or another type of information source (e.g., customer support, G245). In N242, the searcher remarked that the Internet is a good source of information for Internet-related matters. In B1, the searcher writes: 'Naturally, I searched on bing and goog to see if anyone had solved these issues'. and in I85 that 'Well the next day I searched on internet (after-all I am a netizen)'. Utterance I853 contains an even stronger expression of the validity of information found 'in the Internet': 'i found it[ ]unusual coz i searched in the internet and it says that it can only be found in the US'. Another common articulation was that if something is not found using a search engine, it does not exist, or the contrary, that a hit in a search engine is a positive evidence of the veracity of a claim (e.g., N189, G383). For instance, 'I searched in the Internet right away and cried after I found that the news is true' (I884). The same contention implies also that it is strange or even suspicious if something is not found in the Internet (e.g., G407, G420). Searchers also remarked on their experiences of the superiority of particular search engines in comparison to others (e.g., B21, Google is better than Bing).
Finally, in 303/805 (37.6%) of cases the searchers appeared to consider the search activity itself as an authority. Instead of placing explicit trust on their use of a Web search engine as a useful tool for retrieving information, the utterances contained descriptions how the (implicit or explicit) effort or diversity of performing a search implies that the results have to be correct or relevant. Similarly, a poorly performed search (as perceived by the searcher) was seen as a legitimate reason to question the usefulness or accuracy of the results. The most prevalent type of expression that vested authority in the search activity were utterances in which searchers legitimated their questions by claiming that they had searched but failed (e.g., G392, G395, I450) sometimes regretting their poor search skills (e.g., G409), or that they were unable to find an additional piece of information (e.g., N127, G537). Searcher G260 claimed to have been searching for a piece of information 'for over 30 minutes'. In G453, the searcher described the results of an earlier search attempt and promised to provide additional information 'if I find more'. The veracity of the claims and especially the exhaustiveness of the efforts may be doubted when searchers ask simple questions that contain all necessary search terms. The positive reactions to the expressions of search effort (e.g., G532) and the popularity of the custom of legitimating a post containing an answer to a question on a discussion forum or a blog by stating that no earlier answers could be found (e.g., W12, I177) may be an indication of an assumption of the significance of a search as an authoritative act.
Discussion and conclusions
Even if the contemporary search engines provide us with certain contextual cues, the traditional premiss of a search system as a tool for factual retrieval (criticised e.g., by Marchionini 2006) has not become obsolete. The findings of this study shows that it still dominates the minds of the searchers as much as it underpins the design of search systems. Even if contextuality has been acknowledged for a long time as a fundamental premiss of information seeking and credibility assessments (Rieh and Hilligoss 2007) and higher levels of information literacy (Alexandersson and Limberg 2005), the recurrence of ad hoc articulations (and consequent assumptions of their plausibility) of using the Internet as a source to check the verity of statements, acontextual references to the reliability claims, the rarity of articulations of complex evaluations of search results, and the idea of 'searching' (both in terms of using a search engine and engaging oneself in the activity of searching) as a sufficient precondition of a successful retrieval of authoritative results give an impression that searchers have a tendency to conceptualise searching in terms of factual retrieval.
At the same time, however, the general diversity of articulations suggests a high degree of situational contextuality of the authorities. In this sense, the paralleling of Web searching and factual retrieval is an equally contextual matter. Simplistic evaluation criteria and reliance on ad hoc articulations are problematic only if they are unreliable in practice. Instead of being an indication of a low level of information literacy (e.g., Alexandersson and Limberg 2005), a specific assumption of the possibility of factual retrieval can also be an indication of a knowledge of the contextual adequacy and reliability of particular cognitive authorities. Similar to Hargittai et al.'s (2010) study, many utterances contain references to search context, brands (e.g., Google and Bing), routines (searching as a routine), and the social contexts of particular discussion forums or blogs, or the community of Internet users as a whole. Paraphrasing Addelson (2003), the cognitive authorities articulated by searchers are exercised rather than static entities. By their utterances, the searchers do not only refer to existing authorities. The utterances themselves put cognitive authority on searching and contribute to the evolution of general assumptions of the reliability and usefulness of the activity. In spite of their exercised nature, the categories of authority identified in the material meet the two premisory criteria of cognitive authority. The utterances contain references to various forms of expertise (both as expertise and as a broader dependability) and reputation (Wilson 1983), of which the existence of the utterances themselves is an illustrative albeit not the sole example.
In comparison to classical types of cognitive authorities such as institutions or individuals, most of the articulated authorities in the material (apart from the relatively few examples of named individuals or sources e.g., N234) reside on a higher level of abstraction and are reminiscent of the contextual rather than static relevance criteria (e.g., Savolainen and Kari 2006), types of credibility assessments (e.g., Wathen and Burkell 2002) and characteristics of cognitive authorities (e.g., Taraborelli 2008; Addelson 2003) discussed in the earlier literature. In contrast to the observations of Savolainen (2007), the analysed utterances give an impression that the authoritativeness of search engines and the Internet is not merely a question of the perceived credibility of the media. The media and its use produces cognitive authorities in the utterances that make assumptions of the existence or veracity of claims on the basis of their findability on in the Internet and/or using the major search engines. The present observations do, however, correspond to those of Savolainen in the sense that the cognitive authorities identified in the present study are abstract and situated. It is doubtful whether they would be acknowledged as authorities (in a traditional sense) if the question of their authoritativeness were discussed in an interview as Savolainen did in his study. As with Savolainen's informants, even if a large majority of the analysed utterances show that searchers rely on the authoritativeness of Web searching, it is highly doubtful whether the authors of the analysed utterances would explicitly acknowledge searching as an absolute authority per se.
Besides being indicative of the presence of certain typical cognitive authorities, the utterances provide direct evidence of the ways in which cognitive authorities are articulated in everyday life contexts and how the articulations influence the seeking, creation and sharing of information on the Internet. A common utterance that a searcher was unable to find an answer or a solution followed by the missing piece of information (e.g., W12, I177) shows how the assumed authoritativeness of a search may function as (obviously) a partial incentive to create and share new information if it seems that no earlier information is available. In some cases, the searchers made a direct reference to an absolute inference of the veracity of a statement on the basis of a successful or an unsuccessful search. Good and bad reviews but also the lack of reputation seemed to be enough to cast doubts on the plausibility of a particular claim (e.g., W84) and to provide incentives for making further inquiries. The presupposed authoritativeness of search engines as a reliable and easy to use source of information, functions also as a justification to ask for help in case of a failed search (e.g., G12, A11). If, and as it seems in many contexts only if, a search fails, it is socially acceptable to post a direct question. It seems that the assumption of the necessity to search has become a 'ritual of verification' (Moss 2011). We are supposed to perform a ritualistic search before doing anything else even if the actual search were an entirely nominal effort without any practical relevance.
There are some evident limitations in the study. The material is collected from the Web and is likely to represent only a fraction of all possible expressions related to the searching for information. Since the analysed utterances focus on explicit acts of searching, it is apparent that the data is less useful for analysing the impact of such general aspects of the cognitive authority of Web searching as, for instance, the impact of privacy concerns, the 'filter bubble' or commercialist aspirations of search companies. This limitation relates to the premises of the data collection method, which is based on hypothetical assumptions of the potential relevance of particular utterances in explaining certain phenomena. The analysed material per se represents an unknown sample of English-speaking users of particular types of mostly conversational Web services, but as such the utterances are articulations and as such significant expressions of the assumptions of credibility. At the same time, however, it is reasonable to expect that the individual articulations express only some of the complexity of the underlying processes making inferences about the credibility of information. It is also reasonable to believe that some of the utterances can be more plausibly explained by other factors than as references to authority and credibility. Finally, the anecdotal nature of the evidence makes it impossible to analyse complete search processes (as in e.g., Rieh 2002) and to make advanced inferences of their contexts. In spite of its limitations, the data and data collection method has advantages. The material is likely to represent more naturalistic and contextually realistic utterances than an interview or a survey study could produce. When searchers are not explicitly asked to reflect on their cognitive authorities it is possible to observe how the ideas of authority function in everyday life contexts and what their impact and practical relevance in information seeking and socialising in the Internet are.
The findings have twofold implications for systems design. Besides the quality and relevance of retrieved information (and measurement of search effort), the perceived authoritativeness of the search exercise could be a significant indicator of the level of success of a query. If a searcher sees that the cognitive authority of a particular search is low, the results may be seen as less reliable and vice versa. Search systems might be designed to provide information on the potential authority of the search from the systems point of view. Also the fact that searchers seem to be at least rather well aware of why their searches might fail may suggest that this information might be used to increase the success of searching by providing help on the aspects of search that were perceived to fail. In this sense cognitive authority information could function as an additional form of relevance feedback. To operationalise the findings, an important topic of future studies is whether it is possible to automatically identify patterns of search behaviour that are directly attributable to perceived cognitive authority of searches and how to operationalise such data.
The author would like to thank the anonymous reviewers for their valuable comments and suggestions on earlier drafts of this paper. The research was conducted by the time when the author was working at the Department of ALM, Uppsala University.
About the author
Isto Huvila is a Senior Lecturer in information management at the School of Business and Economics, Department of Information Studies, Åbo Akademi University in Turku, Finland and an associate professor and research associate at the Department of ALM at Uppsala University in Sweden. His primary areas of research include information work and information management, knowledge organisation, documentation, and social and participatory information practices. He received a MA degree in cultural history at the University of Turku in 2002 and a PhD degree in information studies at Åbo Akademi University (Turku, Finland) in 2006. He can be contacted at: Isto.Huvila@abo.fi