vol. 15 no. 1, March, 2010
The Web, by its nature, allows anyone to publish without regulations and it has raised concerns about the credibility of information found therein. Although recent studies elucidate the factors affecting users' credibility judgments and practitioners have suggested guidelines to help users develop a keen eye to assess the information on the Web, they are limited to evaluation of individual Websites or Websites as a medium. What is lesser known in the literature are users' credibility judgments in social media such as blogs, wikis, and social question and answer sites. Characterized by user participation and openness in creating and managing content, social media add even more complexity to the complicated task of filtering out credible information on the Web because users evaluate information given by fellow users whose expertise is hard to assess with traditional credibility cues such as an author's affiliation.
Among a variety of social media, of particular concern to this study is a social question and answer site. A social question and answer site is a community-based Website where people ask and answer questions for one another. Since AnswerBag was first introduced in 2003, these sites have been growing rapidly in size and importance in the realm of online searching. According to Hitwise (2008), U.S. visits to such sites increased 898% between February 2006 and February 2008 and the average visit time among the top five sites increased 44% between March 2007 and March 2008. This dazzling success of social question and answer sites is largely attributed to the millions of lay people who volunteer to answer others' questions. However, these sites are criticized for precisely the same reason: anyone can post answers without a peer-review process, so the quality of specific answers varies drastically, ranging from excellent to abuse and spam (Su et al. 2007). Despite widely-held concerns about the credibility of user-generated content on the Web, little is known about how people evaluate the credibility of answers given by laypeople in a social question and answer site. To fill this void, this study aims to explore users' experience of credibility judgments in a social question and answer site. Because assessing credibility is an ongoing and iterative process throughout the information-seeking process, rather than one-time action (Rieh and Hilligoss 2008), the study examines users' evaluation of individual answers in the site along with pre-search activities and post-search verification behaviour.
The popularity of social question and answer sites has grown rapidly over the last couple of years, and thus it seems timely to study users' credibility judgments in the environment. The investigation of real users having real questions for their everyday life tasks in such a site extends previous credibility research to a novel environment and thus presents a fuller picture of Web credibility.
Credibility is conceptualized as a multidimensional construct including believability, trust, reliability, accuracy, fairness, objectivity, and other concepts (Self 1996). Although there is no universal agreement on what dimensions constitute credibility, the notion of credibility is widely accepted as believability (Tseng and Fogg 1999) with two key dimensions: trustworthiness and expertise (Hovland et al. 1953). How to sort out believable from unbelievable information is not a new problem, but the Web provides unprecedented conditions with its relative lack of centralized quality control mechanisms and source attributions, along with its complex interface features and speed of growth (Burbules 2001; Danielson 2005). Over the past decade, researchers have increasingly recognized the importance of credibility judgments of Web information, which has fueled a burgeoning research area, Web credibility, and thus, a number of studies (for a comprehensive literature review, refer to Self 1996; Wathen and Burkell 2002; Metzger et al. 2003; Rieh and Danielson 2007).
Drawing on four years of research on Web credibility by Stanford's Persuasive Technology Lab, Fogg (2003) proposed prominence-interpretation theory, which posits that a person's credibility judgment involves two stages; first, a user notices elements of a Website (prominence) and makes a judgment about it (interpretation). If a specific element is not noticed, it cannot make any impact on the credibility judgment of the site. In other words, the elements that are noticed and interpreted have an impact on credibility judgment. While some previous studies focus on interpretation by presenting users with specific Website elements to see their impact on credibility judgment (e.g., Fogg et al. 2001), others focus on prominence by letting users notice certain elements as they explore the Websites. This study is in line with the latter approach as it asks users to recall the most prominent elements of an answer that lead them to accept or reject the answer as credible or incredible information.
Empirical studies have uncovered a set of criteria that influence credibility judgments, which can be grouped on three levels (Rieh and Belkin 1998, 2000; Fogg et al. 2000; Rieh 2002; Eysenbach and Kohlelr 2002; Fogg et al. 2003; Freeman and Spyridakis 2004; Liu 2004; Liu and Huang 2005):
The criteria on the different levels interact with one another. For example, the characteristics of the source determine perceptions of credibility of information under the assumption that credible sources produce credible information. Conversely, the attributes of information are used to ascertain source credibility in the absence of knowledge of the source (Rosenthal 1971; Slater and Rouner 1996). Additionally, people transfer the credibility assessment of a media product or information channel to the information within it (Flanagin and Metzger 2008a; Hilligoss and Rieh 2008). For example, students perceive books and scholarly journal articles as more credible media than the Web and blogs. However, while people perceive certain media or channels to be non-credible, they still have good reason to use them (Hilligoss and Rieh 2008). For example, blogs may lack credibility because they are opinion-based, but they are useful for getting new ideas. Therefore, how users perceive the credibility of a social question and answer site and what they look for are likely to influence their decision to use the site and the way they evaluate information in the site.
The relative importance of each criterion varies from study to study because of the characteristics of participants, types of source, type of information, and other conditions. In Rieh's (2002) study, for example, scholars were more concerned with content and source reputation than with presentation, graphics, and functionality. In similar fashion, Hong (2006) found that undergraduates regarded message features (e.g., quotations, reference sources) as more important than Website structural attributes (e.g., domain names and site maps). The general public in the study by Fogg et al. (2003) study, on the other hand, evaluated the Website's appearance more frequently than any other features. In a comparative study of undergraduates' and graduates' credibility judgments, Liu and Huang (2005) found that while undergraduates predominantly relied on an author's name, reputation and affiliation, graduate students paid more attention to information accuracy and quality. These findings confirm that experts tend to evaluate content more than other attributes (Stanford 2002).
It should be noted that credibility judgment is an ongoing process rather than a discrete evaluation event as people make three distinctive kinds of judgments: predictive judgment, evaluative judgment, and verification (Rieh and Hilligoss 2008). People make predictive judgments about a source from which information will be gathered, and upon the selection of a source, make evaluative judgments of information presented in the source. People may attempt to verify information because they are uncertain about the credibility of information when first encountered or because they find the information incorrect later after initially accepting the information. Gray et al.'s (2005) health information seekers verified information they obtained from the Web with personal sources offline. Undergraduates in Flanagin and Metzger's study (2000) also performed verification, but did so only rarely or occasionally. Furthermore, people's verbal reports do not accurately represent their actual behaviour. Often, people report they verify credibility, but their actual behaviour belies what they claim (Flanagin and Metzger 2000).
While the previous studies provide a valuable theoretical framework, which can serve as a conceptual basis for this study, social question and answer sites offer a unique venue for understanding Web credibility judgments for two reasons. First, evaluating specific answers is different from evaluating entire Websites, in that individual lay people provide information directly to questioners. Given that noting the institutional level of quality markers such as URL domain or author affiliation is a common strategy in Website evaluation (Rieh 2002), the absence of such quality markers in social question and answer sites may force users to more critically scrutinize answers or the credentials of potentially unqualified answerers. Second, still in its infancy as a medium, a social question and answer site requires users to not only possess general evaluative skills, but be acquainted with the new features of the site, which can be used as information quality markers or credential clues (e.g., an answerer's profile). Some users, especially novices, may have yet to develop effective evaluative skills specific to this environment.
The advent of social question and answer sites adds another layer of complexity to Web credibility with its unique nature. Knowing how users evaluate answers given by lay people will broaden our understanding of Web credibility in the era of user participation and content creation.Quality of answers in a social questions and answers site
The extreme variability in the quality of individual answers in a social question and answer site has led to recent studies that seek to investigate users' evaluation of answers or to automatically identify determinants of answer quality with the purpose of improving the function of the site.
Gazan (2006) examined the influence of answerers' roles on questioners' evaluations of answers in AnswerBag. He grouped the role of answerers into two types: specialists and synthesists. Specialists provide answers based on their knowledge without referencing external sources, while synthesists provide answers using external sources without claiming any expertise. The study showed that the synthesists' answers were generally rated higher than were those of specialists by questioners. Kim and Oh (2009) identified twenty-three relevance criteria questioners use when they select the best answers through content analysis of 2,140 comments questioners left on the best answers in Yahoo! Answers. Their findings illustrate that users evaluate not only the content of answers (e.g., accuracy), but socio-emotional values (e.g., emotional support), utility (e.g., usefulness), information sources (e.g., author's expertise, external links), and other criteria for the selection of best answers. Kim and Oh (2009) and Gazan (2006) did not explicitly examine credibility, but it is obvious that their participants considered credibility when evaluating answers, as evidenced by the commonly identified criteria such as references to external links and answerer's expertise.
Adamic et al. (2008) predicted a questioner's choice of best answers based on the attributes associated with the content and answerer drawing on 1.2 million questions and 8.5 million answers in Yahoo! Answers. Their findings showed that the length of the answer was most indicative across topic categories (lengthier answers tend to be selected as best answers). In certain topic categories, however, the number of competing answers and the history of the answerer were more likely to predict answer quality. These findings are consistent with Agichtein et al.'s (2008) study which found that answer length is dominant over other answer features in predicting answer quality. In the same vein, to predict questioners' satisfaction with the answers presented in Yahoo! Answers, Liu et al. (2008) built a model by extracting six sets of features: question, question-answer relationship, asker user history, answerer user history, category features, and textual features. Among them, askers' ratings (satisfaction) of an answer in response to a previous question was the most salient feature to predict their satisfaction with a new answer. On the other hand, the reputation of the answerer was much less important, suggesting that the authority of the answerer might only be important for some, but not all, information needs.
Interestingly, some researchers and operators of social question and answer sites speculate that one of the reasons such sites are not regarded as reliable sources of high-quality information is the prevalence of conversational questions (e.g., Do you believe in evolution?) as opposed to informational questions (e.g., What is the difference between Burma and Myanmar?) (Harper et al. 2009). Since these two question types are asked for different purposes, people may apply different sets of criteria when evaluating answers provided for each type of question. Therefore, this study categorizes question types into conversational and informational questions and examines their influence on credibility judgments.
The purpose of this study is to investigate users' credibility judgments in a social questions and answers site. Recognizing credibility judgment as an ongoing process instead of a discrete activity, the study describes a sample of users' motivations to use a social questions and answers site, credibility judgments of answers, and post-search verification behaviour. In addition, it investigates the relationships between question type and credibility judgment. The specific research questions the study addresses are as follows:
This study is part of a bigger project whose aim is to understand the information seeking and providing behaviour of questioners and answerers in a social questions and answers site. Since the project was necessarily descriptive and exploratory in nature, interviews were conducted to investigate both questioners and answerers' experiences of the site. This study reports only on the interviews with questioners regarding their credibility judgments. E-mail, chat, and telephone interviews were held with thirty-six questioners of Yahoo! Answers and the interview transcripts were analysed using the constant-comparison method (Lincoln and Guba 1985).Yahoo! Answers
This study selected Yahoo! Answers as a research setting because of its dominant status among social question and answer sites. As of March 2008, Yahoo! Answers was the most visited question and answer site in the U.S., accounting for 74% of all visits (Hitwise 2008). It has attracted twenty-five million users with 237 million answers in the U.S. and 135 million users with 500 million answers worldwide (McGee 2008). The astonishing scale of data and diversity of topics have made Yahoo! Answers a popular setting for recent research on such sites despite its short history (Agichtein et al. 2008).
The process of asking and obtaining answers to a question is quite simple: a user (questioner) posts a question under a relevant category from twenty-five top-level topic categories and it becomes an open question. Once the question is posted, any user (answerers) can post answers to it. Among all answers posted, the questioner can select the best answer or, alternatively, allow the community to vote for the best answer. When a best answer is chosen, either by the questioner or by the vote, the question becomes a resolved question and remains in the Discover section for browsing and searching.
To encourage user participation and reward high quality answers, Yahoo! Answers implements a point system and, based on the points, categorizes members into different levels (Yahoo! Answers 2009). For example, an answerer gets two points for each answer s/he posts and gets ten points when the answer is selected as the best answer. When questioners choose the best answer for themselves or vote for the best answer for others they also get points. The earned points allow everyone to recognize how active and helpful a user has been in the site.Data collection and analysis
Starting November 2008 and ending April 2009, a solicitation e-mail was sent to 750 Yahoo! Answers users individually for the project. Each week during the twenty-five week period, thirty users of Yahoo! Answers (fifteen questioners and fifteen answerers) were selected from one of twenty-five top-level topic categories in the Discover section using three criteria. The first criterion was to select those who asked a question most recently in each topic category. Because of the heavy traffic, the selected participants had usually asked questions within the last day. The second criterion was to select those whose Yahoo! e-mail addresses were public in their profiles, so that the researcher could contact them by e-mail. The third criterion excluded those who explicitly stated in profiles, questions or answers that they were under 18.
In the solicitation e-mail, the participants were given four options of interviewing: telephone, e-mail, chat, and face-to-face (for nearby participants only). Given that Yahoo! Answers users are geographically dispersed and they vary considerably in terms of Internet proficiencies and writing skills, it was an appropriate choice to provide as many interview methods as possible. By allowing people to select the interview method they felt most comfortable with, the weaknesses associated with each method were expected to be reduced to a minimum.
For the bigger project, two types of semi-structured interview questionnaires were prepared: one was for questioners and the other for answerers. The participants were asked to select the type of questionnaire(s) they would like to complete based on their questioning and answering experience in the site. As aforementioned, this study reports only on the data derived from the interviews with questioners. The semi-structured interviews for questioners included seven open-ended questions about:
A questioner's familiarity with the topic of the question, urgency of the information need, experience with the site, and demographic information (age, sex, occupation) were solicited at the end of the interview. The Critical Incident Technique was used to help the participants focus on their most recent questions and evaluation processes. This is a popular interview technique used to identify specific incidents which participants experienced personally rather than eliciting their generalized opinions on a critical issue. Since the purpose of this study is to describe how the participants assess information, the method was useful in drawing out realistic details without observing them directly.
From the 750 e-mails sent, thirty-six interviews resulted with questioners and forty-four interviews with answerers. A possible reason for the low participation rate is the use of Yahoo! e-mail addresses as a contact method to reach potential participants. Creating a Yahoo! e-mail account is mandatory to register for Yahoo! Answers. People create the e-mail accounts as a means to be a member of the site, but not all of them are actually using the accounts. This may have caused a high undeliverable rate.
Among thirty-six interviews with questioners, there were 17 e-mail interviews, 10 through Internet chat (Chatmaker and Yahoo! Messenger), and 9 by telephone. Each chat or telephone interview took approximately 40 minutes to an hour and a half. During the chat and telephone interviews, some questioners pulled up their questions in the site and walked through their evaluation processes with associated answers. The chat session transcripts were automatically recorded and the telephone interviews were audio-taped and transcribed verbatim. Five follow-up interviews were conducted with the e-mail interviewees for clarification and missing data.
The data was analysed using the constant-comparison method of content analysis (Lincoln and Guba 1985). The researcher read through the transcripts and classified individual statements into categories with a simultaneous comparison of other categories. Throughout the process, themes formed inductively, guided by the interview questions and patterns emerged to provide various perspectives on central issues. To see the influence of question type on credibility judgments, the participants' questions were categorized into two groups as in Harper et al. (2009):
Conversational questions are intended to spark a discussion and do a poll on a particular issue, and therefore are not expected to have one correct answer. On the other hand, informational questions are intended to call for facts, procedures, recommendations for products and services, or sources of information. This type of question is expected to have one correct answer or appropriate recommendation.
When it comes to credibility criteria, the researcher and a library science graduate student coded the mentions of criteria independently. Through the initial coding, the two coders developed a codebook iteratively by reaching a consensus on the analysis of shared transcripts. The codebook included a list of criteria the participants used along with their definitions and examples. With the finalized codebook, the coders coded the entire transcripts again independently. After this round of coding, inter-coder reliability was calculated using Cohen's kappa. The value of Cohen's kappa was 77%. If Cohen's kappa is greater than 70%, the agreement is regarded as substantial (Landis and Koch 1977). All disagreements were resolved through discussion to reach consensus.
To verify the researcher's interpretations and conclusions, a member check of the results occurred with four participants, selected based on the number of times they were quoted in the study. A member check is regarded as the most critical method in establishing validity in a qualitative study (Lincoln and Guba 1985: 314). A preliminary draft of the results was sent to the four participants and they confirmed and agreed with all the findings presented.
Most participants were male (72%, n=26) (Table 1). The participants ranged widely in age from 18 to 67 (mean: 37, SD: 13.6), although over half of them were in their 20s or 30s. It was assumed that the participants would be from the United States, but at least two of them were from other countries, as revealed by the questions they asked. The participants greatly varied in their occupations including student, banker, bus driver, computer programmer, graphic designer, aerospace engineer, hair stylist, homemaker, unemployed, and more.
Regarding the experience with Yahoo! Answers, most participants (78%, n=28) had used the site for over a year as of the time of interviewing. With respect to the frequency of using the site, two-thirds of the participants (67%, n=24) reported using the site at least three to four times a week while the rest (33%, n=12) were occasional users. Considering the participants' long experience with the site and the frequency of use, a majority of the participants were experts accustomed to using various features of the site. In addition, most of the questioners were familiar with the topics of their questions (78%, n=28), but only a small number of them had urgent information needs (22%, n=8).Motivations to use Yahoo! Answers
The analysis found that thirteen questioners (36%) asked conversational questions and twenty-three (64%) asked informational questions. The type of question the participants asked was closely tied to their motivation for using Yahoo! Answers.
For those questioners who asked conversational questions, Yahoo! Answers was a natural choice because they 'could get answers from millions of real people' (Participant 11 (P11)) and answerers were believed to have 'an ability to answer as openly as possible' (P16). While three out of the thirteen questioners did a pre-search for background information, most did not feel a need to search information prior to a discussion. Put differently, when the questioners wanted to initiate a discussion or do a poll on a specific issue, they usually went to Yahoo! Answers directly without consulting other sources. In this case, the credibility of the site was not an important consideration. Instead, the questioners sought tools that allowed them to participate in conversation, as one participant prioritized the social interaction taking place in the site over its trustworthiness:
'It's not about trustworthy at all. I like when people notice my question and answer it, and in the end make me smile, cheer me up.' (P12)
The other participants asked informational questions to look for solutions to problems at hand, to expand knowledge on a topic, or to find a fact. As opposed to those who asked conversational questions, most of this group searched information before coming to Yahoo! Answers, mainly using the Web or interpersonal sources. For example, a college student working part-time in technical support services at his college was facing difficulty in accessing his local network from one computer. After performing many Internet searches, calling the hardware manufacturer, and consulting a few of his colleagues at work, he went to Yahoo! Answers and finally found what the problem was. While this example demonstrates the use of Yahoo! Answers as the last resort when searches fail with other sources, several questioners used the site to confirm the information they gathered from other sources, as P13 wanted to check if the information that his mechanic gave to him was real and could be trusted.
One participant illustrated the type of questions that can be better addressed by a social questions and answers site than a general Web search engine. When he found a spider in his bedroom, he took a picture of it and posted a link to the picture in the site to identify what kind of spider it was:
'Search engines like Yahoo! and Google, and whatever in words, but in picture, it won't search the pictures for it. So I figured the only way would be to ask a person. So when I ask Yahoo!, I can ask the people.' (P34)
In a nutshell, the questioners came to the site because of its abilities to deal with difficult questions calling for discussion, personal advice, suggestion, or other information that cannot be easily answered by a traditional Web search engine. Moreover, previous positive experience with Yahoo! Answers raised its perceived credibility and motivated the questioners to use it as an information source. Seven questioners made comments such as:
'Whenever I ask a question on here, I get great results that I can trust.' (P6)
It is evident that the perceived credibility of the site influences one's decision to use it. It is not always the first and foremost factor, however, because questions may not require credible information. For example, the ability to compare a wide spectrum of others' opinions can come first before the credibility issue, depending on the goal of asking a question.
The next section examines the questioners' opinions on the credibility of the site in more detail.Overall credibility of the site
With respect to the overall credibility of the site, one third of the questioners (31%, n=11) gave highly favourable evaluation to the site as P30 states:
'[the site is] trustworthy enough, since people by and large come here to have a good time, their intentions are good and they may be sarcastic but mean no harm.'
As opposed to these questioners who trusted the goodwill of the people giving them answers, ten questioners (28%) were skeptical about the competencies of Yahoo! Answers' users as information providers:
'Who are motivated by bias, a lot motivated by hate, many just exhibiting ignorance of the topic, they have this air of immaturity. A lot of people are answering questions just for the points, not because they know anything about what's being asked.' (P33)
They also reported that biased and hateful users were abundant particularly in politics, religion, and global warming categories where opinion was particularly divided. Accordingly, the nature of the subject category a questioner has been active in strongly influences his evaluation of the entire site; for example, a favourable evaluation by a participant who asked a hunting question may have resulted from experience with the less controversial topic shared among like-minded people who have the same hobby in the Hunting category.
Another group of fifteen questioners (42%) expressed more caution in judging the overall credibility of the site stating that each question should be rated individually since it all depends on what and how a person answers. If a questioner can find a serious person who is willing to take time to research and explain his answer along with a source, the information is very likely to be credible. A problem is that the chances of encountering such a competent answerer are just about 'as if you had asked someone standing next to [sic] in line somewhere in public.' (P8) Some questioners pointed out the potential limits to the utility of the site as a serious source:
'I don't think Yahoo! Answers, they don't have the capability as of yet of being a reputable scholarly source, and I would not use them as such.' (P2)
'In general, it is good for small questions? like when you plant vegetables in your garden, or the name of a movie you can't quite recall. But I would never recommend it for medical, marital, or financial advice.' (P35).
What is notable here is that most of the questioners who gave negative evaluation to the site asked conversational questions. They do not see the site as a credible place for getting questions answered accurately because people often present their opinions rather than hard facts:
'Yahoo! Answers should be more aptly named Yahoo! Opinion because normal responses only convey what an individual thinks or feels on the subject.' (P18)
On the other hand, those who gave a favourable evaluation mostly asked informational questions, advocating asking factual questions in the site:
'For factual information, I would say it's really good. I use it all the time. I would say it's about 99 percent accurate. For opinions, it depends on the category.' (P34)
To conclude, individual questioners have formed a perception about the overall credibility of the site and according to the perception, they ask the type of question that they think is more suited to the site.Credibility criteria for specific questions
Not all questioners evaluated the credibility of answers given to their recent questions. Four questioners (11%) did not do so because the questions were supposed to get emotional support or to make people laugh or because the answers did not provide substantial information to evaluate. When other questioners evaluated credibility, they did not always evaluate all answers. Especially when the number of given answers was high (the range of the number of answers received by each participant was between on and thirty-seven with two-thirds of the questions getting fewer than ten answers), the questioners focused on a small number of answers that got their attention during initial scanning to examine them more carefully.
Furthermore, the questioners did not apply the same set of criteria to every answer in an equal manner. Instead, they noticed certain salient attributes associated with each answer and made a judgment about them, which confirms Prominence-Interpretation Theory (Fogg 2003). For example, a questioner who asked about travelling in a foreign country received two answers. One answerer claimed to live in the country and gave a very grumpy answer while the other answerer said he had travelled there and linked to other sites that the questioner had heard of. The questioner effortlessly judged the second answer to be more credible because of 'the references to other sites that have good credentials' (P3). He also mentioned that the first answerer's tone of writing negatively influenced his credibility perception despite the positive self-claimed expertise and qualifications.
Table 2 lists the credibility criteria the questioners used together with the frequency of use (when a questioner applied a single criterion to multiple answers, it was counted as one). In total, twenty-two criteria were identified and they were grouped into three categories: message criteria, source criteria, and others. The questioners used each criterion either positively or negatively or both in credibility judgments. For example, the fact criterion was used both positively and negatively; a factual assertion made in the answer to a discussion question positively impacted credibility judgment while a lack of fact-based information resulted in a negative credibility judgment. Considering the limitation of the data collection method and the fact that the average number of criteria mentioned per questioner is 2.6, the participants might not have remembered all the criteria they used during the credibility judgment process. Nonetheless, this list of criteria gives insights into the most salient attributes of answers users notice and evaluate in a social questions and answers site.
|Criteria||Positive use||Negative use||Total|
|Spelling and grammar||7||0||7|
|Tone of writing||3||1||4|
|Source criteria||Answerer's attitude||1||2||3|
|Perceived expertise based on the answer||4||1||5|
|Perceived expertise based on an answerer's profile||8||0||8|
|Reference to external sources||4||0||4|
|Self-claimed expertise or qualification||4||0||4|
|Others||Ratings on the answer||2||0||2|
Out of the total eighty-eight times of using credibility criteria, 48 times (55%) were related to message criteria. The questioners considered both the content-related criteria such as accuracy and completeness and the presentation-related criteria such as layout, spelling and grammar. Logic or plausibility of arguments was the most frequently used criterion in this category followed by spelling/grammar. While seven questioners thought using basic grammar and writing skills reflected a more intelligent and knowledgeable answer, several said that clerical errors did not bother them as long as they could understand what was written.
The questioners also evaluated source credibility twenty-eight times (32%). In the absence of institutional-level sources and author affiliation information, an answerer's profile turned out to be the most frequently consulted information about one's credentials because it provides the history of answers including the best answer rating:
'He [the answerer] had a 34% Best Answer rating after answering 2,610 questions over roughly the last three years. His predominant category of answers fell into the military genre.' (P18)
The questioners also gauged the answerer's expertise by examining the content of the answer or the answerer's self-claimed expertise:
'She sounds like she knows what she's talking about.' (P2)
'She lives in the area I asked about and studies regional history, which helps.' (P26)
Like the trustworthiness of the entire site, the self-claimed expertise drew a wide range of views from extremely cynical to unconditionally trusting. Some questioners were in-between the two ends of the spectrum, indicating that their judgments would depend on the type of expert:
'I wouldn't believe anyone who said they were a doctor or lawyer on Answers for free, no. I do believe people who say, "Used to work in a medical office or law office". I have no reason to doubt them.' (P28)
'You can generally tell by the wording of the answers with most disciplines. For instance, a doctor will never use the word "crazy".' (P22)
Furthermore, answerers who proved themselves knowledgeable and competent in a specific topic category over time earned the perception of strong credibility with the questioners. Recognition of such a known answerer influenced credibility perception positively without fail:
'The answerer has answered my questions before and has been chosen best answer.' (P10)
In addition to the perceived expertise of an answerer, a reference citation was an important clue in judging the credibility of information. References to external sources, mostly links to other Websites, are known to serve either as tools for central processing of information or as cues for peripheral processing (Freeman and Spyridakis 2004). Some questioners were engaged in peripheral processing by simply noting the presence of links to other sites and coming to the conclusion that the answer was credible. Others were engaged in central processing by following links to gain additional information to verify the answer with information from another Website. In the study, the former was coded as 'reference to external sources' and the latter as 'verifiability.' When quoted Websites did not coincide with what the answerer said, the credibility of the answer was damaged because it was not verifiable.
Two questioners looked at the ratings on answers given by the members of the community with a belief that the collective decision could be superior to the smartest individual. This criterion is notable because the questioners took advantage of the nature of the social questions and answers site by relying on fellow users' decision making.
An answerer's attitude also influenced credibility. An answerer who exhibited a sense of humour, politeness, or emotional support was regarded as a credible source. In addition to 'ratings on the answer,' this criterion shows social interaction taking place in the community.
The type of question influenced the credibility criteria the participants used. Not surprisingly, the content-related criteria such as logic and accuracy as well as verifiability were important for informational questions (Table 3). For conversational questions, accuracy of spelling and grammar was the most frequently used criterion. Although conversational questions are asked to initiate a discussion and are not expected to have one correct answer, the questioners who asked conversational questions sometimes checked spelling/grammar, logic, and other criteria to assess the credibility of back-up information that supported one's argument in the answers.
|Type of questions||Criteria||Frequency of use|
|Perceived expertise based on an answerer's profile||3|
|Perceived expertise based on an answerer's profile||5|
When asked to list up to three most important credibility criteria in general, the questioners repeated a majority of the criteria they used for their recent questions (Table 4). What is notable here, however, is that as many as fourteen questioners mentioned 'references to external sources'. Knowing that only four questioners actually used that criterion for their specific questions, it is speculated that the questioners consider reference citations critical clues, but the unavailability of citations in the answers prevented them from using them.
|Criteria||Frequency of mentions|
|Tone of writing||6|
|Source criteria||Answerer's attitude||2|
|Perceived expertise based on the answer||3|
|Perceived expertise based on an answerer's profile||6|
|Reference to external sources||14|
The criterion that the questioners think is important in general, but fail to use for their specific questions, is honesty. Honesty, which is equivalent to trustworthiness, has been treated as an essential component of credibility constructs together with expertise. A potential reason for not evaluating honesty for the specific questions could be a difficulty in assessing an anonymous answerer's willingness to answer honestly in the social questions and answers environment.
Novelty, known answerer, and ratings on answers were used for evaluating the specific questions, but not regarded as important in general. They might be secondary to major criteria such as reference to external sources.
These findings illustrate that the participants' ability to use various credibility cues is closely tied to the availability of those cues in the evaluation environment. Therefore, the actual criteria that the participants use differ from the ideal criteria that they should use generally to determine credibility.
Twelve questioners (33%) reported that they verified information with external sources. Ten of them did it when first encountering answers in the site as a part of credibility judgment by continuing searches until they were satisfied that the given answers were correct. Only two questioners verified the information later when they came to doubt the credibility of the information while using the information. The verification rate increases to 43% with informational questions (ten out of twenty-three questions) and decreases to 15% with conversational questions (two out of thirteen questions). Put in another way, those who asked informational questions tended to verify information more than those who asked conversational questions because their intent was to obtain facts. The verification rate for informational questions in the study was higher than expected compared to the rare or occasional verification behaviour reported in previous research (e.g., Flanagin and Metzger 2000).
A potential explanation for this is that the use of the self-report method in this study led the questioners to say they verified the information more than they actually did due to the social desirability effects. An alternative interpretation is that the presence of Website links embedded in the answers accelerated their verification behaviour, as evidenced by five questioners who followed up on the suggested links right away. While clicking a given link is quick and easy to perform, seven other questioners did more laborious work by using search engines or by consulting a Website they were already familiar with to cross-check consistency of the given information. For example, P7, looking into a new cat litter box, wanted people's experience with new technologies for cat litter boxes. After getting a viable answer in Yahoo! Answers, he checked out the Webpage of the product and several consumer sites that rate products to finally arrive at credibility judgment.
More interesting findings may come from those who did not verify the information with other sources. The major reasons the questioners gave for not verifying information include:
- 'It's a matter of opinion so I'm sure that it's the opinion of the answerer. It doesn't mean it's necessarily correct or incorrect.' (P32)
- 'I didn't feel I had obtained any information from the site.' (P14)
- 'If I have to answer or ask a question, it's because I have no other resource for gathering that information. I exhausted all of my other resources.' (P9)
- 'I wouldn't check twice. I checked it once before.' (P34)
- 'Because as I said it's a topic I'm very familiar with anyway.' (P33)
- '[The answer] didn't provide site links.' (P26)
- 'Only one person provided any information relevant to my question.' (P4)
- 'I simply didn't have time.' (P13)
In (1) and (2), the questioners thought that verification was unnecessary because they were seeking opinions instead of facts, or they did not obtain substantial information to verify. Reasons (3) and (4) reveal the close relationship between pre-search activities and verification behaviour: when a questioner had searched the information in advance somewhere else and the site was used as the last resort for confirmation, he or she tended not to verify the information once again. This finding indicates that verification behaviour and its links to the broader subject of information seeking in a social questions and answers site should be examined in the context of an overall information seeking process. In (5), the questioner did not feel a need to verify the answer with other sources because he was confident in his own abilities to understand and evaluate the answer. This shows that verification can be achieved by relying exclusively on personal knowledge/experience without referring to external sources or trusted people. Reason (6) illustrates some questioners' penchant to rely on easy-to-perform verification behaviour (e.g., click on a link) over those requiring additional action (e.g., leave Yahoo! Answers and search Google). This finding echoes the principle of least effort suggested by Zipf (1949) although there were some questioners who made more aggressive effort to verify the information. In (7), since there was only one relevant answer, the questioner decided to accept the information anyway without further verification. Reason (8) shifts our attention from personal-level attributes to the contextual factors, more specifically, the influence of urgency on verification behaviour.
Taken together, verification behaviour is an interplay of many factors including context factors (e.g., time constraint, the number of answers given), questioner-level attributes (e.g., knowledge level), pre-searches, and the presence of external links in answers although it is unclear how they interact or which factors are stronger than others.
The findings of the study clearly show that credibility judgments in a social questions and answers site are better understood in a broader context of an information seeking process because it is closely connected to the selection decision for the site, pre-search activities, and post-search verification behaviour. When deciding whether to use Yahoo! Answers, the perceived credibility of the site is one aspect users consider, although it is not necessarily a determining factor. Even those who do not give great credence to the site still use it to collect first-hand accounts from other people who had a similar problem or to find information that is not easily retrieved by a traditional Web search engine. Put differently, the credibility of a social questions and answers site as a medium is transferred to individual answers in that site. However, although users know they will get anonymous answerers' opinions rather than objective facts and thus, perceive the site as non-credible, they have other good reasons besides credibility to use the site (e.g., to enjoy social interaction).
For the evaluation of the credibility of individual answers, the questioners applied a variety of criteria related to message, source, or others. A notable finding is that the questioners evaluated message credibility more frequently than source credibility. This finding contrasts with previous research showing that source characteristics are the primary criteria people use when making judgments on information quality (e.g., Rieh and Belkin 1998, 2000; Rieh 2002). The frequent use of message criteria in this study can be explained by two reasons. First, it is attributed to the high rate of the absence of source information in the answers. According to Oh et al. (2008), less than 8% of answers in Yahoo! Answers include information about a source from which the answer was derived. Consistent with Slater and Rouner's (1996) finding, the questioners used message attributes to ensure credibility in the absence of source information. Alternatively, it might be due to the high level of familiarity the questioners had with the topics of their questions. Experts who are equipped with sufficient knowledge can make informed decisions on credibility by paying more attention to message content than to other attributes (Stanford et al. 2002). Unfortunately, with a small number of the questioners who were unfamiliar with their topics, the study could not systematically uncover the relationship between the level of knowledge and credibility judgments.
Moreover, the message-related criteria identified in this study, such as accuracy, clarity, completeness, spelling/grammar, tone of writing, and layout, considerably overlap with those from earlier studies (e.g., Eysenbach and Kohlelr 2002; Rieh 2002; Fogg et al. 2003; Liu 2004), implying that there is a set of core message criteria used in evaluating Web information across contexts. On the other hand, 'topicality' was not found in previous research. It might have been presupposed that topicality is already met when participants assess the credibility of a Website. In Yahoo! Answers, however, answerers sometimes go off the topic of a question, so sticking to the topic is important for ensuring a credible answer.
Although message credibility was frequently evaluated, the questioners were also aware of the fact that answerers are not information specialists who are bound by the standards of professional training and ethics, and thus, source credibility should be carefully scrutinized as well. As in other credibility studies (e.g., Eysenbach and Kohlelr 2002; Rieh 2002; Fogg et al. 2003; Liu 2004), an author's expertise, known answerer, and links/references were important source criteria in Yahoo! Answers although the participants had to identify available cues that replace those for Websites. For example, while 'an author's publications in a subject area' was useful for assessing a scholarly author's expertise in Liu (2004), in Yahoo! Answers, 'an answerer's profile' was used to gauge an answerer's expertise in a topic category based on the number of answers posted by the answerer and the best answer rating in that category. A picture of the site owner (Eysenbach and Kohlelr 2002), a source's affiliation (Fogg et al. 2003; Liu 2004), and other criteria associated with a source's identity (e.g., contact information) found in previous research were not used as credibility cues in Yahoo! Answers because the identity of answerers are usually hidden in the site unless users opt to make their personal information public.
A notable source criterion, which is not found in earlier Web credibility research, is 'answerer's attitude.' As opposed to a typical credibility judgment situation where a user interacts with a Website, this criterion points out the social aspect of Yahoo! Answers where people interact with other people through the question and answering process. As more animated, poised, and good-natured speakers were judged to be higher in credibility in interpersonal communication (Metzger, Flanagin, Eyal, Lemus and McCann 2003), answerers who were humorous, polite, and provided emotional support were regarded as more credible in this study. In short, this criterion is evidence that similarities exist between interpersonal credibility and social questions and answers credibility.
Among all source criteria, 'answerer's profile' was the most frequently used criterion to gauge the expertise of an anonymous and potentially unqualified answerer. What should be mentioned here is that there is a gap between the criteria perceived important and the criteria actually used. Whilst 'answerer's profile' was most frequently used, 'reference to external sources' was perceived as most important. This contradiction is due to the frequent unavailability of references, which prevented the questioners from using the most important criterion. After all, people's ability to use various credibility cues is constrained by the availability of those cues in the evaluation environment.
Another criterion that highlights the social interaction taking place in the site is 'ratings on the answer.' The questioners drew on consensus decision-making using the nature of the collective intelligence that emerges from the collaboration in the social questions and answers site. This confirms that social media is shifting the paradigm of credibility assessment from widely shared standards among established authorities to bottom-up assessments through collective or community efforts (e.g., ratings and reputation systems) (Flanagin and Metzger 2008b).
While many source and message criteria from previous research translate to the social questions and answers site environment, website-related criteria did not apply here because the questioners evaluated the answers in the same site. For example, the visual design of the site, which is a popular criterion in earlier studies (e.g., Fogg et al. 2003), could not be used as a credibility cue to assess the credibility of each answer.
Theoretically, this study extends earlier credibility research by examining a new environment, a social questions and answers site. where users evaluate individual answers in the same site instead of evaluating individual Websites.
This study links pre-search activities, credibility judgments in the site, and post-search verification behaviour as a continuous credibility judgment process. This process is not linear because users might not go through all stages. Users can skip the pre-search stage and go directly to a social questions and answers site, or do pre-searches but skip the verification stages afterwards. In the first stage, previous experience with the site influences the perceived credibility of the site and the perceived credibility influences the type of question to ask. Subsequently, the type of question influences evaluation of individual answers. Reversely, the outcome of evaluation of individual answers reinforces or challenges the perceived credibility of the site. Since the relationships among the stages and factors influencing each stage are tentative, the credibility judgment process and the factors identified in this study need verification by additional research involving a large number of users in multiple social question and answer sites.
When it comes to specific credibility criteria, this study indicates that there is a set of relatively consistent credibility cues across Web contexts. Message criteria such as accuracy, clarity, completeness, spelling/grammar, tone of writing, and layout/organization remain important regardless of contexts. With respect to source criteria, a source's expertise, known answerer, and links/references are consistently important although people may use different cues specific to a context. Website-related criteria such as the visual design of the site and functionality are not used in a social questions and answers site because questioners evaluate individual answers in the same site. In addition to identifying a core set of credibility criteria on the Web, this study directs attention to the social and collaborative nature of communication in a social questions and answers site; an answerer's attitude shown through a communication process and the community members' ratings on an answer influence credibility judgments in the site.
In summary, as social question and answer sites are a specific type of Website, many criteria from previous Web credibility research transfer to the social questions and answers environment. Interpersonal traits identified in traditional source credibility literature apply to this environment as well due to the social interaction occurring therein.
Practically, the findings of the study have implications for improving the design of the social questions and answers site. First, the point system of the site was developed to facilitate the exchange of high quality information, but it does not do enough to deter users from posting low-quality answers (Nam et al. 2009) because it gives an incentive to answer as many questions as possible without considering the quality of information. This study suggests revising the point system to give more rewards to those who cite references because references are regarded as the most critical credibility clue. Secondly, the site can develop a search algorithm that incorporates the credibility criteria the questioners consider important. The site could rank answers containing no grammatical errors, with external references, or posted by reputable answerers higher in the search result. The data also point to the need for an algorithm capable of analysing the type of question and returning the most credible answers accordingly.
Overall, many questioners in the study exhibited critical appraisal skills at least to some extent. They share the same concern with researchers on the quality of information presented in the social questions and answers site and thus have developed strategies specific to the environment (e.g., look up an answerer's profile). However, some questioners have extremely positive or negative attitudes toward the credibility of the site, which may be barriers to making appropriate credibility judgments based on accurate analysis of individual answers. As the most widely discussed implication of credibility research is user instruction, this finding also proposes that information professionals and librarians should teach users how to effectively evaluate information given by lay information providers in a social questions and answers site and how to verify the information with more reliable sources. However, teaching users how to identify credible information is not the only way information professionals can address the problem of information credibility in a social questions and answers context. Another is to help users become competent answerers to provide credible information. Although traditional user instruction has revolved around the notion of users as information seekers, it is time to think about the idea of users as information creators and providers. The findings of the study can contribute to developing a user instruction program or guideline to help answerers understand what criteria questioners consider important when filtering out credible information and how to prepare credible answers in a social questions and answers context.
The biggest limitation of this study lies in its self-report research method. As a way to explore users' experience of credibility judgments in the novel environment, the research method produced fruitful findings, but it may have caused potential response bias problems as people know they should critically analyse the information they obtain online (Metzger 2007). To minimize the bias, the interviews used the critical incident technique, which allowed the participants to focus on a recent specific episode rather than generalizing their opinions. Nonetheless, the social desirability effects may have come into play, especially in answering verification behaviour in the data. Therefore, further research is required to confirm how users actually verify information obtained through a social question and answer site through a direct, unobtrusive research method in a natural setting.
Also, with the data, it was sometimes unclear whether the questioners considered the mentioned criteria by ignoring other attributes available in the site, or because of the absence of those attributes. Content analysis of the questions and associated answers together, through a direct, unobtrusive research method would reveal this phenomenon more clearly.
Finally, further research should delve into the influence of questioners' motivation, ability, and other individual characteristics on credibility judgment effort. Since most participants had prior knowledge about their topics and non-urgent information needs in the study, it was not possible to examine the influence of urgent need and low familiarity with a topic on credibility judgments.
Another serious limitation of this study is the small, non-random sample. The findings cannot be generalized to the general public or even to the population of Yahoo! Answers' users because the sample size is very small and there may be discrepancies between those participants who self-selected into the study and those who did not. Most of the participants turned out to be heavy users of Yahoo! Answers and had sufficient knowledge on the topics of their questions, which probably made them more sensitive to the credibility issue. On the other hand, it is possible that those who are less concerned about information credibility in a social question and answer site are less likely to participate in the study because of the lack of interest in that topic. Therefore, the findings of the study should be regarded as suggestive rather than conclusive. Future research using random sampling or a large sample of people could generalize credibility judgment behaviour identified here to a larger population.
The current study extends earlier credibility research that examined how people evaluate the credibility of Web information to a social questions and answers environment where users evaluate answers given by lay information providers.
The major findings of the study include: (1) the questioners do not always evaluate all given answers nor apply the same criteria to every answer; (2) there is a set of relatively consistent credibility cues across Web contexts; (3) the questioners rely more on message credibility than source credibility partly due to the frequent unavailability of source information; and (4) the type of question is a key factor that characterizes a credibility judgment process including pre-search and post-search verification.
Theoretically, the study has strengths in that it examined real users who asked real questions for their everyday life tasks in a novel environment and placed their credibility judgments in a broader context of an information searching process. Practically, it has implications for the design of social question and answer sites and user instruction on how to evaluate Web information. The questioners' credibility judgment behaviour identified in the study can help information professionals develop a user instruction programme to teach information seekers who want to find credible information as well as lay information providers who want to create credible answers.
The author is grateful to Leia Dickerson, graduate student, for her help with data collection and coding. The author would also like to acknowledge the anonymous reviewers for their useful comments.
Soojung Kim is an instructor in the College of Information Studies, University of Maryland. She received her PhD from the University of Maryland. She can be contacted at: email@example.com
|Find other papers on this subject|
© the author, 2010.
Last updated: 17 June, 2010