Vol. 11 No. 2, January 2007
User-oriented digital information search environments call for flexible information access interfaces that may interact with a dynamically changing searcher view in capturing a variety of media. Optimal use of conventional libraries and bibliographic databases requires a general understanding of the knowledge structure of the collection domain (Hsieh-Yee 1993; Pennanen & Vakkari 2003). Novice searchers without such understanding, however, can seek the help of librarians and intermediaries when they get lost in search processes.
Increasing numbers of digital libraries and online resources on the Internet provide potential users with opportunities to access and interact with these resources directly from offices and homes. Such trends seem to offer searchers useful information access environments for a variety of information resources. However, in such environments, novice searchers are forced to seek the information they need without the help of librarians or other intermediaries. In reality, many novice users of digital libraries do not have a general understanding of the knowledge structure of the digital collections held by these libraries. Eventually they may give up pursuing their information needs when they get lost during search processes or obtain unsatisfactory search results.
This research project seeks to find a way to overcome such limitations of existing information access interfaces developed for traditional libraries and bibliographic information services. Specifically, we explore a qualitative research method for eliciting the knowledge structure of novice searchers and patterns of its modification in their search and learn processes, and build on it a naïve ontology for time and space.
In this paper, we report findings from a series of Web searching experiments where novice searchers' search and learn processes were conducted as the initial stage of the exploration. Through these experiments, we identified a reliable method of approaching the phenomenon. We also identified several characteristics of patterns of knowledge modification as well as other search and learn behaviour of novice searchers.
This research attempts to overcome the limitation of conventional subject access systems developed for traditional libraries and information retrieval systems, from the standpoint of managing a large collection of multimedia digital documents. The focus of the study is on the elicitation of searchers' knowledge structure and tracking of its modification during their search and learns processes.
A conventional role of ontology in information science is to provide labels or keywords for indexing documents so that both indexers and searchers can use the same label to represent a concept. This convention allows a searcher to identify the documents relevant to their information needs as long as the searcher's knowledge is well structured to reflect that of domain experts. Novice searchers without such a well-structured domain knowledge may suffer from search failure, particularly when they do not have much experience in searching for information.
Belkin et al. (1982) suggested Anomalous State of Knowledge as the basis for information needs and identified five types of anomalous states by representing and comparing a series of search requisites. By comparing tactics used by novice searchers and information professionals, Hsieh-Yee (1993) found that domain knowledge affects only experienced searchers. By comparing thesaurus terms used by novices and experts in a domain, Vakkari and his colleagues concluded that only those with sufficient domain knowledge may improve search results by using thesaurus terms in query expansion (Vakkari et al. 2001; Sihvonen & Vakkari 2004; Pennanen & Vakkari 2003).
The increasing availability and accessibility of Internet resources to the general public allow the novice searcher to interact with document collections. In response to such trends, the main role of ontology seems to be shifting from a search aid to a navigation tool. This transition is reflected in recently proposed navigation tools that combine browsing and searching functions in a seamless manner. The Flamenco system allows searchers to browse a large collection of architectural images using hierarchically faceted metadata. Searchers of the system can navigate without disturbance of their thought processes (Hearst et al. 2002). JuNii Plus, an information access interface for the shared portal of the Japanese academic community, incorporates ontology-based and content-based retrieval for ranking Web documents. The system intends to provide a seamless switching between searching and browsing (Kando et al. 2006). As demonstrated in the examples above, ontology for navigation tools is expected to be naturalistic for searchers so that they can follow their own view of the world during search processes without being disturbed by the mandatory use of unfamiliar ontology intended for domain experts.
There are several different approaches to information retrieval within the field of information science. In addition, human information seeking behaviour, the domain of this study, is currently the focus of research in various fields external to information science. Such diversity inevitably introduces serious terminological confusion. To avoid such confusion, we will define key terms below as we introduce a conceptual framework.
The term information access interface describes the interface that mediates between information resources and users, and provides aids for users in accessing or using information resources stored on electronic media. The information access interface may include metadata, question answering systems, navigation systems, and virtual agents that intend to support optimal interaction between searchers and information objects.
The term ontology is used in this research to mean an explicit structure that describes patterns of associations among concepts, as typified by classification schema. Naïve ontology is a type of information access interface that allows novice searchers to refine their knowledge interactively by acquiring information chunk-by-chunk as they encounter it in digital environments. Specifically this is a kind of navigation tool that supports browsing and searching by novice searchers throughout their search and learn processes.
This study follows the principle of user warrant in developing naïve ontology in order to overcome limitations in the principle of literacy warrant.
There are two basic approaches, automatic and intellectual, in developing ontological tools for conventional text-based information retrieval systems. The automatic approach is based on statistical manipulation of text messages revealing associations between terms or clusters of terms. For the intellectual approach, ontology is created and designed by humans based on various sources including knowledge structures of taxonomies and experts. Here, the bottom-up approach has been recommended as more reliable than the top-down approach. There are two principles, literary warrant and user warrant, within the bottom-up procedure of the intellectual approach. These two principles may be combined to increase the reliability of ontological tools (Lancaster 1986).
This study uses the principle of user warrant. This is because the principle of literary warrant is too limiting for the development of naïve ontology intended for novice searchers who are not experts in the domain. The principle of literary warrant requires terms and concepts to be used for the basis of the ontology to be collected from literature written by authors who are experts in the domain. Such a procedure may develop ontological tools useful for domain experts but not for novice searchers who are unfamiliar with the domain. In addition, it is suggested that the principle of literary warrant relies heavily on terminology and concepts in the text portion of documents. Thus, the principle is limiting for developing ontological tools for the ever-increasing number of multimedia documents that may not include much text data.
In order to overcome these limitations, this research expands the principle of user warrant and tries to elicit knowledge structure as well as its modification by novice searchers during their search and learn processes.
Human beings cannot function at all without the ability to categorize either in the physical world or in our social and intellectual world. Thus, understanding of how we categorize things and concepts is central to understanding how we think, learn, and behave (Lakoff 1987: 1). When we see things using visual perception, our eyes isolate the input data from all other data and see only such correspondence as is determined by the input identification (Bertin 1967: 11). Kwasnik (1992: 194) defined the notion of a view as what a person articulate as seeing at one time, that is, a span of attention and proposed to use it as a unit of analysis in browsing.
By extending the notion of view into the cognitive world and using it as the unit of analysis, we might be able to elicit novice searchers' knowledge structure and patterns of its modification during their search and learn processes. They may be useful as a basis for developing naï ontology to be incorporated in the design of information access interfaces intended for novice searchers.
Concepts of time and space may have more important roles in browsing and searching multimedia documents than texts. The international standard metadata profiles for multimedia documents such as Dublin Core and IEEE/Learning Object Metadata (LOM) include an element of coverage that describes the special or temporal characteristics of the intellectual content of the resource (see the Taskforce Website). Spatial coverage refers to a physical region using place names or coordinates. Temporal coverage refers to what the resource is about rather than when it was created or made available, and is typically specified using a named time period or date and time format. However, limited sets of controlled vocabulary are recommended: the Library of Congress Subject Headings (LCSH) and the Getty Thesaurus of Geographic Names in addition to numeric schemes for spatial qualifiers; LCSH, the Art and Architecture Thesaurus and the Lexicon of stratigraphic nomenclature for temporal qualifiers. Such controlled vocabularies may be useful for domain experts but are not readily comprehensible to novice searchers. This is because existing controlled vocabularies seem to be too complex for novice searchers who have not yet developed a comprehensive knowledge of domain structure. Thus, this research focuses on the knowledge structure of novice searchers or non-domain experts for concepts of time (temporal) and space (geographic) aspects. We chose history and geography as instances of domains in eliciting knowledge structure to develop naïve ontology.
Specifically, we ask the following two research questions to guide our exploration:
We conducted a series of experiments involving college students as participants to seek for an appropriate methodology to approach the phenomena. We invited college students because we considered them to represent novice searchers in the domains of history and geography.
So far, we have conducted two experiments: one in September 2005 with three participants and the other in December 2005 with four participants. Participants were selected from third-grade female college students. The experiments consisted of the following processes:
Questionnaire: Participants were requested to complete a short questionnaire that asked them the frequency of their Internet usage, favourite browsers and search engines, topics chosen at the college entrance examination, whether they were planning to get a teacher's certificate, and topics of interest in history and geography.
Pre-search Activities: For the first experiment, each participant was given two vignette scenarios prepared by the researcher. The first scenario asked the participant to conduct a Web search to prepare for teaching a middle school history class on a particular topic. The second scenario asked the participant to conduct a Web search to prepare for visiting a world heritage site. Names of people and places for the visit used in each vignette scenario were selected based on a short questionnaire survey conducted in a college library and information science class from which participants for the first experiment were recruited.
For the second experiment, we changed the procedure and invited a pair of participants to discuss their topics of interest in the domains of history and geography for about ten minutes to let them choose two topics for Web search, one for each participant, a history and a geography topic. The modification in the procedure was introduced because each participant in the first experiment was tense and nervous because she had to wear unfamiliar devices and sit alone in a lab environment. This was also because one participant was unable to make progress in her search and learn because of the confusion caused by encountering incomprehensible information. In both experiments, participants were asked what and how much they already knew about the topic of the search.
Recording of Search and Learn Processes: During the search and learn processes, screen shifts, mouse movements and participants' eye movements were captured using an eye tracker (EMR-NL8B). Each session took about fifteen to twenty minutes.
Post-search Interview: After each Web search, an interview was conducted where the recorded search process was displayed, and the participant reported why she made a particular move (e.g. typing, mouse click, browsing) and what she thought, felt and expected at each moment throughout the search and learn process. We applied the notion of a view in collecting data on participants' cognitive movements during the search process by showing their eye-movement on the screen at post-search interview to help them recall what they did, thought and felt at the time when they were browsing a particular part of the screen. The interviews were recorded and transcribed in detail, then analysed using ATLAS.ti following a bottom-up strategy with the constant comparative technique.
In the post-search interview, participants articulated what they did, thought, felt, and expected at each point of movement, either voluntarily or in response to the interviewer's inquiries. From their communication, we identified several characteristics of knowledge acquisition and search and learn as follows.
Participants' expressions of the concepts for time and space, as embedded in their communication, were categorized into three types.
Solely time-related concepts were expressed as dates in the Christian era, temporal relationships with some well-known incidents, and dates in relation to ages of a famous figure. Table 1 presents examples of these three types.
|Solely time-related concepts||13th Century; 1600s|
|Temporal relationships with some well-khown incidents||just after Japan opened trade|
He came to Japan during the seclusion era
|Dates in relation to ages of a famous figure||the anniversary of Beethoven's death|
|* In this and other tables the expressions were translated from Japanese by the author|
Solely space-related concepts were expressed either as place names, country names, areas of a country, or distances between places. Examples are shown in Table 2.
|Place/country names||the Alhambra is located in Granada, Spain|
|Areas of a place or country||Basho travelled around the north-eastern region in Japan|
|Distance between places||The distance between the north and south ends of Israel is 500 km.|
Concepts embracing time and space were articulated as periods or eras, as exemplified in Table 3.
These expressions imply that concepts of time and space are deeply interrelated and that historical incidents and figures are kept as contemporaneous entities in participants' knowledge. This finding suggests the provision of functions that allow the searcher to browse time and space facets simultaneously or interchangeably.
Participants' knowledge has been modified during their search processes. Modification of knowledge was categorized into six types as described in Table 4 with some examples.
|Type||Definition||Example of expressions|
|Adding||Acquire novel information to increase knowledge.||
|Correcting||Clear up a misunderstanding.||
|Limiting||Narrow down the scope of the concept||
|Relating||A concept is related to another concept.||
|Specifying||A concept is narrowed by increasing specificity.||
|Transforming||A concept is expressed in a different framework.||
Identified patterns of modification in knowledge structure provide useful suggestions on the functions to be embedded in the information access interface intended for novice searchers.
Adding: when a participant found pictures of a known place, she increased intimacy and placed her knowledge in context. This finding indicates that the availability of pictures and plans of places and monuments helps novice searchers to place their knowledge in context.
Correcting: sometimes, a searcher found information that contradicted her existing knowledge and corrected it by acquiring the new information. On the other hand, a searcher who found information that contradicted her existing knowledge and tried to resolve the contradiction without success gave up the search. These findings imply that a navigation tool for browsing historical transitions among cultures, people, and religions may be helpful for novice searchers to structure and correct their existing knowledge.
Limiting: identification of irrelevant information or sites increases the focus of the search. This finding implies that a function to help specify time and place of figures and incidents may help novice searchers to put their search within a historical or geographic framework.
Relating: when new concepts were identified, they were related to known concepts. Relating multiple concepts added contexts in knowledge structure. These findings imply the need for a navigation tool that makes it easier for novice searchers to locate contemporaneous figures or incidents in order for them to develop a historical or geographic context to increase familiarity with the topic of the search.
Specifying: specificity of the time and space concepts evolved with the progress of the search. This finding suggests the provision of functions to allow novice searchers to start with a broad concept and to increase specification in both time and space.
Transforming: expressions of time-related concepts were transformed between different calendar systems, such as the Christian era and Japanese eras. This finding indicates that the provision of a navigation tool that allows switching between different calendar systems may be helpful for novice searchers.
Encountering: In addition to the six types of knowledge modification described above, encountering unexpected but somewhat familiar information during the search process would increase participants' affinity with the topic of the search. This finding implies that each participant's life experiences may influence her perceived familiarity with the topic of the search, and its relevance to her.
Participants' general searching behaviour tended to begin with typing one or two known keywords representing relatively broad concepts. They then browsed the search results by clicking links to the sites from the top of the list. If the contents of the first few sites were relevant to the topic of the search, they scanned the contents of each site one by one, quickly looking at basic information such as description, chronology, and glossary to acquire basic information. When they encountered seemingly useful sites, they tended to bookmark them. When they started finding similar information, they moved to the next step by typing more specific keywords obtained from the browsing. Several participants copied keywords found on the relevant site and used them in the next step. This behaviour is consistent with that described by Pennanen & Vakkari (2003). One of the participants articulated this progressive behaviour as loose at the beginning, gradually tightening, but changing keywords if too restrictive.
When participants encountered sites highly relevant to their topic of the search, they read texts in detail and looked at visual information such as pictures of interest and graphics representing correlation among relevant people and incidents. When they encountered sites of interest, such as free wallpapers, historical quizzes, and advertisements for TV programs with popular actors, they interacted with these sites intensively. When they encountered unexpected information, they articulated their surprise and tried to obtain further information for comprehension. These findings imply that duration of browsing in a particular site may indicate the site's level of perceived relevance to the searcher, as suggested by Kelly & Teevan (2003) and Joachims (2005). Here the perceived relevance is not limited to the topic but may also include situational or contextual relevance.
Participants acquired or used the following knowledge on the functions of the information access interface during the search processes.
These findings indicate that searchers modify not only domain knowledge but also knowledge on a variety of functions embedded in the information access interface during their search and learn processes.
While participants were browsing the sites found during their search, they sometimes articulated their impressions.
The above comments may be useful for designing Web pages or information access interfaces for novice searchers.
At the end of the post-search interview, the interviewer asked each participant to comment on her own search processes. Participants responded as follows.
These comments imply that search processes on the Internet are opportunistic. Novice searchers sometimes plan their next step but do not always follow their plan. Their perception of time during the search may be lost because of the high level of concentration. In addition, the notion of view as expanded to the cognitive world and chosen as the unit of analysis for this study was found to be adequate for analysing search processes on the Internet; searchers' visual perception tended be focused on a part of the screen or a view.
It should be noted that watching their own eye movements helped participants recall their search and learn processes. If so, the reliability of data elicited, and eventually the reliability of the study, should be increased. Thus, the method of using the eye tracker in capturing search and learn processes and showing the eye movement during the post-search interview may be recommended in collecting data on search and learn processes.
This exploration provided us with useful information for designing naïve ontology, while offering theoretical and methodological implications for studying human information behaviour.
Results of elicitation and analysis of novice searchers' search and learn processes gave us information for the design of naïve ontology as reported in the previous chapter and synthesized in Table 6.
The above implications are expected to be expanded and elaborated through the further progress of the research project.
Theoretical aspects of information seeking behaviour were identified from the study as follows.
Perceived relevance is associated with an individual searcher's life experience. Participants brought their past experience and social contexts into search and learn processes. A participant was fascinated by information encountered on interests of a friend and free wallpapers that could be downloaded for her own use. This may lead to the enhancement of the concept of situational or contextual relevance (Schamber et al. 1990), which may be measured by the time spent on a site as well as patterns of searchers' eye movements (Joachims 2005).
Searchers learn the functions of the information access interface through searching and learning. Participants' articulations indicated that they learn not only domain knowledge but also functions embedded in the information access interface. Thus, search and learn processes of novice searchers modify not only their domain knowledge but also knowledge and skills for using various functions available on the Web.
The search and learn process on the Web is opportunistic. Searchers make plans during the progress of their search and learn process, but they may forget them when they encounter interesting information. They may stray from the initial topic of the search, particularly when the goal of the search is self-generated. Many researchers also suggest viewing searchers' information retrieval and information behaviour as situated and ad hoc (Hert 1997; Xie 1997).
One of the objectives of the study, conducted as the initial stage of the research project and reported in this paper, was to find an appropriate methodology to elicit knowledge structure and patterns of its modification by novice searchers during their search and learn processes. Through the series of experiments reported above, we obtained several implications concerning methodological aspects.
First, we were convinced that the unit of analysis for browsing, suggested by Kwasnik (1992), is adequate in capturing the dynamically changing knowledge structure of searchers during their searching and learning processes. During the post-search interviews, participants reported voluntarily or in response to the researchers' inquiry what they thought, felt and had done in relation to what they saw on the screen with their eye movements. They said the eye movements shown on the screen pointed to the exact places that they were looking at and that watching their own eye movements helped them recall their search and learn processes. Thus, the research method of using eye trackers in data collection of Web search processes and showing eye movements during the post-search interviews is recommended for future studies, not only to identify what searchers are looking at rather than what data are shown on the screen but also to increase the reliability of data elicited from searchers, which eventually increases the reliability of research findings.
The use of vignette scenarios of imposed goals may lead to incomplete processes. The use of vignette scenarios introduced searching behaviour peculiar for imposed goals, while allowing participants to choose topics produces naturalistic searching behaviour. Participants of former study designs may suffer from incomprehensible information resulting from a completely unknown search topic. Participants of the latter design may stray from the initial search and learn goals set by themselves because of the much greater freedom that they have (Miwa 2000).
In an effort to find an optimal approach for developing naïve ontology for information access interfaces to better navigate novice searchers of multimedia digital documents, two Web searching experiments were conducted by volunteer college students. The concept of view was used as the unit of analysis in eliciting novice searchers' knowledge structure and patterns of its modification while they undergo search and learn processes.
We obtained useful information for the design of a naïve ontology as presented in Table 6. These implications may be expanded and elaborated as the project progresses and are expected to be incorporated into the design of the naïve ontology to be developed as the output of the project.
The study results provided theoretical implications on (1) perceived relevance to be generated from life experiences of searchers, (2) modification of search skills by learning functions of the information access interface in search and learn processes, and (3) the situations of search and learn processes.
We also obtained methodological information through this study. The results demonstrated that the use of an eye tracker in capturing search and learn processes and showing eye movements to the participants during the post-search interview helped improve reliability.
What we have described above are results of two small case studies conducted at the initial stage of the research project. We will continue this research project by recruiting more participants. Through such study, we will expand and elaborate these functions. The design of an information access interface, the goal of this research project, will be attained through continuous interaction with novice searchers as the research project progresses.
This research was funded by the National Institute of Informatics joint research grant, and we wish to thank Dr Barbara Kwasnik at Syracuse University for her helpful suggestions and the notion of view she proposed as the unit of analysis for browsing, which helped us to develop the conceptual framework of this study.
|Find other papers on this subject|
© the authors, 2007.
Last updated: 16 December, 2006