Proceedings of the Tenth International Conference on Conceptions of Library and Information Science, Ljubljana, Slovenia, June 16-19, 2019
Searching for critical dimensions in data literacy
Sonja Špiranec, Denis Kos and Michael George.
Introduction. The data-turn is starting to have a significant impact on library and information sciences. In an era of abundant growth of data and supporting infrastructure, proposals for the twinning of information and data science in library and information science schools aim to create expertise which would cater to the job market in need of data-oriented specialists. While data literacy creates the main line of discourse alternative approaches are being considered such as the concept of critical data literacy.
Method. A comprehensive thematic analysis of the critical data literacy discourse is undertaken to construct a comprehensive definition of critical data literacy and conceptualize its relation to the general thinking on related concepts. The articles were chosen based on the PRISMA instrument for construction of samples for systematic reviews and when submitted to its eligibility criteria 30 articles remained which were thematically analysed using the MAXQDA software.
Findings. Our interpretive analysis uncovered five major themes: ontological treatment of data; critique of the epistemological status of data; literacy rationale situated in key problem accentuations; critical pedagogic articulation; and critical data literacy as ethics.
Conclusions. We conclude that there is a growing need for terminological clarity in relation to the concepts of data and information considered in the context of the established discourse on critical data literacy.
The data turn, the data deluge, the data hype or the age of big data are metaphors used in popular and academic discourses to describe public and institutional awareness of the importance of data. The discussions around data reveal a wide set of perspectives, ranging from expectations and faith in data to emphasizing their risks and pitfalls. Commentators suggest that data have a disruptive effect, and far-reaching consequences to how knowledge is produced, business conducted, and governance enacted (Kitchin, 2014). On the enthusiastic and celebratory end of the data-discourse-spectrum, it is emphasized that data carry the power of truth and objectivity, and the potential to solve the world’s most complex and pressing problems (Oliphant, 2012). In this context, data is understood as being pre-analytical and pre-factual, existing prior to interpretation and argument; the raw material from which information and knowledge are built (Kitchin and Lauriaut, 2014) and as such enthusiastically recognized as a market-niche for new products, services and profiles of experts.
The other, more critical and more nuanced end of the data-discourse-spectrum questions the nature and epistemic qualities of big data and teases out the contradictory implications and effects in terms of implications for the individual and broader society. Expressions like “data fetishism” (Sharon and Zandbergen, 2017) suggest that data sets are reductive in their capacity and creations of human design, implicated in social relations and power dynamics (Boyd and Crawford, 2012; Van Dijk, 2014) which raises the interest in the study of the sociocultural, political, and economic contexts in which data is created and used (Oliphant, 2012). The Janus-faced nature of the data turn, and specifically big data, is well described in Ekbia et al. (2015) who lay out complex methodological, epistemological, aesthetical, technological, legal and ethical, political and economic dilemmas pointing to the conclusion that data is a novelty that is, at the same time, productive and empowering and yet constraining and overbearing.
The described data-turn is starting to have a significant impact on library and information sciences. The library and information sciences allegiance with its requirements can easily be found in its documentalistic core mission but also it can be viewed as a response to the development of the new field of data science. In an era of abundant growth of data and supporting infrastructure, proposals for the twinning of information and data science in library and information science schools (see Wang, 2018) aim to create expertise which would cater to the job market in need of data-oriented specialists (e.g. Lyon and Brenner, 2015; Robinson and Bawden, 2017). The described tendencies suggest a possible shift in the academic marketplace and nomenclature, with the question whether data science will be the next generation of (library) and information science (Ma, 2013), or, following the analysis of Hjørland (2018), after dropping the L-word from the name of library and information science schools, and becoming information schools or I-schools, is it time for a new shift and transition to D-schools?
Disputes about core concepts in library and information sciences, that are recently dominated by data-centric perspectives, create the need and opportunity to reimagine, once more, literacy concepts in terms of data literacy. The literature defines data literacy as the ability to understand, find, read, interpret, evaluate, manage and use data (Prado and Marzal, 2013; Koltay, 2017). However, the above-stated definition does not reflect the holistic approach to literacies, as seen in more recent conceptualizations of literacies commonly discussed in the field, e.g. information literacy, metaliteracy, media literacy, etc. Holistic approaches to these literacies do not only imagine the professional as their key focus, but also the learner and, most importantly – the citizen. So, alternatively, and similar to the development of information literacy towards critical information literacy (Elmborg, 2006; Tewell, 2015), data literacy can also be interpreted from the standpoint of its inherent critical qualities. Overlaps in the attributes of information literacy and digital literacy are already recognized in the literature (Koltay, 2017), and highlighted in the composite term “data information literacy” used e.g. by Carlson et al. (2011), who regard data literacy as part of information literacy, and a kind of logical development of information literacy. Similar to information literacy, criticality and critical assessment is considered to be an essential feature of data literacy, whereby being critical includes, giving emphasis to the version of the given dataset, the person responsible for it, or skills of identifying the context where data is produced and reused (Prado and Marzal, 2013).
However, just as critical information literacy is more holistically dimensioned in comparison to information literacy, providing a counterbalance to a pragmatic, but nevertheless limiting perspective of information literacy and incorporating, but not being limited to instrumental and functional dimensions (Špiranec, Banek and Kos, 2016), data literacy is also being considered as a critical concept with the purpose of promoting social justice and the public good, understanding power relations and power asymmetries as well as reducing social, economic, political and other types of inequalities. However, if data literacy is considered wider than just as a response to data science and the data hype, a key question remains: What does it mean to critically approach data? This question will be explored by analysing and systematically reviewing the literature contributing to the expansion of the notion of data literacy towards critical data literacy.
In this paper we report on research where we took the proposed question as a research orientation and have attempted to formulate a systematic critical review of the critical data literacy discourse based on a thematic analysis. The sample of analysed scientific articles was created using the PRISMA instrument (Moher et al., 2009) for systematic reviews. For the query (“critical data literacy” OR “critical data literacies”) we have identified 99 records in different databases: Google Scholar (N=88), SCOPUS (N=8) and Web of Science (N=3). There were 80 individual records after removal of duplicates. We found an article to be eligible for inclusion if: a) the record was a scientific article; b) the article was in English; c) the full-text was available; and d) the article mentioned the queried terms at least once in the article title or body, abstract text or in the key-words. When submitted to these eligibility criteria 30 articles remained which were analysed using the MAXQDA software. In order to establish emergent themes that define the criticality of the concept of data literacy, we performed an initial inductive thematic coding of paragraphs. The initial codings were then summarized and ascribed to a derived set of themes and subthemes described in the results section of this paper. In the summarization phase we have noticed that the initial codings largely referred to five discursive moments which build up the definition and the context of the debate about critical data literacy and related concepts.
Defining critical data literacy
The five recognized major themes organize different aspects of the concept and include subthemes that further elaborate on each theme. The five themes are ordered in a way which, we believe, makes it easier to grasp the understanding of the concept we got from the analysed literature. The five major themes are as follows:
- Ontological treatment of data
- Critique of the epistemological status of data
- Literacy rationale situated in key problem accentuations
- Critical pedagogic articulation
- Critical data literacy as ethics
Ontological treatment of data and the critique of its epistemological status
The first two themes are best understood together. The main part of understanding critical data literacy stems from the definition of the concept of data. What has been understood as ontological treatment of data can, in the scope of the analysed literature, be subsumed under three subthemes: contextuality of data, interpretativeness of data and the transparency of data.
Data are contextual. Many argue that data are inextricably linked to context (Tygel and Kirsch, 2016; Battista and Conte, 2017; Neff, Tanweer, Fiore-Gartland and Osburn, 2017; Hautea, Dasgupta and Hill, 2017; Gebre, 2018) because without taking it into consideration data become meaningless. All data are constructed in a cultural and historical setting which can show how the data came to be. However, reading context is not a straightforward endeavour. Neff, Tanweer, Fiore-Gartland and Osburn (2017, p. 6) explain the distinction between a representational and an interactional view of the context where the former refers to the view ‘...that context can be represented as data, or as a “stable container” within which activities unfold […]. This view imagines context as an environment with definable, describable, and encodable boundaries and characteristics that exist outside the data’. The authors add to the understanding of this view by formulating a critique that context is not something easily unpackable on demand and essentially visible whenever we encounter more or less structured data with or without context related metadata. To this view they contrast the interactional view which sees context entangled in communication and local practices of those who use data. ‘To critical data scholars, context does not preexist, but is instead a local accomplishment that can shift dynamically and can never be wholly captured as data’ (ibid. p. 6). Related to this discussion we also mention two less prevalent but equally interesting themes which refer to the socio-technical and mediatic nature of data where coded paragraphs describe data as products of social arrangements, practices the use of and affordances of technology (e.g. ibid., 2017).
Additionally, data is perceived as a medium in audience research where calls for critical data literacy focus on understanding that sharing personal data necessarily and sometimes unknowingly mediates our lives (Murru, Amaral, Brittes and Seddighi, 2018).
Since data are so contextual and its roots hard to grasp, understanding of data is always based on interpretation (Pappas, Emmelhainz and Seale, 2016; Tygel and Kirsch, 2016; Hautea, Dasgupta and Hill, 2017; Neff, Tanweer, Fiore-Gartland and Osburn, 2017). In the literature we noticed a lack of clarity in defining the interpretative nature of data. On one hand this is understood as data being an interpretation and on the other as an imperative of interpretative treatment of data. In a sense, when dealing with data we always encounter interpretations which we ourselves interpret through reading and processing and when we create data we construct them as our own interpretations. Interpretation occurs as more or less reasonable subjective estimations, as ‘patterns of inclusion or exclusion’ (Neff, Tanweer, Fiore-Gartland and Osburn, 2017, p. 6) and are ‘based on taken-for-granted norms and standards of data’ (ibid.) generated in different contexts. Communication of the significance of data involves the assessment of one’s own understanding and interpreting the sensibilities of the intended audiences. This realization implies the impossibility of a neutral approach to data and leads to a critical apprehension of the privileged epistemological status sometimes ascribed to them.
The third subtheme refers to the transparency of data. Even though this theme is usually found in calls for transparency and access rights (Bhargava, Kadouaki, Bhargava, Castro and D’Ignazio, 2016; D’Ignazio and Bhargava, 2016; Tygel and Kirsch, 2016; Cox, Pinfield and Rutter, 2018; Pangrazio and Selwyn, 2018) we notice that a general understanding that data can always be considered with regard to its existence and the level of its visibility, organization or accessibility is a crucial aspect of a critical study of data.
All of these dimensions are necessarily linked to the epistemological status of data where authors debating critical data literacy critique it on three levels: data as authority, data as reduction and data as ideology. While the aforementioned dimensions were presented by authors as the way in which the data is found or exists in the world, the next considered dimensions concern the way in which humans assign to them a certain kind of epistemic value.
The critique (D’Ignazio and Bhargava, 2015; Pappas, Emmelhainz and Seale, 2016; Tygel and Kirsch, 2016; Battista and Conte, 2017; D’Ignazio, 2017; El Khouri Buzato, 2017; Greenfield, 2017; Hautea, Dasgupta and Hill, 2017; Neff, Tanweer, Fiore-Gartland and Osburn, 2017; Carrington, 2018; Gebre, 2018) of such treatment of data starts from attempts to deconstruct the notion that data can be taken as an undisputed authority or from the claim that every assignment of the term data to some e.g. numbers does not imply that these numbers are somehow available as datum (from Latin: as a given) – as a fact. This leads data to be ‘typically treated as a superior form of evidence’ (Battista and Conte, 2017, p. 147) and ‘as an absolute, a single ahistorical artifact that speaks from a decontextualized place of authority and is alone in providing “real answers” to social questions’ (Pappas, Emmelhainz and Seale, 2016, p. 179). The latter authors also add that in Western academic settings data interpretation is somehow reserved for experts or authorities in a certain field of inquiry. This is explained in multiple works as data fetishizatio or a mystification that data by themselves have the power to disarm completely since they inherently hold all the true answers regardless of how they were generated. We perceive that taking the data – epistemically – at face value and unproblematically is perhaps the key illiteracy in the context of a critical data literacy.
From those arguments it follows that such understandings of data lead to reductive views of the world. As Battista and Conte (2017, p. 147) explain in the context of data visualization that ‘we live in a data-saturated moment, in which maps, infographics, and charts distil complex realities into seemingly palpable truths’. Because of this, in examined authors viewpoint, a literate has to understand how the data came to be and usually this is explained on the example of scientific research where one has to question, realize a lack of information or get informed about different parts of the scientific process like: who generated the data, how was it generated, in which context was it generated, by whom and for who, what methods were used, what was taken into consideration (what facts, what theories, what measurements) and what wasn’t.
Some authors (e. g. Markham, 2018) in this sense expressed heavily and explicitly stances that because data is not neutral it should be considered as ideology. Tygel and Kirsch (2016, p. 16) portray data as having a ‘seducing precision and objectivity’ and claim that ‘data grounded statements almost always hide ideologies and intentions about anything one wants to prove’.
The key takeaway from such a portrayal of data is that authors advocating a critical data literacy approach see data as always problematic and that the critical stance towards data comes from understanding the way in which data exist in the world and from cultivation of a non-absolutist treatment of its epistemic value. This leads us to conclude that data literacy programs that are built upon statistical and research literacy should ensure that students learn not only how to do statistical analyses and execute research methodologies but also about the philosophical underpinnings of the scientific endeavour, the social organization of the sciences and social influences on the production of scientific knowledge.
Key problem accentuations as proposed literacy rationale
The third noticed major theme describes the state of the field in relation to what we came to understand as a search for the proposed literacy rationale. Key problem accentuations that we mention in the following section refer to how critical data literacy authors ground the need for this literacy in the critique of certain aspects of contemporary society. These critiques refer to five subthemes: algorithms, big data, data science, artificial intelligence and neoliberalism.
One of our findings is that the analysed literature often dedicated significant portions of its space to the definition of terms such as those mentioned here as subthemes. However, every introduction is done with an agenda of problematisation and to accentuate why a critical data literacy is needed or why a critico-pedagogical expansion of data literacy is necessary.
Here we skip the definitions and move directly to problems. When problematised (D’Ignazio and Bhargava, 2015; Bhargava, Kadouaki, Bhargava, Castro and D’Ignazio, 2016; Philip, Olivares-Pasillas and Rocha, 2016; D’Ignazio, 2017; Hautea, Dasgupta and Hill, 2017; Carrington, 2018; Cox, Pinfield and Rutter, 2018; Markham, 2018, Pangrazio and Selwyn, 2018; Ytre-Arne and Das, 2018) algorithms are portrayed as instances of technology to which society has outsourced different kinds of decision-making processes. Authors explain the formulation of so-called algorithmic identities that are created based on procedures of algorithmic profiling whose outcomes become treated as relevant sources of data and information to make automated decisions. These decisions become problematic when they concern filtering of information based on criteria visible mostly only to the creator or proprietor of algorithms and when they are used to rank individuals in order to inform decisions about access to financial and health services, housing, employment, education, etc. which is seen as potentially leading to exclusion and social divisions.
Carrington (2018, p. 71) writes that the identities that algorithms construct are pervasively influencing our lives because of ‘the gap between who we think we are and who algorithms construct us to be’. Carrington sees them as robbing us of the opportunity to construct a self-narrative on our own and increasingly introduce us into an exploitative relationship where more and more personal data is needed in order for an algorithm to understand us better. Connected to this is the critique of targeted marketing as one of the problems most commonly referred to in the context of unwanted or unregulated use of algorithms. The discourse frames this as the debate about relinquishing personal data for free at the doorstep of corporations who then treat them as their own property and sell it to make profit. Personal data is exchanged for a questionable personalization of web services, information retrieval, entertainment, etc. whose purpose is to put the user in the position to create and relinquish ever more personal data which in turn generates more profit for the service provider.
Algorithms may be problematic on the technological side of things but what legitimizes their use authors locate in the so called Big Data hype (Bhargava and D'Ignazio, 2015; Philip, Olivares-Pasillas and Rocha, 2016; El Khouri Buzato, 2017; D’Ignazio, 2017; Carrington, 2018; Cox, Pinfield and Rutter, 2018). The hype or sometimes also called the myth of Big Data refers to the idea that if we connect ‘large data sets to identify patterns in order to make economic, social, technical, and legal claims’ with the available ‘computational power and algorithmic accuracy’ we will achieve a ‘higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy’ (Bhargava and D'Ignazio, 2015, p. 2). The critique of Big Data subsumes and can’t be removed from the critique of algorithmization. However, when referring to Big Data authors more clearly express the consequences for society and its impact on both private and public life. Big Data is portrayed as a problematic process where decision making of societal import is left to automated systems based on an implicit trust in technology and market logic by governmental and corporate actors who treat data as a currency of power and a crucial source of profit. The celebrated view of the prosumer is being dismissed as exploitative in the context of passive and extortionate collection of data. In this sense several authors examine these issues especially in the context of social media and big social data (Hautea, Dasgupta and Hill, 2017; Gebre, 2018; Murru, Amaral, Brittes and Seddighi, 2018; Pangrazio and Selwyn, 2018; Ytre-Arne and Das, 2018) mainly focusing on the lack of awareness about them among contemporary youth and the co-option of attention in audience research.
A professional and scientific framing of Big Data has been established in the field of data science. Even though data science has been more thoroughly considered in only two analysed articles we believe it is important to mention that these authors have noticed the need to take a critical stance towards data science as a whole as well. Perhaps the more articulated critique is that of El Khouri Buzato (2017, p. 4) who debates the scientific status of data science based on claims that data science practices somehow supersede the scientific process and the stance that its methods ‘which were previously reserved for truly "scientific purposes", are now up for grabs for anyone literate enough in these multisemiotic discursive technologies in order to seek patterns among things that are not related in culture or in the human mind’ leading to what we could call false positives or inferences not substantiated in reality. He also claims that data science:
“is but an expansion and evolution of efforts formerly applied to business analysis, not scientific endeavors. It incorporates computer science, statistics, and applied mathematics into new highly automated methods [...] for analyzing high volumes of data and ‘extracting knowledge’ from them” (ibid., p. 6).
Neff, Tanweer, Fiore-Gartland and Osburn (2017) stand rather on the side of interdisciplinary cooperation between data scientists and critical data scholars who could work together toward an ethical data science practice seeking to confront it with the critique of its ‘depoliticized’ practices.
Cox, Pinfield and Rutter (2018) uncover artificial intelligence as another important expansion of the Big Data hype and in great detail explain the consequences that the application of artificial intelligence will have on library and information sciences. Artificial intelligence was recognized as a separate subtheme because of implicit understanding that methods of natural language processing and machine learning are at the basis of Big Data exploitation and processes of algorithmization. Thus, we think that the overarching angle discussing artificial intelligence needs to be more thoroughly examined in the field. Cox, Pinfield and Rutter (2018, p. 5) start the formulation of a critical outlook wherein they debate that ‘given the complexity of the algorithms it becomes difficult to make the process of decisions intelligible’ and that we should ask: ‘How are AI systems to be accountable and transparent if their operation cannot be understood?’. Additionally, they raise concerns about librarians and information professionals potential job insecurity.
The last subtheme we uncovered as a key problem accentuation is the underlying critique of neoliberalism. Regardless of the level of analysis authors (Bhargava and D’Ignazio, 2015; D’Ignazio, 2017; Carrington, 2018; Cox, Pinfield and Rutter, 2018; Markham, 2018; Pangrazio and Selwyn, 2018; Ytre-Arne and Das, 2018) point often to how the constellation of the current economical paradigm creates disparities and breeds inequity because of the divide between those who own and benefit from data exploitation and those who don’t have data and are continually harvested without compensation for personal data which themselves have created in exchange for filtered personalized services. Data is portrayed as a powerful currency, as capital which is created through free unpaid labour and gathered by means of extractive collection (see Bhargava and D’Ignazio, 2015). ‘Profits generated, as results of these labours are not used to compensate users, creating a significant disconnect between labourers and the product of their labour, and between individuals and corporations’ (Carrington, 2018, p. 69). Humans are seen as reduced to units of data who, akin to docile ants, labour for free on big social data farms while, in the context of different promises like communication, education, social engagement etc., they unwittingly transfer the ownership of data about their lives to corporations and governments who collect them, sell them and use them to reaffirm their position and grow further in financial, social and governmental power.
Critical pedagogic articulation
The fourth theme pertains to the question of which kind of pedagogic articulation is necessary in order to convey and achieve critical capacities necessary to battle the accentuated problems. Almost exclusively, authors see solutions in critical pedagogy. Namely, what is meant here is the critical pedagogy of Paulo Freire (Bhargava and D’Ignazio, 2015; Bhargava, Kadouaki, Bhargava, Castro and D’Ignazio, 2016; Pappas, Emmelhainz and Seale, 2016; Tygel and Kirsch, 2016; D’Ignazio, 2017; El Khouri Buzato, 2017; Hautea, Dasgupta and Hill, 2017; Markham, 2018; Murru, Amaral Brites and Seddighi, 2018) although one author also articulates a pedagogic approach based on the work of Ira Shor, a student of Paulo Freire (Berendt, 2012).
The underlying idea of a freireian critical data literacy is ensuring that the educational process is grounded in authentic experiences and the educands own reality. Its goal is to uncover the context within which the individual engages with data. The process entails the phases of thematisation and systematization of experience and their problematisation with the focus on the transformation of the educands reality. A good articulation of Paulo Freire’s critical pedagogy can be found in the work of Tygel and Kirsch (2016, p. 112) who explain that:
“literacy education is composed of two complementary and indivisible aspects: the technical ability of reading and writing, and the social emancipatory process of understanding and expressing oneself in the world. In data literacy, we can observe that there are technical capacities related to data manipulation, such as general computer abilities and statistical-mathematical methods, and capacities for critically analysing data, such as understanding the context in which they were generated, and the reality pictured by them.”
Even though the analysed sample is small we have noticed lots of different approaches, activities and methods. Here we briefly name them: data visualization (Bhargava and D’Ignazio, 2015; Bhargava, Kadouaki, Bhargava, Castro and D’Ignazio, 2016; Pappas, Emmelhainz and Seale, 2016; Philip, Olivares-Pasillas and Rocha, 2016; Battista and Conte, 2017; Gebre, 2018), data biographies (e.g. Markham, 2018), algorithm comparison (e.g. Bhargava and D’Ignazio, 2015); learning the use of data analysis software (e.g. D’Ignazio, 2017) and cultural probes (e.g. Jarke, 2018). Analysis of critical pedagogical articulation of critical data literacy and the discussion on skills, sensitivities and understandings merit more attention but this goal lies outside of the scope of this paper.
Critical data literacy as ethics
The last thematic area defines critical data literacy as a search for a critical democratic mission and societal role. Critical data literacy is at no point portrayed as a mere set of skills, but is contextualized in the demand for active citizen participation and oversight of ethically questionable data-related practices. These two aspects indicate the extent of the already established problematic; an ethic requires a relatively equal, and reasonable playing field. In this sense, authors (Tygel and Kirsch, 2016; D’Ignazio, 2017; Gebre, 2018) have emphasized the problem of access – access to data, access to information, access to technology, access to knowledge and access to services. Access to technology is becoming more important (D’Ignazio, 2017; Cox, Pinfield and Rutter, 2018) as the influence of artificial intelligence increases where either knowledge to employ natural language processing and machine learning or ready-made curated technological solutions need to be available to the public in order to reduce the asymmetries in access to the computational power to understand the data saturated reality we live in. How, and why, one shares economic and political power generated by data is an immense question that needs more exploration. An important part of literature (McMahon, Smith and Whiteduck, 2017; Walter and Suina, 2018) is developing with relation to the term data sovereignty where authors debate concepts like indigenous critical data literacy and voice the critique of colonial treatment of indigenous data. Even though data sovereignty is contextualized in the process of re-appropriation of indigenous data we see it as an important concept for everyone since it establishes that inclusion in all data related processes is a necessary precondition of self-determination. The concept of sovereignty might be considered useful as a starting point when debating the unreflected practices of relinquishing personal data on social media platforms and elsewhere. Additionally, reoccurring themes which refer to ethical issues like protection of privacy, surveillance, epistemic manipulation and exploitation of data-related labour remain at the forefront of critical data literacy as a practice of activism.
Data ethics as a concept, however, is scarcely explicitly debated. Ethical considerations of data in the analysed literature seem to be based on a binary logic of contrasting the empowering uses of technology by the oppressed and the disenfranchising uses of technology by the powerful elites. That being said, we suggest that more complex outlooks are needed where perhaps the very technicization of the process of empowerment might be stifling the sought emancipation (in part due to its inherent exclusivist tendencies) and that a proposed data ethics should seek to explain the presumed value of a technological framing of life and society. This latter point indicates the larger ethical problem of the lack of a comprehensive and coherent framework within which one significant aspect would be the technological dimension of humanity.
In this paper we have attempted to give an overview of the current state of the debate surrounding the concept of critical data literacy. Our systematic literature review based on thematic analysis established five major themes that uncover the way in which the critical data literacy discourse was structured. The themes we identified referred to the ontological treatment of data, their epistemological status, problems they cause in societies and pedagogical and ethical aspects of data. We came to understand critical data literacy as a situated educational praxis where a critical approach to data and our data realities is enacted as the problematization and transformation of oppressing and unjust conditions of life produced by exclusionary, exploitative, invasive and manipulative uses of data.
Thinking about future developments of the critical data literacy discourse we propose that authors contributing to it should consider its relation to the existing achievements in library and information sciences subfields such as that of critical information literacy. Looking at how the critical data literacy discourse introduces Freire’s notion of critical pedagogy without the explicit connection to the existing and conceptually connected literature of critical information literacy we notice that the current state of discourse disregards or evades the historical development of the terminological debate surrounding the definitions and relationship of data and information. This is a recurring question similarly posed by Ma (2013) about whether the concept of information is still relevant for information sciences and whether it still makes sense to emphasize conceptual fluidity and connections between data and information (Bawden and Robinson, 2018) as e.g. proposed in Floridi’s philosophy of information. A further point can be made about this and it concerns the very definition of data because it was not so long ago that information itself was defined as data in context (cf. Liew, 2007; Zins, 2007) while now in the scope of the analysed literature it seems as though data is putting on the garments information used to wear. The contextuality of data is not regarded in its informational capacity but rather it is seen on the ontological level as a truism about data itself.
About the authors
Sonja Špiranec is a Professor at the Department of information and communication sciences, University of Zagreb, Ivana Lučića 3, 10 000 Zagreb Croatia. She received her Ph.D. from Zagreb University and her research interests are information literacy and knowledge organization. She can be contacted at email@example.com
Denis Kos is a Ph.D. candidate and a teaching and research assistant at the Department of information and communication sciences and the Centre of excellence for integrative bioethics, University of Zagreb, Ivana Lučića 3, 10 000 Zagreb Croatia. His research interests are information literacy and knowledge organization. He can be contacted at firstname.lastname@example.org
Michael George is an Associate Professor at St. Thomas University, Fredericton, NB, Canada. He received his Ph.D. from St. Paul University and the University of Ottawa and his research interests are bioethics and ethical foundations. He can be contacted at email@example.com
- Battista, A. & Conte, J. (2017). Teaching with data: visualization and information as a critical process. Retrieved 8 May, 2019 from https://osf.io/preprints/lissa/ams2f/download (Archived by the Internet Archive at http://bit.ly/33sKW3R)
- Bawden, D. & Robinson, L. (2018). Curating the infosphere: Luciano Floridi’s philosophy of information as the foundation for library and information science. Journal of Documentation, 74 (1), 2–17.
- Berendt, B. (2012). Data mining for information literacy. In D. E. Holmes & L. C. Jain (Eds.) Data Mining: Foundations and Intelligent Paradigms (pp. 265–297). Berlin: Springer.
- Bhargava, R., Kadouaki, R., Bhargava, E., Castro, G. & D'Ignazio, C. (2016). Data murals: using the arts to build data literacy. The Journal of Community Informatics, 12(3), 197–216.
- Boyd, D. & Crawford, K. (2012). Critical questions for Big Data: provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662–679.
- Carlson, J., Fosmire, M., Miller, C.C., et al. (2011). Determining data information literacy needs: A study of students and research faculty. Portal: Libraries and the Academy, 11(2), 629–657.
- Carrington, V. (2018). The changing landscape of literacies: big data and algorithms. Digital Culture & Education, 10(1), 67–76.
- Cox, A.M., Pinfield, S. & Rutter, S. (2018). The intelligent library: thought leaders views on the likely impact of artificial intelligence on academic libraries. Library Hi Tech.
- D'Ignazio, C. & Bhargava, R. (2015). Approaches to building big data literacy. Retrieved 8 May, 2019 from http://www.kanarinka.com/wp-content/uploads/2015/07/Big_Data_Literacy.pdf (Archived by the Internet Archive at http://bit.ly/31x4Rgk)
- D'Ignazio, C. & Bhargava, R. (2016). DataBasic: design principles, tools and activities for data literacy learners. The Journal of Community Informatics, 12(3), 83–107.
- D'Ignazio, C. (2017). Creative data literacy: bridging the gap between the data-haves and the data-have nots. Information Design Journal, 23(1), 6–18.
- Ekbia, H., et al. (2015). Big data, bigger dilemmas: a critical review. Journal of the Association for Information Science and Technology, 66(8), 1523–1545.
- El Khuori Buzato, M. (2017). Critical data literacies: going beyond words to challenge the illusion of a literal world. Retrieved 8 May, 2019 from https://www.researchgate.net/profile/Marcelo_Buzato/publication/328260295_Critical_Data_Literacies_going_beyond_words_to_challenge_the_illusion_of_a_literal_world/links/5bc1440c299bf1004c5e4ee9/Critical-Data-Literacies-going-beyond-words-to-challenge-the-illusion-of-a-literal-world.pdf (Archive at the Internet Archive at http://bit.ly/2ZS5fWo)
- Elmborg, J. (2006). Critical information literacy: implications for instructional practice. The Journal of Academic Librarianship, 32(2), 192–199.
- Gebre, E.H. (2018). Young adults' understanding and use of data: insights for fostering secondary school students' data literacy. Canadian Journal of Science, Mathematics and Technology Education, 18(4), 330–341.
- Greenfield, A. (2017). Practices of the minimum viable utopia. Architectural Design, 87 (1), 16–25.
- Hautea, S., Dasgupta, S. & Hill, B.M. (2017). Youth perspectives on critical data literacies. In CHI 2017 Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 919–930).
- Hjørland, B. (2018). Library and information science (LIS): Part 1. Knowledge organization, 45(3), 232–254.
- Jarke, J. & Maaß, S. (2018). Probes as participatory design practice. i-com, 17(2), 99–102.
- Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1).
- Kitchin, R. & Lauriault, T. (2014). Towards critical data studies: charting and unpacking data assemblages and their work. Retrieved 8 May, 2019 from http://mural.maynoothuniversity.ie/5683/1/KitchinLauriault_CriticalDataStudies_ProgrammableCity_WorkingPaper2_SSRN-id2474112.pdf (Archived by the Internet Archive at http://bit.ly/33oNQXz)
- Koltay, T. (2017). Data literacy for researchers and data librarians. Journal of Librarianship and Information Science, 49(1), 3–14.
- Liew, A. (2007). Understanding data, information, knowledge and their inter-relationship. Journal of Knowledge Management Practice, 8(2).
- Lyon, L. & Brenner, A. (2015). Bridging the data talent gap: positioning the iSchool as an agent for change. International Journal of Digital Curation, 10(1), 111–122.
- Ma, L. (2013). Is information still relevant? Information Research: An International Electronic Journal, 18(3), n3.
- McMahon, R., Smith, T.J. & Whiteduck, T. (2017). Reclaiming geospatial data and GIS design for indigenous-led telecommunications policy advocacy: a process discussion of mapping broadband availability in remote and northern regions of Canada. Journal of Information Policy, 7(1), 423–449.
- Markham, A.N. (2018). Critical pedagogy as a response to datafication. Qualitative Inquiry, 1–7.
- Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G. & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med, 6 (7), e1000097.
- Murru, M.F., Amaral, I., Brites, M.J. & Seddighi, G. (2018). Bridging the gap between micro and macro forms of engagement: three emerging trends in research on audience participation. In R. Das & B. Ytre-Arne (Eds.) The Future of Audiences(pp. 161–177). Switzerland: Palgrave MacMillan.
- Neff, G., Tanweer, A., Fiore-Gartland, B. & Osburn, L. (2017). Critique and contribute: a practice-based framework for improving critical data studies and data science. Big Data, 5(2), 1–15.
- Oliphant, T. (2012). A case for critical data studies in library and information studies. Harvard Business Review, 70.
- Pangrazio, L. & Selwyn, N. (2018). Its' not like its life or death or whatever: young people's understandings of social media data. Social Media + Society, 4 (3), 1–9.
- Pappas, E., Emmelhainz, C. & Seale, M. (2016). Thinking through visualizations: critical data literacy using remittances. Retrieved 8 May, 2019 from https://escholarship.org/content/qt75231194/qt75231194.pdf (Archived by the Internet Archive at http://bit.ly/33uSxPi)
- Philip, T.M., Olivares-Pasillas, M.C. & Rocha, J. (2016). Becoming racially literate about data and data-literate about race: data visualizations in the classroom as a site of racial-ideological micro-contestations. Cognition and Instruction, 34 (4), 361–388.
- Prado, J.C. & Marzal, M.A. (2013). Incorporating data literacy into information literacy programs: core competencies and contents. Libri, 63 (2), 123–134.
- Robinson, L. & Bawden, D. (2017). The story of data: a socio-technical approach to education for the data librarian role in the CityLIS library school at City, University of London. Library Management, 38 (6/7), 312–322.
- Sharon, T. & Zandbergen, D. (2017). From data fetishism to quantifying selves: self-tracking practices and the other values of data. New Media & Society, 19 (11), 1695–1709.
- Spiranec, S., Banek Zorica, M., & Kos, D. (2016). Information Literacy in participatory environments: the turn towards a critical literacy perspective. Journal of Documentation, 72 (2), 247–264.
- Tewell, E. (2015). A decade of critical information literacy: a review of the literature. Communications in Information Literacy, 9 (1), 2.
- Tygel, A.F. & Kirsch, R. (2016). Contributions of Paulo Freire to a critical data literacy: a popular education approach. The Journal of Community Informatics, 12 (3), 108–121.
- Van Dijk, J. (2014). Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology. Surveillance & Society, 12 (2), 197–208.
- Walter, M. & Suina, M. (2018). Indigenous data, indigenous methodologies and indigenous data sovereignty. International Journal of Social Research Methodology, 22 (3), 233–243.
- Wang, L. (2018). Twinning data science with information science in schools of library and information science. Journal of Documentation, 74 (6), 1243–1257.
- Ytre-Arne, B. & Das, R. (2018). An agenda in the interest of audiences: facing the challenges of intrusive media technologies. Television & New Media, 20 (2), 1–15.
- Zins, C. (2007). Conceptual approaches for defining data, information, and knowledge. Journal of the American Society for Information Science and Technology, 58(4), 479–493.