Information Research, Vol. 7 No. 4, July 2002,
A presentation open to further development is given. The idea of technology, including information technology, as a human construction is taken as the basis for the themes to be developed. The possibility of constructing an information dynamic, continuous with the dynamic of capitalism, is considered. Differentiations are made between forms of semiotic labour: semantic from syntactic labour and communal from universal labour. Information retrieval systems and the departure from the labour theory of copyright are considered in relation to the forms of labour distinguished. An information dynamic is constructed. The potential and limitations of syntactic labour are considered. The analytic value of the distinctions developed is differentiated from the possible predictive power of the dynamic indicated. The Semantic Web is viewed from the perspective of these considerations.
A high degree of discursive coherence is characteristically demanded for scholarly communication, in written and published form. Discursive coherence, and the finished appearance of a printed product, can give the illusion of closure to a dialogic process. Greater informality is allowed for oral presentations. In this context, I wish to take advantage of the nature of electronic communication, capable of combining oral and written elements (Warner and Cox, 2001), and present observations aphoristically, but still permanently and with a definite rhetorical structure.
The justification for this mode of presentation lies in the developing nature of the field (consciousness is lagging behind reality). For:
Another error ... is the over early and peremptory reduction of knowledge into arts and methods; from which time commonly sciences receive small or no augmentation. ... knowledge, while it is in aphorisms and observations, it is in growth: but when it once is comprehended in exact methods, it may perchance be further polished and illustrate and accommodated for use and practice; but it increaseth no more in bulk and substance. (Bacon, 1973:.32)
The knowledge embodied in this article then remains open to growth.
The 'exact methods' referred to by Bacon are often associated with writing, most famously in Bacon's own observation 'writing [maketh] an exact man' (Bacon, 1985: 209). Exactness has also tended to attract cultural prestige. Yet reservations on the value of exactness can be found (Warner, 2001a), even, as here, for the development of sciences.
Exactness will also occur as a substantive theme of the discussion, particularly in connection with the possibilities and limitations of syntactic semiotic labour (a concept to be developed and explained).
The Semantic Web, as a proposed information system, would be subject to the conditions which influence the production and use of other systems. In particular, in terms of the distinctions to be developed, encoding universally consistent semantics into web-pages (Berners-Lee, Hendler, & Lassila, 2001) can be regarded as description labour expended at the point of production, with the aim of increasing control and reducing human labour in selection and use. More familiarly, semantic categories are to be modelled in syntactically detectable distinctions (Berners-Lee, et al., 2001).
A view of technology as a radical human construction will be taken as the basis for subsequent discussion. Classically, this view was developed by Marx, primarily, although not exclusively, with regard to industrial rather than information technologies:
Nature builds no machines, no locomotives, railways, electric telegraphs, self-acting mules etc. These are products of human industry; natural material transformed into organs of the human will over nature, or of human participation in nature. They are organs of the human brain, created by the human hand; the power of knowledge, objectified. The development of fixed capital indicates to what degree general social knowledge has become a direct force of production, and to what degree, hence, the conditions of the process of social life itself have come under control of the general intellect and been transformed in accordance with it. (Marx, 1973: 706)
Control mechanisms ('self-acting mules') and message transmission technologies ('electric telegraphs') are mentioned in this passage, but they are not its primary focus.
The idea of technology, capable of performing autonomous labour, as industrial technology would have been broadly true of Marx's historical period:
Only in large-scale industry has man succeeded in making the product of his past labour, labour which has already been objectified, perform gratuitous service on a large scale, like a force of nature. (Marx, 1976: 510)
Information technologies for message transmission were increasingly diffused from the mid-1860s and these are acknowledged by Marx, in a later passage which takes an inclusive view of communication:
the last fifty years have brought a revolution that is comparable only with the industrial revolution of the second half of the last century. On land the Macadamized road has been replaced by the railway, while at sea the slow and irregular sailing ship has been driven into the background by the rapid and regular steamer line; the whole earth has been girded by telegraph cables. (Marx, 1981: 164)
The industrial technologies of the 19th century, such as the steam-hammer 'that can crush a man or pat an egg-shell' (Dickens, 1946: 150), would have contained control mechanisms for variation in force, even if such mechanisms are not fully acknowledged in the classic concept of the simple machine (Minsky, 1967: 7). Primitive logic machines, such as Jevons' logic piano, were also developed in the late 19th century (Gardner, 1958).
More recently, the Marxian conception of technology as a radical human construction has been extended to information technologies, understood, rather schematically, as a form of knowledge concerned with the transformation of signals from one form or medium into another (Warner, 2000b). From this perspective, the language, including the written language, used by Marx can be seen as a cumulative creation of the 'general intellect'. Congruently with the growth of message transmission technologies, the late 19th century also witnessed the diffusion of non-verbal and abbreviated forms of writing, in logical notations, telegraphic codes, and shorthand.
The extension of a concept describing industrial technologies to include information technologies implies a continuity from industrial to information societies, with both potentially subsumed under capitalism. Familiarly, within discussions of the information society, continuities are counterposed to disjunctions with industrial and capitalist eras (Webster, 1995). A Marxian perspective can again be both novel and informative, in this context:
It is not what is made but how, and by what instruments of labour, that distinguishes different economic epochs.
The writers of history have so far paid very little attention to the development of material production, which is the basis of all social life, and therefore of all real history. But prehistoric times at any rate have been classified on the basis on the investigations of natural science, rather than so-called historical research. Prehistory has been divided, according to the materials used to make tools and weapons, into the Stone Age, the Bronze Age and the Iron Age. (Marx, 1976: 286)
Developments in the instruments of informational labour must be acknowledged, with the computer, as a universal information machine, displacing calculation and, increasingly, writing by hand, as well as special purpose information machines. Yet an underlying and underpinning continuity also exists, strikingly revealed in the theoretical development of the computer from an account of mathematical operations as the writing, erasure, and substitution of symbols. It questionable whether modern information technologies constitute a transformation in material production rather than a significant addition (Warner, 1999a). An understanding of information as a perspective rather than as a disjunction from pre-existing forms of social organisation is, then, preferred here (Warner, 1999b).
If we acknowledge continuities (while not denying contrasts), can we detect or construct an information dynamic which is continuous with the dynamic of capitalism?
Classically, living labour is required to reawaken the dead labour embodied in machinery and thereby to confer use- and exchange-value on inert stuff (Marx, 1976: 527; Warner, 2000b). The fictional or mythic analogue to this process is supplied by Frankenstein giving life to his creation:
With an anxiety that almost amounted to agony, I collected the instruments of life around me that I might infuse a spark of being into the lifeless thing that lay at my feet. It was already one in the morning; the rain pattered dismally against the panes, and my candle was nearly burnt out, when, by the glimmer of the half-extinguished light, I saw the dull yellow eye of the creature open; it breathed hard, and a convulsive motion agitated its limbs. (Shelley, 1998: 38-39)
The awakening of dead physical or industrial labour by human action has analogies in the use of information technologies, specifically, in one interpretation of non-determinism in automata theory, where determinism is understood as the automatic transformations in the intervals between human intervention (regarded as non-determinist).
Can an information dialectic or dynamic then be constructed and detected in empirical developments?
Dialectics or dynamics have been constructed for other fields. For instance, in medicine a dialectic has been detected between the amelioration of known maladies and the consequential rise of enemies to health. The fundamental dynamic of capitalism has been seen as the substitution of dead for living labour, with the aim of decreasing the current costs of production. This process of substitution then yields benefits (teleologically, the historic role of capitalism):
a permanent tendency to increase the social productivity of labour is the main civilizing by-product of capital accumulation, the main objective service which capitalism has rendered mankind. (Mandel, 1976: 60)
In bibliography, the desire of bibliographers to bring order out of chaos is continually frustrated and fed by the urge of the authors, including bibliographers, to publish.
To construct an information dynamic, partly continuous with the fundamental dynamic of capitalism and inclusive of the dialectic between order and chaos in bibliography, some distinctions must be made between forms of human intellectual labour.
A distinction specific to intellectual labour, although it has analogues in physical labour, must be made between semantic and syntactic labour. A distinction derived from existing discussions, and there already partly applied to semiotic products, can be made between universal and communal labour.
Semantic labour is concerned with the content, meaning, or, in semiotic terms, the signified of messages. The intention of semantic labour may be the construction of further messages, for instance, a description of the original message or a dialogic response.
Syntactic labour is concerned with the form, expression, or signifier of the original message. Transformations operating on the form alone may produce further messages (classically, this would be exemplified in the logic formalised by Boole).
Both semantic and syntactic labour are expensive when directly performed by humans. Education, both formal and informal, has been regarded as constituting the production costs of intellectual labour (scholars will be acutely aware that exchange value of intellectual labour need not be directly commensurate with its production costs and that semiotic labour can be conducted in the leisure enabled by other forms of labour). The objects of semantic labour can become objects of syntactic labour when a process is modelled or formalised, although the opening quotation from Bacon would suggest a restricted possibility of further growth. Syntactic labour need not be simple: consider the complexity of logical systems, for instance (a non-constructivist view of mathematics would admit the possibility of syntactically generating acceptable, but complex and yet unknown, statements).
A dual impulse to formalisation, and to the diffusion of formalisms, can be detected. The cultural value of exactness may motivate attempts at formalisation and positively influence their reception. The reduced labour involved in the operation of formalised processes (contrast direct multiplication with the use of logarithms) may impel both their construction and their diffusion. In their diffusion, formalisations renew the prestige of exactness and demonstrate the economic advantages of reduced labour.
The transition from oral to oral and written linguistic communication could be regarded as the opening possibility of a distinction between syntactic and semantic levels of consideration, when applied to human and social not mathematical, or, more narrowly, numerical domains.
A distinction between universal and communal labour is made by Marx:
We must distinguish here, incidentally, between universal labour and communal labour. They both play their part in the production process, and merge into one another, but they are each different as well. Universal labour is all scientific work, all discovery and invention. It is brought about partly by the cooperation of men now living, but partly also by building on earlier work. Communal labour, however, simply involves the direct cooperation of individuals. (Marx, 1981: 199)
Universal labour, understood as science, discovery, and invention, could be regarded as an aspect of the general intellect which transforms the process of social life. Communal labour is crucial to the awakening and use of universal labour, both as embodied in technologies and written texts. In the narrative of Frankenstein, universal labour would be represented by the learning used by Frankenstein and by the instruments of life, and communal labour, here mediated through a single individual, in the application of that learning and those instruments.
With regard to 'building on earlier work', disciplines are understood to differ in the extent to which they are cumulative. Disciplines marked by the extensive use of syntactic operations, most obviously mathematics, are regarded as more strictly cumulative than the human sciences, and, even more the texts and artefacts studied in the human sciences (consider the reduction of Shannon's seminal work in 1938 on analogies between Boolean logic and switching circuits to material for secondary education, over the subsequent 50 years).
Since the late 19th century, information technologies which can be used to perform syntactic labour have been increasingly developed. These technologies have been adopted for public domain information retrieval systems and have also influenced the development of copyright. I will consider information retrieval systems and copyright, but my further purpose is to suggest that distinctions between forms of labour, and the dynamic constructed, could inform understandings of other areas of information activity.
I wish, in this context, to confine attention to system predominantly concerned with written language. Oral and non-verbal forms of graphic communication, which have undergone less clearly marked historically accumulated forms of coding, present different issues for retrieval system design. Most obviously, they do not necessarily offer readily distinguishable syntactic units with potential semantic significance.
Two antithetical, if not always clearly distinguished, traditions can be detected in information retrieval system design and evaluation. The idea of query transformation, understood as the automatic transformation of a query into a set of relevant records, has been dominant in information retrieval theory. A contrasting principle of selection power has been valued in ordinary discourse, librarianship, and, to some extent, in practical system design and use. Philosophical antecedents to the idea of selection power can also be found (Warner, 2000a) (consider also the etymology of intelligence, from the Latin inter-legere, to choose between). The debate between query transformation and selection power may not be resolvable within either paradigm, but, in this context, I wish to take the privilege of assuming selection power as the founding principle for system design, evaluation, and use.
Selection power may be the design principle, but selection labour could be regarded as the primary concept, from which selection power is derived. Let us assume a resistance to labour (I note here a congruence between Marx and Zipf) and that a relatively fixed quantity of selection labour is shared between system producer and searcher, with variation of the distribution between those poles.
Selection power is valued by a searcher as it reduces their selection labour (and an exhaustive serial search may not be a practical possibility). Description labour by the system producer tends to aim to increase the selection power of the searcher and reduce their selection labour (description labour is understood to include cataloguing, or document description, and classification, or subject categorization, incidentally revealing the congruence between their aims). The semantic and syntactic intellectual labour embodied in documents described is here treated as a given. The description labour of the system producer can contain elements of syntactic labour, for instance, transcription or transformation of the object-language of documents described into the metalanguage of index representations, and of semantic labour, for instance the application of thesaural terms derived from a controlled vocabulary or of cataloguing codes to the description of documents. In the 19th century, both syntactic and semantic labour might have involved continuous human intervention (consider the creation of Palmer's Index to The Times and the primarily syntactic labour of transcribing newspaper headlines as index entries); in modern practice, syntactic labour is delegated to humanly constructed technologies, and, accordingly, human intellectual labour becomes almost exclusively semantic.
Universal labour is understood as information technologies, in both their hardware and software aspects, and communal labour as the awakening or use of those technologies, including semantic record description.
A diagram may clarify these distinctions and their application to information retrieval systems (see Figure 1). The classification of systems from highly to loosely structured is tautological in that it is derived from the objects described and the framework of description, but may still be informative.
The Financial Times, in its various searchable manifestations, provides a peculiarly pure example of the distinction between syntactic and semantic labour. It is available as a web-resource without payment at the point of use, with largely syntactically generated search facilities which operate on identifiable units of the source (at: http://news.ft.com). It is also available with additional description, generated from human semantic labour (which could be syntactically assisted), from a number of vendors. For instance, the Dialog available file labels articles by geopolitical region and product/industry names, including NAICS (North American Industry Classification System) code. Direct payment at point of use is made for the resources which embody additional semantic labour. The continuity of such sources is market testimony to readiness to pay for additional selection power (and further evidence for the congruence of the concept with ordinary discourse understandings and everyday practice). Provision of both types of resource involves similar access to the universal labour embodied in information technologies and comparable communal labour to reinvigorate those technologies.
The costs of human labour in description can be more broadly considered. For instance the costs of creating a catalogue record to the standards required for World Cat are in the order of US $40. The labour in description may contain syntactic elements, for instance, in transcription, but will be predominantly semantic. Costs of syntactic labour, by contrast, in storage, manipulation, and transmission of records have diminished historically, and continue to diminish, as communal human labour is transformed into universal labour. Labour invested in record description increases the selection power and reduces the selection labour of the searcher.
Returning to the overall schema embodied in the diagram, we can see that producers of information systems, from highly to loosely structured, have comparable access to universal intellectual labour, embodied in the language they use, and, specifically, in the information technologies available. Comparable, although contrasting, levels of communal labour would be required for system design and maintenance. Strikingly different levels of direct human labour are given to document description: for records in library and union catalogues, intense semantic labour is required (whose intensity could be related to the exactness required); for Internet directories, selection and description of resources, although to less exacting standards; for Internet search engines, very little, if any, additional semantic labour. The communal labour invested in the description of resources reduces the selection labour of the searcher (with both forms of labour reflecting the high costs of direct human employment).
The model can be validated, from macro- to micro-levels. At a macro-level, syntactically based systems proliferate (consider the variety of Internet search engines), while semantically enriched systems, such as World Cat, may occupy unique market positions. Simultaneously, their search facilities, products of universal labour, are converging in appearance and power. At an intermediate level, the function of library cooperatives has changed over time, moving along the horizontal axis of the diagram, from adapting universal labour to a concern with sharing the descriptive labour of cataloguing (from awakening Frankenstein's monster to distributing its limbs). At a more micro-level, the relative costs of communal and universal labour, considered in relation to market demand, form the decision framework for the conversion of historical resources from paper to electronic form (including Palmer's Index to The Times). For information retrieval systems, the communal labour invested in description at production reduces the labour required at use (proposals for coding in the semantic web could be understood as part of this dialectic). The distribution of direct human labour between producer and searcher may depend on the nature of the market for the product.
Information retrieval systems, then, can be seen to exhibit the fundamental dynamic of capitalism, the substitution of dead for living labour, although semiotic rather than physical labour. The specific, and already known, dynamic of bibliography between order and chaos is accentuated. Chaos is further enabled by the reduced costs of making information public. Possibilities for order are enhanced by the availability of delegated syntactic labour (although the limitations of such labour are becoming painfully known). The resources giving control themselves contribute to overall disorder (consider Search Engine Watch in relation to Theodore Besterman's A World Bibliography of Bibliographies and classic concerns with bibliographic proliferation).
I wish here to review the striking reversal of the labour or 'sweat of the brow' theory of copyright by the Feist judgement of 1991 and to suggest that a similar dynamic, between living and dead labour, underlies the reversal and its date of occurrence.
The classic liberal justification for intellectual property, including copyright, is given by the United States Constitution:
The Congress shall have Power
To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;
Discussion of copyright have tended to refer to the balance between the two ends of promoting science and the useful arts and protecting the property rights of authors. A more careful reading suggest that property rights were to be given to authors as a mechanism to promote the public good and not as an end in themselves (although the late 18th century marked the emergence of the author as a figure fully entitled to economic reward for their labour). Judicial interpretation and public understanding has tended to focus on the rights of authors to be rewarded for their labour. This focus became known as the labour or 'sweat of the brow' theory of copyright. The legislature, oriented towards the present and future rather than precedent, and compelled to review practices, may have been more mindful of the overall public good. Less noticed than the labour theory of copyright is the transformation of copyright in practice, at least in part, to a mechanism for projecting the labour of authors (or, with many forms of publication, the investment of publishers (Wilson, 1990)).
A potential conflict exists between public good and property rights in intellectual productions. Specifically, property rights can conflict with the freedom of expression guaranteed by the First Amendment to the United States Constitution. For example, the description or abstract of a document may approach the document described. It the document is factual, this gives property in facts, which would be a restraint on freedom of speech (Wilson, 1990).
The Feist judgement, concerned with intellectual property in telephone directories, acknowledges and reverses the labour theory of copyright:
Article I, § 8, cl. 8, of the Constitution mandates originality as a prerequisite for copyright protection. The constitutional requirement necessitates independent creation plus a modicum of creativity.
The Copyright Act of 1976 and its predecessor, the Copyright Act of 1909, leave no doubt that originality is the touchstone of copyright protection in directories and other fact-based works. The 1976 Act explains that copyright extends to 'original works of authorship,' 17 U.S.C. § 102(a), and that there can be no copyright in facts, § 102(b). A compilation is not copyrightable per se, but is copyrightable only if its facts have been 'selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship.' § 101 (emphasis added). Thus, the statute envisions that some ways of selecting, coordinating, and arranging data are not sufficiently original to trigger copyright protection. Even a compilation that is copyrightable receives only limited protection, for the copyright does not extend to facts contained in the compilation. § 103(b). Lower courts that adopted a 'sweat of the brow' or 'industrious collection' test -- which extended a compilation's copyright protection beyond selection and arrangement to the facts themselves -- misconstrued the 1909 Act and eschewed the fundamental axiom of copyright law that no one may copyright facts or ideas.
Rural's selection of listings -- subscribers' names, towns, and telephone numbers -- could not be more obvious and lacks the modicum of creativity necessary to transform mere selection into copyrightable expression. In fact, it is plausible to conclude that Rural did not truly 'select' to publish its subscribers' names and telephone numbers, since it was required to do so by state law. Moreover, there is nothing remotely creative about arranging names alphabetically in a white pages directory. It is an age-old practice, firmly rooted in tradition and so commonplace that it has come to be expected as a matter of course.
It may seem unfair that much of the fruit of the compiler's labor may be used by others without compensation. As Justice Brennan has correctly observed, however, this is not 'some unforeseen byproduct of a statutory scheme.' Harper & Row, 471 U.S., at 589 (dissenting opinion). It is, rather, 'the essence of copyright,' ibid., and a constitutional requirement. The primary objective of copyright is not to reward the labor of authors, but 'to promote the Progress of Science and useful Arts.' Art. I, § 8, cl. 8. Accord Twentieth Century Music Corp. v. Aiken, 422 U.S. 151, 156 (1975). To this end, copyright assures authors the right to their original expression, but encourages others to build freely upon the ideas and information conveyed by a work. (Feist, 1991)
The epistemology implied by the judgment conceives of facts existing independently of their discovery. The idea of selection, and of the degree of creativity in selection, also recurs. Most strikingly, the labour theory of copyright is reviewed, critiqued, and dismissed.
Why should this reversal of the labour theory have occurred at that time and place? The Supreme Court can override precedent and may resemble the legislature in its concern for public good and for interpreting the Constitution. The United States was being increasingly influenced by concepts of copyright held in other jurisdictions, marked by the, still minimalist, Berne Convention Implementation Act 1988 (although jurisdictions explicitly valuing dissemination above property rights, for instance the Soviet Union, would have been only indirectly influential) (Warner, 1999a). In terms of the dynamic developed here, the direct human labour involved in the selection, ordering, and presentation of data was no longer sufficiently substantial to warrant protection. The judgment regards this form of selection labour as 'an age-old practice, firmly rooted in tradition and so commonplace that it has come to be expected as a matter of course'.
A similar dynamic, the, principally the substitution of dead for living labour, has been detected in domains seldom viewed from a single perspective. The congruence in time, and, to some extent, in geopolitical region, between the development of the Internet and the departure from the labour theory of copyright is striking.
Can an information dynamic then be constructed (see Figure 2)?
Historically, human intellectual labour begins as semantic labour (natural signs could be regarded as analogous to objects of labour provided by nature) (Warner, 2001b). Over time, and through collective human endeavour, semantic labour can be transformed into syntactic labour, which can reduce the direct human intellectual labour required for semiotic processes. Particularly since the late 19th century, syntactic labour can be modelled and executed mechanically. In these processes of transformation, from spoken to written language, and, further, in computational modelling, a degree of exactness has to be imposed, which may reduce the vitality of the field. In the transformation to computable form, greater exactness is demanded (and this may expose as imperfect formalisations previously accepted as self-consistent). To interpret the results of these syntactic transformations, human semantic labour is required and the cycle is renewed.
Two distinct, but related, approaches could be taken to this dynamic:
First, a priori to assert that universal (machine) labour cannot be semantic in character (following (Searle, 1980)).
To accept this, but then to suggest theoretical potentials and limitations on transformation of or modelling of semantic as syntactic labour.
The second approach promises to be more productive.
Some potentials and limitations can be suggested, connected with the self-identity of the sign and the limitations of exactness (or the exposure of the historical illusion of exactness):
Heraclitus observed that no man stepped into the same river twice. In relation to the stream of oral speech, discussions have questioned the existence of synchronic synonymy. The diachronic analogue to synonymy, replication over time, has been discovered to be difficult to establish for oral forms, considered as signals, and, for the purposes of logical translation, for the signified for oral and written forms.
The only identity required in formal logic is identity of the sign (Wittgenstein, 1981). It could be suggested that we can impose this convention of identity for certain purposes, within mathematics and logic, but not with fully publicly circulated messages. Once messages are fully in the public domain, their producers lose control over their transformation and interpretation.
Loss of control may be a source of richness. From one semiotic perspective, all tautology (and, for Wittgenstein, logic consisted of tautologies) is a refusal of life.
Recognising the potential and limitations of syntactic transformations may enhance our valuing of human intelligence and sympathies. Technology, regarded as a human construction, changes our conception of what it means to be human.
From the perspective developed here, doubt must be cast on the possibility of establishing universally consistent coding proposed for the Semantic Web. The proposal can still be assimilated to the dynamic detected, particularly to the dialectic between labour in production and in use. Further considerations would be the costs of the labour in production and the difficulty of imposing control on distributed entities.
What is the value of this analysis? Previously unrelated developments can be viewed from a common perspective, enhancing our understanding of patterns. Particularly for information retrieval, research is brought simultaneously closer to ordinary discourse understandings, everyday practice, and to the human and social sciences.
The analysis may have predictive as well as analytic value, for instance for the proliferation of syntactically based information retrieval systems (although the predictive value of analyses of human domains is complicated by the effects of analyses on the consciousness and actions of human subjects and their activities).