Distant Horizons Digital Evidence and Literary Change
by Ted Underwood
University of Chicago Press, 2019
Cloth: 978-0-226-61266-9 | Paper: 978-0-226-61283-6 | Electronic: 978-0-226-61297-3
DOI: 10.7208/chicago/9780226612973.001.0001


Just as a traveler crossing a continent won’t sense the curvature of the earth, one lifetime of reading can’t grasp the largest patterns organizing literary history. This is the guiding premise behind Distant Horizons, which uses the scope of data newly available to us through digital libraries to tackle previously elusive questions about literature. Ted Underwood shows how digital archives and statistical tools, rather than reducing words to numbers (as is often feared), can deepen our understanding of issues that have always been central to humanistic inquiry.  Without denying the usefulness of time-honored approaches like close reading, narratology, or genre studies, Underwood argues that we also need to read the larger arcs of literary change that have remained hidden from us by their sheer scale. Using both close and distant reading to trace the differentiation of genres, transformation of gender roles, and surprising persistence of aesthetic judgment, Underwood shows how digital methods can bring into focus the larger landscape of literary history and add to the beauty and complexity we value in literature.


Ted Underwood is professor of information sciences and English at the University of Illinois, Urbana-Champaign. He is also the author, most recently, of Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies.


Distant Horizons is of compelling interest to digital humanists. But its true audience is a wider society of literary and other humanities scholars spanning across fields, periods, approaches, and levels. For this larger audience, Ted Underwood goes out of his way to make distant reading accessible, inviting, and persuasive. This innovative book is the breakout work digital humanists have been waiting for, and it is positioned to be a landmark work in literary scholarship at large.”
— Alan Liu, author of Friending the Past: The Sense of History in the Digital Age

Distant Horizons not only proves that Ted Underwood is defining the field of cultural analytics as it emerges; it shows us why. Combining literary theory with a deep understanding of computational methods, this volume demonstrates and effectively argues that quantitative analysis is best used not to find objective truths but to explore perspectives, both historically local and theoretical. It is at once a primer for quantitative literacy and a historically sensitive exploration of gender, genre, character, and audience, putting paid once and for all to the notion that statistical methods have no place in hermeneutics.”
— Laura Mandell, author of Breaking the Book: Print Humanities in the Digital Age

"This is a substantive contribution to the debate over what Franco Moretti dubbed 'distant reading' and its place in the study of literature. Underwood engages contemporary scholarship, building and testing hypotheses based in the last 20 years of work. . . . Though technical in method, the book is engaging, and Underwood punctuates the argument with data-rich graphs and tables. The volume concludes with a healthy, skeptical consideration of the dangers of distant reading that nevertheless argues for the place of digital reading, alongside more traditional literary inquiry, as a tool for 'learning to doubt one's own perspective.'"

— Choice

“Ted Underwood’s Distant Horizons: Digital Evidence and Literary Change is an insightful account of the applications of computational methods in literary studies and a striking demonstration of the labor that goes into making them work. The book digests and draws implications from the more technical studies that Underwood has conducted during the past decade
with collaborators in literature and computer science, putting them together into a coherent account of an analytical approach with which most humanists are only generally familiar. Along the way, it also offers thought-provoking accounts of periodization, genre, theme, and gender in modern English literature based on Underwood’s computational research.“
— Daniel Rosenberg, Modern Philology

"Underwood’s knowledge of his own materials and methods and his ability to explain them to uninitiated readers are truly exceptional, while he is admirably open to changing his mind and curious to try new approaches. His judicious application of computational tools to digitized corpora of modern printed texts is rightly influential, and I will continue to follow his innovations with pleasure and interest."
— Victorian Studies


DOI: 10.7208/chicago/9780226612973.003.0001
[statistical models;literary language;distant reading;novel;biography;hermeneutics;generic differentiation]
Literary scholars often understand themselves as advancing competing approaches to a limited number of known topics (authors, periods, and genres, for instance). This chapter argues that distant reading is not just another new approach to known topics, but instead expands literary scholarship by backing up to reveal genuinely new objects of study: century-spanning trends for which scholars don't yet have names. As a case in point, it examines a broad differentiating process that made English-language novels steadily less similar to nonfiction narrative from the eighteenth century through the twentieth. Parts of this process are already understood, and certain aspects of it can be described through close readings of isolated passages. But for a clear and convincing view of the whole pattern, scholars need numbers. In particular, we need statistical models. Simply measuring literature (by, say, counting words) may not provide a sufficient social foundation to interpret the patterns we discover. Statistical models can guarantee that foundation by connecting linguistic measurements to concepts that already have literary significance for an existing interpretive community.

DOI: 10.7208/chicago/9780226612973.003.0002
[genre;science fiction;detective fiction;Gothic;mystery;Edgar Allan Poe;scientific romance;genre theory;perspectival modeling;Jules Verne]
Disagreements about the history of genre are hard to resolve because contemporary readers inevitably see earlier works through a retrospective lens. We may call the novels of Mary Shelley "science fiction," for instance, but that phrase didn't exist when she wrote them, and it is difficult to be sure that the genealogical connections we perceive aren't shaped by a need to find ourselves reflected in the past. There may be no perfect solution to this dilemma, but statistical models of genre do provide a new kind of leverage on it, because a model defined only by nineteenth-century examples can be genuinely ignorant about the present, and can serve as a proxy for the literary standards of a vanished era. Using this method, the chapter examines the history of science fiction, detective fiction, and the Gothic. The generational divides predicted by Franco Moretti do not appear. Models trained only on pre-Gernsback "scientific romance" are capable of recognizing their kinship to contemporary science fiction. Detective fiction, similarly, is revealed as a tightly unified genre, with a continuity that stretches back to Edgar Allan Poe. The Gothic, on the other hand, falls apart into several different subgenres.

DOI: 10.7208/chicago/9780226612973.003.0003
[prestige;literary field;book sales;book reviewing;cultural capital;great divide;aesthetic judgment]
How durable are standards of literary judgment? Many literary histories imply that these standards can change within a few decades, as the criteria of the Enlightenment are overthrown by Romanticism, or Victorian assumptions about poetic quality are dismantled by modernism. To test these assumptions, this chapter models a boundary between prestigious works of poetry and fiction (reviewed in elite journals) and obscure works, selected randomly from a large digital library. These models of literary prestige turn out to be very stable: an entire century of judgment about poetry can be described with 79% accuracy by a single model. Moreover, a model based on 25 years of evidence can predict 25 or 50 years into the future with little loss of accuracy—even across boundaries conventionally associated with, say, a modernist poetic revolution. This evidence suggests that stability and momentum may play a larger role in literary history than we have acknowledged. The chapter closes by comparing evidence about critical prestige in fiction to evidence about book sales.

DOI: 10.7208/chicago/9780226612973.003.0004
[gender;characterization;stereotypes;authorship;masculinity in fiction;femininity in fiction;women writers]
This chapter explores the paradox that the representation of gender in fiction became more flexible while the sheer balance of attention between fictional men and women was growing more unequal. The rigidity of gendered roles can be measured by asking how easy it is to infer grammatical gender from ostensibly ungendered words used in characterization. In the nineteenth century, roles are so predictable that this inference is easy; it becomes harder as we move toward the present. But the diminishing power of stereotypes does not parallel progress toward equality of representation. On the contrary, by the middle of the twentieth century, women have lost almost half the space they occupied in nineteenth-century fiction. This tension between growing flexibility and growing inequality of representation presents literary historians with a striking paradox; a few potential explanations are considered.

DOI: 10.7208/chicago/9780226612973.003.0005
[distant reading;pleasure;digital humanities;mechanical objectivity;New Criticism;New Historicism;teaching;interdisciplinarity;big data]
Large-scale quantitative inquiry about literature is more deeply controversial than this book has so far admitted: critics claim not only that research of this kind is ineffective, but that it may be dangerous. To address that concern, this chapter explores several genuine risks of distant reading. First, preoccupation with quantitative details might lead literary scholars to forget the point of their enterprise—which is, after all, illuminating the enjoyment of poems, plays, and novels. Second, association with computers might create an illusory aura of objectivity. Third, new approaches might prove difficult to fit into the curricular framework of a literature department. The chapter suggests that the first of these dangers can be addressed with ambitious, vivid writing; it argues that the second danger is best addressed by emphasizing the intellectual foundations of distant reading and minimizing digital hype. Curricular challenges, however, may prove more difficult to surmount. In the course of exploring these genuine problems with distant reading, the chapter also dismisses a couple of spurious ones—including the presentist assumption that literature must be allied with concrete particularity, and fear-mongering about "big data."