Life Out of Sequence A Data-Driven History of Bioinformatics
by Hallam Stevens
University of Chicago Press, 2013
Cloth: 978-0-226-08017-8 | Paper: 978-0-226-08020-8 | Electronic: 978-0-226-08034-5
ABOUT THIS BOOKAUTHOR BIOGRAPHYREVIEWSTABLE OF CONTENTS

ABOUT THIS BOOK

Thirty years ago, the most likely place to find a biologist was standing at a laboratory bench, peering down a microscope, surrounded by flasks of chemicals and petri dishes full of bacteria. Today, you are just as likely to find him or her in a room that looks more like an office, poring over lines of code on computer screens. The use of computers in biology has radically transformed who biologists are, what they do, and how they understand life. In Life Out of Sequence, Hallam Stevens looks inside this new landscape of digital scientific work.
           
Stevens chronicles the emergence of bioinformatics—the mode of working across and between biology, computing, mathematics, and statistics—from the 1960s to the present, seeking to understand how knowledge about life is made in and through virtual spaces. He shows how scientific data moves from living organisms into DNA sequencing machines, through software, and into databases, images, and scientific publications. What he reveals is a biology very different from the one of predigital days: a biology that includes not only biologists but also highly interdisciplinary teams of managers and workers; a biology that is more centered on DNA sequencing, but one that understands sequence in terms of dynamic cascades and highly interconnected networks. Life Out of Sequence thus offers the computational biology community welcome context for their own work while also giving the public a frontline perspective of what is going on in this rapidly changing field.

AUTHOR BIOGRAPHY

Hallam Stevens is assistant professor at Nanyang Technological University in Singapore, where he teaches classes on the history of the life sciences and the history of information technology.

REVIEWS

“What happens to biology with computerization? Hallam Stevens’s compelling ethnographic and historical narrative shows how the nature of the biological experiment has changed with the increasing use of the tools of information technology in life science and biomedicine. Life Out of Sequence traces rearrangements in the relationship between the virtual and the material as scientists work increasingly on databases rather than cells or bodies. As the book takes on the mirrored questions of the work of life and the life of work in front of the computer in the lab, the reader is brought into the world of bioinformatics, and comes to understand that this is not just a subfield of scientific activity, but a space in which the nature of knowledge production in life science is undergoing fundamental and rapid change.”
— Hannah Landecker, University of California, Los Angeles

“What is it like to do biology when the indispensable scientific instrument has become the computer, when biological objects are transformed into computer-compatible data, and when the manipulation of data replaces the manipulation of organisms and their parts? Life Out of Sequence is a vivid account of how the flow of massive amounts of data has fundamentally changed both the questions biologists ask and the answers they recognize. It is essential reading for anyone wanting to understand the world that biologists have made for themselves and that they are making for the rest of us.”
— Steven Shapin, author of The Scientific Life

“A rich and fascinating ethnographic and historical account of the transformations wrought by integrating statistical and computational methods and materials into the biological sciences. . . . The histories of biology, computing, database technology, and bioinformatic imaging all play a role in this wonderfully transdisciplinary story.”
— Carla Nappi, New Books in Science, Technology, and Society

“[A] sharp and lucid work of history and anthropology of science. . . . [Stevens’s] clear and refined prose should extend the book’s readership beyond its disciplinary audiences in the social studies of science, to welcome scientists into this reading of their field’s past and present. . . . Stevens provides a highly readable telling of how bioinformatics took shape, how it works within technological and conceptual limits that change over time, and how individual, and mostly unsung, scientists made it happen. . . . An effective and enjoyable remolding of oversimplified ‘data-to-truth’ histories of science, Life Out of Sequence draws out the reciprocal impressions made by data systems and living systems on each other—and on the sense scientists make of life.”
— Michael Fortun, Rensselaer Polytechnic Institute, Science

“Stevens presents engaging ethnographic fieldwork throughout the book. . . . An interesting read for life and computational scientists seeking a deeper understanding of the interdisciplinary connections of their domains.”
— D. Papamichail, College of New Jersey, Choice

"Readers benefit from the book's extensive source material, as well as from dozens of interviews with scientists who have widely divergent views on bioinformatics."
— BioScience

TABLE OF CONTENTS

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0001
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0002
[Walter Goad, James Ostell, GenBank, Robert S. Ledley, operations research]
This chapter summarizes the migration of the computer from physics into biology. Although computers were used in biology before the late 1970s, it was the invention of DNA sequencing that did much to change both the direction of biological research and the relationship of biology with computing. Since the early 1980s, the amount of sequence data has continued to grow at an exponential rate. The computer came to seem a perfect tool with which to cope with the overwhelming flow of data. This transition is explored through two case studies – the first of Walter Goad, a physicist who turned his computational skills towards biology in the 1960s; and the second of James Ostell, a computationally-minded PhD student in biology at Harvard in the 1980s. These examples show how the practices of computer use were imported from physics in biology and struggled to establish themselves there. These practices came to be established as a distinct sub-discipline of biology – bioinformatics – during the 1990s. (pages 13 - 42)
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0003
[data-driven, hypothesis-free, statistics]
This chapter follows bioinformatic work into the present, describing its work in action. Bioinformatic practices – derived from computing practices in physics – re-orient biology towards particular sets of questions that are distinct from those of pre-informatic biology. Most importantly, these practices allow biologists to pose and answer general questions. Lab bench experimentation usually permitted biologists to study only a single gene, a single cell, or a single protein at once. The availability, scope, and shape of data enables questions focused on whole genomes, or hundreds of proteins, or on many organisms at once. This is accomplished through the use of statistical methods and approaches; large amounts of data are analyzed and understood statistically. These two elements – general questions and statistical approaches – mark a major break with non-bioinformatic or pre-informatic biology. Some of biologists’ concerns about this transformation to ‘bioinformatic’ biology are manifest in the debates about ‘hypothesis free’ or ‘data driven’ biology. (pages 43 - 70)
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0004
[production, consumption, Erving Goffman, architecture, kanban, 5S, Six Sigma, lean production]
What do the spaces in which bioinformatic knowledge is produced look like? How are they arranged? How do people move around in them? What difference does this make to the knowledge that is produced? The dynamics of data exchange have driven a spatial reorganization of biological work. That is, data work demands that people and laboratories are arranged and organized in specific ways. The arrangements of walls, hallways, offices, benches and the movements amongst them are crucial in certifying and authorizing bioinformatic knowledge – the motion of data through space and between people renders it more or less valuable, more or less plausible. Because of this, spatial motion is also bound up with struggles between different kinds of work and the value of different kinds of knowledge in contemporary biology. (pages 71 - 106)
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0005
[virtual, sequencing pipeline, ontologies, gene ontology, open biomedical ontologies, genomic standards consortium, distributed annotation system, Ensembl]
This chapter describes the organization of biological objects and work in virtual space. It shows how biology becomes virtual and how virtual objects are organized to produce knowledge. This flattening into data is a complex and contested process – samples do not just automatically become data. Although the notion of a sequencing ‘pipeline’ suggests linearity, this chapter describes the messy, contingent process of rendering samples into data. Likewise, producing biological ‘ontologies’ means creating standardized, common languages for speaking about biology. Such standard, computable objects are necessary for bioinformatic work to be possible. Bioinformatics standardizes and flattens biological language into data. The chapter follows the movement of bioinformatic objects in virtual space. Standard, computable formats allow knowledge to be produced by the careful movement and arrangement of data in virtual space. (pages 107 - 136)
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0006
[Edgar F. Codd, Margaret O. Dayhoff, Atlas of Protein Sequence and Structure, GenBank, flat-file database, relational database, theoretical biology]
This chapter explores the role of databases in scientific knowledge-making using one prominent example: GenBank. Biological databases, organized with computers, cannot be thought of as just collections. Instead, biological databases are orderings of biological materials. They provide ways of dividing up the biological world; they are tools that biologists use and interact with. Databases store information within carefully crafted digital structures. Computer databases construct orderings of scientific knowledge: they are powerful classification schemes that make some information accessible and some relationships obvious, while making other orderings and relationships less natural and familiar. Organizing and linking sequence elements in databases can be understood as a way of representing the connections between those elements in real organisms. The database becomes a digital idealization of living system, emphasizing particular relationships between particular objects. (pages 137 - 170)
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0007
[visualization, Ensembl, Project MAC, ACeDB, genome browser, heat map, Ben Fry]
This chapter follows the data out of databases and into the visual realm. Visualizations are not produced as the end results of biological work, afterthoughts prepared for publication or for presentations, but form an integral part of how computational biologists think about their own work and communicate it to their collaborators. Arranging data into images forms a crucial part of knowledge production in contemporary biology – it is often through visualizations that data are made into knowledge. A large part of bioinformatics is about solving problems of representation. As with databases, the choice of visual representation has a determinative effect on the knowledge and objects produced through them. Computational representations of biological objects involve decisions about what features of an object are represented and how they are related to one another. When an object is represented in an image, it is poured into a carefully contrived structure. As with databases, decisions about visualization techniques and tools are decisions about how to constitute an object itself. Visualization in bioinformatics is – through and through – analysis and quantification. (pages 171 - 202)
This chapter is available at:
    University Press Scholarship Online

- Hallam Stevens
DOI: 10.7208/chicago/9780226080345.003.0008
[next-generation sequencing, personal genomics, Genome Wide Association Studies, Web 3.0, Semantic Web]
Where is sequence likely to take biology in the near future? Sequence is not going away: next-generation sequencing machines are making more and more sequence and more and more data an increasingly taken-for-granted part of biology. The ways in which this increasingly massive amount of data managed are likely to become ever more entangled with the management of data in other domains, especially with Web-based technology. Bioinformatics will become just one of many data management problems. This will have consequences not only for biological work, but also – as the results of bioinformatics are deployed in medicine – consequences for our understanding of our bodies. These computational approaches may become so ubiquitous that a ‘bioinformatics’ – as distinct from other kinds of biology – will disappear as a meaningful term of reference. (pages 203 - 224)
This chapter is available at:
    University Press Scholarship Online