Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparision

by David Sankoff and Joseph Kruskal

CSLI, 2001 Paper: 978-1-57586-217-0 | eISBN: 978-1-57586-711-3 Library of Congress Classification QA292.T55 1999 Dewey Decimal Classification 515.24

ABOUT THIS BOOK | TOC

ABOUT THIS BOOK Time Warps, String Edits and Macromolecules is a young classic in computational science. The computational perspective is that of sequence processing, in particular the problem of recognizing related sequences. The book is the first, and still best compilation of papers explaining how to measure distance between sequences, and how to compute that measure effectively. This is called string distance, Levenshtein distance, or edit distance. The book contains lucid explanations of the basic techniques; well-annotated examples of applications; mathematical analysis of its computational (algorithmic) complexity; and extensive discussion of the variants needed for weighted measures, timed sequences (songs), applications to continuous data, comparison of multiple sequences and extensions to tree-structures. This theory finds applications in molecular biology, speech recognition, analysis of bird song and error correcting in computer software.

TABLE OF CONTENTS

1. An overview of sequence comparison Joseph B. Kruskal
Part I. Macromolecular Sequences: 2. Recognition of patterns in genetic sequences Bruce W. Erickson and Peter H. Sellers
3. Fast algorithms to determine RNA secondary structures containing multiple loops David Sankoff, Joseph B. Kruskal, Sylvie Mainville and Robert J. Cedergren
Part II. Time-Warping, Continuous Functions, and Speech Processing
4. The symmetric time-warping problem: from continuous to discrete Joseph B. Kruskal and Mark Liberman
5. Use of dynamic programming in a syllable-based continuous speech recognition system Melvyn J. Hunt, Matthew Lennig and Paul Mermelstein
6. Application of sequence comparison to the study of bird songs David W. Bradley and Richard A. Bradley
Part III. Variations on a Theme: Algorithms for Related Problems: 7. On the complexity of the extended string-to-string correction problem Robert A. Wagner
8. An analysis of the general tree-editing problem Andrew S. Noetzel and Stanley M. Selkow
9. Simultaneous comparison of three or more sequences related by a tree David Sankoff and Robert J. Cedergren
10. An anthology of algorithms and concepts for sequence comparison Joseph B. Kruskal and David Sankoff
11. Dissimilarity measures for clustering strings James M. Coggins
Part IV. Computational Complexity: 12. Recent results on the complexity of common-subsequence problems
13. Formal-language error correction Robert A. Wagner
14. How to computer string-edit distances quickly William J. Masek and Michael S. Paterson
Part V. Random Sequences: 15. An upper-bound technique for lengths of common subsequences Vaclav Chvatal and David Sankoff
16. Probabilistic behavior of longest-common-subsequence length Joseph Deken
17. Common subsequences and monotone subsequences David Sankoff and Sylvie Mainville
Author index
Subject index.

Time Warps, String Edits and Macromolecules is a young classic in computational science. The computational perspective is that of sequence processing, in particular the problem of recognizing related sequences. The book is the first, and still best compilation of papers explaining how to measure distance between sequences, and how to compute that measure effectively. This is called string distance, Levenshtein distance, or edit distance. The book contains lucid explanations of the basic techniques; well-annotated examples of applications; mathematical analysis of its computational (algorithmic) complexity; and extensive discussion of the variants needed for weighted measures, timed sequences (songs), applications to continuous data, comparison of multiple sequences and extensions to tree-structures. This theory finds applications in molecular biology, speech recognition, analysis of bird song and error correcting in computer software.

TABLE OF CONTENTS

1. An overview of sequence comparison Joseph B. Kruskal
Part I. Macromolecular Sequences: 2. Recognition of patterns in genetic sequences Bruce W. Erickson and Peter H. Sellers
3. Fast algorithms to determine RNA secondary structures containing multiple loops David Sankoff, Joseph B. Kruskal, Sylvie Mainville and Robert J. Cedergren
Part II. Time-Warping, Continuous Functions, and Speech Processing
4. The symmetric time-warping problem: from continuous to discrete Joseph B. Kruskal and Mark Liberman
5. Use of dynamic programming in a syllable-based continuous speech recognition system Melvyn J. Hunt, Matthew Lennig and Paul Mermelstein
6. Application of sequence comparison to the study of bird songs David W. Bradley and Richard A. Bradley
Part III. Variations on a Theme: Algorithms for Related Problems: 7. On the complexity of the extended string-to-string correction problem Robert A. Wagner
8. An analysis of the general tree-editing problem Andrew S. Noetzel and Stanley M. Selkow
9. Simultaneous comparison of three or more sequences related by a tree David Sankoff and Robert J. Cedergren
10. An anthology of algorithms and concepts for sequence comparison Joseph B. Kruskal and David Sankoff
11. Dissimilarity measures for clustering strings James M. Coggins
Part IV. Computational Complexity: 12. Recent results on the complexity of common-subsequence problems
13. Formal-language error correction Robert A. Wagner
14. How to computer string-edit distances quickly William J. Masek and Michael S. Paterson
Part V. Random Sequences: 15. An upper-bound technique for lengths of common subsequences Vaclav Chvatal and David Sankoff
16. Probabilistic behavior of longest-common-subsequence length Joseph Deken
17. Common subsequences and monotone subsequences David Sankoff and Sylvie Mainville
Author index
Subject index.