by Rens Bod
CSLI, 1998
eISBN: 978-1-57586-774-8 | Cloth: 978-1-57586-151-7 | Paper: 978-1-57586-150-0
Library of Congress Classification P138.5.B63 1998
Dewey Decimal Classification 410.21

ABOUT THIS BOOK | TOC
ABOUT THIS BOOK
During the last few years, a new approach to language processing has started to emerge, which has become known under the name of "Data Oriented Parsing" or "DOP". This approach embodies the assumption that human language comprehension and production works with representations of concrete past language experiences, rather than with abstract grammatical rules. The models that instantiate this approach therefore maintain corpora of linguistic representations of previously occurring utterances. New utterance-representations are constructed by freely combining partial structures from the corpus. A probability model is used to choose from the collection of different structures of different sizes those that make up the most appropriate representation of an utterance.

In this book, DOP models for several kinds of linguistic representations are developed, ranging from tree representations, compositional semantic representations, attribute-value representations, and dialogue representations. These models are studied from a formal, linguistic and computational perspective and are tested with available language corpora. The main outcome of these tests suggests that the productive units of natural language cannot be defined in terms of a minimal set of rules (or constraints or principles), as is usually attempted in linguistic theory, but need to be defined in terms of a large, redundant set of previously experienced structures with virtually no restriction on their size and complexity. I will argue that this outcome has important consequences for linguistic theory, leading to a new notion of language competence. In particular, it means that the knowledge of a speaker/hearer cannot be understood as a grammar, but as a statistical ensemble of language experiences that changes slightly every time a new utterance is processed.

See other books on: Bod, Rens | Computational linguistics | Language | Linguistics | Statistical methods
See other titles from CSLI