LINGUIST List 33.135
Mon Jan 17 2022
Review: Cognitive Science; Computational Linguistics: McShane, Nirenburg (2021)
Editor for this issue: Billy Dickson <billydlinguistlist.org>
Date: 10-Dec-2021
From: Myrthe Reuver <myrthe.reuver
vu.nl>
Subject: Linguistics for the Age of AI
E-mail this message to a friend Discuss this message Book announced at
https://linguistlist.org/issues/32/32-2038.html
AUTHOR: Marjorie McShane
AUTHOR: Sergei Nirenburg
TITLE: Linguistics for the Age of AI
PUBLISHER: MIT Press
YEAR: 2021
REVIEWER: Myrthe Reuver, Vrije Universiteit Amsterdam
INTRODUCTION
The book “Linguistics in the Age of AI” by Marjorie McShane and Sergei Nirenburg identifies its main goal as providing an accessible, multidisciplinary description of modelling language understanding for an artificial agent, for readers with diverse backgrounds: linguistics, cognitive science, Natural Language Processing (NLP) and Artificial Intelligence (AI). It describes the process of developing an artificial agent with human-level language ability: Language-Endowed Intelligent Agent (LEIA). The focus is on the agent’s processing of language after speech processing, so of already transcribed text. The book is not meant to be a textbook, but the authors have used it as a textbook for linguistics students, and it does contain exercises at the end of each chapter.
SUMMARY
“Setting the Stage”
The book’s introduction explains the aim to build and assess an artificial agent, and to do so in a holistic manner rather than a research focus on sub-components or tasks (as is common in NLP). The book presents the authors’ work towards this goal, spanning several decades and problems. Its content spans many linguistic subfields (syntax, semantics, and pragmatics) as well as NLP tasks (coreference resolution, PoS tagging, and machine reasoning). The focus is on knowledge-based systems rather than machine learning: machine learning should only be used to extend knowledge bases, which is identified here as the “most promising path” (p. xvi) for intelligent agents.
CHAPTER 1 – “Our Vision of Linguistics for the Age of AI”
This chapter covers what the book considers an intelligent language-endowed agent: one able to extract meaning from utterances, process language input on several levels (syntax, semantics, discourse), continuously learn new words and concepts, and explain its decisions in natural language. The chapter then details the complexities in processing language, such as different types of ambiguity, showing that these goals are not trivial. What follows is a short history of NLP. The chapter argues for knowledge-based systems over machine learning, because this is argued to be more linguistically informed, cognitively plausible (due to rule-based reasoning), and more explainable as well as trustworthy than systems built with other (machine learning) approaches. The approach is “deep Natural Language Understanding (NLU) with a cognitive systems approach” (p. 37, but not the ‘deep’ of deep learning), comparable to Cyc (Lenat, 1995) and TRIPS (Allen et. al., 2005). The used theoretical framework is Ontological Semantics (Nirenburg & Raskin, 2004), and a pre-defined ontology of 30,000 word senses is central to the agent’s processing of language.
CHAPTER 2 – A Brief Overview of Natural Language Understanding by LEIAs
This chapter introduces the five steps of language processing by the intelligent agent: (1) Pre-Semantic Analysis, (2) Basic Semantic Analysis, (3) Basic Coreference Resolution, (4) Extended Semantic Analysis, and (5) Situational Reasoning. Each of these is discussed in detail in Chapters 3 to 7. The agent’s goal is to process the input until the understanding is “actionable”: assessing after each step whether processing has led to a sufficient understanding for taking an action. The chapter also introduces the term “micro-theories”: precisely defined linguistic problem spaces (such as lexical disambiguation) that need to be tackled in order to have a well-functioning agent. Microtheories identify *simple* versus *difficult* examples in these problem spaces, with theoretical grounding on why they are easy or difficult. The chapter also briefly introduces the lexicon: words are connected to word senses, an example of use, and also to concept properties (“red” is an attribute of “red car”).
CHAPTER 3 – Pre-Semantic Analysis and Integration
This chapter describes the processing of text before the semantic analyses: tokenizing the text into separate words, part of speech (PoS) tagging of individual words (into nouns, verbs, adverbs, etc.), mapping the words to entries in the lexicon, and syntactic parsing. The chapter describes the use of parsing tools developed by external parties, including Stanford CoreNLP, and fuzzy matching of words with no clear lexicon match.
CHAPTER 4 – Basic Semantic Analysis
The Basic Semantic Analysis stage has the intelligent agent dealing with linguistic phenomena present in the local dependency structure for the meaning representation. These are phenomena like modality and aspect, as well as understanding speech acts and the differences between questions and imperatives. The meaning representation can also contain relation types (e.g. “wolf” is EXPERIENCER of “fear”). The chapter also describes the agent’s handling of metaphors and idioms: with templates of pre-defined meanings (e.g. “Person X is running out of steam” is recorded as “Person X experiences FATIGUE > .8 intensity”, p. 184). This chapter also addresses several forms of ellipsis (verb phrase, noun phrase, and event), e.g. “[..] an environment in which fruit existed, but candy didn’t __”, p. 192. One solution for ellipsis is reconstructing full sentences from pre-defined templates in the lexicon (e.g. re-adding an experiencer or agent). At this stage, unknown words receive an underspecified semantic meaning based on the earlier syntactic processing.
CHAPTER 5 – Basic Coreference Resolution
Coreference resolution is the task of linking different mentions in a text (e.g. “he”, “the bartender”, and “John”) to the same entity – the referent can also be a verb phrase or event mention. The chapter starts with an extensive introduction to coreference resolution with examples, and then dives into several challenges for an artificial agent processing language, such as ellipsis and non-referential relationships. The chapter details a step-wise approach: first using Stanford’s CoreNLP for simple coreference resolutions, and then using heuristics, templates, and rule-based solutions for complex references with low confidence scores. Examples of such solutions include sentence simplification by removing additional clauses, and checking for mismatches in properties of entities (e.g. a red versus a blue car). These solutions require no domain-specific knowledge or extensive processing power.
CHAPTER 6 – Extended Semantic Analysis
This stage of processing deals with issues that prevented a solved meaning representation in earlier stages. For ambiguities, one strategy is to use real-world knowledge, such as that a “cow” who eats grass likely refers to the animal and is not meant as a derogatory term for a woman (p. 251). Incongruencies can be solved with knowledge about metonymy (“the red hair did it”) and templates without ‘typical’ prepositions. Atypical use of idioms can be tackled with allowing modifiers in idiom templates. Indirect modification (e.g. in a “bloodthirsty chase”, the ANIMAL is bloodthirsty, not the chase) can be solved with semantic knowledge from the ontology. Simple underspecification issues can be solved with ontology knowledge (e.g. a time at night with flight means the concept “night flight”) or calculation (e.g. “3 hours later”). Fragments are processed as text strings that can carry some semantic meaning.
CHAPTER 7 – Situational Reasoning
The chapter starts with outlining that while *human* communication is usually aware of situational context and topic of conversation, an artificial agent’s does not need to be for human-like performance. The chapter also discusses details about the agent’s cognitive structure, called OntoAgent. Situational reasoning solves the last difficulties in speech act ambiguity (a command is more likely in one setting, and a question in another). It also aids coreference resolution and disambiguation by using contextual information such as the SOCIAL ROLE of different humans and other ontological knowledge (e.g. Joe is PRESIDENT, so is more likely to give a speech).
CHAPTER 8 – Agent Applications: The Rationale for Deep, Integrated NLU
This chapter introduces a case study for the agent: the Maryland Virtual Patient system, a simulation system for clinicians in training. The aim is an explainable system that simulates clinically and behaviorally realistic (both likely and unlikely) clinical scenarios. Physiology is modelled with an ontology of diseases, symptoms, and clinical observations, with many causal or other connections between symptoms and diseases, and an IF/ELSE decision system. The chapter covers an extensive example on GERD (gastroesophageal reflux disease) in this physiological system, and then describes how individuals can design instances of patients with different genders, weights, and personality traits. It may be possible for the agent to extract disease aspects from text, for instance with filling in templates from case studies. The chapter also discusses how virtual agents can help mitigate clinician’s biases, and concludes that intelligent systems require complex and expensive expert knowledge.
CHAPTER 9 – Measuring Progress
Chapter 9 cites Hobbs (2004) as work that addresses complex evaluation needs of holistic approaches: demonstrations may “dazzle observers for the wrong reasons.” However, the task-based evaluation that is common in NLP is also deemed inadequate for measuring progress on intelligent agents. Such conventional evaluation would mean separate evaluations for different tasks (e.g. PoS tagging and disambiguation) by means of an annotated test set the system is scored on. The chapter argues this does not translate to real-world performance and usability. Instead, the chapter advocates for 6 microtheory-focussed evaluations as well as 2 holistic ones. Microtheory-based evaluations are on problem spaces such as nominal compounding and verb phrase ellipsis. These evaluation experiments were aimed at extensive post-hoc error analysis with linguistic (theoretical) knowledge, without a test set with high inter-annotator agreement. Holistic performance measurement meant the agent processed English sentences from step 1 to 5 (that is, all except situational reasoning). The meaning representation of these sentences was deemed to be satisfactory in a qualitative analysis, with “quite a number” (p. 374) of correct meaning representations.
EPILOGUE – The book concludes with how building an intelligent agent requires to look beyond individual linguistic tasks and sub-fields, and how the agent, like humans, will be set to learn new concepts and ideas continuously. The book concludes that a linguistic focus in NLP has been under-explored, and that this research program may excite linguists to work on language-endowed intelligent agents.
EVALUATION
i. LINGUISTIC THEORY
This book describes a research agenda and efforts for designing an intelligent agent that is also evaluated holistically. The book provides a convincing case for how ambitious and exhaustive this goal is, with the many linguistic examples illustrating the complexity of everyday language use. NLP is a field with its roots in linguistics, and this book is strong in connecting NLP problems and methods to precise linguistic problems and explanations. “Linguistics in the Age of AI” demonstrates an extensive knowledge of (theoretical) linguistics, both in syntax and semantics. I think this benefits understanding of the specific problems and especially the errors that are likely to occur in the language processing for artificial agents.
The small problem spaces the book calls “micro-theories” are an interesting concept that indeed could lead to building computational tools and solutions for linguistic problems, as indeed many current theories and hypotheses in (theoretical) linguistic literature are not well-suited to building and testing computational methods. However, there is recent research focussing on linguistic knowledge in state-of-the-art neural models, with testing on linguistic phenomena such as negative polarity items (Jumelet & Hupkes, 2018, Weber et. al., 2021), and syntactic agreement mechanisms (Linzen et. al., 2016, Finlayson et. al., 2021).
ii. DISCUSSION OF RECENT DEVELOPMENTS
This book solely focuses on symbolic AI for the design of an artificial agent. Current NLP state-of-the-art methods are not mentioned - such as the use of machine learning and deep learning. One such method is pre-trained Transformer contextual word embeddings (Devlin, 2019), which took the computational linguistics field by storm due to its better performance (e.g. fewer errors) compared to other approaches in tasks from coreference resolution (Joshi et. al., 2019) to machine translation (Wang et. al., 2019). While the book does provide arguments why these methods are not discussed (because knowledge-based approaches are found to be less data-dependent, and more realistic for an open-domain artificial agent), I find these arguments less convincing and not a good reason for not mentioning these well-performing approaches at all. Deep learning and machine learning are not only currently “receiving the most buzz” (p. xvi) in the NLP field, but also have led to considerable improvements in performance on many subtasks for the artificial agent’s processing of language. Readers interested in how these latest developments work can consult the latest (online) edition of the seminal NLP textbook by Jurafski & Martin (2021).
These methods also lead to the work on Event Coreference resolution by Zeng et. al.(2020), which shows that machine learning based techniques such as semantic embeddings help identify paraphrased event mentions. Methods based on machine learning and deep learning are also found to perform best in systematic comparisons of commercial systems for (subtasks of) dialogue understanding, such as intent classification and entity recognition (Liu et. al., 2021). While qualitative analysis and error analysis is very important, such benchmarks and quantitative comparisons on the same test set allow us to see which models perform better on subtasks of NLU such as coreference resolution.
iii. APPROPRIATENESS FOR AUDIENCE
Many chapters start with a detailed explanation of the task or sub-field in question, with several sentences and linguistic examples to illustrate the challenges for the artificial agent, to then dive into potential solutions. Sometimes this is needed, but the explanations are sometimes so detailed that readers from the AI and NLP fields without a strong background in (basic) theoretical linguistics may feel lost. This appears to be the opposite of the book’s intent. Linguists are also part of the book’s intended audience, but by reading this book linguists do not receive the current state of the art in NLP and AI, with few literature or approaches mentioned from after 2016.
Examples sometimes seem dissonant and inappropriate. The inspiration mentioned in the beginning of the book is the agent HAL from Kubrick’s Space Odyssey film – but this intelligent agent turns out to be malfunctioning and kills its human team members. Another inappropriate example mentions disambiguation of “cow” as either referring to a woman or an animal (example from p. 251). While one could argue examples are only minor aspects of a scientific work, I would argue they are crucial to the contextualization, understanding, and framing of it.
Additionally, one scenario the book extensively describes is an “autonomous agent” used in combat and warfare situations, a scenario with considerable social and ethical responsibility that goes undiscussed. Ethical and social responsibility of conversational AI has been central to the NLP field, for instance by Ruane et. al. (2019), who mention the importance of understanding underlying values when designing a conversational system, and emphasize: “Language is inherently social, cultural, contextual, and historical, which means that the design of agent dialogue necessarily reflects a particular worldview.” (p. 107), and the importance of taking responsibility for such a worldview and its specific harms. This book does not consider these aspects, while they are currently very prominent in the NLP field at large and in conversational AI in particular.
iv. CONCLUSION
“Linguistics for the age of AI” provides an extensive case study of designing an artificial agent’s processing of language, grounded in especially knowledge of linguistic theory. It also attempts to bridge the gap between theoretical linguistics and language understanding for artificial agents. However, the book does not mention current developments in NLP and NLU such as the use of machine learning and recent attention to societal, cultural, and ethical considerations, which makes it less suitable as a current overview of this field and its current approaches. Also, its described evaluation practices make it difficult to compare this system to similar systems. Additionally, the book contains some examples and case study scenarios that appear less thoughtful. Nonetheless, the book convincingly displays how complex processing of dialogue is.
REFERENCES
Allen, J., Ferguson, G., Stent, A., Stoness, S. C., Swift, M., Galescu, L., ... & Campana, E. (2005). Two diverse systems built using generic components for spoken dialogue (Recent Progress on TRIPS). In Proceedings of the ACL Interactive Poster and Demonstration Sessions (pp. 85-88)
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019 (pp. 4171–4186)
Finlayson, M., Mueller, A., Gehrmann, S., Shieber, S., Linzen, T., & Belinkov, Y. (2021). Causal analysis of syntactic agreement mechanisms in neural language models.
In: Proceedings of ACL-IJCNLP 2021 (pp. 1828-1843).
Hobbs, J. R. (2004). Some Notes on Performance Evaluation for Natural Language Systems.
https://www.isi.edu/~hobbs/performance-evaluation/performance-evaluation.html Jumelet, J., & Hupkes, D. (2018). Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items. In the Proceedings of the BlackboxNLP Workshop, co-located at EMNLP 2018. (pp. 222-231).
Joshi, M., Levy, O., Zettlemoyer, L., & Weld, D. S. (2019). BERT for Coreference Resolution: Baselines and Analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 5803-5808).
Jurafsky, D. and Martin, J.H.. (2021). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition.
https://web.stanford.edu/~jurafsky/slp3/ Lee, K., He, L., Lewis, M., & Zettlemoyer, L. (2017). End-to-end Neural Coreference Resolution. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 188-197).
Lenat, D. B. (1995). CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11), pp. 33-38.
Liu, X., Eshghi, A., Swietojanski, P., & Rieser, V. (2021). Benchmarking natural language understanding services for building conversational agents. In Increasing Naturalness and Flexibility in Spoken Dialogue Interaction (pp. 165-183). Springer, Singapore.
Linzen, T., Dupoux, E., & Goldberg, Y. (2016). Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics, 4 (pp. 521-535).
Nirenburg, S., & Raskin, V. (2004). Ontological semantics. MIT Press.
Ruane, E., Birhane, A., & Ventresque, A. (2019). Conversational AI: Social and Ethical Considerations. In AICS (pp. 104-115).
Weber, L., Jumelet, J., Bruni, E., & Hupkes, D. (2021). Language Modelling as a Multi-Task Problem. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (pp. 2049-2060).
Wang, Q., Li, B., Xiao, T., Zhu, J., Li, C., Wong, D. F., & Chao, L. S. (2019). Learning Deep Transformer Models for Machine Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 1810-1822).
Zeng, Y., Jin, X., Guan, S., Guo, J., & Cheng, X. (2020). Event coreference resolution with their paraphrases and argument-aware embeddings. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 3084-3094).
ABOUT THE REVIEWER
Myrthe Reuver is a PhD candidate in computational linguistics at the Vrije Universiteit (VU) Amsterdam, with as supervisors Antske Fokkens from the Vrije Universiteit Amsterdam and Suzan Verberne from Leiden University. Her research interest is in societally and scientifically responsible Natural Language Processing in complex domains, and her current research is into viewpoint diversity in news recommendation.
Page Updated: 17-Jan-2022