翻译Pradeep Dasigi的一篇长文
Knowledge-Aware Natural Language Understanding
基于知识感知的自然语言理解
摘要
Natural Language Understanding (NLU) systems need to encode human gener- ated text (or speech) and reason over it at a deep semantic level. Any NLU system typically involves two main components: The first is an encoder, which composes words (or other basic linguistic units) within the input utterances compute encoded representations, which are then used as features in the second component, a predic- tor, to reason over the encoded inputs and produce the desired output. We argue that the utterances themselves do not contain all the information needed for understanding them and identify two kinds of additional knowledge needed to fill the gaps: background knowledge and contextual knowledge. The goal of this thesis is to build end-to-end NLU systems that encode inputs along with relevant background knowledge, and reason about them in the presence of contextual knowledge.
自然语言理解(NLU)系统需要将人类创造的文本或语音进行编码,然后以语义深度级别来对其进行推理。任何典型的自然语言理解系统都由两部分组成:第一部分是编码器,它将输入语句中的单词(或其他基本语言单位)组成计算编码表示,并将其作为第二部分(预测期)的特征,并对编码输入进行推理并产生所需的输出。我们认为,这些语句本身并不包含理解他们所需要的全部信息,也不确定填补空白所需的两种额外知识:背景知识和上下文知识。本文的目标是创建一个端到端的自然语言理解系统,将输入语句与相关背景知识一起编码,并在上下文知识存在的情况下进行推理。
The first part of the thesis deals with encoding background knowledge. While distributional methods for encoding sentences have been used to represent meaning of words in context, there are other aspects of semantics that are out of their reach. These are related to commonsense or real world information which is part of shared human knowledge but is not explicitly present in the input. We address this limitation by having the encoders also encode background knowledge, and present two approaches for doing so. First, we leverage explicit symbolic knowledge from WordNet to learn ontology-grounded token-level representations of words. We show sentence encodings based on our token representations outperform those based on off-the-shelf word embeddings at predicting prepositional phrase attachment and textual entailment. Second, we look at cases where the required background knowledge cannot be stated symbolically. We model selectional restrictions verbs place on their semantic role fillers to deal with one such case. We use this model to encode events, and show that these representations are better at detecting anomalies in newswire texts than sentence representations produced by LSTMs.
本文的第一部分处理“对背景知识进行编码”。虽然句子编码的分布方法已被用于表示上下文单词中的意义,但语义的其他方面却超出了它们的范围。这些信息与常识或真实世界的信息有关,它们是人类共享知识的一部分,但在输入中没有明确地表示出来。我们通过让编码器也“对背景知识进行编码”来解决这个限制,并提出了两种实现方法:第一种方法,我们利用来自WordNet的显式符号知识来学习单词基于本体的、token级别的表示。表明了基于我们token表示的语句编码,要优于那些基于现成的介词短语依附与文本蕴涵预测中的词嵌入;第二种方法,我们研究了所需背景知识不能被象征性表述的案例。我们对动词在语义角色填充词上的选择限制进行建模来处理这种情况。我们使用此模型对事件进行编码,并表明这些表示(representations)在检测新闻专线文本中的异常方面的表现,要比LSTM类模型生成的句子更好。
The second part focuses on reasoning with contextual knowledge. We look at Question-Answering (QA) tasks where reasoning can be expressed as sequences of discrete operations, (i.e. semantic parsing problems), and the answer can be obtained by executing the sequence of operations (or logical form) grounded in some context. We do not assume the availability of logical forms, and build weakly supervised semantic parsers. This training setup comes with significant challenges since it involves searching over an exponentially large space of logical forms. To deal with these challenges, we propose 1) using a grammar to constrain the output space of the semantic parser; 2) leveraging a lexical coverage measure to ensure the relevance of produced logical forms to input utterances; and 3) a novel iterative training scheme that alternates between searching for logical forms, and maximizing the likelihood of the retrieved ones, thus effectively transferring the knowledge from simpler logical forms to more complex ones. We build neural encoder-decoder models for semantic parsing that use these techniques, and show state-of-the-art results on two complex QA tasks grounded in structured contexts.
本文的第二部分是运用上下文知识进行推理。我们研究了问答(QA)任务,其中推理可以表示为离散操作序列(即语义解析问题),而答案可以通过执行基于某些上下文的操作序列(或逻辑形式)来获得。我们不假设逻辑形式的可用性,而是构建弱监督语义解析器。这种训练设置带来了巨大的挑战,因为它要在指数级大的逻辑形式空间中搜索。为了应对这些挑战,我们提出了:1)使用语法来约束语义分析器的输出空间;2)利用词汇覆盖率测量,确保产生的逻辑形式与输入话语的相关性;3)一种新的迭代训练方案,在寻找逻辑形式和最大化检索到的可能性之间交替进行,从而有效地将知识从简单的逻辑形式转移到更复杂的逻辑形式。我们利用这些技术建立了用于语义分析的神经编码器-解码器模型(encoder-decoder models),并在两个复杂的基于结构化内容的QA任务上展示了SOTA的结果。
Overall, this thesis presents a general framework for NLU with encoding and reasoning as the two core components, and how additional knowledge can augment them. While the tasks presented in this thesis are hard language understanding challenges themselves, they also serve as examples to highlight the role of background and contextual knowledge in encoding and reasoning components. The models built for the tasks provide empirical evidence for the need for additional knowledge, and pointers for building effective knowledge-aware models for other NLU tasks.
综上所述,本文提出了一种以编码及推理为两大核心组件的NLU总体框架,以及如何用额外的知识增强它们。虽然本文提出的任务是困难的语言理解挑战本身,但也可以作为例子来强调背景知识和上下文知识在编码和推理过程中的作用。为此任务构建的模型给额外知识的需求提供了经验证据,并给“为其他NLU任务构建有效的知识感知模型”提供了指导。
目录
1 Introduction / 概述
1.1 Natural Language Understanding / 自然语言理解
1.1.1 Definition / 定义
1.1.2 Parts of an NLU system / 自然语言理解系统的组成部分
1.2 Knowledge / 知识
1.2.1 Background Knowledge for Encoding / 用于编码的背景知识
1.2.2 Contextual Knowledge for Reasoning / 用于推理的上下文知识
1.3 Knowledge-Aware NLU / 知识感知的自然语言理解
1.3.1 Better encoding with background knowledge / 基于背景知识的编码
1.3.2 Reasoning with contextual knowledge / 基于上下文知识的推理
1.3.3 Evaluating NLU Performance / 自然语言理解性能评估
1.4 Thesis Contributions and Outline / 论文贡献及纲要
I Encoding with Background Knowlege / 基于背景知识的编码
2 Related Work: Learning to Encode / 相关工作:编码学习
2.1 Representation Learning for Lexical Semantics / 词汇语义的表示学习
2.1.1 from distributional to distributed represantations / xxx
2.2 Incorporating Knowledge / 知识融入
2.2.1 Multi-prototype word vectors / 多原型词向量
2.2.2 Relying on symbolic knowledge / xxx
2.3 Selectional Preference / xxx
2.4 WordNet as a source for Selectional Preferences / xxx
3 Encoding Sentences with Background Knowledge from Ontologies / 基于来自本体的背景知识对语句进行编码
3.1 Introduction / 介绍
3.2 Ontology-Aware Token Embeddings / 本体感知令牌嵌入
3.3 WordNet-Grounded Context-Sensitive Token Embeddings / xxx
3.4 PP Attachment
3.5 Textual Entailment
3.6 Conclusion / 结论
3.6.1 Summary / 总结
3.6.2 Future Work / 未来工作
4 Leveraging Selectional Preferences as Background Knowledge for Encoding Events / 利用选择参考作为背景知识来编码事件
4.1 Understanding Events / 事件理解
4.2 Model For Semantic Anomaly Detection / 语义异常检测模型
4.2.1 Training / 训练
4.3 Data / 数据
4.4 Results / 结果
4.5 Conclusion / 结论
II Reasoning with Contextual Knowledge / 基于上下文知识的推理
5 Related Work: Learning to Reason / 相关工作:推理学习
5.1 Learning to Reason with Latent Variables / 基于潜在变量的推理学习
5.2 Semantic Parsing / 语义解析
5.3 Weak Supervision / 弱监督
5.3.1 Maximum marginal likelihood training / 最大边界似然训练
5.3.2 Reinforcement learning methods / 强化学习方法
5.3.3 Structured learning algorithms / 结构化学习算法
5.3.4 Bridging objectives / 目标衔接
6 Constrained Decoding for Semantic Parsing / 语义分析中的约束解码
6.1 Need for Grammar Constraints / 语法约束需求
6.2 Need for Entity Linking / 实体链接需求
6.3 Grammar-Constrained Neural Semantic Parser / 语法约束神经语义解析器
6.3.1 Encoder / 编码器
6.3.2 Decoder / 解码器
6.4 Experiments with "WIKI TABLE QUESTIONS" / 基于“维基表格问题”的实验
6.4.1 xxx
6.4.2 xxx
6.4.3 xxx
6.4.4 xxx
6.4.5 xxx
6.4.6 xxx
6.4.7 xxx
6.4.8 xxx
6.5 Conclusion / 结论
7 Training Semantic Parsers using Iterative Coverage-Guided Search / 使用可迭代覆盖向导检索训练语义解析器
7.1 Coverage-guided search / 覆盖向导检索
7.2 Iterative search / 迭代检索
7.3 Task Details / 任务细节
7.4 Experiments / 实验
7.5 Related Work / 相关工作
7.6 Conclusion / 结论
8 Conclution / 结论
8.1 Summary / 总结
8.2 Future work in Knowledge-Aware Encoding / 知识感知编码器的未来工作
8.3 Future work in Knowledge-Aware Reasoning / 知识感知解码器的未来工作