Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases论文学习

研究背景:

A critical challenge in constructing a natural language interface to database (NLIDB) is bridging the semantic gap between a natural language query (NLQ) and the underlying data. When translating NLQ to SQL, this challenge arises in two specific problems: (1) keyword mapping and (2) join path inference.(构建自然语言数据库接口 (NLIDB) 的一个关键挑战是弥合自然语言查询 (NLQ) 和底层数据之间的语义鸿沟。当将NLQ转换为SQL时,这个挑战出现在两个特定的问题上:(1)关键字映射和(2)连接路径推断)

研究内容:

  • Our approach is to use the information in the SQL query log of a database to select more likely keyword mappings and join paths for SQL translations of NLQs.(我们的方法是使用数据库的SQL查询日志中的信息来为nlq的SQL翻译选择更可能的关键字映射和连接路径)
  • We propose a system TEMPLAR, which augments existing pipeline-based NLIDBs with SQL query log information. (我们提出了一个系统TEMPLAR,它用SQL查询日志信息增强了现有的基于管道的nlidb)
  • We model the information in the SQL query log in a data structure called the Query Fragment Graph, and use this information to improve the ability of existing NLIDBs to perform keyword mapping and join path inference. The QFG stores information on query fragment occurrences in the log, as well as co-occurrence relationships between each pair of query fragments.(我们在名为query Fragment Graph的数据结构中对SQL查询日志中的信息进行建模,并使用该信息改进现有nlidb执行关键字映射和连接路径推断的能力。QFG在日志中存储关于查询片段出现的信息,以及每对查询片段之间的共生关系)
  • our goal is to augment, rather than replace, NLIDBs.(我们的目标是扩充而不是替换nlidb)

技术方案框架:

  • TEMPLAR interfaces with the NLIDB it is augmenting on two fronts: one for keyword mapping, and the other for join path inference.(在两个方面与NLIDB进行接口:一个用于关键字映射,另一个用于连接路径推断)
  • The Keyword Mapper carries out the execution of MAPKEYWORDS, and uses a word similarity model, the query fragment graph (QFG) which stores the SQL query log information, and the database itself to retrieve candidate matches.(Keyword Mapper执行函数MAPKEYWORDS,使用一个单词相似度模型、一个存储SQL查询日志信息的查询片段图(query fragment graph, QFG)和数据库本身检索候选匹配项)
  • The Join Path Generator executes INFERJOINS, and it utilizes the QFG and the schema graph of the database to infer join paths.(Join Path Generator执行函数INFERJOINS,它利用QFG和数据库的模式图来推断连接路径)

如何做实验进行评估:

  • We enhanced two different NLIDB systems, NaLIR and Pipeline, with TEMPLAR, and executed them on our benchmarks. The augmented versions are denoted NaLIR+ and Pipeline+ respectively.(我们使用 TEMPLAR 增强了两个不同的 NLIDB 系统 NaLIR 和 Pipeline,并在我们的基准测试中执行它们。 增强版本分别表示为 NaLIR+ 和 Pipeline+)
  • We tested each system by evaluating its ability to translate NLQs accurately to SQL on three benchmarks: the Microsoft Academic Search (MAS) database, and two additional databases regarding business reviews from Yelp and movie information from IMDB.(我们通过评估每个系统在三个基准上将 NLQ 准确转换为 SQL 的能力来测试每个系统:Microsoft Academic Search (MAS) 数据库,以及两个关于 Yelp 商业评论和 IMDB 电影信息的附加数据库)
上一篇:DeWeb配置SSL的方法,未亲测,供参考


下一篇:服务器端打开tensorboard,以及alias命令解释