Week 2 Quiz: Natural Language Processing and Word Embeddings (第二周测验:自然语言处理与词嵌入)
1.Suppose you learn a word embedding for a vocabulary of 10000 words. Then the embedding vectors should be 10000 dimensional, so as to capture the full range of variation and meaning in those words. (假设你学习了一个为 10000 个单词的词汇表嵌入的单词,那么嵌入向量应 该为 10000 维,这样就可以捕捉到所有的变化和意义)
【 】 True(正确) 【 】 False(错误)
答案
False
Note: The dimension of word vectors is usually smaller than the size of the vocabulary. Most common sizes for word vectors ranges between 50 and 400. (注:词向量的维数通常小于词汇表的维数,词向量最常见的大小在 50 到 400 之间。)
2.What is t-SNE?( t-SNE 是什么?)
【 】 A linear transformation that allows us to solve analogies on word vectors(一种可以让我们解决词向量的相似性的线性变换)
【 】 A non-linear dimensionality reduction technique(一种非线性降维技术)
【 】 A supervised learning algorithm for learning word embeddings(一种学习词嵌入的监督学习算法)
【 】 An open-source sequence modeling library(一个开源序列建模库)
答案
【★】 A non-linear dimensionality reduction technique(一种非线性降维技术)
3.Suppose you download a pre-trained word embedding which has been trained on a huge corpus of text. You then use this word embedding to train an RNN for a language task of recognizing if someone is happy from a short snippet of text, using a small training set. (假设 你下载了一个经过预先训练的词嵌入模型,该模型是在一个庞大的语料库上训练的出来的。 然后使用这个词嵌入来训练一个 RNN 来完成一项语言任务,即使用一个小的训练集从一小 段文字中识别出某人是否快乐。)
【 】 True(正确)
【 】 False(错误)
答案
True
Note: Then even if the word "ecstatic" does not appear in your small training set, your RNN might reasonably be expected to recognize "I’m ecstatic" as deserving a label