I didn't know much about the book til I read it. Originally I thought the book was purely full of equations, formula and things about maths, but it's a surprise that the book contains lots of content about NLP. No wonder my teacher in NLP Lab recommended to us.
Ch 3 is about statistical language model. In the task of judging if a sentence is natural, a model is adapted to calculate the occurence probability of every word rather than to judge if the sentence matches the syntax, which enables computers to finish the judgement easilier and more accurate.
And later a more efficient method put forward by Markov made things easier.
Ch 3 describes the importance of models in solving NLP problems. In addition to those models, there are actually plenty of models in the field of machine learning, such as linear regression, logistic regression and SVM in supervised learning, clustering in unsupervised learning. Models of ml nowadays are largely based on statistical models, requiring a large amount of data and efficient algorithms. And Ch 5 introduces another model which is widely used in NLP: Hidden Markov Model.
The Hidden Markov Model is said to be the fastest and efficient way to solve NLP problems. According to the book, lots of NLP problems such as machine learning, automatic correction and speech recognition can be considered as problems of decoding in a communiacation system, which makes problems more easily to solve than focusing more attention in grammar and syntax. In simple words, the task is to restore signal o1, o2, o3... after transformations into the original signal s1, s2, s3. And the Hidden Markov Model can be adapted to solve these problems. Markov put forwad a simplified assumption that the probability distribution of every status in a ramdom process is merely related to its previous status.
Some content in Ch 3 is related to the assumption. Based on the assumption we have the Markov Chain, and the Hidden Markov Model is the extention of the Markov Chain.
Another topic I'm interested in is about graphs and crawlers. I've written plenty of crawlers for fun or to scratch some useful info. But previously the most things I was considering were how to organize the data structure to make data well stored and seldom has I considered about graphs. Maybe that's because mostly I run my crawlers in a single website. And the crawler the book describes is something more like a search engine. A search engine is also a crawler, a very very big crawler. Crawlers for search engines traverse each hyper link in each website once using Graph Traversal and also a hash list. Websites here are like nodes, and hyper links are like arcs in a graph.
In this term I plan to learn the Markov Model as well as other popular machine learning models in depth, not only things on surface but also derivations related to them. And I wanna write a raw model of machine translation for multiple languages, with my experience in NLP Lab. Also I wanna apply the graph theory to crawlers. Text matching is also fun, maybe I will adapt it in a searching func.