Taking a turn away from first order logic, search algorithms and planning, week 5 introduced the key issues around natural language processing [NLP] and the programming language Prolog.
The logic programming paradigm use by Prolog is something I have not learned about before. The development of axioms and problems solving by querying the axioms is the foundation of languages such as prolog. The engine of Prolog is a backward chaining theorem prover. The axioms in logic programming need to be Horn clauses: disjunctions of literals with exactly one positive literal
king(X) & greedy(X) → evil(X).
In the tutorial we were able to do some basic playing with a toy implementation by Lloyd Allison:
Prolog relies very heavily on unification, a process that we were actually unable to correctly re-inact in the tutorial.
p(X, c(Y,Z)) <= p(X,Z)
p(1, c(1, c(2, c(3, nil)))) yes
p(2, c(1, c(2, c(3, nil)))) yes
p(3, c(1, c(2, c(3, nil)))) yes
After reading the tutorial solution, I am not really much clearer on the proves for each of these outcomes. I will have to follow up in the lecture.
We discussed the surface level methodologies for NLP:
- Lexical analysis
- Syntactic analysis
- Semantic analysis
- Pragmatic analysis
The focus of the lecture was however on the limitations of NLP. How ambiguity of words, their meaning and context makes effective NLP very difficult. Implications were another issue for NLP covered for some time.
Next came some approaches for overcoming the challenges of NLP. Statistical approaches such as N-Gram analysis. This veered the lecture into information retrieval , discussing the techniques used by search engines such as google to interpret searches.
On the topic of NLP I wondered if there were any large knowledge bases being assembled to try and assist in the task. Yahoo have a cluster of computers working on this: