Into NLP 6 ~ New Link Project – Dependency Parser
Today we will talk about one of my favorite tools from the toolbox of classical NLP: The Dependency Parse. Dependencies can help us analyze the grammatical structure of a text. This is incredibly useful since we can use it to do pattern matching on the structure of a sentence, something that a normal regular expression simply cannot do. In my mind, dependencies are a natural extension of POS tagging, the topic of my last article: While POS tagging tells us what individual words are, dependencies tell us how these words relate to each other. But let’s start from the top.
What is a dependency?
Let’s say we have a sentence and want to get all adjectives describing a given noun.
So if we take the sentence “The big, old lion jumps into the cold lake.” we would get “big” and “old” for “lion” and “cold” for “lake” . Well since we have learned how to use POS Tags, identifying adjectives is pretty easy, just search for the “JJ
” tag and then just take all the adjectives that come before the noun. Since adjectives always come before the noun this trick has been working since… time immemorial.
Okay there might be some edge cases, but let’s be honest no one wants to go through all the edge cases and produce pages of spaghetti code just to get some adjectives. And luckily you don’t have to.
The Dependency Tree
If we run a dependency parser on the sentence this is the result:
Dependencies are usually represented as a tree structure with the root being the main verb of the sentence. Each arrow represents a relationship between the two words. As you can see between each adjective-noun pair there is an “amod
” (adjectival modifier) relationship. This makes finding related adjectives stupidly easy and it also deals with those silly edge-cases:
In my last article we tried to use POS Tags as a make-shift method of understanding the structure of a sentence. We looked at the sentence “The big lion hunts the gazelles.”
So let’s see what a dependency parse can tell us about this sentence:
First you can see the adjective modifying the noun (amod
), the determiners/articles (det
) connecting to the noun, and both subject (nsubj
) and object (obj
) of the action (Again there are different tag sets depending on the parser you are using). This is exactly how we interpreted the sentence from the POS Tags. Only now we don’t have to deal with all the annoying thinking-about-sentence-structure ourselves, we don’t have to make up rules for different structures; instead we can simply let the dependency parser do it for us.
How to use this stuff?
In the past I have written an article about using dependency trees to identify causal relationships in a text. This was done by applying simple pattern matching to the dependency parse-tree: A rule could look something like this:
[Effect] - advcl -> [CAUSE] - mark -> IN:"if"
Where we have an effect that has an adverbial clause (advcl
) that is the cause which in turn has a mark with the signaling word “if” (the “IN
” in the rule is the POS tag). This rule is able to identify and parse a sentence like
“If the tool detects an error it shows a warning window.”
In this case “detects” would be the root of the cause subtree, and “shows” the root of the effect subtree. By looking at everything connected to these roots we find that the cause is “the tool detects an error” and the effect is “it shows a warning window”.
What’s nice about this approach is that it works even if the sentence was phrased differently. So if we had the sentence “The tool shows a warning window, if it detects an error.”
The same pattern would still work, since dependencies work regardles of word order.
This same principle can be applied to basically any domain: You can simply get some example sentences, run them through a dependency parser and look for the patterns that will usually arise. Like the simple amod
example from earlier. This makes a task like “get the subject, and object of this verb” very easy to solve.
You can do some really powerful NLP work with just tokenizing, POS tagging, and dependency parsing, even if you don’t have the computational resources or the training data for a deep neural network. But there are still more NLP methods to explore: So next time we will have a look at chunking.