SRL Applications part2: SRL for (open) information extraction

3 min readMay 23, 2017

this part is mainly for how SRL applies to (open) information extraction.

The most related works I can find over research papers/projects are the work came out of Professor Oren Etzioni’s group from university of Washington( then Allen institute for artificial intelligence) on open information extraction(reference 2,3,4) .

Here we want to focus on:

SRL -> openIE -> openIE applications (another video link)

The above part can be further divided into:

SRL -> openIE
openIE’s applications

SRL -> openIE

openIE

Open IE systems make a single (or constant number of) pass(es) over a corpus and extract a large number of relational tuples (Arg1, Pred, Arg2) without requiring any relation-specific training data.

vs google knowledge graph

the key difference is openIE can get facts about anything and don’t have to tell the computer to do it ahead of time. In other words, google’s knowledge graph has a targeted relationships to be extracted.

SRL->openIE

verbs and their semantically labeled arguments almost always correspond to Open IE relations and arguments respectively (SRL computes more information than Open IE requires)
ignore part of the semantic in- formation (such as distinctions between various Ai’s) . This leads to possibility that: an incorrectly labeled SRL extraction could convert to a correct Open IE extraction, if the arguments were correctly identified but assigned incorrect semantic roles.
SRL systems gracefully handle new verbs (i.e., verbs not in their PropBank train- ing) by only attempting to identify A0 (the agent) and A1 (the patient).
have difficulty in cases where the part of speech of a word is ambiguous or difficult to tag automatically. (the word ‘write’ when used as a noun causes trouble In the sentence “Be sure the file has write permission.”, extractors extract <the file, write, permission>. Part-of-speech ambiguity affected about 20% of sentences)
SRL-based systems, due to their deeper processing, can better handle complex syntax and long-range dependencies . However, extractors suffer significant quality loss in complex extractions compared to binary. (reason why openIE 2nd generation choose relational tuples (Arg1, Pred, Arg2))