Google Research Blog

The latest news from Research at Google

50,000 Lessons on How to Read: a Relation Extraction Corpus

Thursday, April 11, 2013

Posted by Dave Orr, Product Manager, Google Researchrelation extraction.Jim HensonJane HensonmanybelovedcharactersshowsWho created Kermit?which proteins interacthundreds of millions of entities and billions of relationsexplore the world’s informationhuman-judged datasetWikipedia(Update: you can find additional relations here.)Freebase MID’sFreebase property/education/education/institution/m/01tdnyh/m/07tgnGory Detailshttps://code.google.com/p/relation-extraction-corpus/JSON

pred: predicate of a triple

sub: subject of a triple

obj: object of a triple

evidences: an array of evidences for this triple

url: the web page from which this evidence was obtained
snippet: short piece of text supporting the triple

judgments: an array of judgements from human annotators

rator: hash code of the identity of the annotator
judgment: judgement of the annotator. It can take the values "yes" or "no"

{"pred":"/people/person/place_of_birth","sub":"/m/026_tl9","obj":"/m/02_286","evidences":[{"url":"http://en.wikipedia.org/wiki/Morris_S._Miller","snippet":"Morris Smith Miller (July 31, 1779 -- November 16, 1824) was a United States Representative from New York. Born in New York City, he graduated from Union College in Schenectady in 1798. He studied law and was admitted to the bar. Miller served as private secretary to Governor Jay, and subsequently, in 1806, commenced the practice of his profession in Utica. He was president of the village of Utica in 1808 and judge of the court of common pleas of Oneida County from 1810 until his death."}],"judgments":[{"rater":"11595942516201422884","judgment":"yes"},{"rater":"16169597761094238409","judgment":"yes"},{"rater":"1014448455121957356","judgment":"yes"},{"rater":"16651790297630307764","judgment":"yes"},{"rater":"1855142007844680025","judgment":"yes"}]}