CLIFF parses news articles and pulls out people, organizations and places mentioned. A number of tools do this, so why did we create CLIFF? We’ve built on those tools to add disambiguation tailored to the ways news articles are written, and a concept of “focus” that tries to get at what place an article is really about (as opposed to all the places it mentions).
I created CLIFF with Catherine D’Ignazio, based on the CLAVIN engine from Berico Technologies. We use it to power the Media Cloud project’s geoparsing capabilities. Our main motivation was to fine-tune an engine for news articles.