Entity Extraction Puts the World’s News at a News Agency’s Fingertips

Enterprise Search, Entity Extraction, Intelligence Analysis, Social Media Analysis

Entity Extraction for news agencies

News Agencies Face a Tsunami of Data Feeds

News agencies need to monitor news from around the globe in multiple languages as part of their news management process. News agencies have editorial systems to manage the workflow, but most are still limited to keyword filters that are rough and inaccurate. In addition, a large part of the subsequent processing of the contents of the news feeds is still done manually, which doesn’t scale and is cost prohibitive.

A Data Analytics Technology such as Entity Extraction Can Help

Today a data analytics technology such as Entity Extraction can help automate the news-management process. Entity Extraction goes beyond rough topic identification of a news item. It takes the analysis of its content to a deeper level and can do it in multiple languages.

Entity Extraction finds occurrences in unstructured text of all named entities, not just the ones that are known and that can be retrieved via key words. This includes names of people, companies, governmental organizations, countries, states, cities, etc.

Entity Extraction also identifies relationships between these entities in texts, such as:

    • Relationships between organizations
    • Affiliations of people with organizations
    • Locations of organizations
    • etc.

In addition, some Entity Extraction products find a large set of relevant events such as:

    • Political changes
    • Merger and acquisitions
    • Criminal acts
    • Cyber security incidents
    • Military conflicts
    • etc.

This type of Entity Extraction, so-called Event Extraction, also identifies who the participants are in the events (generally speaking, Event Extraction identifies “Who did what to whom?”)  For example, a news report in unstructured text of an ongoing geopolitical conflict might contain the following sentence:

Russia launched an attack on a railway yard in Kharkiv with S-300 missiles on September 28.”

Entity Extraction will process this sentence and produce something like the following output:

EVENT Ontology:  Conflict

EVENT SUB-Ontology: Attack target

ATTACKER: Russia

TARGET: a railway yard

WEAPON: S-300 missiles

PLACE: Kharkiv

DATE: September 28, 2022

This type of output can then be stored in a database and be searched just like other database records. An editor can avoid the clumsy and inaccurate use of keywords using a conventional search engine and find precisely what they are interested in without all the irrelevant noise that keyword searches frequently provide.

For example, an editor can find information such as the following:

    • All persons and organizations linked to a particular organization
    • Acquisitions of companies by other companies
    • Departures and appointments of C-level executives
    • etc.

By plotting events over time, an editor can also:

    • Track events and participants over time (e.g., a developing political crisis)
    • Identify emerging themes (e.g., a geopolitical conflict, a new political party)
    • etc.

Entity Extraction does this in multiple languages and since it outputs the results in an identical structured format, it is easy to aggregate data across languages.

Entity Extraction also supports the real-time generation of alerts for highly relevant news.

Summary

Entity Extraction is a critical tool for helping news agencies to manage their news sources. It automates a previously slow, cumbersome process and so allows a news agency to analyze the contents of the incoming news at a fine-grained level instead of just assigning topic tags to an entire document.