OKKAM Community Portal

 
  • Increase font size
  • Default font size
  • Decrease font size
Home -> Documentation -> ENS-enabled Tools
ENS-enabled Tools


The OKKAMizers Pipeline

E-mail Print PDF

This paragraph focuses on the different steps of the OKKAMization process. In particular we will analyze how the different tier moves and processes data, in order to better understand the OKKAM Empowered tool functionality.

The following picture describes the three main steps of this workflow:

 

pipeline

Entities extraction

It 's clear that every tool analyzes different kinds of information. Depending on the situation, the gathering of data (ACQUISITION) can be carried out in various ways. After acquisition, the first processing step is the parsing and analysis of a text for entities extraction. In particular the system contains three different approaches:

  • Keyword based
  • Shallow linguistic
  • Semantic Analysis

The main module is the semantic analysis based on COGITO® Discover semantic capabilities, that provides Semantic Analysis for texts in English and Italian. Once acquired, data are cached and immediately analyzed using the semantic approach. Following the NPL, the whole text is disambiguated, where disambiguation is the process defining the meaning of one or more words in a given text, when words have a number of distinct meanings. Disambiguation is solved through adequate algorithms. This process is based on the Sensigrafo®, a semantic network that contains a representation of linguistic knowledge and world knowledge. It is an oriented graph consisting in tags representing concepts, and arcs representing the conceptual relationship between concepts.

OKKAMcore Interface

The Entity Matching stage generates an OKKAM ID for entities, where this is required. This process is based on the following steps:

Querying the system

A query is created for each identified named entity (e.g. people, location). It contains the main features useful to better identify, in a unique way, the potential entity. The query is sent to the OKKAM Engine.

Query processing

The OKKAM Engine analyzes the query, searches the repository and returns, if present, the correct OKKAM ID for the named entity. If the named entity is not in the repository, another task starts, creating a new persistent web identifier for entities which don’t have one yet.

The correct OKKAM ID identification

The OKKAM ID resulting from matching is a value ready to be included as a new field in the document.

The use of this module is the basis for building a Web of Entities in which information about the same entity is consolidated in such a way that aggregation, integration and mashup become easier and faster.

Presentation Layer

At this stage, the system is able to perform different operations depending on the type of tool that is used. The main features are:

  • Editing a document
  • Interactive document OKKAMization
  • Batch document OKKAMization
  • Removing OKKAM markup from a document
  • Exporting OKKAM markup from a document
  • Creating an entities index
  • Creating a new entity
  • Searching for entities in the repository or in a document
Last Updated on Tuesday, 01 December 2009 11:36
 

The OKKAM empowered tools architecture

E-mail Print PDF

The OKKAM empowered tools architecture is a typical three tier architecture. This is a client-server architecture in which the user interface, Business logic, computer data storage and data access are developed and maintained as independent modules, on separate platforms.

 

Last Updated on Wednesday, 03 February 2010 16:45 Read more...