Facts from text extraction goals

EKG Language Processing is a tool for extracting facts from text. Here are examples of tasks that it can solve:

  • service desk tickets processing automation: classification, identification of mass problems and failures, identification of the subject of the request, etc.;
  • analysis of contractual documents: identification of the parties to the contract, the subject of the contract, the scope of obligations, additional conditions;
  • analysis of organizational and administrative documents, as well as standards and other regulatory documentation: search for definitions of terms, requirements, responsibilities - for example, for the purpose of finding duplications or contradictions;
  • extraction of facts from various text reports, analytical notes, publications, etc.

Solving such problems will allow businesses to:

  • save the time that employees spend reading and analyzing documents,
  • increase the availability of information (the required document or fact may not be found manually),
  • enrich the contents of corporate databases, collect more information for analysis,
  • speed up the processing of the customer requests, improve the quality of organizational and administrative documents,
  • automate control over the fulfillment of the requirements of these documents.

Our product has examples of commercial use and brings real benefits to organizations!

See also the Service desk automation use case description.

Our technology and advantages

The our solution uniqueness is that it relies on grammatical analysis of each phrase of the document, moves from the lexical level to the conceptual level, and extracts explicit facts from the text. The probability of error with this method of processing is extremely low, and the value and accuracy of the result, on the contrary, is high. This allows us to solve even those problems that are still beyond the power of large language models (LLM), which only generate a probabilistic answer to a specific question, or other NLU (Natural Language Understanding) tools, which are mainly reduced to fuzzy tools for classifying statements. Our approach is more specific (narrow), but much better suited to solving business problems in which the cost of error is high. There are many examples where LLM "hallucinations" led to serious losses for businesses. Our product does not hallucinate!

The EKG LP algorithm is as follows:

  • extract pure text from the analyzed document (PDF, office formats)
  • perform grammatical analysis of each phrase in the document
  • determine whether each phrase or group of phrases belongs to the desired type of phrases: definition, requirement, obligation, error message, etc.
  • build a "semantic portrait" of the statement – ​​a formalized structure conveying its meaning
  • replace lemmas with concepts in the "semantic portrait"
  • perform the required processing: find duplicates or contradictions, classify statements, write the extracted information to the database, generate answers to questions, etc.