The Enterprise Knowledge Graph (EKG) – one of the modern, relevant data architectures. The essence of EKG is that all information, usually scattered in an organization between multiple IT systems and data sets, is linked into a single graph, which can be virtual or physical. The graph is a network, each node of which is a virtual reflection of some business object. The links between the nodes reflect the semantic dependencies that exist between such objects.
In this article we will tell you about the world practice of using corporate knowledge graphs to solve applied tasks of banking and financial sector organizations. An example of solving a real problem of the banking sector using our company's tools is described in the article "Calculation of risk per borrower using EKG and logical inference rules"
Banks often take over other financial organizations, inheriting their data and IT systems. Each takeover is a complex task that involves ensuring the continuity of the merged institution's operations by gradually migrating its staff to unified corporate IT systems. At the initial stage of this process, when the staff of the merged organization is still working on the old software, it is necessary to extract data from it for analysis and general reporting. At the final stage, when the transition to unified IT solutions has already taken place, archived copies of legacy systems may still be useful for regulatory requirements or retrospective analysis. To avoid having to manually extract information from legacy systems every time with the help of programmers, you can combine it into an enterprise knowledge graph. A solution of this class is a "single window" of information retrieval. EKG allows you to retrieve, for example, information about a client's transactions for a certain period regardless of which systems they are stored in.
A similar project, focusing on information about the bank's main business objects (employees, software, hardware, customers, etc.) was implemented by the European company Ontotext in one of the global banking groups. Here are examples of queries that such a solution is capable of answering:
Linking information from different systems needed to answer such questions into an enterprise knowledge graph, according to Ontotext, has made it possible:
Data Governance has been actively developed in recent years and includes such areas as controlling data quality (Data Quality) and tracing data provenance (Data Lineage). Since data is used to make decisions, its validity and credibility are very important to managers. One of the projects to implement Data Lineage tools is described on its website by the American company TopQuadrant.
The customer was tasked with organizing the tracking of data processing processes in order to improve operational efficiency and meet the requirements of the regulator. In this case, the banking group (TopQuadrant does not disclose its name, but among the company's clients are such organizations as Morgan Stanley, J.P.Morgan, CapitalOne) used hundreds of information systems and data sources, between which there were many explicit and implicit links. To solve the problem, it is necessary to build a complex model describing both the data sets themselves and the business-level entities and concepts represented in them. In other words, it is necessary to formally describe what exactly is contained in this or that data set. The model should also include a description of program components that transform data, their input and output parameters, and launch modes. The model cannot do without a description of the main business processes, policies and reports that involve data.
Linking all the above information into an enterprise knowledge graph requires the creation of procedures for extracting and transforming data from IT systems and is quite labor-intensive. However, the result of the process is worth it: the customer received a simple and understandable tool that allows to trace the relationship between business applications, data sets, analyze the impact of certain changes in the IT infrastructure on the bank's activities, and facilitate the consolidation of all information into a corporate data lake. Such a tool will reduce the number of incidents related to unplanned consequences of changes in the IT infrastructure, improve data availability, and assess the degree of trust in the data used in decision-making.
The German company Metaphacts reports on a project to create a set of master customer data (Customer 360) for an international financial organization. The use of knowledge graphs allows to describe the complex relationships of customers (both individuals and organizations) among themselves, to reflect the variety of roles in which the customer can act in relations with the organization, to consolidate information about the customer received from different sources.
EKG as a tool for creating a single point of access to data can also be used to solve the tasks of information support for users of various roles. Spanish company Gnoss has implemented such a system in BBVA bank (35 countries of presence, 70 million customers). The system provides an interface for searching macroeconomic information of BBVA Research, the analytical division of the banking group. The user can find analytical materials using a wide range of filters, navigate through related materials. At another Spanish bank, Gnoss implemented a semantic search system for various types of artifacts, including descriptions of processes, policies, risks, organizational structures, and documents.
A similar project was implemented by Ontotext for BCA Research, the analytical division of Euromoney. This project focused on gathering information from multiple business units across the company using legacy automated systems. An important role in the project was played by the semantic text annotation functionality, by which each text document is annotated with links to key business objects based on an analysis of its content. This allows you to quickly find, for example, all documents of a certain type related to a certain asset, customer or employee.
Another European vendor, The Semantic Web Company (PoolParty), specializes in semantic text analysis using the EKG toolkit. Its clients include financial institutions such as Credit Suisse and The World Bank. In their description of cases of EKG use in banks and financial sector, PoolParty specialists, along with the advantages we have already mentioned, emphasize that a single data model, according to which the graph is built, can quickly adapt to changing market needs, and the presence of clearly described semantics (meaning) of data contrasts with the not always clearly defined meaning of data used by artificial intelligence algorithms. In addition, the results of graph data processing are always explainable and logically provable, unlike the conclusions reached by machine learning models, which are "black box".
Semantic text annotation algorithms developed by PoolParty have found application in the knowledge management system of a large insurance company, a system for analyzing news and markets, a system for supporting the staff of a retail bank (especially relevant from the point of view of ensuring compliance with the requirements of the regulator), and a system of personal recommendations for bank customers. Also interesting is the system of semi-automatic verification of contract texts, which identifies the main business objects mentioned in the text, the relations between them, and then applies a set of rules to check whether the essence of the contract is in line with the bank's policies.
Enterprise knowledge graph allows you to compare information about different objects and events, find correlations and complex chains of links between them. This makes it a promising tool for creating anti-fraud tools. Ontotext talks about the implementation of a system for detecting potentially fraudulent actions of traders.
A financial institution needs to create a system that automatically checks the actions of traders for compliance with the legislation of several markets. The regulator of each market introduces its own rules, which change over time and differ from the rules of other regulators. The task is to define a set of rules applicable to each trader's action and verify their fulfillment. At the same time, tasks such as detecting coordinated actions of several traders are non-trivial and require processing a huge amount of information in a short period of time.
EKG can be built using different techniques and technologies, but the most common and practically useful is the creation of a graph whose semantics (meaning) is described by an ontology model. All the companies mentioned in this review use this approach. The use of ontologies allows not just connecting different data objects, but also accurately describing the meaning of each node and edge of the graph. There is a set of ontologies, FIBO (Financial Industry Business Ontology), which formally describes a conceptual schema for graph representation of financial information. This set is supported by the international consortium EDM Council, whose task is to disseminate best practices of data governance in the financial industry. In the described implementation, Ontotext uses this set of ontologies to represent information about traders' actions, thus eliminating problems that may be associated with semantic ambiguity of the source data.
Another major advantage of ontologies over other EKG construction methods is that it is easy to formally describe in terms of ontology the rules applicable to the data included in the graph. There are software components that allow such rules to be automatically applied to the data. The rules themselves are designed by analysts of the subject area and are taken out of the program code, which significantly increases the flexibility of the solution: when new types of entities and new business rules appear, it is enough to make changes in the ontology model. In the described project such rules are used to detect fraudulent actions. It is also convenient to form search queries in terms of ontology. Ontotext in its press release gives the following examples of queries:
Using tools based on EKG, the compliance department of a financial institution was able to detect and analyze a wide range of suspicious activities based on criteria such as the number of transactions and alarms for certain traders. The implementation of the tool allowed to clarify the criteria for describing normal and abnormal trading operations and to develop statistical models for analyzing patterns of suspicious activities.
A solution for preventing fraud is also in Metaphacts' portfolio. The model built within this solution describes the business logic associated with customers and their invoices, contracts, payments, and claims. The solution allows the anifraud division to quickly identify suspicious claims using automated tagging and graph search.
Data structures in the financial industry lend themselves well to standardization because they are based on global business rules in this area. Features specific to different markets and jurisdictions are describable by extending standard ontologies. The need for international exchange of financial data has led to the creation of several mature, common ontology data models: besides the FIBO ontology mentioned above, it is necessary to point out the LKIF (Legal Knowledge Interchange Format) ontology for describing regulatory requirements. Based on these ontologies, more specific models are created, such as The Bank Regulation Ontology. The availability of such ontologies gives organizations in the banking sector an advantage that many other industries do not have: it is possible to implement EKG-class tools with minimal effort to create an ontology model, relying on ready-made specifications. The same factor simplifies data exchange between different organizations in the financial sector due to the availability of unified data models for exchange (XBRL taxonomy, a widely used format for exchange of financial statements, is also relevant).
We have considered several examples of enterprise knowledge graphs application in banks and the financial sector and were convinced that the prerequisite for their use is heterogeneous and complex data structure and diversity of their sources. Consolidation of such data into a single graph with a formally described semantics allows solving various applied tasks efficiently and using significantly less resources than other tools. Despite the inertia of the financial sector due to the desire to minimize risks, modernization of approaches to data governance is necessary to avoid drowning in a sea of fragmented and low-quality information. Large financial institutions that are early adopters of such technologies are improving their operational efficiency and gaining important competitive advantages.
If we look at innovations as opportunities rather than risks, what could be the future trajectory of EKG penetration into the banking sector? We believe that in a relatively short time the use of graphs to consolidate information on customers and assets, ensure the quality and availability of information, and anti-fraud will become common practice.
In our opinion, the next major step will be the data-centric transformation of banks' IT infrastructure, where data, rather than applications, will become the key IT asset. This will lead, in particular, to the emergence of banking application development pipelines that do not have their own data warehouses and interact with a single array of corporate information available through a data virtualization platform. In the course of the evolutionary transition from the use of cumbersome and expensive proprietary application systems to a data-centric approach, businesses will be able to quickly implement any new functional requirements, test business hypotheses and gain competitive advantage through a truly flexible IT infrastructure.
In the near future, we can expect progress in the development of Natural Language Understanding technologies using EKG, which will lead to the emergence of truly “smart” chatbots and assistants capable of maintaining a meaningful dialog with the customer using available data. The creation of such assistants promises to improve the customer experience and relieve the burden of support services, as well as to improve the availability of information for bank employees.
DataVera offers Kazakhstani financial services organizations to start solving accumulated problems in data governance with the help of modern technologies. We are ready to design and implement solutions for any functional tasks that require processing of complexly structured, heterogeneous information - not limited to scenarios similar to those described in this article. The slogan “data is the new oil” has long ceased to be a marketing strain. In today's environment, the willingness to quickly embrace innovation and use non-typical approaches to solving data challenges is indeed one of the key factors of business efficiency.
Our specialists have competencies and practical experience in implementing EKG projects in large companies, building ontology models, designing and implementing complex IT projects in banking and other industries. We will be glad to realize the potential of innovative technologies in the interests of your organization.