Handling the temporal nature of data is not an easy task. Imagine you have a simple customer database that shows that Jane Doe lives in Sydney and works as a clerk. Your manager prepares to contact Jane, but he can only guess when this information was obtained, whether Jane has moved to another city or changed jobs, how long she has lived in Sydney, etc.
A typical operational database does not take time into account: it represents only the "current" state of real world objects, although in fact it is not current, but more or less outdated. It is not known when the data were obtained and when the object was actually in this state.
A slightly more advanced approach records all changes to an object's properties in a history log. This allows you to track changes and suggest how out of date your information is, but not much more than that. What if we need to select all customers who lived in Sydney in 2015? Or customers who have ever worked as a clerk? Most bases with a change history log won't give you an answer.
At DataVera, we believe that the potential of temporal dimensioning of business data has yet to be unlocked. Our EKG Provider data virtualization platform provides an API that allows easy retrieval of the state of data at each point in time. Any change to the data that is written to the platform can be accompanied by a timestamp reflecting when the actual properties of the object have been (or will be) changed. We believe this gives the data a new quality as it can now provide much more analytical information that can be monetized.
DataVera EKG Provider represents all data according to an ontology model. In the ontological world, time is usually modeled explicitly. If some entity can have different states, each state is represented as a separate object called "time-part". This means that if we want to track the history of our customers' change of residence and work location, we must create a separate model object for each fact that a customer lives in a certain city or is employed at some job. This is semantically correct, but rather inconvenient and slow in terms of data processing algorithms. A simple question like "who lived in Sydney in 2015?" becomes a complex SPARQL query, not very efficient on millions of objects.
Our approach realizes data temporality at the platform level. You can run a simple query, such as "who lives in Sydney? ", augmented with a timestamp denoting the moment of interest, and get the correct result that reflects the past state of the data. Forget about hundreds of additional data objects for the states of each object, forget about reification of properties just to reflect their temporal nature. Work with a simple data structure that is still a "snapshot" of the real world, combined with the ability to scroll the time axis in any direction.
Every piece of data, every property value in the EKG Provider is annotated with the source of the data (user, external IT system, etc.) and the time when the corresponding real-world event occurred. This increases the trust and value of the data and opens the way to more intelligent decision making.