Two years ago I dedicated the few first technical posts on this blog to the topic of modelling time in data. To my surprise these posts started getting a lot of traffic, and still continue to do so. Since understanding of temporality in how data is modelled, and especially in how it is processed was at the foundation of this project - I decided to wrap these blog posts into a talk, which I presented recently at the PyData Global 2021 conference.
The talk shows the pitfalls of anemic and non-temporal data models, and how both OLTP and OLAP systems are gradually converging on storing event-based historical data. It demonstrates that Batch Processing is becoming an inadequate tool for processing such data and how Stream Processing, which evolved mostly separately from OLAP, is now bringing the most versatile toolset for processing historical data.
You can find the pre-recorded version (and the link to the conference recording) here:
I hope you’ll find it useful, and let us know what you think about this theory of data convergence! We’ve recently created a Discord Server where you can chat to us and other people about everything data related.