If I had to name one positive thing that came out of year 2020 it would probably be how much more open and accessible the tech conferences have become during the pandemic. Taking conferences online was a huge challenge, and some unfortunately were not able adjust in time (e.g. Strangeloop). But those who did were getting huge audiences. I definitely broke my personal record in conference attendance this year.
I was also happy to present at Data+AI Summit (former Spark Summit) last month. This was the first public appearance for Open Data Fabric, so it was quite exciting to share it with the world:
The talk goes into a little more detail on the current state of data science, the root problems behind the current reproducibility crisis, and why a radically new approach to managing data is long overdue.
Kamu CLI Update #︎
Meanwhile, we’ve also been busy making lots of improvements to kamu-cli
, and added a whole bunch of new examples that you can easily follow. Examples serve both as great introduction to temporal data processing, and to managing data with Open Data Fabric. Through a few very simple and seemingly routine problems we demonstrate how inadequate our current “local optima” of batch processing is, and how advantageous it is to apply stream processing paradigm when dealing with all types of temporal data, regardless of how frequently it is updated.
As I said in my talk, its time to face it - batch processing’s only answer to temporal problems is to sit and wait until those problems disappear. That’s definitely not how data pipelines of the future will be built.
Join us, and let’s instead face the real problems head-on!