As a company, you rarely get the opportunity to start from scratch. Being a startup, you have to choose whether to go with the status quo or create something new and test the boundaries of what is possible.
Operational vs. Analytical
There has been a separation of the data plane within an organization for many years based on operational and analytical data efforts. While this was driven partially by using different data models and having other characteristics to their workloads, it was also enforced by the requirement; never to have analytical efforts impact operational ones.
The analytical realm has been an internal value to an organization, a tool to improve itself. It provided insights and guidance on the direction to take, observing state and behavior within the operational realm. The analytical realm was a tool for managers and investors, much less for operational employees and customers. The operational realm has been focused on executing the core business processes. Times have changed, and this separation isn't as clear-cut as it used to be.
Speed is demanded
Customers require more innovative products and services and insights into their relationship with the organization. This desire drives the analytics right into the operational realm. On top of that, the speed at which organizations operate is constantly increasing. For example, it isn't enough for a company to reply to a customer the same day; the customer demands an answer now. As a result, where we previously had hours or even days to perform our analytics for a limited audience, we now have to serve a broader audience in a matter of seconds.
Building from the ground up
These insights required us to drop traditional approaches and rethink how our data platform would be built from the ground up. We started by defining the criteria our platform should meet and the functionalities it needed to provide. We also leverage respected views on this matter like this blog by Confluent. Eventually, each of them could be assigned to one of these categories:
Flexibility
Robustness
Scalability
Responsiveness
Maintainability
So how's that for a list of buzzwords. Let me dive a bit deeper on what these mean in the context of KOR.
KOR is a startup, and while we have a pretty good idea of what we want to accomplish, we have the advantage of being agile and having speed of execution work in our favor. We need to push boundaries out without breaking any existing functionality (or the bank, for that matter). This requires any solution we come up with to provide flexibility to quickly change or add new components to the platform without impacting the other components.
KOR operates within a regulated financial services space, forcing high uptimes and substantial requirements on data consistency. We simply cannot lose data, period. This is a common requirement for any data system and is usually tackled on the infrastructure layer, but solution designers often overlook an implication. Not losing data also means we need to be able to deal with mistakes from humans. Traditionally, we did so by taking backups, but this led to an enormous and error-prone job when restoring the backup. We want to make sure we tackle robustness early on, within the foundation of our architecture.
At its core (pun intended), KOR is a data platform, and while we have a good idea of the amount of data we'll need to deal with, we will most definitely be off in our estimations. Being a startup also means we will have a growth curve with relatively small data initially growing as time passes. This is impacted by retention requirements posed as part of the regulatory requirements, which might require us to retain data for decades to come while keeping that data available to be queried. These things combined, drive the need for scalability, not only within the way we store our data but also in the way we process it.
While it's fine to take a few hours to analyze the data in many cases, we believe decreasing that delay will allow better quality and fewer mistakes with less impact. The longer it takes to discover an error, the greater the damage it can do to an organization. We believe in new opportunities when reducing the delay in processing and reporting, both for our clients and the regulators. As a result, responsiveness became one of our objectives.
Last but not least, we required our solution to be maintainable. KOR’s ability to move fast comes from optimizing the maintainability by automation. This allows us to spend time on what matters most, creating value for our clients. Our platform components should be predictable, easy to understand, deploy, and manage.
Memorizing data backbone
As KOR, we made up our minds and started designing our foundation from the ground up, setting out to define our approach. An approach where events are central to the organization's workings and considered a first-class citizen. It results in a company with a memorizing data backbone; a ledger of events interconnecting all our products and processes.
There is no central data model. Instead, each of our services uses a specialized data model, one that is optimized to get the most out of the service in ways of efficiency and performance. When events arrive from the backbone, the service applies them to its data model. When services make changes, they will update the data in their model and send out an event on the data backbone for other services to react to.