SQL Event Store. Maybe Yes?

5 minute read Published: 2022-07-14

Last year I published a post about how using a SQL based event store might not be the best idea. My argument in that post boils down to two things.

  1. Building an event-store on top of a SQL database isn't as easy as you'd think
  2. Because of the above you should consider using a purpose built tool

What the post did not do a good job of was elaborating on the trade-offs of using a purpose built event store. Just like building an event store on SQL isn't easy, using a purpose built event store as your source of truth datastore comes with some interesting challenges. This post is an exploration of those challenges.

Terminology

Before we go further it's a good idea to get some terminology out of the way to make sure we're speaking the same language.

Stream
A sequence of ordered events, typically for a single process or entity e.g. User-1
Category
A grouping of related streams e.g. User-1 belongs to the User category
Optimistic Concurrency Control
The ability to write to a stream if and only if it's last event has a certain version

What is a purpose built event store?

Yves Lorphelin has created a fantastic enumeration of expected APIs for event stores. I'll however say that in most scenarios you really only need three APIs.

  1. Appending events to a stream with optimistic concurrency control
  2. Reading a stream
  3. Reading across streams (on a per-category basis)

On top of these basics a purpose built event store like EventStoreDB will give you two killer features:

  1. Real-time streaming as a first class citizen
  2. A globally ordered $all stream

So why would I consider NOT using a purpose built event store?

I think you'd be wise to consider at least two challenges

  1. Your cloud provider doesn't have a managed offering
  2. You might not need real-time streaming

Your cloud provider doesn't have a managed offering

If you're building something, the last thing you want is more hardware/infrastructure to manage. Most cloud providers however do not have a cloud-native event store offering. This in turn means you can either look for a third party to manage your event store, or manage it yourself.

Contrast this with the ubiquity of managed and cloud native PostgreSQL offerings. AWS Aurora, as an example, is fully compatible with the postgres API and will likely cost you far less than a managed event store. It has a 99.99% uptime SLA, offers easy read-replicas, backups, and more.

Managed EventStoreDB on the other hand will give you a 99.5% uptime SLA. That's almost 2 days of downtime a year! The last thing you want for your primary datastore is a system that is unreliable.

You might not need real-time streaming

While real-time streaming sounds like an absolute must-have it comes with a set of challenges. Firstly it tends to be push based. Push based systems are inherently less reliable than pull based systems.

The way these push-based real-time subscriptions work relies on having a connection open to the event store at any given time. Because these connections are long-lived and the protocol might not be the simplest, the job of handling network failures ends up in your application layer.

What happens if the node you're connected to goes down, or there's a network failure?

In these cases your connection should drop. Maybe you had the foresight to implement a retry, so your application will attempt to open a new connection subscribing from your current checkpoint. You may also get into a situation where your application will never be able to connect to the event-store again. As an example, Alpine linux has some well known issues with how DNS is handled, sometimes it's better to let your process crash and leave it to your orchestrator to do the work of recycling it.

A popular approach to handling network failures these days is to employ side-car proxies like Istio. Istio is a man-in-the-middle attack you perpetrate on yourself because it promises you nice things like "solving all your network issues." In practice, it tends to break your network in very subtle ways. It tries to be "smart" so your application doesn't have to be. One way it tries to be helpful is by handling retries on your behalf. In reality this means your application won't know that the event store is down, from the application's perspective it's just taking a really, really long time to respond.

Takeaways

Using a purpose built event store is no bed of roses. All decisions have trade-offs. In many cases starting your journey out on a SQL database is the right decision. It means fewer moving parts, and less time spent on maintaining infrastructure. The ubiquity of relational databases has meant a lot of innovative things have popped up around them. These days if you need real-time push-based streaming from Postgres then Superbase has got you covered. Migrating relational databases has long been a risky and hard endeavour, but services like Amazon's DMS have cropped up promising to take much of that burden away from you. The list goes on and on.

Ultimately, SQL gives you options, and options are something we shouldn't throw away early in our journey.