Element Ventures is now 13books Capital
Read more

The promise of radically simpler data management

Data was never the “new oil”. Whoever wrote that stuff reached for a lazy metaphor and has never worked with data.

Unlike oil, data ain’t running out. It’s exploding. And it doesn’t need to be “mined” either. 

For 20 years, I have been creating, productising, and selling enterprise data management to the world’s most complex organisations. 

These organisations all have two things in common: access to data in their (own!) systems is way, WAY too complicated, borderline impossible; and data quality is low.

Granted, data is a boring and abstract subject. But data is where the money is at. Even if you’re used to hearing large market/TAM numbers, data management and data integration blow you away: $110bn and $15bn per year, respectively.

So, data is HUGE. But it’s also stuck in the 90s. The tools out there are depressing. One bank I sold to in the past was quoted over $100,000 to add two columns to a CSV file.

And that’s nothing. Projects fail constantly, year round, because access to data is too hard — and this becomes a huge barrier to adopting any new tech. So everyone, down from the CEO, has to care about this.

Making access easy — Enter AI

The current generation of AI is the first time that we can feel genuinely excited about making it easier to tap into corporate data stores.

With a rethink of tooling, access to data should become radically easy and available to anyone in a large business.

Plugging holes in data quality (missing values, duplicates, etc), can now become proactive and automated, rather than break someone’s workday on a Friday at 5pm, when they want to go home.

This means that when integrating new technology, firms can move away from this sort of disaster project:

… and instead focus on customer functionality.

Why isn’t this already done?

In the SMB world, you can buy integration-as-a-service platforms that sort out your data. They just hook up your systems: Hubspot, Netsuite, Xero, whatever you’re using.

In a large financial institution, you can’t do that: nothing is standard, everything is complex, there are tons of customisations, and the systems involved are often poorly documented.

So, data access in this environment is just fundamentally hard: it requires understanding the business context of the data, proprietary system behaviour, and annoying file formats that require lexical data manipulation (COBOL copybooks, etc). 

AI has a huge part to play here. Imagine a world where companies can swap out decades-old legacy systems with ease because the glue code is generated — people will bite your hands off.

Going active with data quality

Once you have access to data, you need to control its quality. Most of us are familiar with data quality… check out some records in the CRM you’re working with! Chances are, there are gaps and errors. We want to fix those.

Unfortunately, the current data quality industry is a bit like a doctor who looks at your sore throat, says “yup, you have a problem”, and then doesn’t fix it for you (but charges you a ton).

Thanks a lot, doc.

Let’s guess: In firms with 10,000 operations staff and 5,000 IT staff, how much time do you think is wasted on workarounds attributable to bad data quality? How many processes exist only as workarounds, providing no value? 

I’ll leave it to you. But it’s a number that runs in the millions, per company.

We can do much better with AI, thanks to:

  • tabular foundation models
  • better ability to generate synthetic data, and
  • larger availability of structured and unstructured data in general

Data quality improvement should move from passive to active, and from once-in-a-while to continuous, so that we can finally get to a reasonably “self-healing” architecture.

So what does it all mean?

Back to brass tacks. Messing around with data costs businesses billions of dollars per year in aggregate.

Rapid and well-functioning access to data is vital for any company of reasonable size. It is the only way to understand the market and customers, and to keep the company safe.

And yet, despite all the advances in AI, access to data is stuck in some faraway land two decades ago, where documents are written, translated into SQL statements, and plonked into a proprietary ETL tool.

Dear hero-startup changing that… give us a shout 🙂