How ENSEK Built a Data Platform for Day-One Gross Margin Accuracy

Written by Adam Seed | Mar 5, 2026 12:46:28 PM

As ENSEK continues to invest in stronger, more scalable gross margin reporting, the technical foundations behind that progress are coming into sharper focus. In this post, we look under the hood at the data platform that made day‑one gross margin accuracy possible — and how it was engineered to deliver speed, scale and audit confidence from the outset.

Quick Take: ENSEK’s Data Platform Story

ENSEK’s Head of Data, Adam Seed, explains how we engineered a high-performance data platform that delivers “day-one” gross margin accuracy. In line with ENSEK’s mission to power Lives, not just Loads, we built the platform to deliver financial clarity at scale — processing over 100 billion rows of data in just four hours, with 99.99999% accuracy.

By leveraging Databricks and a layered lakehouse architecture, we ensured the solution supports operational goals, enables customer integration, and lays the foundation for future cost data inclusion and advanced analytics maturity.

Why Gross Margin Accuracy Is Business-Critical

Gross margin reporting is one of the most important data products we deliver at ENSEK. It’s not just a report — it’s a product in its own right. We sell it. Customers rely on it. And it has to be right.

Every month, by 9:00am on the 1st, we need to deliver a complete gross margin output that includes all data up to midnight the night before. That’s a hard service-level agreement (SLA). It’s tight — some might say ridiculous — but it’s non-negotiable. And it’s not just about speed. The output has to reconcile to our sales ledger within GBP 150,000. When you’re dealing with billions in revenue, that’s a rounding error. But getting there? That’s a whole different story.

The Scale Problem: Processing 100 Billion+ Rows in Hours

Our first major hurdle was the scale of the data. For one customer alone, we’re talking about 10 million meters, each generating half-hourly data. That’s roughly 36 billion rows per year. Multiply that over seven years and you’re into the hundreds of billions. The complexity increases when this data comes from a number of fundamental sources; Ignition our billing system and Industry flows, it's not just a volume, its breath. Processing this data in under nine hours is not trivial.

When we last ran the job on our production database, it took three and a half days to complete. That’s obviously not going to cut it when you’ve got a nine-hour window. And then there’s the accuracy requirement — reconciling to within GBP 150,000. That’s where things get really tricky. You’re dealing with messy realities like unallocated cash. How do you represent that on a usage marker? It’s not straightforward.

How We Engineered a Four-Hour Gross Margin Pipeline

To meet the SLA, we built a new data platform using Databricks and a lakehouse architecture. (A lakehouse combines the structure of traditional databases with the scalability of data lakes.) We structured the platform into three layers:

Point-in-Time Layer (PiT): A forensic snapshot of all data sources at any moment, enabling traceability and auditability.
Feature Vault: The core data engineering layer, where business logic and quality checks are applied.
Public Data Layer: A simplified, analytics-ready dataset with a full data dictionary and ownership metadata.

This architecture and a combination of streaming and batch processing enables us to process gross margin data in four hours, with the remaining time used for data preparation and validation. We also offer visualisation through Sigma, our newly chosen adoption tool capability, offering complex accountancy style visualisations that gross margin demands. On top of that, we’ve built a backbone data integration service that ships terabytes of raw and aggregated data directly to our customers’ data warehouses under a service call data lake integration. This means they don’t need to hire an army of data engineers to get data into a usable format — we handle the complexity for them, using Amazon Web Services (AWS) to ensure scalability and performance.

What Makes Our Data Platform Stand Out

What sets our approach apart is the level of engineering and precision we’ve achieved. We’re not just hitting the SLA — we’re smashing it. The gross margin job now runs in four hours. That gives us a buffer for prep and validation, and it means we’re consistently delivering on time.

And the accuracy? We’re talking about “twelve nines” — 99.9999999999%. That level of precision didn’t come easy. We probably got to 99% in six months. But chasing that last fraction? That took another eight months. It’s a testament to the team’s commitment and the robustness of the platform.

We don’t cut corners. The data we deliver includes everything — warts and all — so customers get the full picture. That transparency builds trust.

Results: From Days to Hours, with Industry-Leading Accuracy

Here’s what’s changed, in real terms:

Processing time reduced from 3.5 days to 4 hours.
On-time delivery every month (SLA achieved by 9:00am).
Integration with National Grid and customer systems is underway.
Embedded BI tools Use of Sigma to deliver this as a true SaaS product offering, removing complex access and understanding
Scalable data delivery via AWS and backbone integration service.

These results help us work smarter, stay compliant, and deliver faster — exactly what we’re aiming for.

What’s on the Horizon: From Gross Revenue to Full Margin Intelligence

Right now, our gross margin product is essentially a gross revenue report. We’ve got the consumption data, but we’re still working on integrating all the cost elements — distribution, renewables, transmission. That’s the next big step. It’s on the 2026 roadmap, though it was originally planned for this year. (Market-wide half-hourly changes pushed us back a bit.)

We’re also looking at Delta Sharing — a Databricks feature that enables secure data sharing between platforms. It’s a cleaner, more efficient way to get data where it needs to go.

In parallel, we’re advancing our data maturity model — a framework for evolving how we capture, structure, and use data to drive business value. We are looking far beyond traditional dashboards. The next frontier is conversational analytics powered by Artificial Intelligence (AI). However, in a highly regulated energy market, sending sensitive financial and commercial data to public AI models is a non-starter. That is why we are exploring a self-hosted Large Language Model (LLM) deployed directly within our secure infrastructure. This approach guarantees absolute data sovereignty.

The grand vision? Allowing you to query your "Ledger of Truth" in plain English — asking, 'Why did our gross margin increase for these particular products?' — and getting an instant, auditable answer without your data ever leaving the ENSEK ecosystem. We've come a long way, but this is how we redefine financial intelligence.

We’ve come a long way — but there’s still a lot more to build.

Note: ENSEK’s Financial Assurance capability is delivered through our Ignition platform and is not available as a standalone product. This ensures seamless data integration, governance, and audit-grade controls.

View full post