The challenge of matching property data in the real world
Originally published by Geovation Scotland
Building Atlas joined Geovation Scotland in March this year as an Enterprise-in-Residence at the Meadowbank House headquarters. Here, Engineering Lead Paul Symmers, tells us more about the challenges facing retrofitting in the UK and what Building Atlas can do to help…
To reach the UK’s Net Zero goals, a building needs to be retrofitted every 2 minutes. At Building Atlas, we built a platform to help building owners reach that ambition. We work with thousands of buildings at a time. That should be simple – buildings are physical things, right?
Not exactly, to:
Royal Mail > a building is a delivery address
the Valuation Office Agency > it’s a set of tax-assessable units
a local authority > it’s a mix of units, land parcels and services
Utilities > it’s often just a billing address
an owner > it’s a home, office or shop
And to us, a building is a dynamic collection of attributes drawn from a myriad of datasets that often disagree with each other > structure, use, ownership, tenancy and energy performance.
If you’re building anything in the UK property or retrofit space, you’ve likely hit this wall too. So, let’s talk about the elephant in the postcode: matching data is much harder than it should be.
The core problem: “a building” isn’t one thing
Matching data sounds simple in theory. You take two datasets and find the same building in each. But which “same” are we talking about?
A postcode might cover 170 flats, a pub and a kebab shop
One physical building could have multiple postcodes or three different street addresses (we’ve seen it)
Units in the same building might appear on different business rates records, or not at all
EPCs (Energy Performance Certificates) might cover one unit or five; or the same unit might have one EPC per floor if each has its own heating system
And don’t get us started on energy meters: meter IDs such as MPANs are linked to addresses that often bear little resemblance to reality and have no other identifiers to match against
In other words, matching data to buildings is a constant battle between representation and reality.
We’re lucky to have had support from Geovation Scotland but most people don’t get that kind of access.
Manual matching is a dead end (we’ve tried it)
Let’s say you’re an asset owner or a consultant trying to assess a building portfolio without tools like ours. Could you match the data yourself? Technically, yes – but here’s what you’d be up against:
EPCs are public, but the address might not match what’s on the front door, especially for multi-unit buildings. You could have five certificates for one building, or one certificate for five units
Business rates can be searched manually but making sense of which unit is which (or whether anything’s missing) is incredibly hard
Building geometry, roof type and construction materials are essential for retrofit planning and aren’t easily available unless you have the technical infrastructure to access OS National Geographic Database
Even identifying which addresses are part of the same physical building requires spatial joins with NGD or AddressBase; which is beyond the capabilities and budget of a typical non-technical user
You can do some of this manually. But it’ll take about an hour per building, and you still won’t get the complete picture. That’s why so many teams fall back to physical site visits, starting from scratch every time.
It’s not scalable and it’s not necessary.
What we do at Building Atlas
Our mission is to decarbonise buildings at scale. To do this we need accurate, detailed representations of buildings to model the best energy efficiency retrofit measures for each of them. We’ve built the foundation to untangle this mess for all non-domestic buildings across the country.
At Building Atlas, we’ve built a property intelligence engine that connects disparate datasets (AddressBase, NGD, business rates, EPCs, utilities, planning) using UPRNs where possible, and smart inference where not. We treat each building as a flexible data object, with a parent-child hierarchy of units and a clear record of how confident we are in each match.
Key features of our approach:
UPRN-first: Wherever possible we anchor data to UPRNs
Conflict-aware: If two sources disagree on floor area or usage we don’t hide it, we highlight it
Change-friendly: Buildings evolve. Our system allows for unit splits, refurbishments and name changes: the data is always live
Source-transparent: We know where each piece of information came from and how strong the match is
Scalable: With just an address, our AI can build an energy model for any of the 1.7 million non-domestic buildings in the UK in seconds
The result lets our customers (from local authorities to estates or investors) make smarter decisions about where to invest, retrofit or intervene. Not just based on raw data but based on matched intelligence they can trust.
We’re not starting from scratch – the UK is ahead
The UK already has some of the best property data infrastructure in the world. Thanks to Ordnance Survey, HM Land Registry and the introduction of UPRNs, we have the foundations for a joined-up national property graph.
But UPRNs still aren’t used consistently, they are not yet across all government datasets and rarely in private ones. That’s the gap.
We don’t need a revolution; we just need to connect the dots. Geovation Scotland, Registers of Scotland and OS have laid the groundwork; now it’s about adoption, integration and scale.
Let’s stop matching data in the dark
We believe the future of proptech, retrofit and climate investment depends on answering one simple question “what even is a building, and how do we know?”.
If that’s the kind of thing that keeps you up at night – or if you’re wrestling with postcode pain, EPC mismatches or rogue billing addresses – we’d love to swap notes.
We’re Building Atlas, we help public and private sector clients target retrofits with better property data and we’re always happy to talk shop…
Drop us a line at hello@buildingatlas.io
Visit our website
Follow us on LinkedIn
Paul Symmers is Senior ML Engineer at Building Atlas