In this episode, Amir Bormand sits down with Harrison Tang, CEO and Co-founder of Spokeo, to explore a problem most people in AI, data, and digital identity overlook: entity resolution. Harrison unpacks how billions of fragmented data records are connected, how we determine what's true in a world of generated content, and why trust and privacy are becoming the new battlegrounds in tech.
They discuss the philosophical foundations of identity, the technical challenges of resolving entities at scale, and how GenAI complicates truth detection. If you're building in data, trust, or anything AI-related—this is required listening.
🧠 Key Takeaways:
Entity resolution is foundational to how we understand digital identity—but it’s far from solved, especially with GenAI-generated noise increasing.
Spokeo resolves 600M+ entities from 19B+ records, using distributed computing and multiple “criteria of truth” (consensus, authority, coherence, etc.).
Generative AI can create content—but not verify it. It’s great for mock/test data, but not for discerning truth.
The real challenge? Detecting fake content. Harrison breaks down the four pillars: provenance, detection, governance, and education.
Privacy ≠ Security. Identity and access management sits above entity resolution, and is crucial for enforcing data control.
⏱️ Timestamped Highlights:
00:55 – What Spokeo does and the scale of its data
02:10 – What is entity resolution? Why it matters
04:10 – The challenge of 19B record comparisons
06:00 – Garbage in, garbage out: why data quality starts at ingestion
07:10 – The five criteria of truth: consensus, authority, consistency, coherence, correspondence
10:40 – Where GenAI helps (and fails) in entity resolution
13:00 – Can AI discern truth like a human? Harrison’s take on AGI skepticism
16:20 – The rise of fake data and the opportunity for Spokeo
18:15 – AI provenance, invisible watermarks, and content authenticity
21:00 – The four pillars of trust in the AI age
23:00 – How privacy impacts data workflows and IAM
25:30 – Why entity resolution sits at the foundation of identity systems
💬 Quote of the Episode:
“The problem of who we are has existed since the beginning of the human race. And in the digital world, that question is more important than ever.” — Harrison Tang
🔗 Resources Mentioned:
W3C Credentials Community Group – where Spokeo contributes on decentralized identity standards
Adobe Content Authenticity Initiative – cited as a tool for detecting AI-generated content
Zero-shot prompting – the concept behind GenAI generating realistic data from a single prompt
🎯 Career Tips (from the episode):
While there wasn’t a dedicated segment on careers, Harrison did hint at a big opportunity area:
If you're in data or security, AI-generated fake content is a growing risk—and a career edge for anyone working on provenance, detection, and digital trust systems.