Snowflake, Salesforce, dbt Labs, BlackRock, Tableau, ThoughtSpot and more than a dozen leading tech firms unveiled a joint initiative called the Open Semantic Interchange (OSI) aimed at standardizing how business data is defined, shared, and understood across platforms — with the goal of eliminating what many consider AI’s biggest bottleneck: semantic fragmentation of enterprise data. Disparate internal definitions of key terms (e.g. what counts as a “customer,” “active user,” etc.) often cripple AI model output and force companies into weeks of manual cleanup and reconciliation. The OSI specification will be open source, vendor neutral, and immediately usable (via formats like YAML) so that existing tools can adopt it without a complete re-architecture. Backers argue that although giving up control over proprietary definitions may seem counterintuitive, the real competitive leverage will shift toward who builds the best AI tools, not who owns the semantics.
Sources: CIO Dive, Venturebeat
Key Takeaways
– Enterprises are paying a steep price for inconsistent data semantics: fragmented definitions across business units and tools greatly slow or degrade AI implementation.
– OSI (Open Semantic Interchange) aims to establish a vendor-neutral, open standard so that disparate tools, dashboards, and AI systems can “speak the same language” with respect to business meaning.
– Standardizing semantics doesn’t reduce competition; proponents believe it shifts the locus of competition from owning definitions to innovating on top of those definitions (e.g. in user experience, AI capabilities).
In-Depth
In the world of enterprise AI, models aren’t the only bottleneck — the real curse often lies in the data itself, specifically in how business terms get defined, stored, and interpreted across systems. The recent announcement of the Open Semantic Interchange (OSI) initiative is a response to that challenge. Spearheaded by Snowflake alongside partners like Tableau, ThoughtSpot, BlackRock, Salesforce, dbt Labs, and several others, OSI is designed to tackle semantic fragmentation: when different parts of a company use different definitions of critical terms like “customer,” “active user,” “churn,” or even “metric,” AI models trained on those conflicting inputs generate unreliable outputs.
OSI seeks to provide a standardized semantic layer — a shared specification that encodes definitions, relationships, business logic (via metadata), and AI-oriented synonyms or instructions, in formats like YAML. The idea is that once metrics are defined according to OSI, dashboards, notebooks, machine learning models, BI tools, and analytics platforms can all inherit or align with those definitions without re-inventing or reconciling them.
This move is being driven by urgency: AI deployments have repeatedly stumbled not because algorithms are bad, but because the data feeding them is messy, duplicated, inconsistent, or siloed. A CIO study found that about half of CEOs admitted their AI investments had left them with fragmented tech stacks. Meanwhile, research shows that data-quality issues cost enterprises hundreds of millions in lost value when AI models underperform due to flawed or unclean data inputs.
By stepping back from proprietary control over semantic definitions, companies are betting that a shared foundation will accelerate AI adoption, reduce rework, and free up R&D to focus on experience, innovation, and model improvements rather than firefighting data mismatches. Of course, governance will matter: ensuring the standard remains neutral, adaptable, and not hijacked by dominant players will be essential. But if successful, OSI could shift the paradigm: instead of every business building its own, slightly incompatible semantic map, there will be shared roads everyone can drive on — which in aggregate could unlock real returns on the huge sums already being invested in AI.

