EdgeRed

Home AI Snowflake Cortex AI implementation lessons from Australia

Snowflake Cortex AI Implementations: What We've Learned the Hard Way

Most of the Cortex AI content I read online describes an implementation that goes more or less to plan. That hasn’t always matched my experience. Here’s a more honest version of what these projects actually look like — what moves fast, what doesn’t, and what I wish more clients knew before we started.

The semantic modelling phase sounds technical. It mostly isn’t.

When I tell clients we’re going into semantic modelling, they picture engineers writing code. The reality is that a significant chunk of that phase is sitting in rooms — or on calls — helping people from finance, sales, and ops agree on what words mean.
What does “active customer” mean? What counts as revenue? Which date field do we use when someone asks about a transaction? These aren’t edge cases — they’re the questions that surface the moment you try to encode business logic into something a machine has to interpret consistently. In most organisations, those definitions exist implicitly. Everyone has a slightly different working version and they’ve learned to live with the ambiguity.
Cortex AI doesn’t have that luxury. The semantic layer has to pick one answer and commit to it. On good projects, those conversations are over quickly because someone with authority has already done the alignment work — often as part of a prior BI project. On harder projects, the AI implementation becomes the forcing function for fights that have been quietly deferred for years. That can take weeks, and there’s no shortcut through it.
One domain typically takes anywhere between one to three weeks — not because the technical work is slow, but because the negotiation is unpredictable. Once definitions are locked, the build itself is fast.

The question bank matters more than the demo

Before we show anything to an executive, we test against a question bank built from real stakeholder queries — things pulled from actual Slack threads and email chains, not invented test cases. If I have to make up questions to test against, that’s usually a signal the use case isn’t defined well enough yet.
On recent builds, we’ve formalised this into a proper eval framework — tracking SQL accuracy, answer relevance, and response quality across runs, with results in a spreadsheet rather than in someone’s head. It feels like extra overhead at the time. It consistently saves rework later. One bad answer that gets screenshotted and shared around before you’ve properly tested is significantly harder to recover from than the time it takes to run a thorough eval.

The architecture decision nobody talks about enough

On one project, a single Cortex AI platform needed to serve two genuinely different user types: executives asking high-level KPI questions, and analysts drilling into granular, entity-specific data. Same interface, very different grains, different tables, different tolerances for response time.
Our solution was a dual-agent architecture — one agent for executive-level queries, one for granular questions, with intelligent routing deciding which to hit. It worked. But it worked because we mapped that requirement in week one. Discovering it in week eight would have meant rebuilding, not iterating.
Discovery isn’t just scoping. It’s what makes the right architecture decision obvious before you’ve committed to one.

What “done” actually looks like

My benchmark: a non-technical user asks a question they’d previously have needed to wait on an analyst to answer, gets a result, and trusts it. That last part is the hard one.
The most concrete signal is what happens to the reporting environment around the Cortex build. When the semantic layer is stable and the conversational interface is live, teams typically find a significant portion of their report estate becomes redundant — reports that existed purely to answer slight variations of the same recurring questions. The long tail of ad hoc requests shifts to the agent. That’s the version of done worth aiming for.

The thing I always say upfront now

At the end of every engagement, there’s a conversation about who owns the semantic layer going forward. It matters more than most clients expect. The model evolves — business definitions change, new domains get added, the underlying tables shift. If there’s no data team with the authority to maintain it, the model degrades quietly after handover and nobody quite knows why the answers started getting worse.
I’ve seen this happen. I say it explicitly at kickoff now.

Thinking about a Cortex build?

Ten to twelve weeks to a production-ready first domain is achievable — with the right foundation and a team that can make decisions. If you’re not sure whether your data environment is ready, our readiness assessment will tell you. Let’s talk.

This blog was written by Wil Grebner, Principal Consultant at EdgeRed.

About EdgeRed

EdgeRed is an Australian AI and data consultancy, part of The Omnia Collective group, with teams in Sydney and Melbourne. We build things that work in production — agentic AI, machine learning, data engineering, and Microsoft Fabric implementation. 250+ projects. 100+ clients. 100% Australian on-shore team.

Subscribe to our newsletter for practical data and AI insights, straight to your inbox.