Jan 12, 2026
Building AI That Understands Capital Markets
Author | Audience |
|---|---|
Joe Barhouch, AI Engineer | Business |
In part 1 of this series, we explored why generic AI fails in Capital Markets: the high cost of error, the "looks right but is wrong" problem, and the complexity of hybrid data.
These challenges point to a fundamental reality: AI solutions and products serving Capital Markets need to be built differently from day one.
Domain Knowledge is not a single step in your AI pipeline. It is intelligence that must exist at every layer. If one layer fails, the entire pipeline fails.
Training on Capital Markets data teaches an AI vocabulary, but it does not teach it how your business operates or which calculation methodology to use. Domain Knowledge is not a dataset you add to training; it is an integral part of the architecture.
The Unstructured Data Stack
Capital Markets decisions rely on context found in research reports, regulatory filings, and market commentary. Generic AI treats these simply as text to search. Domain-aware AI treats them as structured information with specific attributes.
Metadata Extraction
Before you can find the right document, one needs to know what it contains. You need attributes that matter for enterprise decisions: company, time period, report type, and sector.
Consider an analyst asking: "What’s Morgan Stanley’s latest outlook on consumer discretionary?"
Without proper metadata, a generic system searches for those keywords and might return a 2022 report because of high textual similarity. With domain-specific metadata, the system filters to the most recent Morgan Stanley research before searching. You're searching the right shelf, not the entire library.
By identifying these attributes upfront, firms significantly increase retrieval accuracy while reducing the high costs associated with erroneous data outputs.
Query Understanding and Retrieval
Metadata only matters if your retrieval system uses it effectively.
First, the system must understand the intent. When a user requests "Q3 performance," the system must recognise this as a time filter (reports published in or after Q3), not a keyword search for documents mentioning "Q3" from previous years.
Second, retrieval also cannot rely solely on textual similarity. A query about Apple revenue can return supply-chain commentary because the language overlaps, even though the context is wrong. Domain-aware retrieval applies hard constraints – right company, right database, right period, right report type.
Validation and Grounding
Even with good retrieval, systems can produce answers that sound confident but are wrong.
A validation layer with Domain Knowledge enables two checks. Does the answer address the question as intended? And is it grounded in the sources? Generic agents may introduce metrics that were not present or misattribute information. A validation layer catches these errors before they reach the user.
The Structured Data Stack
While unstructured data provides context, structured data provides the quantitative foundation for Capital Markets decisions. But structured intelligence only works when Domain Knowledge is embedded across data, analytics, and workflows as part of the architecture.
Data (Ontology & Semantic Layer)
Structured data only becomes usable when its meaning is explicit. In practice, structured data begins by connecting, ingesting, and unifying fragmented internal and external sources into a governed, reusable data model. This model captures how entities, metrics, and relationships are defined and exposed through a semantic layer.
Bridging the "Unstructured-Structured" Gap
The true power of a domain-aware ontology lies in its ability to map structured data to unstructured context. Rather than treating tables and documents as separate silos, the ontology serves as a semantic middle layer that performs Entity Resolution. For instance, it ensures that a "ticker symbol" in a quantitative database is recognised as the exact same "object" as a company mentioned in a qualitative research report.
By creating these "why" links, the ontology ensures that downstream analytics operate on the right definitions and the full context: linking the what (quantitative metrics) with the why (qualitative precedents and exceptions). This architecture transforms data from generic rows into human-semantically shaped intelligence that reflects how Capital Markets teams actually reason about securities and portfolios.
Analytics
Analytics encode financial methodology, not just computation.
Generic systems can generate queries, but they cannot reliably determine how calculations should be performed. A request for “average returns” is meaningless without knowing whether to compound, how to weight, and which time conventions apply.
Domain-aware analytics combine traditional Business Intelligence with reasoning and quantitative validation. Domain Knowledge ensures analytics follow industry best practices and firm-specific rules, enforcing correct methodologies for each business. This is what enables explainable logic and materially higher accuracy than generic approaches.
Workflows
Structured data and analytics only create value when delivered through workflows that reflect how decisions are actually made.
Domain-aware systems deliver intelligence through purpose-built workflows that define which questions are asked, in what order, and what outputs are required to move from analysis to action. These workflows are proactive, surfacing relevant insights rather than relying on ad hoc exploration.
Domain Knowledge governs both industry-standard workflows and organisation-specific adaptations, ensuring outputs are relevant to role, context, and decision.
Why Architecture Wins Over Training
These layers are sequentially dependent. If metadata extraction fails, retrieval searches the wrong documents. If the semantic model misses business logic, the query calculates the wrong numbers. This is why prompt engineering cannot solve architectural problems.
Firms successfully deploying AI in Capital Markets recognise that training teaches patterns, but architecture enforces knowledge.
This is where Engine AI differentiates itself. Rather than relying on training alone, Engine AI is built around workflow-led architecture, treating workflows as the primary interface between data, analytics, and decision-making.
In this model, workflows are not an afterthought. They are deliberately designed and delivered through a layered interface: fixed workflows expressed as dashboards for repeatable decisions; flexible workflows supported by copilots for exploration and analysis; and conversational workflows for natural-language interaction with both structured and unstructured content.
By embedding Capital Markets Domain Knowledge directly into these workflows (defining which questions are asked, in what order, and under what constraints), Engine AI closes the last mile between insight and action. The result is intelligence that is consistent, explainable, and reliable in production, and that teams can act on with confidence.
