Skip to main content
The Flow Retriever lets you use headless Flow Designer flows as virtual data sources in the RAG pipeline. Instead of querying a search index, the system executes a retriever flow that can query databases, call APIs, run custom logic, or combine multiple sources — and the results are merged into the LangChain context alongside traditional search results.

How It Works

User sends a message
  └── LangChain Orchestrator
       ├── Azure AI Search (index-based sources) ────── documents
       ├── Flow Retriever (flow-based sources) ───────── documents
       ├── Vector Retriever (external vector stores) ─── documents   ──→  merged into RAG context
       └── Web Search (if enabled) ───────────────────── documents
  1. The orchestrator identifies FLOW_RETRIEVER entries in the chat’s dataSources[]
  2. Each retriever source is executed in parallel via performFlowRetrieverRAG()
  3. The flow receives { query, connectionId, _retrieverInstructions } as inputs
  4. The flow’s final output is parsed as structured documents
  5. Documents are merged into the RAG context for the LLM to cite

Execution Contract

Retriever flows receive these inputs automatically:
InputDescription
query / user_inputThe user’s latest message
connectionIdResolved data connection ID (fixed or user-selected)
_retrieverInstructionsNatural-language instructions set by the chat admin
_userIdAuthenticated user’s ID
_userUpnAuthenticated user’s UPN
_userEmailAuthenticated user’s email
Any flowParams.*Additional parameters configured by the admin
The flow’s final LLM node should produce one of these output formats:
  • JSON array[{ "title": "...", "content": "...", "url": "..." }]
  • JSON object with documents key{ "documents": [...] } or { "results": [...] }
  • Plain text — Wrapped as a single document automatically
JSON output can be wrapped in markdown code fences (```json ... ```).

Connection Modes

ModeconnectionModeBehavior
Fixed (admin-set)fixedThe admin selects a data connection at configuration time. It’s baked into flowParams.connectionId and used for every query. The sidebar shows a read-only label.
User selects at runtimeuserEnd users see a connection picker in the right sidebar. They choose a connection before each query. The selected ID is passed as runtimeConnectionIds.
HybridhybridAdmin sets a default connection, but users can override it at runtime from the sidebar. Falls back to the admin default if the user hasn’t chosen one.
All connections are ACL-checked before execution — the user must have access to the connection in the Data Platform Connections registry.

Configuring a Chat with Flow Retriever

  1. Open a chat’s Edit Form → Data Sources
  2. Check Flow Retriever
  3. Select a Retriever Flow from the dropdown (only flows with flowType: retriever appear)
  4. Choose a Connection Mode:
    • Fixed — pick a connection from the dropdown
    • User selects at runtime — users choose in the sidebar
  5. Optionally add Retriever Instructions — natural-language guidance injected as {{_retrieverInstructions}}
  6. Save the chat

Creating a Retriever Flow

Retriever flows are standard Flow Designer flows with a few constraints:
  1. Set flowType to retriever — In the flow editor, set the flow type to “Retriever”. This marks the flow as headless and makes it available in the Flow Retriever dropdown.
  2. No user interaction nodes — Retriever flows must not contain FORM_PROMPT or HUMAN nodes. They execute headlessly with no user interaction at execution time.
  3. Accept the standard inputs — The Start node should expect query (the user’s search text) and optionally connectionId and _retrieverInstructions.
  4. Return structured documents — The final LLM/output node should produce a JSON array of documents:
    [
      { "title": "Row 1", "content": "The data from this row...", "url": "optional-link" },
      { "title": "Row 2", "content": "Another result..." }
    ]
    

Example: Database Retriever Flow

A typical SQL retriever flow has this node graph:
Start → LLM (NL→SQL) → Tool (Execute SQL) → LLM (Format Results) → End
Nodes:
#Node TypePurpose
1StartReceives { query, connectionId }
2LLMConverts the natural-language query into SQL using a system prompt with schema context. Receives {{_retrieverInstructions}} for domain-specific guidance.
3ToolExecutes the generated SQL against the connectionId using the SQL tool provider
4LLMFormats the raw query results into the structured [{ title, content }] JSON array
5EndReturns the formatted documents
System prompt for node 2 (NL→SQL):
You are a SQL query generator. Convert the user's natural-language question
into a valid SQL query for the connected database.

{{_retrieverInstructions}}

User question: {{query}}

Respond with ONLY the SQL query, no explanation.
System prompt for node 4 (Format Results):
Format the following SQL query results as a JSON array.
Each element must have "title" and "content" fields.
The title should be a short identifier. The content should be
a natural-language summary of the row data.

Results:
{{sql_results}}

Respond with ONLY the JSON array.

Bootstrapped Retriever Flows

Two example retriever flows are included and can be seeded via Admin → Bootstrap Assets ([#/admin/bootstrap]) → Bootstrap Flows:
FlowGroupDescription
Database RetrieverRetriever FlowsNL→SQL→Execute→Format pipeline for relational databases
Cosmos DB RetrieverRetriever FlowsNL→Cosmos SQL→Execute→Format pipeline for Azure Cosmos DB
These flows are production-ready starting points. Clone and customize them for your specific database schemas and query patterns.

Security

  • Connection ACL — Every connection is checked against the user’s entitlements before execution
  • Flow validation — Only flows with flowType: retriever are accepted
  • User identity injection_userId, _userUpn, _userEmail are injected server-side for row-level filtering
  • Timeout — Each flow execution has a 30-second timeout boundary
  • Parallel execution — Multiple retriever sources execute in parallel; failures are isolated per-source

IDataSourceConfig Fields (Flow Retriever)

See Datasource Catalog → IDataSourceConfig Fields for the full field reference. Flow Retriever–specific fields are summarized below.
FieldTypeDescription
type'flowretriever'Identifies this as a flow retriever source
flowIdstringFK to the retriever flow in the flows Cosmos container
flowParamsRecord<string, any>Pre-configured inputs (e.g., { connectionId: '...' })
connectionMode'fixed' | 'user' | 'hybrid'How the connection is resolved
retrieverInstructionsstringNL instructions injected as {{_retrieverInstructions}}
indexName''Always empty — flow retrievers don’t use search indexes