RAG Retriever

v2.0.1Updated

The RAG Retriever LOP queries a vector index created by a RAG Index LOP and returns semantically relevant text chunks. It communicates with the embedding_sidecar over HTTP for all query operations, supporting both single and batch queries. Results can be output as raw text or as an augmented chat table ready for downstream LLM operators.

Agent Tool Integration

🔧 GetTool Enabled 1 tool

This operator exposes 1 tool that allow Agent and Gemini Live LOPs to retrieve relevant text chunks from a connected knowledge base using single or multi-query semantic search.

Use the Tool Debugger operator to inspect exact tool definitions, schemas, and parameters.

When connected to an Agent LOP, the agent receives a retrieve_knowledge tool. The agent can pass a single query string or an array of queries to retrieve information about multiple topics in one call. Batch queries are processed in a single request for efficiency. Results are returned with similarity scores and source metadata.

If ‘Allow Agent Parameter Control’ is enabled, the agent can also override ‘Top K Results’ and ‘Similarity Threshold’ per query. Otherwise, the operator’s parameter values are always used.

Requirements

SideCar: The operator requires the embedding_sidecar to be running. It is started automatically when a query is triggered.
RAG Index LOP: A configured RAG Index operator with an active (loaded) index must be referenced via the ‘RAG_Index OP’ parameter.

Input/Output

Inputs

Input 1 (optional): A chat table DAT or text DAT used as the query source when ‘Search Mode’ is set to ‘Last User’, ‘Last Assistant’, or ‘Full Chat’.
Input 2 (optional): A text DAT used as the query source when ‘Search Mode’ is set to ‘Input 2’.

Outputs

Output 1: Retrieved text chunks. The format depends on ‘Out1 Output Mode’:
- Raw Context: Chunks separated by --- dividers as plain text.
- Chat Table: The original input conversation (if present) plus a system message containing the retrieved context, formatted for direct use with downstream Chat or Agent operators.

Usage Examples

Basic Retrieval with a Custom Query

Set ‘RAG_Index OP’ to point at a configured RAG Index LOP that has an active index.
Set ‘Search Mode’ to ‘Custom’.
Enter your search text in ‘Query Phrase’.
Set ‘Top K Results’ to the number of chunks you want returned (e.g., 5).
Optionally raise ‘Similarity Threshold’ to filter out low-relevance results.
Pulse ‘Query Index’ to run the search.
Results appear in Output 1 based on the selected output mode.

Augmenting a Chat Conversation with Context

Wire a chat table DAT (from a Chat or Agent LOP) into Input 1.
Set ‘RAG_Index OP’ to your active RAG Index LOP.
Set ‘Search Mode’ to ‘Last User’ to search using the most recent user message.
Set ‘Out1 Output Mode’ to ‘Chat Table’.
Pulse ‘Query Index’.
The output table now contains the original conversation plus a system message with the retrieved context — wire this into a Chat or Agent LOP to give the model relevant knowledge.

Using with an Agent for Automatic Retrieval

Place an Agent LOP and a RAG Retriever LOP in your network.
Point ‘RAG_Index OP’ to your active RAG Index LOP.
Connect the RAG Retriever to the Agent’s tool inputs.
The agent will automatically call retrieve_knowledge when it needs information from the knowledge base, passing single queries or lists of queries as needed.

Search Modes

Last User: Extracts the most recent user message from the Input 1 chat table as the query.
Last Assistant: Extracts the most recent assistant message from the Input 1 chat table.
Full Chat: Combines all user and assistant messages from the Input 1 chat table into a single query.
Input 2: Uses the text content of Input 2 as the query.
Custom: Uses the text entered in ‘Query Phrase’ as the query.

Best Practices

Start with a ‘Similarity Threshold’ of 0 and increase it gradually to find the right cutoff for your data. Setting it too high may filter out useful results.
Use ‘Full Chat’ search mode sparingly — longer queries can reduce precision. ‘Last User’ is usually the best default for chat-based workflows.
When using the agent tool with multiple queries, group related topics together (e.g., comparing pricing across vendors) to get organized, per-topic results.
The ‘Add Text’ parameter lets you customize the system message prefix that wraps the retrieved context in Chat Table output mode.

Troubleshooting

“Cannot query: No valid index connection”: The referenced RAG Index LOP either has no loaded index or is not correctly referenced. Verify that the RAG Index operator has an active index (its ‘Active Index’ indicator should be on) and that ‘RAG_Index OP’ points to it.
“Custom mode requires Query Phrase parameter to not be empty”: When using ‘Custom’ search mode, you must enter text in the ‘Query Phrase’ field.
No results returned: Try lowering ‘Similarity Threshold’ or increasing ‘Top K Results’. The index may not contain content relevant to the query.
“Embedding server not available”: The SideCar could not be started or reached. Check that the SideCar system component is running and that the embedding_sidecar process is healthy.

Parameters

Retriever

Query Index (Query) op('rag_retriever').par.Query Pulse

Default:: False

RAG_Index OP (Indexsource) op('rag_retriever').par.Indexsource COMP

Default:: "" (Empty String)

Query Phrase (Queryphrase) op('rag_retriever').par.Queryphrase Str

Default:: "" (Empty String)

Add Text (Addtext) op('rag_retriever').par.Addtext Str

Default:: "" (Empty String)

Status Log (Statuslog) op('rag_retriever').par.Statuslog Str

Default:: "" (Empty String)

Active Index (Indexstatus) op('rag_retriever').par.Indexstatus Toggle

Default:: False

Top Result Only (Topresultonly) op('rag_retriever').par.Topresultonly Toggle

Default:: False

Top K Results (Topk) op('rag_retriever').par.Topk Int

Default:: 0
Range:: 0 to 1
Slider Range:: 0 to 10

Similarity Threshold (Similaritythreshold) op('rag_retriever').par.Similaritythreshold Float

Default:: 0.0
Range:: 0 to 1
Slider Range:: 0 to 1

Display Results (Displayresults) op('rag_retriever').par.Displayresults Toggle

Default:: False

Header

Clear Results (Clearresults) op('rag_retriever').par.Clearresults Pulse

Default:: False

Changelog

v2.0.12026-03-26

Rename sidecar reference from embedding_server to embedding_sidecar throughout — aligns with finalized sidecar naming scheme
Updated EnsureSidecar and GetSidecarUrl calls in RetrieverEXT.py
Updated sidecar field in manifest.json

v2.0.02026-03-02

Refactor to HTTP sidecar client, remove llama-index dependency - Add batch query support via /query/batch endpoint - Preserve GetTool and HandleRAGQuery agent interface - Add sidecar field to manifest
Initial commit

v1.2.02025-07-30

Added IsIndexConnected tdu.Dependency for reactive state tracking
Added new "Input Query" search mode that reads from in2_query operator
Enhanced connection logic to use IndexActive dependency from IndexCreator
Added Addtext parameter for context prefix
Changed input handling from inputs[0] to in1_conversation operator
Fixed missing os import and improved error handling
Updated GetTool and HandleRAGQuery to use reactive dependencies

v1.1.02025-06-30

added GetTool method to the operator so it can be used by the LOPs controllers

v1.0.02024-11-06

Initial release