Skip to content

RAG Retriever

v2.0.1Updated

The RAG Retriever LOP queries a vector index created by a RAG Index LOP and returns semantically relevant text chunks. It communicates with the embedding_sidecar over HTTP for all query operations, supporting both single and batch queries. Results can be output as raw text or as an augmented chat table ready for downstream LLM operators.

🔧 GetTool Enabled 1 tool

This operator exposes 1 tool that allow Agent and Gemini Live LOPs to retrieve relevant text chunks from a connected knowledge base using single or multi-query semantic search.

When connected to an Agent LOP, the agent receives a retrieve_knowledge tool. The agent can pass a single query string or an array of queries to retrieve information about multiple topics in one call. Batch queries are processed in a single request for efficiency. Results are returned with similarity scores and source metadata.

If ‘Allow Agent Parameter Control’ is enabled, the agent can also override ‘Top K Results’ and ‘Similarity Threshold’ per query. Otherwise, the operator’s parameter values are always used.

  • SideCar: The operator requires the embedding_sidecar to be running. It is started automatically when a query is triggered.
  • RAG Index LOP: A configured RAG Index operator with an active (loaded) index must be referenced via the ‘RAG_Index OP’ parameter.
  • Input 1 (optional): A chat table DAT or text DAT used as the query source when ‘Search Mode’ is set to ‘Last User’, ‘Last Assistant’, or ‘Full Chat’.
  • Input 2 (optional): A text DAT used as the query source when ‘Search Mode’ is set to ‘Input 2’.
  • Output 1: Retrieved text chunks. The format depends on ‘Out1 Output Mode’:
    • Raw Context: Chunks separated by --- dividers as plain text.
    • Chat Table: The original input conversation (if present) plus a system message containing the retrieved context, formatted for direct use with downstream Chat or Agent operators.
  1. Set ‘RAG_Index OP’ to point at a configured RAG Index LOP that has an active index.
  2. Set ‘Search Mode’ to ‘Custom’.
  3. Enter your search text in ‘Query Phrase’.
  4. Set ‘Top K Results’ to the number of chunks you want returned (e.g., 5).
  5. Optionally raise ‘Similarity Threshold’ to filter out low-relevance results.
  6. Pulse ‘Query Index’ to run the search.
  7. Results appear in Output 1 based on the selected output mode.

Augmenting a Chat Conversation with Context

Section titled “Augmenting a Chat Conversation with Context”
  1. Wire a chat table DAT (from a Chat or Agent LOP) into Input 1.
  2. Set ‘RAG_Index OP’ to your active RAG Index LOP.
  3. Set ‘Search Mode’ to ‘Last User’ to search using the most recent user message.
  4. Set ‘Out1 Output Mode’ to ‘Chat Table’.
  5. Pulse ‘Query Index’.
  6. The output table now contains the original conversation plus a system message with the retrieved context — wire this into a Chat or Agent LOP to give the model relevant knowledge.

Using with an Agent for Automatic Retrieval

Section titled “Using with an Agent for Automatic Retrieval”
  1. Place an Agent LOP and a RAG Retriever LOP in your network.
  2. Point ‘RAG_Index OP’ to your active RAG Index LOP.
  3. Connect the RAG Retriever to the Agent’s tool inputs.
  4. The agent will automatically call retrieve_knowledge when it needs information from the knowledge base, passing single queries or lists of queries as needed.
  • Last User: Extracts the most recent user message from the Input 1 chat table as the query.
  • Last Assistant: Extracts the most recent assistant message from the Input 1 chat table.
  • Full Chat: Combines all user and assistant messages from the Input 1 chat table into a single query.
  • Input 2: Uses the text content of Input 2 as the query.
  • Custom: Uses the text entered in ‘Query Phrase’ as the query.
  • Start with a ‘Similarity Threshold’ of 0 and increase it gradually to find the right cutoff for your data. Setting it too high may filter out useful results.
  • Use ‘Full Chat’ search mode sparingly — longer queries can reduce precision. ‘Last User’ is usually the best default for chat-based workflows.
  • When using the agent tool with multiple queries, group related topics together (e.g., comparing pricing across vendors) to get organized, per-topic results.
  • The ‘Add Text’ parameter lets you customize the system message prefix that wraps the retrieved context in Chat Table output mode.
  • “Cannot query: No valid index connection”: The referenced RAG Index LOP either has no loaded index or is not correctly referenced. Verify that the RAG Index operator has an active index (its ‘Active Index’ indicator should be on) and that ‘RAG_Index OP’ points to it.
  • “Custom mode requires Query Phrase parameter to not be empty”: When using ‘Custom’ search mode, you must enter text in the ‘Query Phrase’ field.
  • No results returned: Try lowering ‘Similarity Threshold’ or increasing ‘Top K Results’. The index may not contain content relevant to the query.
  • “Embedding server not available”: The SideCar could not be started or reached. Check that the SideCar system component is running and that the embedding_sidecar process is healthy.
Query Index (Query) op('rag_retriever').par.Query Pulse
Default:
False
RAG_Index OP (Indexsource) op('rag_retriever').par.Indexsource COMP
Default:
"" (Empty String)
Search Mode (Searchmode) op('rag_retriever').par.Searchmode Menu
Default:
last_user
Options:
last_user, in2, full_chat, last_assistant, custom
Query Phrase (Queryphrase) op('rag_retriever').par.Queryphrase Str
Default:
"" (Empty String)
Add Text (Addtext) op('rag_retriever').par.Addtext Str
Default:
"" (Empty String)
Status Log (Statuslog) op('rag_retriever').par.Statuslog Str
Default:
"" (Empty String)
Active Index (Indexstatus) op('rag_retriever').par.Indexstatus Toggle
Default:
False
Top Result Only (Topresultonly) op('rag_retriever').par.Topresultonly Toggle
Default:
False
Top K Results (Topk) op('rag_retriever').par.Topk Int
Default:
0
Range:
0 to 1
Slider Range:
0 to 10
Similarity Threshold (Similaritythreshold) op('rag_retriever').par.Similaritythreshold Float
Default:
0.0
Range:
0 to 1
Slider Range:
0 to 1
Display Results (Displayresults) op('rag_retriever').par.Displayresults Toggle
Default:
False
Out1 Output Mode (Outputmode) op('rag_retriever').par.Outputmode Menu
Default:
raw
Options:
raw, chat
Header
Clear Results (Clearresults) op('rag_retriever').par.Clearresults Pulse
Default:
False
v2.0.12026-03-26
  • Rename sidecar reference from embedding_server to embedding_sidecar throughout — aligns with finalized sidecar naming scheme
  • Updated EnsureSidecar and GetSidecarUrl calls in RetrieverEXT.py
  • Updated sidecar field in manifest.json
v2.0.02026-03-02
  • Refactor to HTTP sidecar client, remove llama-index dependency - Add batch query support via /query/batch endpoint - Preserve GetTool and HandleRAGQuery agent interface - Add sidecar field to manifest
  • Initial commit
v1.2.02025-07-30
  • Added IsIndexConnected tdu.Dependency for reactive state tracking
  • Added new "Input Query" search mode that reads from in2_query operator
  • Enhanced connection logic to use IndexActive dependency from IndexCreator
  • Added Addtext parameter for context prefix
  • Changed input handling from inputs[0] to in1_conversation operator
  • Fixed missing os import and improved error handling
  • Updated GetTool and HandleRAGQuery to use reactive dependencies
v1.1.02025-06-30

added GetTool method to the operator so it can be used by the LOPs controllers

v1.0.02024-11-06

Initial release