Skip to content
  1. OPERATORS
  2. RETRIEVERS

Search RAG

v1.0.0new

search_rag builds a vector index from documents and searches it for relevant context. Use it when a document table or folder of text/Markdown files should become a semantic retrieval source for panels, agents, or chat-context injection.

The operator sends document content to the embedding_sidecar, chunks it, embeds it with the selected Provider and Model, and stores an active vector index. Search queries return matching chunks into results_table and the output DAT. Optional reranking can run a second pass over retrieved chunks.

The operator replaces the older separate RAG index/retriever pattern by combining indexing, persistence, search, and agent tool exposure in one COMP.

  1. Connect a document table to input 1, or set Input Mode to Folder and choose Document Folder plus File Pattern.
  2. Choose Provider, Model, Chunk Size, Overlap, and Index Name on the Index page.
  3. Pulse Create Index and wait until Health is Ready and Progress reaches 1.
  4. Set Query, Top K, Min Similarity, and optional Enable Rerank, then pulse Search.
  5. Inspect results_table or the output DAT. Use Output Mode Chat when input 2 carries a chat table and you want retrieved context appended as a system row.
  6. Enable Save to File, choose Index Folder, and pulse Save Index when the index should survive sidecar/project restarts.
  • Input 1: Document table with content/text/body plus optional id, filename, and metadata columns.
  • Input 2: Optional chat table used only when Output Mode is Chat.
  • Output 1: Raw retrieved context text, separated by dividers. In Chat mode, the operator also writes an internal output_table with copied chat rows plus a context system row.

search_rag exposes GetTool() only after the index is active and Enable Tool is on. The tool name comes from Tool Name, defaulting to search_index, and lets an agent search the built vector index.

When Allow Agent Control is enabled, the agent may also override result count and similarity threshold for a call. Tool Preset can turn the tool off or configure whether that control is available.

  • agent: Calls the built index as a document search tool.
  • source_dat: Supplies normalized document rows for indexing.
  • source_docs: Supplies local document content.
  • source_crawl4ai: Feeds crawled web content into the index.
  • source_github: Feeds repository text into the index.
  • search_text: Provides lightweight lexical search alongside semantic retrieval.
  • The embedding_sidecar must be running and pass preflight before indexing or search can complete.
  • Agents do not see the search tool on a fresh placement; the index must be active first.
  • Changing Provider, Model, Dimension, chunking, or document source can make an existing index stale. Rebuild or upsert before relying on results.
  • Auto Index is useful while iterating but can trigger work when source tables or folders change.
  • Without Save to File, the active index is not durable across sidecar restarts.
  • OpenAI embedding requires a configured OpenAI API key; Local uses Ollama at localhost:11434.
Status (Status) op('search_rag').par.Status Str

Current vector-index status.

Default:
"" (Empty String)
Health (Indexhealth) op('search_rag').par.Indexhealth Menu
Default:
not_built
Options:
not_built, building, ready, stale, error
Progress (Progress) op('search_rag').par.Progress Float
Default:
0.0
Range:
0 to 1
Search Header
Query (Query) op('search_rag').par.Query Str

Search query for the active vector index.

Default:
"" (Empty String)
Search (Search) op('search_rag').par.Search Pulse
Default:
False
Config Header
Top K (Topk) op('search_rag').par.Topk Int

Maximum number of vector-search chunks to return.

Default:
5
Range:
1 to 100
Min Similarity (Similaritythreshold) op('search_rag').par.Similaritythreshold Float

Minimum similarity score for returned chunks.

Default:
0.0
Range:
0 to 1
Reranking Header
Enable Rerank (Rerank) op('search_rag').par.Rerank Toggle

Run a second-pass reranker on retrieved chunks when available.

Default:
False
Rerank Model (Rerankmodel) op('search_rag').par.Rerankmodel StrMenu
Default:
default
Menu Options:
  • Default (FlashRank) (default)
  • ms-marco-MiniLM-L-12-v2 (ms-marco-MiniLM-L-12-v2)
Output Header
Output Mode (Outputmode) op('search_rag').par.Outputmode Menu

Choose whether search output is raw context text or a chat-table context injection.

Default:
raw
Options:
raw, chat
Context Prefix (Addtext) op('search_rag').par.Addtext Str
Default:
Use the following context to answer the question:
Actions Header
Clear Results (Clearresults) op('search_rag').par.Clearresults Pulse
Default:
False
Reset (Reset) op('search_rag').par.Reset Pulse

Clear results, index, and logs

Default:
False
Source Header
Input Mode (Inputmode) op('search_rag').par.Inputmode Menu
Default:
auto
Options:
auto, doctable, folder
Document Folder (Documentfolder) op('search_rag').par.Documentfolder Folder
Default:
"" (Empty String)
File Pattern (Filepattern) op('search_rag').par.Filepattern Str
Default:
*.txt *.md
Embedding Header
Provider (Embedmodel) op('search_rag').par.Embedmodel Menu
Default:
local
Options:
local, openai
Model (Ollamamodel) op('search_rag').par.Ollamamodel StrMenu
Default:
nomic-embed-text
Menu Options:
  • nomic-embed-text (nomic-embed-text)
  • nomic-embed-text-v2-moe (recommended) (nomic-embed-text-v2-moe)
  • bge-m3 (multilingual) (bge-m3)
  • qwen3-embedding:0.6b (lightweight) (qwen3-embedding:0.6b)
  • mxbai-embed-large (mxbai-embed-large)
  • all-minilm (fast, small) (all-minilm)
Dimension (Embeddimension) op('search_rag').par.Embeddimension Int

Optional embedding dimension truncation. 0 uses the model default.

Default:
0
Range:
0 to 4096
Chunking Header
Strategy (Chunkstrategy) op('search_rag').par.Chunkstrategy Menu
Default:
recursive
Options:
recursive, fixed
Chunk Size (Chunksize) op('search_rag').par.Chunksize Int

Target characters per indexed chunk.

Default:
1024
Range:
64 to 16384
Overlap (Chunkoverlap) op('search_rag').par.Chunkoverlap Int

Characters shared between adjacent chunks.

Default:
128
Range:
0 to 4096
Actions Header
Index Name (Indexname) op('search_rag').par.Indexname Str
Default:
"" (Empty String)
Auto Index (Autoindex) op('search_rag').par.Autoindex Toggle
Default:
True
Create Index (Createindex) op('search_rag').par.Createindex Pulse
Default:
False
Mark Dirty (Dirtyindex) op('search_rag').par.Dirtyindex Pulse
Default:
False
Clear Index (Clearindex) op('search_rag').par.Clearindex Pulse
Default:
False
Persistence Header
Save to File (Savetofile) op('search_rag').par.Savetofile Toggle
Default:
False
Index Folder (Indexfolder) op('search_rag').par.Indexfolder Folder
Default:
"" (Empty String)
Load on Start (Loadonstart) op('search_rag').par.Loadonstart Toggle
Default:
False
Save Doc Table (Savedattable) op('search_rag').par.Savedattable Toggle
Default:
True
Save Index (Saveindex) op('search_rag').par.Saveindex Pulse
Default:
False
Load Index (Loadindex) op('search_rag').par.Loadindex Pulse
Default:
False
Stats Header
Documents (Documentcount) op('search_rag').par.Documentcount Int

Number of source documents tracked for the current index.

Default:
0
Range:
0 to 1
Chunks (Chunkcount) op('search_rag').par.Chunkcount Int

Number of indexed chunks currently tracked.

Default:
0
Range:
0 to 1
Last Indexed (Lastindexed) op('search_rag').par.Lastindexed Str
Default:
"" (Empty String)
Preset (Toolpreset) op('search_rag').par.Toolpreset Menu
Default:
full
Options:
custom, off, readonly, interactive, full
Enable Tool (Enablesearchindex) op('search_rag').par.Enablesearchindex Toggle
Default:
True
Allow Agent Control (Allowagentcontrol) op('search_rag').par.Allowagentcontrol Toggle
Default:
False
Tool Config Header
Tool Name (Toolname) op('search_rag').par.Toolname Str
Default:
search_index
Description (Tooldescription) op('search_rag').par.Tooldescription Str

Description exposed to agents for this vector search tool.

Default:
Search a vector index of documents for relevant information.
v1.0.02026-05-02
  • reorganized parameter layout: Search page first with query/config/reranking/output, Index page second with source/embedding/chunking/persistence - added Reset pulse and Clear Results pulse to Search page - added Status label reset and Indexhealth readout - updated category to Retrievers
  • set release_level to prod
  • Initial search_rag structure