OPERATORS
RETRIEVERS

Search RAG

v1.0.0new

search_rag builds a vector index from documents and searches it for relevant context. Use it when a document table or folder of text/Markdown files should become a semantic retrieval source for panels, agents, or chat-context injection.

What It Does

The operator sends document content to the embedding_sidecar, chunks it, embeds it with the selected Provider and Model, and stores an active vector index. Search queries return matching chunks into results_table and the output DAT. Optional reranking can run a second pass over retrieved chunks.

The operator replaces the older separate RAG index/retriever pattern by combining indexing, persistence, search, and agent tool exposure in one COMP.

Typical Workflow

Connect a document table to input 1, or set Input Mode to Folder and choose Document Folder plus File Pattern.
Choose Provider, Model, Chunk Size, Overlap, and Index Name on the Index page.
Pulse Create Index and wait until Health is Ready and Progress reaches 1.
Set Query, Top K, Min Similarity, and optional Enable Rerank, then pulse Search.
Inspect results_table or the output DAT. Use Output Mode Chat when input 2 carries a chat table and you want retrieved context appended as a system row.
Enable Save to File, choose Index Folder, and pulse Save Index when the index should survive sidecar/project restarts.

Inputs And Outputs

Input 1: Document table with content/text/body plus optional id, filename, and metadata columns.
Input 2: Optional chat table used only when Output Mode is Chat.
Output 1: Raw retrieved context text, separated by dividers. In Chat mode, the operator also writes an internal output_table with copied chat rows plus a context system row.

Agent Tool Use

search_rag exposes GetTool() only after the index is active and Enable Tool is on. The tool name comes from Tool Name, defaulting to search_index, and lets an agent search the built vector index.

When Allow Agent Control is enabled, the agent may also override result count and similarity threshold for a call. Tool Preset can turn the tool off or configure whether that control is available.

Works Well With

agent: Calls the built index as a document search tool.
source_dat: Supplies normalized document rows for indexing.
source_docs: Supplies local document content.
source_crawl4ai: Feeds crawled web content into the index.
source_github: Feeds repository text into the index.
search_text: Provides lightweight lexical search alongside semantic retrieval.

Gotchas

The embedding_sidecar must be running and pass preflight before indexing or search can complete.
Agents do not see the search tool on a fresh placement; the index must be active first.
Changing Provider, Model, Dimension, chunking, or document source can make an existing index stale. Rebuild or upsert before relying on results.
Auto Index is useful while iterating but can trigger work when source tables or folders change.
Without Save to File, the active index is not durable across sidecar restarts.
OpenAI embedding requires a configured OpenAI API key; Local uses Ollama at localhost:11434.

Parameters

Search

Status (Status) op('search_rag').par.Status Str

Current vector-index status.

Default:: "" (Empty String)

Progress (Progress) op('search_rag').par.Progress Float

Default:: 0.0
Range:: 0 to 1

Search Header

Query (Query) op('search_rag').par.Query Str

Search query for the active vector index.

Default:: "" (Empty String)

Search (Search) op('search_rag').par.Search Pulse

Default:: False

Config Header

Top K (Topk) op('search_rag').par.Topk Int

Maximum number of vector-search chunks to return.

Default:: 5
Range:: 1 to 100

Min Similarity (Similaritythreshold) op('search_rag').par.Similaritythreshold Float

Minimum similarity score for returned chunks.

Default:: 0.0
Range:: 0 to 1

Reranking Header

Enable Rerank (Rerank) op('search_rag').par.Rerank Toggle

Run a second-pass reranker on retrieved chunks when available.

Default:: False

Rerank Model (Rerankmodel) op('search_rag').par.Rerankmodel StrMenu

Default:

default

Menu Options:

Default (FlashRank) (default)
ms-marco-MiniLM-L-12-v2 (ms-marco-MiniLM-L-12-v2)

Output Header

Context Prefix (Addtext) op('search_rag').par.Addtext Str

Default:: Use the following context to answer the question:

Actions Header

Clear Results (Clearresults) op('search_rag').par.Clearresults Pulse

Default:: False

Reset (Reset) op('search_rag').par.Reset Pulse

Clear results, index, and logs

Default:: False

Index

Source Header

Document Folder (Documentfolder) op('search_rag').par.Documentfolder Folder

Default:: "" (Empty String)

File Pattern (Filepattern) op('search_rag').par.Filepattern Str

Default:: *.txt *.md

Embedding Header

Model (Ollamamodel) op('search_rag').par.Ollamamodel StrMenu

Default:

nomic-embed-text

Menu Options:

nomic-embed-text (nomic-embed-text)
nomic-embed-text-v2-moe (recommended) (nomic-embed-text-v2-moe)
bge-m3 (multilingual) (bge-m3)
qwen3-embedding:0.6b (lightweight) (qwen3-embedding:0.6b)
mxbai-embed-large (mxbai-embed-large)
all-minilm (fast, small) (all-minilm)

Dimension (Embeddimension) op('search_rag').par.Embeddimension Int

Optional embedding dimension truncation. 0 uses the model default.

Default:: 0
Range:: 0 to 4096

Chunking Header

Chunk Size (Chunksize) op('search_rag').par.Chunksize Int

Target characters per indexed chunk.

Default:: 1024
Range:: 64 to 16384

Overlap (Chunkoverlap) op('search_rag').par.Chunkoverlap Int

Characters shared between adjacent chunks.

Default:: 128
Range:: 0 to 4096

Actions Header

Index Name (Indexname) op('search_rag').par.Indexname Str

Default:: "" (Empty String)

Auto Index (Autoindex) op('search_rag').par.Autoindex Toggle

Default:: True

Create Index (Createindex) op('search_rag').par.Createindex Pulse

Default:: False

Mark Dirty (Dirtyindex) op('search_rag').par.Dirtyindex Pulse

Default:: False

Clear Index (Clearindex) op('search_rag').par.Clearindex Pulse

Default:: False

Persistence Header

Save to File (Savetofile) op('search_rag').par.Savetofile Toggle

Default:: False

Index Folder (Indexfolder) op('search_rag').par.Indexfolder Folder

Default:: "" (Empty String)

Load on Start (Loadonstart) op('search_rag').par.Loadonstart Toggle

Default:: False

Save Doc Table (Savedattable) op('search_rag').par.Savedattable Toggle

Default:: True

Save Index (Saveindex) op('search_rag').par.Saveindex Pulse

Default:: False

Load Index (Loadindex) op('search_rag').par.Loadindex Pulse

Default:: False

Stats Header

Documents (Documentcount) op('search_rag').par.Documentcount Int

Number of source documents tracked for the current index.

Default:: 0
Range:: 0 to 1

Chunks (Chunkcount) op('search_rag').par.Chunkcount Int

Number of indexed chunks currently tracked.

Default:: 0
Range:: 0 to 1

Last Indexed (Lastindexed) op('search_rag').par.Lastindexed Str

Default:: "" (Empty String)

Tool

Enable Tool (Enablesearchindex) op('search_rag').par.Enablesearchindex Toggle

Default:: True

Allow Agent Control (Allowagentcontrol) op('search_rag').par.Allowagentcontrol Toggle

Default:: False

Tool Config Header

Tool Name (Toolname) op('search_rag').par.Toolname Str

Default:: search_index

Description (Tooldescription) op('search_rag').par.Tooldescription Str

Description exposed to agents for this vector search tool.

Default:: Search a vector index of documents for relevant information.

Changelog

v1.0.02026-05-02

reorganized parameter layout: Search page first with query/config/reranking/output, Index page second with source/embedding/chunking/persistence - added Reset pulse and Clear Results pulse to Search page - added Status label reset and Indexhealth readout - updated category to Retrievers
set release_level to prod
Initial search_rag structure