- OPERATORS
- RETRIEVERS
Search Text
v1.0.0newsearch_text builds a fast BM25 text index over tables, folders, or an existing RAG index. Use it when you need lightweight lexical retrieval, exact phrase support, required/excluded terms, code-aware tokenization, or a dependency-free companion to semantic search.
What It Does
Section titled “What It Does”The operator reads the selected source, builds an in-memory BM25 index, and writes ranked hits to results_table plus search statistics to stats_table. It supports Source Table, RAG Index, Direct Table, and Folder input modes, optional chunking, field weighting, deduplication, and multi-query rank fusion.
Search syntax supports quoted phrases, +required terms, -excluded terms, and | separated multi-query searches when Multi-Query is enabled.
Typical Workflow
Section titled “Typical Workflow”- On the BM25 page, choose Input Mode and set the matching source: Source Table, RAG Index Source, Direct Table DAT, or Document Folder.
- Choose a Search Preset and Tokenizer, then adjust K1 / B only when the preset is not enough.
- Enable Chunking for long documents, or Field Weighting when selected table columns should influence ranking more strongly.
- On the Search page, enter Query and set Top K Results, Minimum Score, Deduplicate, and Multi-Query as needed.
- Pulse Search. With Auto Index enabled, the index builds or rebuilds before searching when the source has changed.
- Inspect
results_table,stats_table, anddoc_index_table.
Inputs And Outputs
Section titled “Inputs And Outputs”- Inputs: No connector inputs. Select data through Source Table, RAG Index Source, Direct Table DAT, or Document Folder.
- Output 1: Search results table.
- Output 2: Search statistics table.
Agent Tool Use
Section titled “Agent Tool Use”search_text exposes GetTool() after the text index is ready. The main tool name comes from Tool Name, defaulting to search_data, and lets an agent run single-query or multi-query BM25 searches.
When Expose Fetch Tool is enabled, the operator also exposes a companion fetch tool, derived as {Tool Name}_fetch unless Fetch Tool Name is set. The fetch tool retrieves the full document for a returned doc_id, including reassembled chunks when chunking was used.
Allow Agent Control lets the agent override search limits and BM25 scoring parameters for a call. Return Columns can keep agent results compact by returning selected source-table fields instead of full content.
Works Well With
Section titled “Works Well With”agent: Uses the search and fetch tools after the index is ready.search_rag: Pairs lexical BM25 search with semantic vector retrieval.graph: Provides direct keyword retrieval alongside relationship traversal.source_dat: Supplies table data for Source Table or Direct Table modes.
Gotchas
Section titled “Gotchas”- Agents do not see the search tool until the index is ready.
- Auto Index can rebuild on Search when input data changes. Disable it when you want explicit Dirty Index / Search control.
- Field Weighting can overemphasize boosted columns on structured metadata tables; use it only when those columns should dominate ranking.
- RAG Index mode requires a ready index source that exposes chunks the operator can read.
- Fetch tool lookups depend on stable
doc_idvalues; chunked documents are reassembled by original document id.
Parameters
Section titled “Parameters”Search
Section titled “Search”op('search_text').par.Status Str Current text-search status.
- Default:
"" (Empty String)
op('search_text').par.Indexsize Str - Default:
"" (Empty String)
op('search_text').par.Lastsearchtime Str - Default:
"" (Empty String)
op('search_text').par.Query Str Search query. Supports: "exact phrase", -exclude, +required terms
- Default:
"" (Empty String)
op('search_text').par.Search Pulse - Default:
False
op('search_text').par.Topk Int Maximum number of text-search results to return.
- Default:
10- Range:
- 1 to 100
op('search_text').par.Minscore Float Minimum relevance score required for a result to be kept.
- Default:
0.0- Range:
- 0 to 10
op('search_text').par.Dedup Toggle Remove duplicate hits from the same source document.
- Default:
True
op('search_text').par.Multiquery Toggle Split the query on | and fuse the per-query rankings.
- Default:
False
op('search_text').par.Autoindex Toggle Automatically rebuild the index before searching when inputs have changed.
- Default:
True
op('search_text').par.Clearresults Pulse - Default:
False
op('search_text').par.Sourcetable DAT - Default:
"" (Empty String)
op('search_text').par.Ragindex COMP - Default:
"" (Empty String)
op('search_text').par.Directtable DAT - Default:
"" (Empty String)
op('search_text').par.Documentfolder Folder - Default:
"" (Empty String)
op('search_text').par.Filepattern Str - Default:
*.txt *.md *.py
op('search_text').par.K1 Float BM25 term-frequency saturation. Higher values reward repeated terms more.
- Default:
1.5- Range:
- 0.1 to 3
op('search_text').par.B Float BM25 length normalization. 0 disables length normalization; 1 applies it fully.
- Default:
0.75- Range:
- 0 to 1
op('search_text').par.Enablechunking Toggle - Default:
False
op('search_text').par.Chunksize Int Maximum characters per indexed chunk when chunking is enabled.
- Default:
1000- Range:
- 100 to 10000
op('search_text').par.Overlap Int Characters shared between adjacent chunks when chunking is enabled.
- Default:
200- Range:
- 0 to 1000
op('search_text').par.Fieldweighting Toggle Boost selected table columns while building the text index.
- Default:
False
op('search_text').par.Boostcolumns Str Comma-separated column names to boost, such as name,title.
- Default:
name,title
op('search_text').par.Boostfactor Float Score multiplier for boosted columns during indexing.
- Default:
3.0- Range:
- 1 to 10
op('search_text').par.Dirtyindex Pulse Mark the text index dirty so the next search rebuilds it
- Default:
False
op('search_text').par.Clearindex Pulse Clear indexed documents and index state
- Default:
False
op('search_text').par.Reset Pulse Clear results, index, and logs
- Default:
False
op('search_text').par.Toolname Str Unique tool name for agent integration
- Default:
search_data
op('search_text').par.Allowagentcontrol Toggle - Default:
False
op('search_text').par.Tooldescription DAT Optional text DAT with custom tool description. Leave empty to use default.
- Default:
"" (Empty String)
op('search_text').par.Describedatasource Toggle Append data source info to the tool description
- Default:
False
op('search_text').par.Agentresultstable Toggle - Default:
True
op('search_text').par.Returncolumns Str Comma-list of source table columns to include in compact tool results. Empty returns full content.
- Default:
"" (Empty String)
op('search_text').par.Enablefetchtool Toggle Expose a companion fetch tool for retrieving full document content by doc_id.
- Default:
True
op('search_text').par.Fetchtoolname Str Leave empty to auto-derive as "{Toolname}_fetch".
- Default:
"" (Empty String)
Changelog
Section titled “Changelog”vv1.0.0
# v1.0.0 (2026-05-02)
- reorganized parameter layout: Search page first with status/query/config/actions, BM25 page with source/preset/algorithm/chunking/weighting/maintenance - added Reset pulse and Status/Indexsize/Lastsearchtime readouts to Search page top - moved agent behavior pars to Tools page - updated compose.json pairs_with (rank_fusion to search_merge) - updated category to Retrievers
- split query oneOf into separate query/queries tool fields - removed required constraint on query parameter - added mutual exclusion validation for query vs queries
- Initial search_text structure