Skip to content

RAG Index

v2.1.0Updated

The RAG Index operator builds vector store indices from your documents. Feed it a folder of files or a structured document table, choose an embedding model, and it produces a searchable index that downstream operators like RAG Retriever can query. All embedding and vector storage is handled by the embedding_sidecar service over HTTP — the operator does not run any vector or embedding code locally. Indices can be saved to disk and reloaded across sessions.

  • SideCar: The LOPs SideCar must be running with the embedding_sidecar service. The operator starts the sidecar automatically when needed.
  • Embedding Provider: A running Ollama instance with an embedding model pulled (e.g. ollama pull nomic-embed-text). The OpenAI option is also available in the Embedding Model menu.
  • Input 1 (optional): A Table DAT with columns doc_id, filename, content, metadata. Used when Input Mode is set to Doc Table or Auto Detect.

No wired outputs. The operator maintains internal tables (documents, index info, stats) and holds the index on the embedding sidecar for use by connected RAG Retriever operators.

  1. On the Index page, set Input Mode to “Folder”.
  2. Set Document Folder to the path containing your files.
  3. Set File Pattern to match your documents (e.g. *.txt *.md *.py). Separate multiple patterns with spaces.
  4. Choose an Embedding Model — “Local (Ollama)” for local processing or “OpenAI” for cloud embeddings.
  5. If using Ollama, pick a model from the Ollama Model menu (nomic-embed-text, mxbai-embed-large, or all-minilm).
  6. Optionally adjust Chunk Size and Chunk Overlap to control how documents are split.
  7. Give the index a name in Index Name or let the operator generate one automatically.
  8. Pulse Create Index. The Current Status and Progress fields update in real time as documents are sent to the embedding sidecar for processing.
  1. Wire a Table DAT into the operator’s first input. The table must have doc_id, filename, content, and metadata columns (metadata as JSON strings).
  2. Set Input Mode to “Doc Table” (or leave on “Auto Detect” — it will detect the wired input automatically).
  3. Choose your embedding model and pulse Create Index.
  1. Enable Sync to File to persist index data to disk during creation.
  2. Set Index Folder to your preferred save location. If left blank, the operator saves to project/index/{index_name}/ automatically.
  3. Pulse Save Index at any time to manually save the current index state. A config.json file is saved alongside the vector data with embedding settings and index statistics.
  4. To reload a saved index, set Index Folder to the directory containing your saved index and pulse Load Index. The operator restores embedding model settings from the saved config automatically.
  5. Enable Load on Start to automatically reload the index when the TouchDesigner project opens (requires Sync to File to be enabled).

Pulse Clear All to remove the current index and all internal tables, both locally and on the embedding sidecar. This resets the operator to a clean state for rebuilding.

If an index creation is taking too long, pulse Stop Index Creation to cancel. Note that the server-side embedding operation may still complete.

  • Start with default chunk settings and adjust based on retrieval quality. Smaller chunks give more precise results but increase index size.
  • Use local embeddings (Ollama) for privacy-sensitive data or offline workflows. OpenAI embeddings tend to produce higher quality results for general text.
  • Name your indices using the Index Name field before creating — this makes saved folders easier to identify and prevents auto-generated names.
  • Save to file for any index you want to persist. In-memory indices on the embedding sidecar are lost when the SideCar stops.
  • Check the stats table after index creation for a detailed breakdown of document counts, chunk counts, token estimates, and file types processed.
  • “Embedding server not available”: The SideCar service needs to be running with the embedding_sidecar. It should start automatically, but check the SideCar operator if issues persist.
  • Index creation stalls or errors: Check the Logger for detailed messages. Common causes are an unreachable Ollama server or missing API keys for OpenAI embeddings.
  • “No documents to process”: Verify your Document Folder path and File Pattern, or check that your wired input table has data rows beyond the header.
  • Embedding model errors: Ensure Ollama is running (ollama serve) and the selected model is pulled (ollama pull nomic-embed-text).
  • Load fails with “Index folder not found”: Confirm the Index Folder path points to a directory that was previously saved by this operator, containing a config.json and the vector store data.
Create Index (Createindex) op('rag_index').par.Createindex Pulse
Default:
False
Index Name (Indexname) op('rag_index').par.Indexname Str
Default:
"" (Empty String)
Input Mode (Inputmode) op('rag_index').par.Inputmode Menu
Default:
auto
Options:
auto, doctable, folder
Document Folder (Documentfolder) op('rag_index').par.Documentfolder Folder
Default:
"" (Empty String)
File Pattern (Filepattern) op('rag_index').par.Filepattern Str
Default:
"" (Empty String)
Current Status (Status) op('rag_index').par.Status Str
Default:
"" (Empty String)
Active Index (Activeindex) op('rag_index').par.Activeindex Toggle
Default:
False
Embedding Model (Embedmodel) op('rag_index').par.Embedmodel Menu
Default:
local
Options:
local, openai
Ollama Model (Ollamamodel) op('rag_index').par.Ollamamodel StrMenu
Default:
"" (Empty String)
Menu Options:
  • nomic-embed-text (nomic-embed-text)
  • mxbai-embed-large (mxbai-embed-large)
  • all-minilm (all-minilm)
Chunk Size (Chunksize) op('rag_index').par.Chunksize Int
Default:
0
Range:
0 to 1
Slider Range:
0 to 1
Chunk Overlap (Chunkoverlap) op('rag_index').par.Chunkoverlap Int
Default:
0
Range:
0 to 1
Slider Range:
0 to 1
Sync to File (Savetofile) op('rag_index').par.Savetofile Toggle
Default:
False
Index Folder (Indexfolder) op('rag_index').par.Indexfolder Folder
Default:
"" (Empty String)
Save Index (Saveindex) op('rag_index').par.Saveindex Pulse
Default:
False
Load Index (Loadindex) op('rag_index').par.Loadindex Pulse
Default:
False
Load on Start (Loadonstart) op('rag_index').par.Loadonstart Toggle
Default:
False
Clear All (Clearall) op('rag_index').par.Clearall Pulse
Default:
False
Header
Progress (Progress) op('rag_index').par.Progress Float
Default:
0.0
Range:
0 to 1
Slider Range:
0 to 1
Stop Index Creation (Stopindex) op('rag_index').par.Stopindex Pulse
Default:
False
v2.1.02026-03-16
  • Added RAG index creation with embedding_sidecar integration - Implemented document processing from tables and folders - Added index persistence and configuration saving
v2.0.02026-03-02
  • Refactor to HTTP sidecar client, remove llama-index dependency - All vector operations via embedding_server over HTTP - Add collection name sanitization - Add sidecar field to manifest
v1.1.22026-03-01
  • Replace torch import check with importlib.metadata for TD 32050+ compatibility
  • Initial commit
v1.1.12025-08-03
  • Fixed missing Ollama embeddings integration by adding llama-index-embeddings-ollama to installation packages
  • Added llama-index-embeddings-huggingface and llama-index-embeddings-openai to ensure all embedding types work properly
  • Resolved "No module named 'llama_index.embeddings.ollama'" error during embedding model initialization
v1.1.02025-07-30
  • Added IndexActive tdu.Dependency for reactive state tracking
  • Switched to Ollama for local embeddings to fix numpy conflicts
  • Added Ollamamodel parameter with nomic-embed-text, mxbai-embed-large, all-minilm options
  • Enhanced config saving/loading to include Ollama model selection
  • Improved error handling and logging
v1.0.02024-11-06

Initial release