RAG Index
The RAG Index operator builds vector store indices from your documents. Feed it a folder of files or a structured document table, choose an embedding model, and it produces a searchable index that downstream operators like RAG Retriever can query. All embedding and vector storage is handled by the embedding_sidecar service over HTTP — the operator does not run any vector or embedding code locally. Indices can be saved to disk and reloaded across sessions.
Requirements
Section titled “Requirements”- SideCar: The LOPs SideCar must be running with the
embedding_sidecarservice. The operator starts the sidecar automatically when needed. - Embedding Provider: A running Ollama instance with an embedding model pulled (e.g.
ollama pull nomic-embed-text). The OpenAI option is also available in the Embedding Model menu.
Input/Output
Section titled “Input/Output”Inputs
Section titled “Inputs”- Input 1 (optional): A Table DAT with columns
doc_id,filename,content,metadata. Used when Input Mode is set to Doc Table or Auto Detect.
Outputs
Section titled “Outputs”No wired outputs. The operator maintains internal tables (documents, index info, stats) and holds the index on the embedding sidecar for use by connected RAG Retriever operators.
Usage Examples
Section titled “Usage Examples”Indexing a Folder of Documents
Section titled “Indexing a Folder of Documents”- On the Index page, set Input Mode to “Folder”.
- Set Document Folder to the path containing your files.
- Set File Pattern to match your documents (e.g.
*.txt *.md *.py). Separate multiple patterns with spaces. - Choose an Embedding Model — “Local (Ollama)” for local processing or “OpenAI” for cloud embeddings.
- If using Ollama, pick a model from the Ollama Model menu (nomic-embed-text, mxbai-embed-large, or all-minilm).
- Optionally adjust Chunk Size and Chunk Overlap to control how documents are split.
- Give the index a name in Index Name or let the operator generate one automatically.
- Pulse Create Index. The Current Status and Progress fields update in real time as documents are sent to the embedding sidecar for processing.
Indexing from a Document Table
Section titled “Indexing from a Document Table”- Wire a Table DAT into the operator’s first input. The table must have
doc_id,filename,content, andmetadatacolumns (metadata as JSON strings). - Set Input Mode to “Doc Table” (or leave on “Auto Detect” — it will detect the wired input automatically).
- Choose your embedding model and pulse Create Index.
Saving and Reloading Indices
Section titled “Saving and Reloading Indices”- Enable Sync to File to persist index data to disk during creation.
- Set Index Folder to your preferred save location. If left blank, the operator saves to
project/index/{index_name}/automatically. - Pulse Save Index at any time to manually save the current index state. A
config.jsonfile is saved alongside the vector data with embedding settings and index statistics. - To reload a saved index, set Index Folder to the directory containing your saved index and pulse Load Index. The operator restores embedding model settings from the saved config automatically.
- Enable Load on Start to automatically reload the index when the TouchDesigner project opens (requires Sync to File to be enabled).
Clearing and Rebuilding
Section titled “Clearing and Rebuilding”Pulse Clear All to remove the current index and all internal tables, both locally and on the embedding sidecar. This resets the operator to a clean state for rebuilding.
If an index creation is taking too long, pulse Stop Index Creation to cancel. Note that the server-side embedding operation may still complete.
Best Practices
Section titled “Best Practices”- Start with default chunk settings and adjust based on retrieval quality. Smaller chunks give more precise results but increase index size.
- Use local embeddings (Ollama) for privacy-sensitive data or offline workflows. OpenAI embeddings tend to produce higher quality results for general text.
- Name your indices using the Index Name field before creating — this makes saved folders easier to identify and prevents auto-generated names.
- Save to file for any index you want to persist. In-memory indices on the embedding sidecar are lost when the SideCar stops.
- Check the stats table after index creation for a detailed breakdown of document counts, chunk counts, token estimates, and file types processed.
Troubleshooting
Section titled “Troubleshooting”- “Embedding server not available”: The SideCar service needs to be running with the
embedding_sidecar. It should start automatically, but check the SideCar operator if issues persist. - Index creation stalls or errors: Check the Logger for detailed messages. Common causes are an unreachable Ollama server or missing API keys for OpenAI embeddings.
- “No documents to process”: Verify your Document Folder path and File Pattern, or check that your wired input table has data rows beyond the header.
- Embedding model errors: Ensure Ollama is running (
ollama serve) and the selected model is pulled (ollama pull nomic-embed-text). - Load fails with “Index folder not found”: Confirm the Index Folder path points to a directory that was previously saved by this operator, containing a
config.jsonand the vector store data.
Parameters
Section titled “Parameters”op('rag_index').par.Createindex Pulse - Default:
False
op('rag_index').par.Indexname Str - Default:
"" (Empty String)
op('rag_index').par.Documentfolder Folder - Default:
"" (Empty String)
op('rag_index').par.Filepattern Str - Default:
"" (Empty String)
op('rag_index').par.Status Str - Default:
"" (Empty String)
op('rag_index').par.Activeindex Toggle - Default:
False
op('rag_index').par.Chunksize Int - Default:
0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
op('rag_index').par.Chunkoverlap Int - Default:
0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
op('rag_index').par.Savetofile Toggle - Default:
False
op('rag_index').par.Indexfolder Folder - Default:
"" (Empty String)
op('rag_index').par.Saveindex Pulse - Default:
False
op('rag_index').par.Loadindex Pulse - Default:
False
op('rag_index').par.Loadonstart Toggle - Default:
False
op('rag_index').par.Clearall Pulse - Default:
False
op('rag_index').par.Progress Float - Default:
0.0- Range:
- 0 to 1
- Slider Range:
- 0 to 1
op('rag_index').par.Stopindex Pulse - Default:
False
Changelog
Section titled “Changelog”v2.1.02026-03-16
- Added RAG index creation with embedding_sidecar integration - Implemented document processing from tables and folders - Added index persistence and configuration saving
v2.0.02026-03-02
- Refactor to HTTP sidecar client, remove llama-index dependency - All vector operations via embedding_server over HTTP - Add collection name sanitization - Add sidecar field to manifest
v1.1.22026-03-01
- Replace torch import check with importlib.metadata for TD 32050+ compatibility
- Initial commit
v1.1.12025-08-03
- Fixed missing Ollama embeddings integration by adding llama-index-embeddings-ollama to installation packages
- Added llama-index-embeddings-huggingface and llama-index-embeddings-openai to ensure all embedding types work properly
- Resolved "No module named 'llama_index.embeddings.ollama'" error during embedding model initialization
v1.1.02025-07-30
- Added IndexActive tdu.Dependency for reactive state tracking
- Switched to Ollama for local embeddings to fix numpy conflicts
- Added Ollamamodel parameter with nomic-embed-text, mxbai-embed-large, all-minilm options
- Enhanced config saving/loading to include Ollama model selection
- Improved error handling and logging
v1.0.02024-11-06
Initial release