RAG Index Operator
v1.1.1
What's new
See LOPs 0.1.0 Full changelog →
RAG Index v1.1.1 [ July 31, 2025 ]
- Added Ollama embeddings integration
- Enhanced embedding type support for HuggingFace and OpenAI
- Resolved import errors for embedding model initialization
The RAG Index operator is designed to create and manage vector store indices from various document sources. These indices are crucial for Retrieval Augmented Generation (RAG) workflows, allowing LLMs to retrieve relevant information from a knowledge base before generating responses. It supports different input modes (document tables, folders) and embedding models, and provides options for chunking, saving, and loading indices.
Requirements
Section titled “Requirements”- Python Packages:
llama-index
tiktoken
torch
(with CUDA support for local embeddings on GPU)
- ChatTD Operator: Required for asynchronous task execution and embedding model management.
Input/Output
Section titled “Input/Output”Inputs
Section titled “Inputs”- Input 1 (Document Table, optional): A Table DAT containing document data with columns like
doc_id
,filename
,content
,metadata
. Used whenInput Mode
isDoc Table
orAuto Detect
.
Outputs
Section titled “Outputs”- Documents Table (Table DAT): Stores information about the processed documents, including their content, metadata, and hash.
- Index Info Table (Table DAT): Provides details about created and loaded indices, such as name, number of nodes, creation timestamp, and status.
Parameters
Section titled “Parameters”Page: Index
Section titled “Page: Index” Create Index (Createindex)
op('rag_index').par.Createindex
Pulse - Default:
None
Index Name (Indexname)
op('rag_index').par.Indexname
Str - Default:
None
Document Folder (Documentfolder)
op('rag_index').par.Documentfolder
Folder - Default:
None
File Pattern (Filepattern)
op('rag_index').par.Filepattern
Str - Default:
None
Current Status (Status)
op('rag_index').par.Status
Str - Default:
None
Active Index (Activeindex)
op('rag_index').par.Activeindex
Toggle - Default:
false
Chunk Size (Chunksize)
op('rag_index').par.Chunksize
Int - Default:
1024
Chunk Overlap (Chunkoverlap)
op('rag_index').par.Chunkoverlap
Int - Default:
20
Sync to File (Savetofile)
op('rag_index').par.Savetofile
Toggle - Default:
false
Index Folder (Indexfolder)
op('rag_index').par.Indexfolder
Folder - Default:
None
Save Index (Saveindex)
op('rag_index').par.Saveindex
Pulse - Default:
None
Load Index (Loadindex)
op('rag_index').par.Loadindex
Pulse - Default:
None
Load on Start (Loadonstart)
op('rag_index').par.Loadonstart
Toggle - Default:
false
Clear All (Clearall)
op('rag_index').par.Clearall
Pulse - Default:
None
Header
Progress (Progress)
op('rag_index').par.Progress
Float - Default:
0.0
Stop Index Creation (Stopindex)
op('rag_index').par.Stopindex
Pulse - Default:
None
Page: About
Section titled “Page: About” ChatTD (Chattd)
op('rag_index').par.Chattd
OP - Default:
None
Show Built In Pars (Showbuiltin)
op('rag_index').par.Showbuiltin
Toggle - Default:
false
Bypass (Bypass)
op('rag_index').par.Bypass
Toggle - Default:
false
Version (Version)
op('rag_index').par.Version
Str - Default:
None
Last Updated (Lastupdated)
op('rag_index').par.Lastupdated
Str - Default:
None
Creator (Creator)
op('rag_index').par.Creator
Str - Default:
None
Website (Website)
op('rag_index').par.Website
Str - Default:
None
Usage Examples
Section titled “Usage Examples”Creating an Index from a Folder of Documents
Section titled “Creating an Index from a Folder of Documents”- Set
Document Folder
to the path containing your text or markdown files. - Set
File Pattern
to*.txt *.md
(or other relevant patterns). - Choose your
Embedding Model
(e.g.,Local Embed
). - Adjust
Chunk Size
andChunk Overlap
as needed. - Pulse
Create Index
. - Monitor the
Status
andProgress
parameters.
Saving and Loading an Index
Section titled “Saving and Loading an Index”- After creating an index, ensure
Sync to File
is enabled or pulseSave Index
. - Set
Index Folder
to your desired save location. - To load an existing index, set
Index Folder
and pulseLoad Index
(or enableLoad on Start
).
Technical Notes
Section titled “Technical Notes”- This operator leverages the
llama-index
library for RAG functionalities. - Index creation can be resource-intensive, especially for large document sets or complex embedding models.
- The
ChatTD
operator is essential for managing Python dependencies and providing access to embedding models. - Chunking parameters (
Chunk Size
,Chunk Overlap
) significantly impact the quality of retrieval. Experiment with these values based on your document content.
Related Operators
Section titled “Related Operators”- RAG Retriever: Used to query the created index.
- ChatTD: Provides core services and embedding models.
- Source Ops: Can be used to generate document tables for input.