Skip to content

Source Github Operator

The Source Github LOP (formerly GitHubParser) is designed to ingest and parse content from public or private GitHub repositories, transforming it into a structured DAT table suitable for use with the Rag Index LOP. It supports parsing documentation files (.md, .rst), issues, pull requests, code files (specific languages), and wikis. This allows you to create a comprehensive knowledge base from a GitHub repository for RAG applications.

Note: Requires the requests Python library. Authentication is recommended to avoid strict GitHub API rate limits.

Parameters are organized into pages.

Repository URL (Repourl) op('source_github').par.Repourl Str
Default:
None
Branch/Tag (Branch) op('source_github').par.Branch Str
Default:
None
Parse Repository (Parse) op('source_github').par.Parse Pulse
Default:
None
Stop Processing (Stop) op('source_github').par.Stop Pulse
Default:
None
Current Status (Status) op('source_github').par.Status Str
Default:
None
Progress (Progress) op('source_github').par.Progress Float
Default:
None
API Rate Limit (Ratelimit) op('source_github').par.Ratelimit Str
Default:
None
Clear All (Clear) op('source_github').par.Clear Pulse
Default:
None
Caution: Exposing the viewer of large index tables will be heavy Header
Display (Display) op('source_github').par.Display Menu
Default:
index
Options:
index, content
Select Doc (Selectdoc) op('source_github').par.Selectdoc Int
Default:
1
Range:
1 to N/A
Slider Range:
1 to N/A
Display File (Displayfile) op('source_github').par.Displayfile Str
Default:
None
Include Documentation (Includedocs) op('source_github').par.Includedocs Toggle
Default:
Off
Doc File Patterns (Docpatterns) op('source_github').par.Docpatterns Str
Default:
None
Include Wiki (Includewiki) op('source_github').par.Includewiki Toggle
Default:
Off
Include Issues/PRs (Includeissues) op('source_github').par.Includeissues Toggle
Default:
Off
Issue State (Issuestate) op('source_github').par.Issuestate Menu
Default:
all
Options:
all, open, closed
Max Issues/PRs (Issuelimit) op('source_github').par.Issuelimit Int
Default:
0
Range:
0 to 1000
Include Comments (Includecomments) op('source_github').par.Includecomments Toggle
Default:
Off
Include Code Files (Includecode) op('source_github').par.Includecode Toggle
Default:
Off
Code Languages (Codelanguages) op('source_github').par.Codelanguages Str
Default:
None
Include Code Context (Includecontext) op('source_github').par.Includecontext Toggle
Default:
Off
Max File Size (KB) (Maxfilesize) op('source_github').par.Maxfilesize Int
Default:
0
Range:
0 to 10000
Ignore Patterns (Ignorepaths) op('source_github').par.Ignorepaths Str
Default:
None
Max Directory Depth (Maxdepth) op('source_github').par.Maxdepth Int
Default:
0
Range:
0 to 50
Use Authentication (Useauth) op('source_github').par.Useauth Toggle
Default:
Off
GitHub Token (Token) op('source_github').par.Token Str
Default:
None
ChatTD (Chattd) op('source_github').par.Chattd OP
Default:
None
Show Built In Pars (Showbuiltin) op('source_github').par.Showbuiltin Toggle
Default:
Off
Bypass (Bypass) op('source_github').par.Bypass Toggle
Default:
Off
Version (Version) op('source_github').par.Version Str
Default:
None
Lastupdated (Lastupdated) op('source_github').par.Lastupdated Str
Default:
None
Creator (Creator) op('source_github').par.Creator Str
Default:
None
Website (Website) op('source_github').par.Website Str
Default:
None
Available Callbacks:
  • onParseStart
  • onParseComplete
  • onFileProcessed
  • onIssueProcessed
  • onRateLimitUpdate
  • onError
  1. Set ‘Repository URL’ (e.g., github.com/derivative/TouchDesigner-Samples).
  2. Set ‘Branch/Tag’ (usually main or master).
  3. Configure Rules (e.g., disable ‘Include Code Files’ if not needed).
  4. Pulse ‘Parse Repository’.
  5. Monitor ‘Status’, ‘Progress’, and ‘API Rate Limit’.
  6. Output appears in the index_table DAT.

Parsing with Authentication (Private or Higher Rate Limit)

Section titled “Parsing with Authentication (Private or Higher Rate Limit)”
  1. Enable ‘Use Authentication’ on the Auth page.
  2. Paste your GitHub Personal Access Token into ‘GitHub Token’.
  3. Set ‘Repository URL’ and other parameters as needed.
  4. Pulse ‘Parse Repository’.
  1. Adjust ‘Max Issues/PRs’, ‘Max File Size (KB)’, ‘Max Directory Depth’.
  2. Use ‘Doc File Patterns’ and ‘Ignore Patterns’ to specifically include/exclude paths.
  3. Specify desired ‘Code Languages’ if including code.
  4. Pulse ‘Parse Repository’.
  • Parsing relies on the GitHub REST API. Unauthenticated requests have very low rate limits (around 60/hour). Use a PAT.
  • The process runs asynchronously via ChatTD.
  • Large repositories can take considerable time to parse.
  • Ensure your PAT has the necessary scopes to access the repository content (files, issues, wiki).
  • The output index_table is formatted for direct use with the Rag Index LOP.