Skip to content

Source Github Operator

The Source Github LOP (formerly GitHubParser) is designed to ingest and parse content from public or private GitHub repositories, transforming it into a structured DAT table suitable for use with the Rag Index LOP. It supports parsing documentation files (.md, .rst), issues, pull requests, code files (specific languages), and wikis. This allows you to create a comprehensive knowledge base from a GitHub repository for RAG applications.

Note: Requires the requests Python library. Authentication is recommended to avoid strict GitHub API rate limits.

Parameters are organized into pages.

Repository URL (Repourl) op('source_github').par.Repourl Str
Default:
None
Branch/Tag (Branch) op('source_github').par.Branch Str
Default:
main
Parse Repository (Parse) op('source_github').par.Parse Pulse
Default:
None
Stop Processing (Stop) op('source_github').par.Stop Pulse
Default:
None
Current Status (Status) op('source_github').par.Status Str
Default:
Ready
Progress (Progress) op('source_github').par.Progress Float
Default:
0
API Rate Limit (Ratelimit) op('source_github').par.Ratelimit Str
Default:
"" (Empty String)
Clear All (Clear) op('source_github').par.Clear Pulse
Default:
None
Caution: Viewing large index tables can be slow Header
Display (Display) op('source_github').par.Display Menu
Default:
index
Options:
index, content
Select Doc (Selectdoc) op('source_github').par.Selectdoc Int
Default:
0
Display File (Displayfile) op('source_github').par.Displayfile Str
Default:
"" (Empty String)
Include Documentation (Includedocs) op('source_github').par.Includedocs Toggle
Default:
On
Doc File Patterns (Docpatterns) op('source_github').par.Docpatterns Str
Default:
*.md *.rst *.txt docs/* wiki/*
Include Wiki (Includewiki) op('source_github').par.Includewiki Toggle
Default:
On
Include Issues/PRs (Includeissues) op('source_github').par.Includeissues Toggle
Default:
On
Issue State (Issuestate) op('source_github').par.Issuestate Menu
Default:
all
Options:
all, open, closed
Max Issues/PRs (Issuelimit) op('source_github').par.Issuelimit Int
Default:
10
Range:
1 to 1000
Slider Range:
10 to 100
Include Comments (Includecomments) op('source_github').par.Includecomments Toggle
Default:
On
Include Code Files (Includecode) op('source_github').par.Includecode Toggle
Default:
On
Code Languages (Codelanguages) op('source_github').par.Codelanguages Str
Default:
python javascript typescript
Include Code Context (Includecontext) op('source_github').par.Includecontext Toggle
Default:
On
Max File Size (KB) (Maxfilesize) op('source_github').par.Maxfilesize Int
Default:
500
Range:
1 to 10000
Slider Range:
100 to 1000
Ignore Patterns (Ignorepaths) op('source_github').par.Ignorepaths Str
Default:
node_modules/* .git/* tests/*
Max Directory Depth (Maxdepth) op('source_github').par.Maxdepth Int
Default:
10
Range:
1 to 50
Slider Range:
5 to 20
Use Authentication (Useauth) op('source_github').par.Useauth Toggle
Default:
Off
GitHub Token (Token) op('source_github').par.Token Str
Default:
None
ChatTD (Chattd) op('source_github').par.Chattd OP
Default:
/dot_lops/ChatTD
Show Built In Pars (Showbuiltin) op('source_github').par.Showbuiltin Toggle
Default:
Off
Bypass (Bypass) op('source_github').par.Bypass Toggle
Default:
Off
Available Callbacks:
  • onParseStart
  • onParseComplete
  • onFileProcessed
  • onIssueProcessed
  • onRateLimitUpdate
  • onError
1. Set 'Repository URL' (e.g., `github.com/derivative/TouchDesigner-Samples`).
2. Set 'Branch/Tag' (usually `main` or `master`).
3. Configure Rules (e.g., disable 'Include Code Files' if not needed).
4. Pulse 'Parse Repository'.
5. Monitor 'Status', 'Progress', and 'API Rate Limit'.
6. Output appears in the `index_table` DAT.

Parsing with Authentication (Private or Higher Rate Limit)

Section titled “Parsing with Authentication (Private or Higher Rate Limit)”
1. Enable 'Use Authentication' on the Auth page.
2. Paste your GitHub Personal Access Token into 'GitHub Token'.
3. Set 'Repository URL' and other parameters as needed.
4. Pulse 'Parse Repository'.
1. Adjust 'Max Issues/PRs', 'Max File Size (KB)', 'Max Directory Depth'.
2. Use 'Doc File Patterns' and 'Ignore Patterns' to specifically include/exclude paths.
3. Specify desired 'Code Languages' if including code.
4. Pulse 'Parse Repository'.
  • Parsing relies on the GitHub REST API. Unauthenticated requests have very low rate limits (around 60/hour). Use a PAT.
  • The process runs asynchronously via ChatTD.
  • Large repositories can take considerable time to parse.
  • Ensure your PAT has the necessary scopes to access the repository content (files, issues, wiki).
  • The output index_table is formatted for direct use with the Rag Index LOP.