GitHub Connector
Sync documentation and code files from your GitHub repositories into Knowledge Raven.
What Gets Synced
Knowledge Raven indexes text-based files from your repositories. This includes:
- Documentation:
.md,.mdx,.rst,.txt - Code:
.py,.js,.ts,.go,.java,.rs,.rb,.php,.cs,.swift, and 40+ other languages - Config:
.yaml,.yml,.json,.toml,.ini,.tf,.hcl - Scripts:
.sh,.bash,.zsh,.ps1 - Special files:
Dockerfile,Makefile,README,LICENSE,CHANGELOG,.gitignore, etc.
Binary files, images, and files larger than 50 MB are skipped automatically.
How to Connect
Step 1 — Open Connectors
In your Knowledge Raven Dashboard, navigate to Connectors and click Add Connector, then select GitHub.
Step 2 — Authorize with GitHub
You will be redirected to GitHub’s OAuth authorization page. Knowledge Raven uses a standard OAuth App (not a GitHub App) with the repo scope for read access to your repositories.
Step 3 — Select Repositories
After authorization, Knowledge Raven lists your accessible repositories. Select the repositories you want to sync. For large repositories, you can expand the file tree to select specific folders.
Step 4 — Select a Knowledge Base
Choose which Knowledge Raven knowledge base the synced content should appear in.
Step 5 — Initial Sync
Knowledge Raven fetches the repository tree, checks file extensions against the supported list, downloads eligible files, and indexes them.
Change Detection
GitHub uses SHA-based change detection — Knowledge Raven stores the Git SHA (content hash) for each file and only re-indexes files whose SHA has changed. This is more reliable than timestamp-based detection.
Token Handling
GitHub OAuth App tokens are long-lived — they don’t expire unless revoked. No refresh tokens are needed.
Source Deep-Linking
When your agent cites a file from GitHub, the source_link points to the raw file URL on GitHub (e.g., github.com/org/repo/blob/main/docs/guide.md).
Troubleshooting
Private repositories not showing — Make sure you authorized with an account that has access to the repository, and that the repo scope was granted.
Large files skipped — Files over 50 MB are skipped to prevent memory issues. This is expected behavior for large data dumps or binary assets.