Skip to Content

GitHub Connector

Sync documentation and code files from your GitHub repositories into Knowledge Raven.

What Gets Synced

Knowledge Raven indexes text-based files from your repositories. This includes:

  • Documentation: .md, .mdx, .rst, .txt
  • Code: .py, .js, .ts, .go, .java, .rs, .rb, .php, .cs, .swift, and 40+ other languages
  • Config: .yaml, .yml, .json, .toml, .ini, .tf, .hcl
  • Scripts: .sh, .bash, .zsh, .ps1
  • Special files: Dockerfile, Makefile, README, LICENSE, CHANGELOG, .gitignore, etc.

Binary files, images, and files larger than 50 MB are skipped automatically.

How to Connect

Step 1 — Open Connectors

In your Knowledge Raven Dashboard, navigate to Connectors and click Add Connector, then select GitHub.

Step 2 — Authorize with GitHub

You will be redirected to GitHub’s OAuth authorization page. Knowledge Raven uses a standard OAuth App (not a GitHub App) with the repo scope for read access to your repositories.

Step 3 — Select Repositories

After authorization, Knowledge Raven lists your accessible repositories. Select the repositories you want to sync. For large repositories, you can expand the file tree to select specific folders.

Step 4 — Select a Knowledge Base

Choose which Knowledge Raven knowledge base the synced content should appear in.

Step 5 — Initial Sync

Knowledge Raven fetches the repository tree, checks file extensions against the supported list, downloads eligible files, and indexes them.

Change Detection

GitHub uses SHA-based change detection — Knowledge Raven stores the Git SHA (content hash) for each file and only re-indexes files whose SHA has changed. This is more reliable than timestamp-based detection.

Token Handling

GitHub OAuth App tokens are long-lived — they don’t expire unless revoked. No refresh tokens are needed.

Source Deep-Linking

When your agent cites a file from GitHub, the source_link points to the raw file URL on GitHub (e.g., github.com/org/repo/blob/main/docs/guide.md).

Troubleshooting

Private repositories not showing — Make sure you authorized with an account that has access to the repository, and that the repo scope was granted.

Large files skipped — Files over 50 MB are skipped to prevent memory issues. This is expected behavior for large data dumps or binary assets.

More troubleshooting

Last updated on