Skip to content

Added 'Training GitHub Repository Embeddings using Stars' resource#316

Open
Puzer wants to merge 1 commit intoigrigorik:gh-pagesfrom
Puzer:gh-pages
Open

Added 'Training GitHub Repository Embeddings using Stars' resource#316
Puzer wants to merge 1 commit intoigrigorik:gh-pagesfrom
Puzer:gh-pages

Conversation

@Puzer
Copy link
Copy Markdown

@Puzer Puzer commented Jan 4, 2026

Hi @igrigorik ! First off, huge thanks for maintaining GH Archive. It's an incredible resource.

I recently used the dataset (processing ~1TB of raw data via BigQuery) to train repository embeddings based on the starring patterns of 4M+ developers. The result is a client-side semantic search engine running entirely in the browser via WASM.

I've added a link to the project and the demo in the "Research, visualizations..." section. I hope you find it a worthy addition to the list!

Links:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant