Methodology

GitPersona turns public GitHub data into a developer character sheet. Here's exactly how each number is produced — and where it's an estimate.

Data source

Everything comes from the public GitHub REST and GraphQL APIs: your profile, your public repositories, per-repo language byte counts, and (when a token is configured) your contribution history. We only ever read public data. No sign-in, no write access.

Results are cached so repeat visits are fast and we stay friendly to GitHub's rate limits.

Confidence labels

Every major stat carries a confidence level, shown right next to it:

High confidence: repo count, stars, forks, followers, account age

Medium confidence: language breakdown, years active

Estimated: total commits, total source lines

Estimated source lines

Counting lines exactly would require cloning and scanning every repo. Instead we derive an estimate from GitHub's per-language byte counts, which already exclude vendored code, lockfiles, generated files, and binaries. We convert bytes to lines using per-language density factors and apply a conservative discount.

This is labelled estimated current source lines— not an exact count of every line you've ever written. For large accounts, a subset of repositories is measured precisely and the long tail is approximated from repo size.

Estimated commits

With a GitHub token, we sum your commit contributions year by year using GitHub's own contribution accounting (medium confidence). Without a token, GraphQL is unavailable, so we fall back to a heuristic based on repository size and age (estimated). Either way the number is clearly labelled.

The six scores

Each score is a deterministic 0–100 value computed from your repos, languages, and activity, using a bounded saturation curve so prolific accounts don't trivially max out:

Output — how much you've built (repos, LOC, commits, popularity).
Consistency — how regularly and recently you code.
Depth — repo size, maintenance, and commits per project.
Diversity — language and project-type variety.
Polish — descriptions, topics, licenses, demos, READMEs, stars.
Experimentation — new repos per year and breadth of small projects.

How archetypes work

Your archetype is rules-based and deterministic. We score all 19 archetypes against your profile signals (score dimensions, language categories and shares, fork ratio, average stars) and assign the best match. Broad behavioral types coexist with language-specific ones, so a profile dominated by a single language reads as that specialist. The same profile always produces the same archetype.

Prototype AlchemistFrontend CraftsmanBackend EngineerOpen Source MonkFull-Stack ShapeshifterAI TinkererSystems GoblinWeekend BuilderSilent OperatorTutorial SurvivorTypeScript NativeJavaScript NativePythonistaMarkup ArtisanRustaceanGopherJVM EngineerMobile NativeData Scientist

Privacy

Public scans use only public GitHub data. Cached scan summaries can be deleted on request. We never expose private repositories, and the project requests no write permissions.

Limitations

GitHub's API has rate limits, repo size includes git history (so LOC is approximate), and contribution data for private work isn't visible without authorization. Treat the estimated numbers as a well-informed approximation, not an audit.

Ready to see your own character sheet?

← Back home