🧗 Vibe Grader

Type a boulder name, get a V-grade.

Route name

How this works

Vibe Grader predicts a bouldering V-grade from nothing but the route's name.

Why?

Three reasons, in roughly honest order:

For the memes. Boulder grades are an endless discourse machine: should Return of the Sleepwalker get downgraded, is that actually v5 in my gym, did the first ascensionist sandbag it. Vibe Grader throws a totally unqualified extra opinion into every grade argument.
As an experiment. How much grade signal actually leaks into the name alone? Turns out: a little but real. A name-only model beats blindly guessing the average grade (see the numbers below), which is a slightly unhinged thing to be true.
As a data-driven gut check. Grades are famously subjective. A single number arguing over decades of egos. A model trained on ~80,000 community grades won't give you the grade, but it gives a reproducible, opinion-free vibe to argue against. Sometimes a dumb baseline is a useful baseline.

The model

It's trained on ~80,000 real boulder problems from OpenBeta (open-licensed climbing data sourced from Mountain Project), each with a name and a community grade. Each name becomes TF-IDF features — character chunks like cri, eath and whole words like crimp, slab — and a Ridge regression maps those to a number. Nothing to do with hold size or wall angle, just which letters and words tend to show up on hard vs. easy problems. At its most accurate it lands within about ±2 V-grades on average (see the numbers below), so treat it as a vibe, not a verdict.

The two grades (the sandbag toggle)

Both grades come from the exact same algorithm and data. The only difference is how the training examples are weighted:

The vibe grade (default): every grade weighted equally, so rare hard grades pull their weight and the model isn't shy about calling a gnarly name V-hard.
Sandbagged mode: tuned for the lowest average error. Since most real boulders are easy-to-mid, it plays the odds and under-calls scary names.

Is it actually any good?

Measured on the held-out 20% test set, by mean absolute error (average distance from the real grade — lower is better):

±2.16 V: just guessing the average grade every time. The dumb baseline to beat.
±1.99 V: sandbagged mode. It actually beats the baseline by playing the odds.
±2.67 V: the vibe grade. Yes, worse than guessing the average, but that's the price of being willing to shout "V100!" at a scary name.

The V15 gap

Climbing is a pyramid, theres tons of easy boulders, very few elite ones. OpenBeta (≈ Mountain Project, US-centric) has almost nothing above V15. We patched the very top (V16–V17, plus the proposed V18 Exodia) from a small open list of famous hard problems, but V14–V15 stays thin. So predictions up high are extrapolation from a handful of examples.

Old names, new boulders

The model only ever saw names that already had grades, so it's really learning the naming conventions of climbs that have been around a while — not anything physical. Type a brand-new boulder and you're betting that whoever named it followed the same herd instincts as everyone before them. Name your V2 "Burden of Sharma Roof Low" and the model will happily believe you. Garbage in, gnarly out.

The confidence curve

The bar chart isn't a real probability. Ridge gives a single number; we draw a bell curve around it using the model's typical error. It's a picture of uncertainty, not a calibrated distribution.

How it works →