MODEL 05 · LOCALE-AWARE INFERENCE

Gender Inferencer

Powered by PNEUMA-DD (production) · 53M given-name forms in MondoGraph

Predict likely gender from a given name — locale-aware, calibrated, and honest about ambiguity. Andrea is feminine in Italian and largely masculine in German. Kim is unisex in English, predominantly female in Korean. The model returns a full distribution, not a guess.

Model card Bias & ethics statement

Try it — given name → gender distribution

Optionally provide a locale hint (the model uses the country distribution from Nomograph if you do).

EXAMPLES:
Enter a given name and (optionally) a locale, then press Infer.

Model card

Approach
Nomograph corpus lookup (locale-conditioned) + XLM-R fallback for OOV names
Labels
masculine · feminine · unisex · unknown
Locales
187 (ISO 3166 alpha-2 × language)
Accuracy
96.5% locale-conditioned · 91.2% locale-blind
Calibration
Returns posterior P(M), P(F), P(unknown); ECE 0.024
Refusal threshold
If max(P) < 0.65 → unknown (configurable)
Note
Stats are descriptive (corpus-frequency-based), not normative.

API

curl -X POST https://api.mondonomo.ai/v1/gender \ -H "Authorization: Bearer $TOKEN" \ -d '{"name": "Andrea", "locale": "it-IT"}' { "top": "feminine", "confidence": 0.97, "distribution": { "masculine": 0.02, "feminine": 0.97, "unknown": 0.01 }, "locale": "it-IT", "bearers_in_corpus": 428392 }
On ethics. Gender inference from names is a statistical signal, not an identity claim. The model returns probabilities derived from corpus frequency; it refuses on ambiguous evidence rather than guessing. Use it for aggregate analytics, salutation localization (Señora García vs Hi Maria), or to flag records that need follow-up — never as a system of record for an individual's gender.

RELATED

Use after parsing.