Production · serving today

PNEUMA-DD · DATA-DRIVEN VARIANT

The fastest way to understand a proper name.

Same task surface as PNEUMA — entity-type, language, country, gender, parsing — but powered directly by MondoGraph token-frequency statistics. Transparent, debuggable, and shipping in production today. Sub-50ms typical latency. No training, no GPU — just the corpus.

Analyze a name → See PNEUMA roadmap

TRY IT — FULL PNEUMA-DD OUTPUT

Type a full name. See the whole stack run.

One call to /api/v1/parse returns language, parse structure, gender posterior, and per-token country/language distributions — all from MondoGraph in ~25 ms.

Input processed in-memory for inference — not stored, not logged with identifying content, not used for training. Data handling.
EXAMPLES:
Pick an example above, or type any full name.

WHY DATA-DRIVEN

The fastest, most transparent model is no model at all.

Most name-understanding tasks have explicit ground-truth in MondoGraph: Andrea is feminine in Italian because 92% of Italian Andreas are women. PNEUMA-DD returns that posterior directly, with the bearer count as evidence. No black box, no hallucination, and every prediction is auditable down to the source rows.

~25ms
p50 latency
96.5%
gender, locale-cond.
96.1%
entity-type accuracy
100%
predictions explainable

LIVE DEMOS

Each task, in your browser.

Same endpoints we use in production. Type a name, see a real result.

METHOD NOTE

Bayesian posteriors over MondoGraph rows.

For an input name n and an optional locale prior L, PNEUMA-DD computes P(class | n, L) by direct count over the 556M-row token_stats table, with Dirichlet smoothing for low-evidence cases and a temperature parameter for calibration. Convention detection in the parser uses a CRF over locale-conditioned slot priors. For out-of-corpus tokens, the model falls back to PNEUMA (in development) or a character-n-gram nearest-neighbor lookup.

EXPLAIN-WHY

Every prediction has receipts.

Each API response includes a bearers_in_corpus field and a list of the supporting row counts. Audit a gender call: Andrea, it-IT → feminine 0.97 · 428,392 bearers. KYC and regulated industries can show the regulator the exact evidence used. Try toggling explain: true in the demos to see the receipts.