MONDOPHON · UNIVERSAL NAME MAPPING
MondoPhon is the universal proper-name-to-proper-name mapping engine — an ensemble of Transformer and Weighted Finite-State Transducers covering every script in active use. Romanization, deromanization, phonetic transcription, G2P, P2G — all under one API.
TASK SURFACE
Every script-to-script and script-to-sound conversion you need to handle the world's names. Where a learned Transformer wins (named entities, soft phonetic rules), MondoPhon uses it. Where a WFST wins (deterministic Romanization standards, compose-able cascades), it uses that. The ensemble routes per-task per-language.
尤金 → Yóujīn
ณัฐกรณ์ → Natthakon
يوسف → Yusuf
Yusuf · ar → يوسف
Eugene · ru → Юджин
Kenta · ja → 健太
Eugen · de → /ˈɔʏɡn̩/
ณัฐกรณ์ · th → /nát̚.tʰa.kɔːn/
/ˈjuːdʒɪn/ · en → Eugene · Eugen · Eugéne
Top-3 with beam-90 search
Yusuf ≈ Joseph ≈ Youssef ≈ Yosef
IPA2vec cosine over 128-dim embeddings
ar → IPA → Latn → IPA → ja
Multi-hop conversion with confidence at every step
LEGACY MODELS · THE SCIENCE BEHIND
The full ensemble is in preview, but its production components have been published, evaluated and are live on the API. They demonstrate the research lineage that MondoPhon will unify into one endpoint.
The current production P2G inside MondoPhon. ByT5-based seq2seq, trained on WikiPron + 1M Nomograph augmentations. Companion helper models IPA2vec and similarIPA also power the soundalike search.
The Thai romanization path in MondoPhon today. ByT5-based, with a surprisingly competitive AyutthayaAlpha-VerySmall variant. Published with Chulalongkorn — the corpus comes from the Handbook of Top Thai Names.
ENSEMBLE ARCHITECTURE
Naïve character-level transliteration breaks on every irregular form (silent letters, tones, kunya). Pure neural models hallucinate when scaled to 150+ scripts. MondoPhon routes per-task per-language: deterministic Romanization standards (RTGS, ALA-LC, BGN/PCGN, Hepburn, Pinyin) run as WFST cascades for speed and auditability; named-entity-aware mappings and phonetic prediction run on the Transformer ensemble. At inference, the router decides which path to take based on input language, target script, and confidence.