Onomastics · scientific research on names with MondoGraph

WHAT BECOMES POSSIBLE

Three centuries of dictionary footnotes, now queryable in a single graph.

Onomastic knowledge has lived in prose — paragraphs in Behind The Name, footnotes in regional baby-name handbooks, scholarly chapters on Hebrew theophorics. Useful to a human reader; opaque to a machine. The MondoGraph stack turns that knowledge into structured data, with first-class support for every script and naming tradition.

Cross-cultural comparison

Compare entire naming systems quantitatively. Which countries share the most given names? Which pairs of languages encode the same etymon under different scripts? Answers now come from one query, not a literature survey.

Diachronic tracing

Track how a single name diffuses through history — from a Hebrew theophoric, through Greek translation, medieval Latinization, regional vernacularization, and contemporary global re-romanization. Etymology as a graph traversal.

Cultural-heritage projects

Onomastic atlases that previously took a generation to assemble can now be produced as derived publications — bilingual country-pair volumes, regional handbooks, religious-tradition encyclopedias — with the database doing the heavy lifting.

METHODOLOGICAL FOUNDATION

Names as a measure of cultural similarity.

The flagship onomastic paper to come out of MondoGraph is Lauc's Navigating Linguistic Similarities Among Countries Using Fuzzy Sets of Proper Names (Names, vol. 72, 2024) — a methodology for measuring how similar two countries are by treating their forename inventories as fuzzy sets and computing a distance between them.

"This paper examines the commonalities among several countries and languages through the lens of proper names, especially forenames. It posits that the investigation of these names offers a fresh perspective on language similarity due to their distinct influence from cross-cultural interactions and language contact compared to regular vocabulary. The results show a notable correlation between the commonality of proper names across languages and the overarching commonality of the languages themselves." Davor Lauc · Names: A Journal of Onomastics, vol. 72 · 2024 · read the paper ↗

The method introduces a novel similarity measure that generalizes set similarity by accounting for the distances between member elements (using PolyIPA-derived phonetic distance for the forename case). Applied across 88 countries, the resulting clusters recover known linguistic families — but also surface surprising affinities (e.g. between historically separated diaspora communities) that flat-vocabulary similarity misses entirely. The paper is the methodological anchor for everything below.

RESEARCH SUBPROJECTS

From general infrastructure to focused atlases.

The MondoGraph stack supports specialized research surfaces — each filtering, annotating, and publishing on a particular slice of the world's names. Two are highlighted here; more are in production with academic partners.

Published · books available

Named by God — global biblical onomastics

SUBPROJECT 01 · INTERNATIONAL BIBLICAL ONOMASTIC SOCIETY

A large-scale research and publishing project dedicated to the systematic study, mapping, and dissemination of biblical personal names. The intellectual core is a rigorously structured corpus of 2,365 biblical names, each treated as a unit of cultural transmission: original scriptural form in Hebrew, Greek, or Aramaic; phonetic transliterations; etymological trajectory through ancient Near Eastern languages; narrative role across 35,003 indexed Bible verses; social-graph of associated biblical figures; and differentiated spread across 202 countries and 12 writing systems.

The comparative orthographic layer comprises 20,540 attested spelling variants, constituting the most comprehensive multilingual register of biblical name forms assembled to date. Distribution data challenges the assumption that biblical nomenclature is a Western or Christian phenomenon — Brazil alone hosts an estimated 119M bearers of biblical names, followed by India (116M) and Mexico (100M).

The methodological lineage traces to the foundational study by Martinjak, Lauc & Skelac, Towards Analysis of Biblical Entities and Names using Deep Learning (IJACSA, vol. 14, no. 5, 2023; DOI 10.14569/IJACSA.2023.0140552) — which applied NLP and social-network centrality metrics (Degree, Closeness, Betweenness) to Polish, Croatian, and English translations of the Gospel of Mark, seeding the agenda Named by God now executes at civilizational scale.

PUBLISHED VOLUMES · BILINGUAL COUNTRY PAIRS

From the Black Forest to the Bible BeltGerman–English · 333p

From the Adriatic to the VistulaCroatian–Polish · 333p

From Lilies to LotusesThai–English · 333p

From Eagle to White EaglePolish–English · 333p

From Alter to Torri GateCroatian–Japanese · 333p

Biblical Baby Names Encyclopedia A Cross-Cultural Guide · Marcus Paterson · ASIN B0GP76NFDD ↗

Sacred Names Across the Pacific — Biblical Names Between Korea and the United States Marcus Paterson & Jayden Yoon · ASIN B0GDV2PLXH · the first rigorous cross-cultural excavation of biblical nomenclature between fundamentally different linguistic universes ↗

Names
indexed2,365

Bible verses
tagged35,003

Countries with
bearer data202

Writing
systems12

Spelling
variants20,540

Global biblical-name
bearers~1B ppl

Top countryBR · 119M

In progress · build replays in ~12h

Formalised Etymology

SUBPROJECT 02 · SEMANTIC NETWORK FOR NAME ETYMOLOGY

A semantic network of name etymologies across languages and scripts — designed for retrieval, reasoning, and downstream NLP. Etymological knowledge has been trapped in prose: dictionary entries, Wiktionary articles, scholarly footnotes. A user searching for the Korean surname 김 could not easily learn that it romanizes to Kim, traces to Chinese 金 ("gold"), shares a Wikidata concept with Japanese キム, and is carried by ten million people in the corpus. This project links it all in one graph.

A typed graph with three node kinds — Form, Etymon, Concept — and twelve edge predicates. A separate claim layer records who said what, with what confidence, citing which source row; every fact is auditable.

variant-of derived-from transliteration-of cognate-of hypocorism-of feminine-of masculine-of compound-of borrowed-from named-after contains-element contracted-from

Built from eleven idempotent ingestion stages against open resources — Behind The Name, Paranames (Wikidata parallel-name corpus), polyglotnames transliteration pairs, Wiktionary etymology TSVs, and the Etymological Wordnet (de Melo, 2013), with a small Firecrawl pass for residue. No LLM is used in the structured extraction step beyond rule-based parsing of Wiktionary prose. Reproducible from disk in roughly 12 hours.

The method extends Davor Lauc, Tomislava Lauc & Vjera Lopina, The Formalization of Multilingual Etymologies into Semantic Networks: AI Methodologies for Etymological Dictionaries, presented at DH Benelux 2024 — Breaking Silos, Connecting Data, Irish College, Leuven, 5–7 June 2024. The paper introduces a hybrid LLM-plus-curation pipeline for turning etymological dictionary text into structured networks; this project operationalises it at scale across 100+ languages.

Form nodes
names + word forms11.3M

Etymon nodes
historical source words2.7M

Concept nodes
Wikidata Q-IDs2.7M

Typed
edges14.4M

Claims with
provenance21.4M

Edge
predicates12

Languages with
first-class support100+

WHO USES NELMA FOR ONOMASTIC RESEARCH

Four kinds of partner.

Academic linguists

Co-author papers, get HF-published derivative datasets under Apache 2.0, participate in cross-language methodology design.

Cultural-heritage orgs

Diaspora projects, religious traditions, national encyclopedias — let Mondonomo produce the data, you produce the narrative.

Publishers

The bilingual country-pair format demonstrates a working pipeline from database to printed book — replicable across new language pairs.

Field societies

The International Biblical Onomastic Society partners with Mondonomo on Named by God. Comparable arrangements available for other field societies.

Email research@mondonomo.ai → Or see commercial cases →

PAPERS & BOOKS

Quantitative onomastics, at civilizational scale.