Resources
I enjoy building and sharing open computational tools, datasets, and code to support reproducible research in phonology and linguistic analysis.
Code
BUFIA-AR
Implementation of the Bottom-Up Factor Inference Algorithm over autosegmental representations for learning tonal and phonotactic patterns from structured data.
- Tonotactic learning over autosegmental graphs
- Structured and interpretable representations
- Experimental comparison with probabilistic learners
Prosodic Extractor
A tool for extracting syllabic, segmental, and tonal structures from orthographic input and converting them into multi-tier representations.
- Automatic extraction of prosodic structure
- Multi-tier representation generation
- Configurable language-specific mappings
Datasets / Corpora
Hausa Tonotactic Dataset
Curated dataset of Hausa lexical items with annotated tonal structures used in computational learning experiments.
- 600+ lexical items