Resources

I enjoy building and sharing open computational tools, datasets, and code to support reproducible research in phonology and linguistic analysis.

Code

BUFIA-AR

Implementation of the Bottom-Up Factor Inference Algorithm over autosegmental representations for learning tonal and phonotactic patterns from structured data.

  • Tonotactic learning over autosegmental graphs
  • Structured and interpretable representations
  • Experimental comparison with probabilistic learners

GitHub

Prosodic Extractor

A tool for extracting syllabic, segmental, and tonal structures from orthographic input and converting them into multi-tier representations.

  • Automatic extraction of prosodic structure
  • Multi-tier representation generation
  • Configurable language-specific mappings

GitHub


Datasets / Corpora

Hausa Tonotactic Dataset

Curated dataset of Hausa lexical items with annotated tonal structures used in computational learning experiments.

  • 600+ lexical items

Download


This site uses Just the Docs, a documentation theme for Jekyll.