This data set consists of twenty Excel spreadsheets (.xlsx format) — one for each language represented in the Oto-Manguean Inflectional Class Database. Each spreadsheet provides a list of verbs, together with information pertaining to the inflectional exponents of these verbs and details regarding their membership in inflectional classes. Each row in the spreadsheets is a unique record, containing the details for a single verb. Each column in a spreadsheet is a field common to all records for that language. Each file can be used as a stand-alone data set in its own right; given that each language is a unique system, and each file encodes different information for each language.Grammatical meaning is expressed by changes in word form (inflection): for example, in English, the ending -s is added to a verb form to show that it has a third person singular subject in the present tense. This is fairly simple. In other languages the systems are more intricate. In Spanish, the ending for 'we' in 'we sing' (cant-amos) differs from the one in 'we want' (quer-emos), which in turn differs from the one in 'we open' (abr-imos). Such inflectional classes are apparently useless: the variation is independent of meaning, and must simply be memorised. But these systems are widely found, highly structured, and remarkably resilient over time. They represent a unique sort of complexity with implications for theories of language and mind. Our knowledge of inflectional classes is largely limited to European languages. We move beyond this, focusing on the Oto-Manguean languages of Mexico, numbering about 200, many of which are threatened. They have a rich array of suffixes, prefixes, complex tonal patterns and stem alternations, co-occurring in a single word form. This results in the interaction of multiple layered inflectional classes, drastically increasing complexity. They provide important evidence of the degree of inflectional complexity a language can tolerate.
The data was collected using three methods: (i) primary data which we collected in the field; (ii) primary data from our collaborators and (iii) secondary data from both published and unpublished sources. Data for Tilapa Otomi was collected by Enrique Palancar over the course of several field trips to Tilapa, Mexico, where he worked with an elderly woman, who is one of the few remaining speakers of the language. Data for the following languages were generously made available to us by our collaborators: Acazulco Otomi (Néstor Hernandez-Green); Matlatzinca (Leonardo Carranza); San Pedro Amuzgos Amuzgo (Yuni Kim and Fermín Tapia); Yaitepec Chatino (Jeffrey Rasch); Yoloxóchitl Mixtec (Jonathan Amith and Rey Castillo García) and Zenzontepec Chatino (Eric Campbell). Data for the remaining thirteen languages (as well as additional data for San Pedro Amuzgos Amuzgo) were extracted from various published and unpublished sources, including PhD theses, dictionaries, grammars and published articles. The references for all the sources we used are given in the "References.txt" file. Where the publication is also available in electronic format, the URL is provided under "Publications" in the "Related resources" section below.