-
Amharic WIC Corpus
Substantially cleaned version of existing morphologically annotated WIC Corpus. -
Somali Web Corpus
Somali web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated. -
Tigrinya Web Corpus
Tigrinya web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.