ARTICLE
TITLE

Web Corpora of Volga-Kama Uralic Languages

SUMMARY

This paper presents corpora of five minority Uralic languages that belong or are adjacent to the Volga-Kama area, which has been characterized as a Sprachbund (Bereczki 1983, Helimski 2003). A total of 11 corpora contain written and, in one case, spoken texts in Udmurt, Komi, Meadow Mari, Erzya and Moksha languages. The described resources are “web corpora” both in terms of their accessibility (all of them are accessible through a web-based query interface) and, in most cases, in terms of the medium (almost all texts come from web resources, such as digital newspapers and social media). The paper describes the corpora from the user perspective. The main focus is on the search capabilities and on certain research questions that can be studied with the help of these corpora. All corpora are available at http://volgakama.web-corpora.net/.

 Articles related

P. A. Van Brakel,H.C. Potgieter    

Various approaches and techniques have been developed over the years by information services to enable them to render effective current awareness services to their clients or end-users of information. Of these services, computerized SDI services probably... see more


Rossana da Cunha Silva    

Com o advento da internet e dos constantes avanços tecnológicos, os corpora se tornaram essenciais no crescimento dos Estudos da Tradução Baseados em Corpus (ETBC), assim como no desenvolvimento de sistemas de informação e técnicas qu... see more

Revista: Belas Infiéis

Haiyan Men    

The present study tries to unveil the images of career women depicted on the internet within the framework of corpus linguistics. Based on web corpora, internet reportage of women from the fields of education, politics, business, media, medicine, law, an... see more


Ralitsa Dutsova    

Web-based software system for processing bilingual digital resourcesThe article describes a software management system developed at the Institute of Mathematics and Informatics, BAS, for the creation, storing and processing of digital language resources ... see more


Ralitsa Dutsova    

Web-based Digital Lexicographic Bilingual ResourcesThe paper presents briefly a web-based system for creation and management of bilingual resources with Bulgarian as one of the paired language. This is useful and easy to use tool for collection and manag... see more