SUMMARY
Open science is based on freely and openly available scientific publications and data. The latter enable the verification and improvement of previous research. In the context of language technologies and manually annotated language resources, they also enable training of new text processing tools. However, just like scientific publications, research data need to be properly cited, as only this makes reproducibility of experiments possible and is the most important indicator of how interesting and useful researchers' work is in the community and plays a major role in their success with research grant proposals and career trajectory. In this paper, we survey the landscape of linguistic data, mainly (mainly language corpora) citation in six leading Slovene scientific journals (Jezik in slovstvo, Slavisticna revija, Slovenšcina 2.0, Linguistica, Slovene Linguistic Studies and Jezikoslovni zapiski) and in the proceedings of two scientific conferences focused on linguistics (Jezikovne tehnologije in digitalna humanistika and Obdobja) for the period of the last seven years, i.e. from 2013 to 2019. We consider 1,074 papers and analyse the results both quantitatively and qualitatively. From the quantitative perspective, we show that, overall, only about a fourth of the papers includes the use of language resources, and that in the later period (2018–2019) the use of language resources is over twice as frequent as it is in the earlier period (2013–2017). We classify the manner of language resource citation into five categories (e.g. citing the hyperlink in the texts or citing the key paper about the resource) and show that how a resource is cited is, to a large extent, dependent on the instructions for authors of the particular publication. Our qualitative analysis focuses mainly on resources deposited in the repository of the CLARIN.SI research infrastructure, where we show that they are, with few exceptions, incorrectly cited. We summarise the finding using the so-called Austin principles, show what has already been achieved in the scope of the CLARIN.SI infrastructure and propose guidelines for citing linguistic research data and how to implement them.