LibGuides: Linguistics Data: Home

Linguistics Data and Corpora

A corpus (corpora in plural form) is generally speaking a substantive collection of language data processed in machine-readable form for research purposes. Corpora are sampled or collected under comparatively less controlled but more “ecological” conditions to allow more research questions to be posed. Corpus Linguistics methods have become popular due to the increased availability of corpora and statistics tools (Wallis, S., 2020, pp. 3-5).

This guide includes sources of linguistics datasets, tools and guides on statistical analysis of linguistics data. Please use the left-side menu to browse and contact Ying Liu yingliu(at)uvic.ca if you have questions and suggestions.

Wallis, S. (2020). Statistics in corpus linguistics research : a new approach (First edition.). Routledge.

The Open Handbook of Linguistic Data Management by Berez-Kroeker, Andrea L., editor.; McDonnell, Bradley James, editor.; Koller, Eve, editor.; Collister, Lauren B., editor.
ISBN: 0-262-36607-X

Publication Date: 2021
The SAGE Handbook of Qualitative Data Analysis by Uwe. Flick (Editor)
ISBN: 9781446208984

Publication Date: 2013-12-27

This handbook is the first to provide a state-of-the art overview of the whole field of QDA; from general analytic strategies used in qualitative research, to approaches specific to particular types of qualitative data, including talk, text, sounds, images and virtual data.

Linguistics Data

Your Librarian

Linguistics Data and Corpora

Related Guides