Workshops > Workshop 2B - 1:45-3:30

A variationist playbook: Best practices in corpus constitution, exploitation, handling, analysis and interpretation

University of Ottawa Sociolinguistics Laboratory

Armed with a research question, what do you do next? Drawing on years of experience in developing and refining methods for constructing massive compendia of vernacular speech, the aim of this workshop is to share some of the tried-and-true protocols for each step of a variationist analysis, along with tips on avoiding unforeseen pitfalls along the way. While there is no one-size-fits-all approach to quantitative research, we illustrate with the techniques we’ve found most helpful in our own studies of language variation and change. Topics covered will include corpus constitution, elaborating efficient transcription protocols, computerizing and manipulating data, using software and best practices in data analysis.

Learn how to design and build a corpus that addresses a specific research question. Extend your sample stratification far beyond standard social categories (age, sex, etc.) by constructing content-based indices most relevant to the particular axes of variation in your speech community. Find out how to exploit unconventional data sources, tapping into the possibilities of bilingual, enclave and minority varieties. Expand the diachronic axis of changes in progress beyond apparent time by appealing to the rich historical resources of popular theatre, personal correspondence, folklore recordings and even prescriptive grammars!

Build an adaptable token file that works for you by exploiting the features of software such as Microsoft Excel. Variable linguistic phenomena for which a traditional variable cannot be defined (e.g. bilingual code-switching) highlight the analytical strengths of spreadsheet programs. Discover time-saving extraction procedures, techniques for dealing with rare variants, and how to distinguish form from variable context. We’ll demonstrate how to operationalize claims in the literature as factors in a multivariate analysis, focusing on the essential link between coding and hypothesis-testing as well as methods for incorporating maximum reliability into coding procedures.

Finally, we will discuss why initial variable rule results are best conceived as a series of exploratory manoeuvres which can be used to hone your analysis. From interactions to crossovers to lexical epiphenomena, preliminary runs can be riddled with problems. Using techniques such as parallel runs, cross-tabulations, frequency information and trends in rates and conditioning, we’ll show how to work your way through to significant and reliable results.