Opening talk

Corpus, context, culture: histories and futures
Professor Ronald Carter, University of Nottingham

This opening talk involves a review of some key developments in corpus linguistics over the past 40 years and of what we might learn from them in both theory and practice for future developments of the field. The principal focus is on the main themes of the conference and examples are given which suggest how corpus linguistics can continue to address issues of interdisciplinary applications and involve other fields of study --- with particular reference to language learning, stylistics and professional and internet communication. Issues of context and culture will be explored with reference to a spoken corpus data and to a recent research collaboration between Nottingham University and Cambridge University Press in the development of an e-language corpus (the CANELC corpus) where (sometimes multimodal) language use is marked by forms associated with both spoken and written grammars. Exploration of data from this and other corpora suggests the need for a fuller recognition of key notions of 'emergent' cultures, 'dynamic' contexts and mixed research methods appropriate to different data streams.

Ronald Carter


Keynote speakers

Place-making in Brooklyn, New York
Professor Beatrix Busse, University of Heidelberg, Germany

In the model of urbanity (Busse and Warnke 2014), language and other semiotic modes function as parameters of both urban variation and of processes of urban variational place-making to create identity, belonging, attachment and place. These discursive practices are always historically motivated and mobile, and their linguistic patterning and social styling are highly declarative. In this paper, my focus will be on selected – so far mostly gentrified - neighbourhoods in Brooklyn, New York. I shall investigate contemporary and historical linguistic and multi-modal means of reflecting on people’s sense of belonging as place-making activities (Cresswell 2004, Friedmann 2010) in Park Slope, Williamsburg and Brooklyn Heights. I will show when, how, and why not just dialect features, but practices on all levels of language as well repetitive linguistic and semiotic patterns index, that is, “enregister” (Johnstone 2009) social value and construe Brooklyn as a brand. These include, for example, comparisons and negation and range around interplaying themes like a generic reference to Brooklyn as a whole, a contrast between Brooklyn and Manhattan as well as a focus on aspects of home and on art. Methodologically, I will show that investigations of urban place-making are in need of (more) corpus-assisted approaches and, at the same time, should combine these with qualitative methodologies which also use more disparate linguistic practices or multimodal artefacts as their objects of analysis. Therefore, data to be analysed will be a corpus of semi-structured interviews I conducted with Brooklynites in February 2012 and – among others – narrative fiction, (historical) newspaper discourse and the semiotic landscape in these neighbourhoods.


  • Busse, Beatrix and Ingo H. Warnke. 2014. “Sprache im urbanen Raum – Konzeption und Forschungsfelder der Urban Linguistics. In: Handbuch Sprachwissen Ed. Ekkehard Felder and Andreas Gardt. Berlin: de Gruyter Mouton.
  • Cresswell, Tim. 2004. Place: A Short Introduction. Oxford: Blackwell, 2004.
  • Friedmann, John. 2010. “Place and Place-Making in Cities. A Global Perspective.” Planning Theory & Practice 11.2: 149-165.
  • Johnstone, Barbara. 2009. “Pittsburghese Shirts: Commodification and the Enregisterment of an Urban Dialect.” American Speech 84.2: 157-175.




The Contexts and Cultures of Interdisciplinary Research Discourse
Professor Susan Hunston, University of Birmingham

Academic discourse has long been a key focus of research in corpus linguistics as well as other forms of discourse analysis, and for good reason. There is a practical value in investigating a form of language that is crucial for students and professional users of English. Studies of academic discourse interrogate and demonstrate the link between language, culture and context. Academic discourse contributes to the social construction of knowledge. For the most part, studies have focused on distinct academic disciplines, for example contrasting features associated with stance. In the study reported in this paper we focus on interdisciplinary (ID) research discourse. We see this as an area of importance because ID research has a high priority in affecting societal change, yet ID discourse is often reported as constituting a site of difficulty. (‘We talk different languages.’) Aside from this utilitarian value, investigating ID texts, as opposed to those focused on a single discipline, has the potential to shake things up, methodologically and conceptually. Whereas there are established ways of comparing disciplinary discourses, methods of looking at an ID field are open to innovation.
This paper presents three aspects of our study so far, and considers the contribution these might make to the broader study of discourse using corpus linguistics. The first is corpus design, including the relationship between a corpus and a community of practice, and the role of an interdisciplinary journal in constructing that community. The paper considers the usefulness of a topographic as opposed to a typographic representation of disciplines. Secondly, the paper considers the question of sameness or change in meaning, using phrases including the wordform change as an example and considering to what extent this is a contested site. Finally, the paper examines aspects of stance marking in selected texts.


The Corpus as Social History - Prostitution in the Seventeenth Century
Professor Tony McEnery, University of Lancaster

To what extent can corpora illuminate the past? Can the close readings traditionally associated with the study of history gain from the tools and methods of corpus analysis? In this talk I will talk about work I have undertaken with historian Dr. Helen Baker looking at an issue in social history - prostitution in seventeenth century England. Social history in particular represents an interesting topic where the corpus might contribute - while the documentary sources and analyses associated with major historical events and figures are typically many and well analysed, the documents associated with the everyday, the unexceptional, are more sparse. In the case of marginalised or criminalised groups the documentary evidence outside of court proceedings is widely scattered and typically indirect. Prostitutes (we deal only with female prostitutes in this talk) are a good example of such a marginalised group - indeed in such a case the marginalization is enhanced by class and gender as well as criminality. I begin by considering what social historians have claimed about prostitution in this period. I then move to look at what the corpus shows us, using the latest version of the EEBO corpus available at Lancaster University - 1.5 bllion words of lemmatised, POS tagged and spelling regularised written texts from the 15th, 16th, 17th and 18th centuries. Using corpus techniques to explore the texts, I show what a corpus may show the historian about prostitution in the period - and what historians can offer to corpus linguists who are approaching texts from this period.

Corpus research for SLA: The importance of mixing methods
Dr. Ute Römer, Georgia State University

Over the past few decades, the growing availability of native speaker and learner corpora has enabled Second Language Acquisition (SLA) researchers to study patterns in the linguistic input of learners as well as in their language output in a more empirical and systematic way than previously possible. Corpus linguists have contributed considerably to a better understanding of central aspects of second language (L2) learners’ production and of differences between learner and native speaker English (see, for example, the influential work of Sylviane Granger and her team, and of Stefan Gries and Stefanie Wulff). This talk argues that corpus linguistics has a lot to offer to research and practice in SLA, especially if different methods and data types are combined in a “methodological pluralism” sense (McEnery & Hardie 2012). The talk also suggests that progress in corpus-based SLA research will depend to some extent on successful collaborations between corpus linguists and scholars from other fields. After a brief overview of some existing uses of corpora in SLA, the talk will present findings from two case studies that benefited from mixing methods and from the presenter’s collaboration with researchers from neighboring disciplines, including a computational linguist, a genre expert, a psycholinguist, and a cognitive linguist.
While not based on learner output, the first study examines language that can serve as a model to learners and has clear implications for second language teaching practice (see Wulff, Römer & Swales 2012). It combines quantitative and qualitative approaches to the distribution of attended and unattended this in successful student writing across disciplines and determines what learners need to know about this in this particular writing context. The second study examines the role that constructions play in second language acquisition. It uses data from a native speaker corpus, learner corpora, and psycholinguistic experiments to investigate what influences L2 learners’ acquisition and processing of English verb-argument constructions (see Ellis, O’Donnell & Römer 2013; Römer, O’Donnell & Ellis Forthcoming; Römer, Roberson, O’Donnell & Ellis 2014). The talk discusses the implications of both studies before closing with thoughts on desiderata and future avenues for corpus-based SLA research.


  • Ellis, N. C., M. B. O’Donnell & U. Römer. 2013. Usage-based language: Investigating the latent structures that underpin acquisition. Language Learning 63(Supp. 1): 25-51.
  • McEnery, T. & A. Hardie. 2012. Corpus Linguistics. Method, Theory and Practice. Cambridge: Cambridge University Press.
  • Römer, U., M. B. O’Donnell & N. C. Ellis. Forthcoming. Using COBUILD grammar patterns for a large-scale analysis of verb-argument constructions: Exploring corpus data and speaker knowledge. In: Nicholas Groom, Maggie Charles & Suganthi John (eds.). Corpora, Grammar and Discourse: In Honour of Susan Hunston. Amsterdam: John Benjamins.
  • Römer, U., A. Roberson, M. B. O'Donnell & N. C. Ellis. 2014. Linking learner corpus and experimental data in studying second language learners’ knowledge of verb-argument constructions. ICAME Journal.
  • Wulff, S., U. Römer & J. M. Swales. 2012. Attended/unattended this in academic student writing: Quantitative and qualitative perspectives. Corpus Linguistics and Linguistic Theory 8(1): 129-157.

Building onto the corpus-driven approach: a wider look on meaning
Professor Wolfgang Teubert, University of Birmingham
What makes the corpus-driven approach stand out in language studies is its appeal as a ‘scientific’ methodology. Using computational tools to identify, count and measure real language data, we obtain dependable findings. Scientific practice, however, is no different from any other social practice: it is discursively constructed. In the absence of a ‘real’ fundament, there cannot be a ‘true’ bottom-up approach. All corpus research presupposes a consensus on the arbitrary decisions underlying our research question, and the findings obtained have to be interpreted to make sense.
Meaning is found only in discourse. In my investigation of the discourse object ‘human rights,’ I will move from ambiguous collocation profiles to what texts actually say about this object by assigning a meaning to this lexical item. The meaning of human rights is, as I see it, the entirety of what is said about this lexical item, i.e. of all the paraphrases we find in discourse. Yet what counts a paraphrase is a matter of interpretation. The corpus-driven approach offers candidates we can accept or reject. The study of paraphrastic content is thus a necessary extension of traditional corpus linguistics. It combines a methodological approach with an interpretive endeavour that is free from methodical constraints.
In order to make sense of human rights in a specific text, or text segment, we have to uncover its intertextual links, thus revealing how it differs from what has been said before. We will not understand what human rights means in a specific context unless we have analysed those links. Texts thus can be seen as the nodes of dynamic, diachronically evolving networks held together by intertextual links. Therefore corpus linguistics has to concern itself with the diachronic dimension of discourse if it is to pave the way for interpreting what a given text (segment) or a lexical item means. Again, there is no method to capture intertextuality – it is up to the arbitrary decisions of an interpretive community.
Language is not a natural phenomenon; it is a cultural artefact. Linguistics, including the corpus-driven approach, belongs to the human sciences.

