Transcription Challenges in the Creation of the Pauper Petition Corpus

Blog post written by Anita Auer (29 May 2020)

In this week’s blog post, we want to give you some insight into the work that we have been doing over the last few weeks and what challenges we have been facing.

Before we tell you more about the challenges, here is some relevant background information. In the history of the English language, the sources at our disposal for linguistic investigation, i.e. texts written or published, were largely produced by the educated layers of society. As compulsory elementary schooling was only introduced in 1870, before then, a great part of society did not receive (regular) schooling. The labouring poor may have been taught to read and write (some words and phrases) in Sunday schools, factory schools, and some taught themselves. Ultimately, the literacy levels varied a lot before elementary schooling was introduced. Within this context, pauper petitions are extremely valuable for historical (socio)linguists in that it is the first time in the history of the English language that we have access to written material produced by the lower strata of society. Before then, the language of the labouring poor had always been mediated by an educated person, e.g. a playwright would depict the language of lower-class characters in a play or a court scribe would record the language of a person in depositions. In the case of the pauper petitions, we are dealing with the actual language of the paupers. It should however be pointed out that the writer of the petition was not always the applicant for relief him- or herself but somebody else who could write. This information is of importance for the social (meta-linguistic) information linked to the letters, but we will talk about this issue some other time. The investigation of the language of the labouring poor allows us to shed light on and to compare education opportunities and language repertoires across social strata. Moreover, lower-class language may give us new insights into language variation and change in English language history.

As the basis for our investigations of the language of the labouring poor is a corpus of c. 2050 pauper petitions covering the period c. 1795-1835, it is very important for us that the transcriptions we make are philologically accurate, e.g. that the line and page breaks are kept according to the original, the spelling of the words reflects the spelling in the original petition and is not standardised or somewhat edited, self-corrections are indicated, a change of hand is indicated in the transcription, etc. An accurate reflection of the language use at the time is essential for our analyses of linguistic levels like spelling, phonetic spelling, i.e. reflections of pronunciation, morphology and syntax (cf. Auer et al. 2014). To illustrate what such a philologically accurate transcription can look like, we provide an extract from a petition from Buckinghamshire from the early 19th century here:

sur i Wish you to take great Care

hoW you go out as thar is a great

Maney uerrey Bead Men and

Women going ahought 2 uerry                      

Shocking Murders has Been

Commited at this time plase to

tall My WiFe Not to go to

sall her things a Bought the

Cunterrey as i am a Fread

some misforten Will hapen to her

tall her i shall Come to see her

as soon as i Can Make My Money

From him

As you can see, the language differs from Standard written English as we know it today. This is not surprising if we consider that literacy levels varied a lot at the time (no compulsory elementary schooling then). To tell you more about some of our transcription challenges, you will for instance notice the random capitalisation in the extract, word-initial as well as in the middle of words. When transcribing, which we currently do based on facsimiles, we sometimes end up with different versions, e.g. some team members propose lower case while others propose upper case. We thus have some ambiguous cases. With regard to some characters, selected letter writers clearly only knew the character in lower or upper case. This is then reflected in the transcription where we transcribe the character (upper or lower case) as it is written in the original. Apart from the capitalisation issue, we also tend to have discussions about the spelling of <a/o>, <e/i> and <u/v> in some of the petitions. While it would be easier to simply go for today’s standard forms, we try to accurately present the spelling in that it can provide insight into language change over time. For example, the use of <u> instead of <v> in haue and other words was still common during the early part of the Early Modern English period (1500-1700) but then gradually disappeared (cf. Rutkowska 2016). If we were to find <u> instead of <v> examples in the pauper petitions, we could prove that the <u> instead of <v> spelling survived in the English language much longer than previously assumed, notably in a text type that has not been systematically studied yet. More generally, we find that there are many linguistic differences between printed language and hand-written documents like letters and diaries where some linguistic features that have been changing can no longer be found in the printed language but are retained in hand-written documents.

We carried out a haue vs. have poll on twitter on 28 May 2020 to get an impression of whether other people have similar problems deciding on the spelling variant. This was certainly confirmed by the result. Here is now the full extract of the petition from 1811 that we have used as illustration:

We herewith invite you to send us your transcription of this extract by posting a response on twitter or by sending us a twitter message. Many thanks for your help with this!

References

Auer, Anita, Mikko Laitinen, Moragh Gordon & Tony Fairman. 2014. An Electronic Corpus of Letters of Artisans and the Labouring Poor (England, c. 1750-1835): Compilation Principles and Coding Conventions. In Lieven Vandelanotte, Kristin Davidse, Caroline Gentens & Ditte Kemps (eds.), Recent Advances in Corpus Linguistics. Developing and Exploiting Corpora, 9-29. Amsterdam/New York: Rodopi.

Rutkowska, Hanna. 2016. Orthographic regularization in Early Modern English printed books. Grapheme distribution and vowel length indication. In Cinzia Russi (ed.), Current Trends in Historical Sociolinguistics, 165–193. Warsaw/Berlin: De Gruyter Open.