The Parsed Corpus of Early English Correspondence contains 4970 personal letters by 666 writers, altogether 2.2 million words of running text from the years 1410-1681. The letters have been selected to be as socially representative of the literate social ranks of the time as possible. In addition to the flat text version, the corpus has also been provided with part-of-speech tagging and parsing. These two versions contain the same texts as the flat text version, as well as the additional linguistic coding. The corpus is also provided with two manuals, one outlining the corpus, the other explaining the annotation.
Nevalainen, Terttu and Helena Raumolin-Brunberg. 2003. Historical sociolinguistics. London: Longman
Nevalainen, Terttu and Helena Raumolin-Brunberg (eds). 1996. Sociolinguistics and Language History. Studies Based on The Corpus of Early English Correspondence. (Language and Computers 15). Amsterdam and Atlanta: Rodopi
The Corpus of Early English Correspondence Sampler (CEECS, identification number 2461), published in 1998, and deposited in the University of Oxford Text Archive in 2003, is a flat text version of some of the texts included in PCEEC. The full Corpus of Early English Correspondence (CEEC) was completed in 1998, and contains texts which for copyright reasons are not included in either CEECS or PCEEC, but are available in digitised form in inhouse use of the CEEC project team. The CEEC is being supplemented by an extension (CEECE, 1682-1800) and a supplement (1403-1681); these two corpora are still being compiled and in inhouse use in Helsinki.
For more information see Raumolin-Brunberg, Helena & Terttu Nevalainen (2007). “Historical sociolinguis-tics: The Corpus of Early English Correspondence.” In: Creating and Digitizing Language Corpora, Volume 2: Diachronic Databases, ed. by Joan C. Beal, Karen P. Corrigan & Hermann L. Moisl, 148-171. Houndsmills: Palgrave-Macmillan.