Presentation is loading. Please wait.

Presentation is loading. Please wait.

UK Archives Discovery Forum 7 th March 2013 An open old  modern surname index for digitized archives Patrick Hanks, UWE Sean Cunningham, TNA Paul Cullen,

Similar presentations


Presentation on theme: "UK Archives Discovery Forum 7 th March 2013 An open old  modern surname index for digitized archives Patrick Hanks, UWE Sean Cunningham, TNA Paul Cullen,"— Presentation transcript:

1 UK Archives Discovery Forum 7 th March 2013 An open old  modern surname index for digitized archives Patrick Hanks, UWE Sean Cunningham, TNA Paul Cullen, UWE Richard Coates, UWE 1

2 Theme of the talk Of all linguistic and historical data, surnames are among the most unstable. Digitization of archives presents an opportunity for new approaches to statistical study of issues such as the relationship between surnames and localities. This implies new approaches to transcription of data (rigorously respecting the distinction between transcription and interpretation). We have some suggestions to make re the UKAD objective of developing international agreement on standards for mark-up and database structure. 2

3 Overview of the FaNUK project  Family Names of the United Kingdom (AHRC-funded)  An XML database  list of entries from 1997 electoral roll  Explanations of origins and history (in progress)  1881 census: geographical distribution of surnames before mass mobility and increased immigration  Select a “main entry” for each cluster of spellings, on the basis of etymology and frequency  20,000 main entries; 25,000 modern variant spellings  Being linked to innumerable medieval spellings  Medieval name forms could be linked to the main entries in the FaNUK database 3

4 Interface between history, geography, and philology H. B. Guppy (1890) was the first to suspect a systematic association between surnames and localities Guppy’s hypothesis has become increasingly statistically testable since c. 2009 Vast amounts of primary historical evidence are becoming available in machine-readable form 4

5 Computational analysis of data We are applying techniques developed in corpus linguistics to the study of this primary onomastic data, e.g.  in the statistical association of surnames with localities on the basis of the primary evidence 5

6 Pegden and variants 1881 distributions (Steve Archer’s British Surname Atlas) Locative name from Pegden Farm in Lindfield (Sussex) PegdenPagdenPigden 6

7 Rochester 1881 distribution (Steve Archer’s British Surname Atlas) 7

8 Some currently available digitized sources (from TNA and elsewhere) Late 14 th -century Poll Taxes (ed. C. Fenwick) Parish Registers – Digitized by members of the Federation of Family History Societies and the LDS Church – the IGI) PROB11 (probate records of the Canterbury Prerogative Court, 1383-1687) Chancery Proceedings (1386-1558) Feet of Fines (C12-1509): Chris Phillips – Wonderful resources! – but, alas, there are slight differences in format, which makes file comparison difficult – De we need an agreement on standard format? 8

9 IGI (193 million records) 9

10 14 th -century Poll Taxes (200,000 records) 10

11 Prob11 (TNA) 11

12 An associated project: British Academy funding Old modern surname index to C14 Poll Tax There are significant problems in relating medieval forms to modern forms of surnames. – Many people assume that the relationships are obvious, but all too often they aren’t. In many cases, linguistic expertise (supported by circumstantial evidence) is required to make the connections with confidence. E.g. Yelling (a Somerset surname) must surely be from Yelling (a place-name in Hunts), even though Somerset is far from Hunts Need for studies of early migration 12

13 Example: Sapsford Clemens Sabrichworth’, 1381 in Poll Tax (Colchester, Essex) – indexed by FaNUK as Sapsford – (because that’s the usual modern form of the surname – 756 bearers) – The surname derives from Sawbridgeworth (Herts), which until recently was locally pronounced Sapsford or Saps(w)’th 13

14 Misidentifications In the Kent Hundred Rolls Project (www.kentarchaeology.ac), 2006:www.kentarchaeology.ac Walwarecchare is modernized confidently but erroneously as Walmer. Should be Waldershare (near Deal). Middelton is modernized as Middleton. Should be Milton (Regis) (near Sittingbourne). Uppecham is modernized as Petham. Should be (East or Up) Peckham (near Maidstone). Stephani de Hokeregg is modernized as Stephen of Hucking. Should be Stephen of Hockeredge (i.e. Hockeredge near Cranbrook). Better not to modernize at all than to modernize erroneously! 14

15 More misidentifications In The Survey of Archbishop Pecham’s Kentish Manors 1283-85, ed. Kenneth Witney, Kent Records vol. 28 (Kent Arch. Soc., 2000), the medieval forms of names are rarely given. They are wrongly identified surprisingly often: – Ferur is rendered Ferryman (but the modern surname Ferrer = ‘smith’) – Ismonger rendered Fishmonger (but modern Isemonger = ‘ironmonger’) – Yue rendered Yew as if referring to the tree (but modern Ive is from a personal name Ive) – Sewen rendered Sweyn (but modern Sewin is from a pers. name Sawin) – Idoyn rendered Jordan (but modern Iddon is from the fem. pers. name Idonea) – Cissor is sometimes rendered Sawyer, sometimes Tailor. Why? 15

16 Some other dubious interpretations The Northumberland Lay Subsidy Roll of 1296, ed. Constance Fraser (Soc. of Antiquaries of Newcastle upon Tyne, 1968): Thomas ad Fontem has been modernized as Thomas Spring. BUT Fons / Fontem in medieval Latin normally represents the English surname that later became Well or Wells. (There is indeed a Middle English surname atte Spring, but it refers to residence near a plantation of saplings, not to water, and it would not be represented by ad Fontem.) 16

17 Transcribe first; then interpret/translate! Even the simplest-looking case of a common occupational surname must be treated with caution. For example: “John Mercator” appears in dozens of medieval deeds in Canterbury Cathedral Archives (Kent). This used to be modernized in the online catalogue as John Merchant, until the archivist Elizabeth Finn noticed that a seal attached to one of these deeds gives the name as John Chapman. Molinarius may be the modern surname Miller or Milner, or Millward,... or.... Faber... Smith or Ferrer... or? 17

18 Concluding the problem Surnames are unstable – over 50% of all current surnames are variants of some other name. – They need to be studied statistically. – The (highly variable) medieval and Tudor forms need to be indexed to a selected modern form of each name Transcription vs. interpretation: 1.Transcribe verbatim: “diplomatic” transcriptions are preferable; digitize. If not diplomatic, declare what level of interpretation is used. 2.Then (separately) translate/interpret the data; don’t just assume that the correct modernizations are ‘obvious’ 3.Place-name scholars take a similar view : See for example O. J. Padel, ‘Place-Names and Calendaring Practices’ in M. Hicks (ed., 2012): The 15 th -century Inquisitions Post Mortem: A Companion. Boydell, Woodbridge. 18

19 Next Steps Summer 2012 UWE and TNA agreed a partnership for a three year project to create a digital resource to: – Identify names in selected name-rich historical records – Link early name forms to reliable headforms – Begin indexing and linguistic interpretation Original plan to analyse, index and compare three large datasets: - -Fenwick’s 1377-81 Poll Tax returns; -TNA’s digitised catalogue descriptions of early Chancery bills and ancient petitions; -the IGI data 19

20 Intended Outcomes Link early forms of names to reliable inventory of modern names in the FaNUK database Identify significant associations between surnames and localities over time Develop an Old↔Modern index of surnames Plot the continuity between medieval and modern name forms Create statistical analytic procedures for re-use with datasets Agree a publicly available ‘gold standard’ of old and modern spellings of every established UK surname (i.e. The nineteenth century names still extant) 20

21 New Departures For various reasons, further discussion between the partners has refined the scope of the project Project now aims to combine the intended outcomes of the former plan towards a funding bid to create: an archival tool that delivers linguistically and historically reliable authority data on ancient name forms Intended as a name-relational cataloguing toolkit to be freely available via TNA website Will cluster variant spellings of surnames according to their derivation and geographical distribution Will use linguistic and onomastic expertise of FaNUK and draw data from a wider pool of earlier TNA records series (in addition to Poll Tax, chancery bills, petitions, and IGI) 21

22 A toolkit for surname cataloguing An open-access toolkit resource that incorporates the authority of a structured name inventory /database will be of valuable practical benefit to a wide archival user base It will facilitate the recall of name data without damaging (linguistic or onomastic) precision It will be maintained and refined as part of TNA’s sustainable catalogue technology It will be an infrastructure not an interface Name data created to the standards defined at the start of the project (stemming from FaNUK’s existing expertise) will broaden the range of evidence included and allow contributions from diverse users over time 22

23 Feedback - Questions How might the archival community use such a tool ? How can archivists contribute to its development, range and usability? Should there be a mechanism for archival users to contribute name data (IP and copyright issues) What early records data should be created to provide a versatile foundation for comparison with modern name forms? How might it be delivered to users as a web tool? How might it adapt to the broadening role envisaged for TNA’s catalogue 23


Download ppt "UK Archives Discovery Forum 7 th March 2013 An open old  modern surname index for digitized archives Patrick Hanks, UWE Sean Cunningham, TNA Paul Cullen,"

Similar presentations


Ads by Google