[ Skip to Abstract | Return to Top ]

Peer-Reviewed Article

Accepting Editor: Alexei Lavrentev, CNRS & Université de Lyon.
Recommending Reader: Yan Greub, CNRS & Université de Lorraine.
Recommending Reader: Sabine Tittel, Heidelberger Akademie der Wissenschaften.
Received: 2014-12-02
Revised: 2015-03-10
Published: 2015-11-22

[ Skip to Navigation | Return to Colophon ]

Abstract

Digitization of dictionaries originally in book form as well as the creation of online dictionaries has revolutionized the way dictionaries are presented and offers not only the opportunity of presenting textual links between dictionary headwords but the possibility of directly connecting one online dictionary to another. This article is an introduction to one of the new functions of the online Anglo-Norman dictionary, i.e. cross-referencing, the provision of links from the Anglo-Norman dictionary entries to other relevant medieval and modern dictionaries. In addition to establishing the usefulness of cross-referencing for dictionary users and presenting how this has been achieved in the Anglo-Norman dictionary, this article examines some of the potential pitfalls that need to be addressed when implementing live links to other dictionaries.

Keywords: online dictionary; cross-referencing; headword; link; hyperlink.


[ Return to Navigation]

Introduction

§ 1    As David Trotter has said, no dictionary exists in isolation, and all dictionaries are part of an international and multilingual network of lexicographical resources which collectively attempt to record and explain the vocabulary of the languages of the world (Trotter 2011, 28). Digitization of dictionaries originally in book form as well as the creation of online dictionaries has revolutionized, if not the fundamental relationships between words or dictionaries, the way dictionaries are presented and the amount of information that can be provided. Digitization, therefore, offers the possibility of not only presenting textual links between words and dictionaries but of actually directly connecting one online dictionary to another. This article, then, is an introduction to one of the new functions added to the online Anglo-Norman dictionary (AND), i.e. cross-referencing, the provision of links from the AND entries to other relevant medieval and modern dictionaries. In short, cross-referencing means that we are connecting the headwords in the AND to words from the same root, or of the same etymon, in other dictionaries, both medieval and modern. These links, as I will demonstrate with examples from the AND, can show us how widespread or narrow the usage of any particular headword listed in the AND was, what sort of texts and contexts it was used in, how it emerged in different varieties of French – for example continental Old and Middle French in comparison to Anglo-Norman – and whether the word has survived to modern day English or French.

Choice of dictionaries for cross-referencing

§ 2    The Anglo-Norman dictionary functions as the record of the variety of French introduced into Anglo-Saxon England after the Norman conquest, and this variety subsequently dramatically relexified the English language. In other words, the AND can be considered the link between French and English lexicography (Trotter 2011). The term Anglo-Norman originates from the time that the language was regarded as the regional dialect of the Norman invaders who came across the Channel with William the Conqueror, and although the term Anglo-French or the currently popular French of England perhaps reflects the reality of the varying ethnic background of the people using the language, as well as their linguistic competence, better than the somewhat restrictive Anglo-Norman, the dictionary project preserves the old name. William Rothwell notes in the Introduction to the on-line AND: The title of this second edition of the Dictionary preserves the old name purely in order to maintain continuity with the first edition, which adopted Anglo-Norman as being the term in current use in academic circles at the time in the later nineteen-forties when the idea of a glossary of the medieval French of Britain was first mooted (Rothwell n.d.). The AND has connections to other dictionaries of Romance languages, both medieval and modern, as well as to Germanic languages and, of course, to medieval Latin, the third important language of medieval Britain. All of these connections are reflected in the choice of dictionaries the AND links to: the main French etymological dictionary FEW, the etymological dictionary of Old French DEAF, two key medieval French dictionaries Godefroy and Tobler-Lommatzsch, the principal Middle French online dictionary DMF (which is also the only dictionary fully constructed as an online dictionary) as well as the modern French dictionary TLF, the recently completed British Medieval Latin dictionary (DMLBS) as well as the Middle English dictionary (MED) and the Oxford English dictionary (OED). The AND currently links to the following dictionaries: Französisches Etymologisches Wörterbuch (FEW); Le Dictionnaire du moyen Français (DMF); Frédéric Godefroy, Dictionnaire de l’ancienne langue Française et de tous ses dialectes du IXe au XVe siècle (GdF), (Paris: F. Vieweg, 1881-1902) as well as the complement to Godefroy (GdFC); Tobler-Lommatzsch: Altfranzösisches Wörterbuch (TL); Dictionnaire étymologique de l'ancien Français (DEAF) (Städtler et al. 2012); Le trésor de la langue Française informatisé (TLF); Oxford English dictionary (OED); Middle English dictionary (MED); Dictionary of medieval Latin from british sources (DMLBS) (London: Oxford University Press). These dictionaries will be referred to in this paper by the abbreviations provided here.

Screenshot from the AND
Figure 1: Screenshot from the AND

Linking method 1

§ 3    As is apparent from Figure 1, the AND currently has two different ways of providing links to related dictionaries, depending on the target dictionary format and the feasibility of linking electronically. Although DMLBS is the only dictionary in our list that is currently available only in paper format, not all our target dictionaries are digitized or available online in a format that would allow direct links. Figure 1 shows that the current links from the AND to GdF, GdfC and TL are to their book form. TL is available on CD ROM but not online so direct linking is not possible, whereas the site hosting GdF and GdFC has been deemed unstable due to the site being privately run and not attached to any institution. This prevents any knowledge as to how long it might be up and running or supported, so an online link to it at this point was deemed unfeasible. Figure 1 also shows the first linking method, which is as follows: the dictionary siglum is followed by the volume number, page number and the headword. In the case of FEW, the headword is the etymon as it is imperative to provide the root word to permit any queries, questions, or further research based on the etymon. We use the empty set sign to indicate when no cross-reference exists or it is not known or available in a dictionary: in Figure 1 hanellissement (breathing) has no cross-reference in GdFC, DEAF, TLF, MED or OED. Linking to FEW is currently being developed so that it is possible to provide both the electronic link and the book reference as demonstrated in Figure 1. This is due to ATILF (Analyse et Traitement Informatique de la Langue Française, a joint research unit of Le centre national de la reserche scientifique and the University of Lorraine) having implemented and publicized a linking interface to its page images of FEW which will allow the AND to add live links to the references. Linking to DEAF is currently problematic due to divergent linking systems according to the entries: DEAF has published the letters G to K in book format and although this information is also available online, it is currently in image files from the book version, and it is only a temporary phase in the development of a fully functional and searchable online dictionary (for the état présent of DEAF and more information concerning the dictionary in its evolving format, see e.g. the préface to fascicule F1 (Städtler et al. 2012; Tittel 2010a; Tittel 2010b). Other letters are also available online in DEAFpré as short articles but these are still being revised and have yet to be published as full dictionary articles. Therefore, the AND headwords in G to K will have a reference to the book form, whereas entries from M onwards as well as from A to E and majority of F will provide the corresponding DEAF headword if it is available, but currently leave the link part open for a later inclusion of a reference. DEAF has published the first part of F in book form with at least another fascicle due shortly, and these references will be added to the AND as and when the facsimiles become available. However, all other letters will probably only be published in electronic format, which the AND would then link to, once the site has been finalised. This is obviously not an ideal solution, but it cannot be avoided at this current moment, and the team decided that it is important to provide the corresponding headwords in DEAF, even if AND cannot currently provide a live link to the site. This is especially important with rare words, as sometimes the only cross-reference to the AND entry might be found in DEAF. Some problematic issues with other dictionaries cannot always be avoided either; for example, when encountering a cross-reference in DMF, which is not yet live in their current version of the dictionary. See for example DMF flasquet: the headword can be found when one performs a search for an individual headword entry, but it is not yet to be found in the DMF search by etymon (FEW). The hyperlink also does not work currently and gives an error message indicating that the entry is currently waiting to be processed. We have nevertheless included the link from the AND headword flasket. But then this is the nature of dictionary work – ever changing and evolving! The key issue really is that none of the cross-referencing will work unless each dictionary project develops and publicizes a system for stable links that will not break in the future.

Linking method 2

§ 4    The second method in the AND to provide cross-references is by live hyperlinks to other online dictionaries, which will take the user directly to the corresponding form in those dictionaries. The entry hanellissement (Figure 1) can only be hyperlinked to FEW and DMF, the links visible in blue, so another headword such as illuminer provides a better example (Figure 2).

Screenshot from the AND
Figure 2: Screenshot from the AND

§ 5    As is visible in this screenshot, the verb illuminer is a Latinism, as is indicated by the FEW etymon illuminare as well as the DMLBS headword. The verb can be found in all medieval as well as modern French and English dictionaries, and hyperlinks are provided for cross-references in FEW, DMF, TLF, OED and MED. Apart from the quick and convenient direct links to these other online dictionaries, the page also provides an immediate visual clue as to which dictionaries list the word, and in what form. Consequently, this provides for some users all of the necessary information they might need without necessarily having to look up the cross-references in the other dictionaries. AND also provides an easy system for making links to the AND, as is perceptible from Figure 2: at the foot of each entry, the user can copy or cut and paste the line inside the angle brackets, which provides a persistent link to the entry concerned.

Implementation of links

§ 6    So how does all this happen in practice? The addition of cross-referencing is taking place simultaneously in two different ways: the main editors of the AND, Geert de Wilde and Heather Pagan, add these links over the course of revising the dictionary entries from the letter N to Z, whereas I began the process of adding links to the entries already revised, i.e. A to M. The whole team works with an XML editor, epcEdit, which has been modified and expanded for the editorial team’s specific needs. It is possible to retrieve the cross-references section of an existing entry from the project DMS without having to modify the full entry xml-file; this will create a skeleton epcEdit document with the necessary links to the full entry so that it can be automatically merged into the existing entry. Once the cross-references have been merged, the entry will be rendered in the normal data management system (DMS) browser interface with the cross-references in place and the links active. This separate retrieval and saving system allows two or more editors to work on the same headword entry or the cross-reference section simultaneously without causing any version control issues. Figure 3 illustrates this skeleton epcEdit document with cross-references already in place. The AND server will generate the cross-referencing document with the appropriate lemma already in place when the file is retrieved from the server, here hanellissement, as well as each of the required targets, i.e. dictionaries, in a specific order. The settings of the siglum and linkable attributes are therefore supplied in advance without the need to be added or edited further by the editors. As is visible from Figure 3, both linkable and non-linkable items have only two elements: <link_form> element, which is the text to be displayed as the target of the link, i.e. headword or etymon (e.g. TL alenissement in Figure 3), and either <link_loc> or <link_target> element. Non-linkable items have a <link_loc> element, which shows the reference location for the dictionary concerned (e.g. TL 1,284 referring to volume 1, page 284). Linkable items have a <link_target> element which requires the information set by the target dictionary as identifier. In the case of DMF and TLF, this is the lemmatic form (e.g. hanelissement) whereas OED and MED both have numeric IDs (see Figure 4). Any item where both of these elements are left empty (e.g. GdFC in Figure 3) will automatically render with the empty set symbol without the editors needing to mark them as null. The only instance when the editor(s) would need to add anything to the skeleton document provided would be if there were multiple references to the same source. This can be done easily by adding two elements for the next reference after the closing element of the existing reference. The document type definition (DTD) will ensure that the correct ones are chosen for insertion at appropriate places.

Screenshot of the epcEdit cross-references only document
Figure 3: Screenshot of the epcEdit cross-references only document

Screenshot from epcEdit work window
Figure 4: Screenshot from epcEdit work window

§ 7    Currently the FEW entries in the digital version do not contain a <link_target> element of the kind that it is present in purposefully devised linking interfaces such as the DMF or the OED, and therefore the FEW items have a <link_loc> element as found in the non-linkable dictionaries. The <link_target> elements are highlighted in yellow purely for visual purposes to differentiate these fields from <link_loc> elements.

Issues rising from automated process of linking

§ 8    The AND has therefore adopted a system of adding the cross-references manually rather than using an automated process or macro, whereas some of the bigger online dictionaries, such as DMF and OED, seem to have automated the process to a certain extent. Automation of the process of course implies that it can be done quickly. Manual linking is admittedly a slower process, but has other benefits. The main one, of course, is that we examine each headword and entry individually, which in turn minimises the risk of linking to wrong words or etymons. Even if it cannot be denied that automated linking is quicker, mistakes can creep in more often. The following section will demonstrate some of the issues rising from the automated process.

§ 9    The OED links to both the Old English Dictionary and MED. One of the current reoccurring errors with the OED links to MED is visible on the OED entry junior (Figure 5).

Screenshot from the OED
Figure 5: Screenshot from the OED

§ 10    As is visible in the entry for junior, the MED link provided on the right hand side of the page, marked here with the blue box, is to MED joinour, which actually signifies joiner, furniture maker. This is clearly not the same word or even derived from the same etymon, so it is possible that the form of the two words is similar enough for the automatic linking to pick up joinour in MED for junior in OED, and therefore OED joiner will not link to MED at all. A similar issue can be found with the MED entry jonk n.2, which in nautical terms means an old cable or rope (Figure 6).

Screenshot from the MED
Figure 6: Screenshot from the MED

§ 11    The MED headword here is linked to OED junk n.3, which actually is the name for a common type of native sailing vessel in the Chinese seas. But the nautical theme is probably the only thing that links these words, as they do not derive from the same etymon. Some comedy will also inevitably present itself: MED ferte (boldness, fierceness; strength) is erroneously linked to OED fart, which, as OED notes in the entry, is not in decent use and clearly does not correspond in sense or etymon to ferte. It should be remembered, however, that some of the entries in the OED have not been revised since the 19th century and these issues will probably be picked up eventually by an editor revising such entries or alternatively by helpful volunteers.

§ 12    In the French online dictionaries, which also have automated their linking to cross-references in other dictionaries, similar problems appear. The DMF for example erroneously links their entry multe (see Figure 7), which means amende (and translates as fine, penalty), to AND mulet 1 (translates as mule) instead of the correct Anglo-Norman cognate multe 1, which not only has the same meaning as the DMF entry (mulct, fine, penalty) but is also identical in form. Or in some cases such as the DMF headword jupe (meaning tunic mostly worn by men), it is not linked to the modern French dictionary TLF jupe (in modern use refers primarily to a skirt) although it is the same word and TLF actually links to the DMF entry. Apart from linking to wrong headwords, the automated system also only picks up the headword in the target dictionaries. Therefore, as in the TLF the word larder, which is listed under the headword lard, the DMF entry for larder 1 does not pick up this term, and the two entries are not linked. Similarly, any verbs that are listed in an entry for a noun, i.e. guerdonner under the headword guerdon, are not picked up by the DMF automated linking system. Another issue manifests in the linking of headwords with identical forms, but with different meanings which are separated by numbers, i.e. the AND entries enfeoffer 1 (substantival form feoffor) and enfeoffer 2 (the verb enfeoff). DMF headword for the related AND verb form of enfeoffer 2, enfieffer, is erroneously linked to enfeoffer 1. Sometimes the automated process misses cross-references altogether, as is the case with DMF engouler, which is not linked to TLF engouler, although TLF links their entry to the DMF entry. In OED, for any headwords that are both nouns and adjectives, a link is only provided to one of the two in MED (either adjective or noun) but not both even if the OED entry would require both terms to be linked. In these cases, the AND will have one live hyperlink to the OED and two to the MED to cover all cross-references.

Screenshot from the DMF
Figure 7: Screenshot from the DMF

Case studies

§ 13    In addition to the usefulness of providing instantaneous links for the dictionary user to other medieval or modern dictionaries, as well as providing a quick reference to sources of additional information on the etymology of the headwords (namely FEW and DEAF), the linking can result in interesting and even surprising discoveries. For the modern reader, the words locust and lobster refer to two very different species of the animal kingdom and at first glance they do not seem to have much in common. Locust, the modern English word for an insect associated with migrating hordes that ravage whole areas of countryside, especially in Africa and Asia, by consuming all vegetation in their path, derives from the Old French and Anglo-Norman word locuste (DMF locuste, from Latin LOCUSTA: insect, locust, grasshopper). But in fact the Latin word locusta (DMLBS 1634a/b) also originally signified a lobster or a similar crustacean (Lewis and Short 1989, 1075b) and the form used in some of the Latin examples reads as locusta marina – a form adopted in some Anglo-Norman texts. The Latin word actually originally signified a lobster or a similar crustacean, and that the application to the locust was suggested by the resemblance in shape. It should also be noted that whereas the modern English lobster, deriving from Old English (lopustre, lopystre, loppestre), only includes the original sense of the word, the modern French langouste, a form already present in Anglo-Norman (languste) to signify locust, includes both senses even if today the term is almost exclusively used to signify lobster. TLF langouste (sense B) lists the sense locust still in use in the 19th century so it survived to modern French even if it is no longer in use today. If you would like to see any more information, please see my blogpost Word of the Month for March 2014 (Nara 2014), and you can also read on other interesting Anglo-Norman words or terms by the editorial team via this blog. Another interesting example is morris dance, a lively traditional dance that we identify as something quintessentially English. This dance is performed in formation by a group of dancers in a distinctive costume and usually wearing bells and ribbons and carrying handkerchiefs and sticks. The earliest form in the OED is morisk dance (under OED Morisk). AND also has the headword morisk, which is a later form than the noun form lettre de morisk, which signifies Moresque design, Arabesque ornament. Morisk, the Anglo-Norman adjectival form from Letters and Petitions (1390-1412) signifies Moorish in style: Moorish, characteristic of the style (of painting, decoration, etc.) of the Moors. Morris dance therefore actually refers originally to a dance in Moorish style and any entertainment involving such dances. This is ultimately the classical Latin maurus, meaning a Moor, Moroccan, a black man (DMLBS maurus 1 1738b).

Conclusion

§ 14    To conclude, this article aims to establish how one of the new functionalities of the AND, i.e. linking of cross-references, demonstrates how digitization of book dictionaries and the creation of online dictionaries has fundamentally changed the functionality of dictionaries. In addition to establishing the usefulness of cross-referencing for dictionary users, i.e. provision of direct links to other relevant dictionaries, both medieval and modern, as well as of the sources for etymological information, this article points out some of the pitfalls that need to be addressed when implementing links, as well as the potential it has as a further research tool. Although not all dictionaries are online or digitised yet, the potential is there for all the relevant related dictionaries to be interlinked in such a way that it will become possible to rapidly review the entirety of lexicographical evidence, irrespective of language (Trotter 2011, 28). There have been recent discussions between various dictionary projects with a view to linking dictionaries even more closely together, but no concrete project exists as yet. As David Trotter continues, it would be a significant step forwards, and would allow us to reassemble in its full multilingual complexity the lexical landscape of medieval Europe (Ibid, 28).

Works cited

Lewis, Charlton, and Charles Short. 1989. A Latin dictionary. Oxford: Clarendon Press.

Nara, Katariina. 2014. Word of the month: Locusts and lobsters. The Anglo-Norman words blog. Accessed March 13. http://anglonormandictionary.blogspot.co.uk/

Rothwell, William. (n.d.). Anglo-French and the AND. Introduction to the online Anglo-Norman dictionary. Accessed April 12, 2015. http://www.anglo-norman.net/sitedocs/main-intro.shtml?session=SLON1005T1415873994

Städtler, Thomas, Stephen Dörr, Sabine Tittel, M. Kiwitt, and Frankwalt Möhren (eds.). 2012. Dictionnaire étymologique de l'Ancien Français (DEAF). Berlin: De Gruyter.

Tittel, Sabine. 2010a. Le « DEAF électronique » – un avenir pour la lexicographie. Revue de Linguistique Romane, 74: 301-311.

Tittel, Sabine. 2010b. Dynamic access to a static dictionary: A lexicographical «cathedral» lives to see the twenty-first century – the Dictionnaire étymologique de l’Ancien Français. Cahiers du Cental 7: 295-302.

Trotter, David. 2011. Bytes, words, texts: The Anglo-Norman dictionary and its text-base. Digital Medievalist 7. http://digitalmedievalist.org/journal/7/trotter/