GRAPHEM (Grapheme based Retrieval and Analysis for PalaeograpHic Expertise of medieval Manuscripts)

The research program GRAPHEM (Grapheme based Retrieval and Analysis for PalaeograpHic Expertise of medieval Manuscripts) federates five research teams: 3 laboratories in computer science (LIRIS, LIFO, and LIPADE) and 2 in human sciences (Institut de recherche et d’histoire des Textes and Ecole nationale des chartes). Its aim is to improve the data mining and image processing techniques applied to medieval scripts and their classification.

Partners

The project was funded by the French Research Agency (Agence Nationale de la Recherche) for the period Jan. 2008 – Jul. 2011, and lead by the LIRIS (UMR 5205, Laboratoire d’InfoRmatique en Image et Systèmes d’information). The involved partners are:

Corpus

The photographic material used during the project is the documentation gathered for the Catalogues des manuscrits datés in France, comprising a collection of 9000 digital images of dated or datable medieval book scripts from the 9th to the 16th c.

Results

The major achievements are:

  • 4 different content based image retrieving systems, working on 9000 images
  • 1 comparison tool (to compare the results of the different systems)
  • 1 measurement tool for palaeographers Graphoskop
  • 1 result visualization system

References

Epidoc

The EpiDoc (Epigraphic Documents) Collaborative http://epidoc.sf.net/ is an international community of scholars working with texts in ancient Greek and Latin inscribed on durable materials (principally stone), convened by Tom Elliott of the University of North Carolina at Chapel Hill. EpiDoc is a collection of recommendations for the encoding of inscriptions in XML, many of the issues surrounding which are equally relevant to scholars working with other ancient and medieval texts. The EpiDoc Guidelines are a subset of the TEI and track the latest release of the TEI Guidlines.

The EpiDoc Guidelines (http://www.stoa.org/epidoc/gl/latest/) are a detailed collection of suggested conventions for encoding the Lejden-style epigraphic transcription style in XML, currently authored in XML and disseminated in a variety of formats. The EpiDoc Homepage (http://epidoc.sf.net) also provides access to both stable and development versions of the Guidelines; an associated RelaxNG schema and TEI ODD; and software and other tools created by members of the collaborative for editing, manipulating, or processing texts.

ESawyer (Electronic Sawyer)

Overview

The ‘Electronic Sawyer’ presents in searchable and browsable form a revised, updated, and expanded version of Peter Sawyer’s ‘Anglo-Saxon Charters: an Annotated List and Bibliography’, published by the Royal Historical Society in 1968. Its main content derives from Sawyer’s catalogue, with corrections and modifications, and with additional data collected by Dr Susan Kelly, Dr Rebecca Rushforth, and others. Dr Rushforth was also responsible for the development of the database which lies behind the online version of this catalogue.

The ‘Electronic Sawyer’, mounted on a server at the Department of Digital Humanities (DDH), at King’s College London, is an integral part of the KEMBLE website, mounted on a server in the University of Cambridge.

‘Kemble’ is the website of the British Academy-Royal Historical Society Joint Committee on Anglo-Saxon Charters. The BA-RHS Committee on Anglo-Saxon Charters has undertaken this work as part of the publication of a new multi-volume edition of the corpus of Anglo-Saxon charters. Further information about the project, and about ‘Sawyer’, the ‘Revised Sawyer’, and the ‘Electronic Sawyer’, is available on the ‘Kemble’ website.

The production of the ‘Electronic Sawyer’ would not have been possible without the generous support of the Arts and Humanities Research Council in funding the work of Dr Rushforth and others. Dr Kelly’s work, at an earlier stage, was funded by The Leverhulme Trust and The Newton Trust. We are also most grateful to Professor Harold Short, and to CCH, for their expertise and support.

[Text reproduced (with minor edits) from project website with permission.]

Project Status

  • Technical development completed (2011)
  • Content undergoing revision (last update 2008)

Project Team

  • Simon Keynes (Department of Anglo-Saxon, Norse, and Celtic, University of Cambridge) – Project Director
  • Harold Short and Paul Spence – Technical Research Directors
  • Zaneta Au – Web Developer (2008 Release)
  • John Bradley – Database consultancy
  • Beatriz Caballero – Interface Developer (2011 Release)
  • Arianna Ciula – Analyst (2008 Release)
  • Emma Connolly (ASNC) – Kemble website (2005, 2010)
  • Susan Kelly (ASNC) – Research Officer (1991-6)
  • Eleonora Litta Modignani Picozzi – Analysts (2008 Release)
  • Sean Miller (ASNC) – Revised Sawyer online (1999)
  • Rebecca Rushforth (ASNC) – Research Officer (2003-8)
  • Peter Stokes – Analyst and Web Developer (2011 Release)
  • Paul Vetch – Lead Interface Developer
  • Rory Naismith (ASNC), David Pelteret (Independent Scholar), Levi Roach (ASNC), David Woodman (ASNC) – Revised Sawyer

References

EPT (Edition Production Technology)

A software suite based on Eclipse under development by the Electronic Boethius project. EPT allows the creation of image based editions, where images (for example of folios) are transcribed and marked up all through a cleverly designed interface. It is especially interesting in its ability to allow overlapping markup. This allows markup of both the textual structure (pages, folios, etc.) and the textual content (lines, paragraphs, etc.) that may overlap. It does so by using a start & end milestone technique. The editor allows a large number of customisations, and since it is based on eclipse, further plugins could be developed.

A message from Dot Porter to the Digital Medievalist mailing list provides further information:

The EPT enables an editor to:

  1. Create a project by importing digital images, transcript (which can be a pre-existing XML document, or a text document with no markup), and a DTD or set of DTDs. (Details on what I mean by “set of DTDs” – not TEI tagsets! – can be found by following the tutorial links on the demo site, see below).
  2. Insert markup into this project through user-friendly, completely configurable markup templates. In the EPT the editor views text & image side-by-side, and the markup software includes functionality for connecting pieces of text with the corresponding image sections.
  3. The full version of EPT includes additional tools for more advanced

editing – collation (tools for both types – comparing versions of the same text, and describing the structure of the physical object), statistical analysis, paleographic description, glossary development.

There are obvious start-up costs involved here – it’s not simple to get started. You’ll need to have your images and transcript (though it is possible to transcribe-as-you-go, if you import a blank text file into a new project). You’ll need to have your DTD, and if you’re concerned about overlapping markup you’ll need to divide that DTD into smaller, well-formed DTDs (the demo example projects come with such DTDs, based on TEI, which you’re welcome to use as-is and extend for your own projects). You’ll also need to do a fair amount of configuration. On top of this, you’ll need to learn to use the software, which has a fairly steep learning curve. Once the project is created and the software has been configured to suit the project, though, any editor comfortable with point-and-click technology should be able to create the electronic edition.

For a nifty article on using the EPT to help solve an editorial problem, see Kevin Kiernan’s article “The source of the Napier fragment of Alfred’s Boethius” in the inaugural issue of the Digital Medievalist journal (DOI: 10.16995/dm.7).

For the history of the EPT, and a discussion of the early aims & developments of the software and the relationship between eBo and ARCHway, see the article “The ARCHway Project: Architecture for Research in Computing for Humanities through Research, Teaching, and Learning” (Kiernan et al.), forthcoming in Literary and Linguistic Computing (abstract athttp://llc.oxfordjournals.org/cgi/content/abstract/fqi018?ijkey=a2FHqg7XTTULMJz&keytype=ref – full text is available online if your library subscribes to LLC online). Note that this article is based on our presentations at ACH/ALLC 2003 so some of the specifics are out of date.

For a working demo, including text projects and tutorials for getting started with your own projects, visit http://beowulf.engl.uky.edu/~eft/EPPT-Demo.html

The source code for the EPT is being released in stages, corresponding with the finishing dates of the two supporting projects. The ARCHway Project finished at the end of January, and the source code for that project, the “Development EPT”, will be released very shortly. The Development EPT is a general version of the EPT. It lacks some of the editing functionality in the Stable EPT – it has the basic image-text linking, but lacks the more specialized tools described above in #3. The Development EPT is designed to be extended – if you have access to computer science support (an RA with experience coding JAVA, especially using the Eclipse development platform), you can extend the Development EPT to serve the particular needs of your individual project.

 

Further information

http://dblab.csr.uky.edu/~eiaco0/publications/DigiCULT04.html

DocExplore

Homepage

Description

DocExplore is an EU INTERREG IVa project investigating the computer-based access and analysis of historical manuscripts. The project commenced on the 1st April 2009.

The aims of the project can be summarised as empowering citizens on both sides of the Channel to engage with, explore and study their cultural heritage, as embodied in written and printed documents, in meaningful, informative, accessible and entertaining ways, through the provision of transparent computer-based interactive tools. We therefore envisage developing a generic document analysis framework which provides a basic operational infrastructure and interactive toolkit.

The framework for exploring historical documents will address three strata of usage:

  • Observation tasks (e.g. in interacting with exhibits)
  • Informal information assembly, manipulation and coordination (e.g. searching documents, comparing texts, etc)
  • High-level formal textual research (automated reading tools, advanced annotations, etc)

 

Keywords

  • Disciplines: Computer Sciences, Digital editions

 

Team

  • LITIS, Université de Rouen
  • EDA, University of Kent
  • Rouen Nouvelles Bibliothèques
  • Canterbury Cathedral Archives
  • GRHIS, Université de Rouen
  • CMEMS, University of Kent

 

Contact