Thierry Pellé | 27 May 19:38 2016

Best place for linkGrp blocs

Dear all,
	I wonder where is the best place to insert (typed) <linkGrp>

	Currently, I aggragate all <linkGrp> in a companion document
commented with a "how to read links" manual.

Thank you!

Frederik Elwert | 27 May 11:01 2016

Linking page breaks and witnesses

Dear all,

I guess this is a trivial question for anybody who works with critical
editions, but I couldn’t figure it out: What is the usual way to link a
page break to the witness in which it appears?

Let’s say I have a list of witnesses:

      <witness xml:id="alif" n="أ">Princeton Garrett 724Y</witness>
      <witness xml:id="ba" n="ب">Damad İbrahim 1043</witness>
      <witness xml:id="dschim" n="ج">Bodleian Laud Or. 192</witness>

In the apparatus (lem/rdg), I refer to them using a pointer in  <at> wit. For
page breaks,  <at> ed and  <at> edRef seem to be the relevant places. The TEI
critical edition toolbox suggests using the witness ID in  <at> ed:

    <pb ed="alif" />

However, I understand that  <at> ed is of type teidata.word, and thus not a
pointer, but more of a standardised label (like  <at> key). In that case,
ed="أ" would seem more appropriate.

There is also  <at> edRef, but in the examples, they link to a bibliographic
entry (bibl) about the edition, not a witness. Would it be correct to
use something like

    <pb edRef="#alif" />

(Continue reading)

José Calvo | 27 May 07:59 2016

Metadata about cast in theatre

in different situation it has arisen the question, what would be a good solution to encode some metadata about the cast in a theatre play in TEI. Let's say that we want to encode metadata like the gender, social level, function (lover, enemy, protagonist, servant...) of each person of the play. I hoped I would find attributes for that in the roleDesc, but I didn't. There could be different solutions (external csv files, empty elements with <at> type and <at> subtype like <seg type="gender" subtype="male"/>, define a new namespace...) but none is really satisfying. Any experience or solution for that? I would be grateful if you could help me.

Thanks, with kind regards,
José Calvo


José Calvo Tello

Digital Humanist

University of Würzburg, Department for Literary Computing

ron.vandenbranden | 26 May 15:33 2016

encoding complex sources in an apparatus


We're starting an edition project that is building on an existing digital edition in which a number of published versions of a literary work had been encoded using the parallel-segmentation method. Apart from updating the encoding from P4 to P5, a new phasse is planned in which intermediary text versions, such as manuscript versions, proofs, etc. will be incorporated in the edition. As it goes with hand-edited texts, these intermediary text versions are complex sources in themselves. Often, different correction layers can be distinguished, which generally don't allow to make any claims about their internal chronology. I am looking for a way to integrate the encoding of such complex sources in an apparatus, and would like to check if my analysis makes sense.

Suppose we have following text versions:

  Ed1: the first edition   P2: a proof for the second edition (based on Ed1, with manual corrections)     P2a: corrections in red pencil     P2b: corrections in black pencil   Ed2: the second editon If version P2 would be included in the digital edition, it seems that it would only make sense to take into account the sum of its correction layers (P2a and P2b). Still, the editor wants to distinguish between the different annotation layers. For this fine-grained distinction, I'm considering the use of the <at> change attribute to link additions and deletions to the identified revision rounds.

Perhaps an example fragment could clarify things:
  Ed1   The weather was fine.   P2    The <add change="#P2a">fall</add> <restore change="#P2b"><del>weather</del></restore> was <add change="#P2a">cool, but</add> <subst change="#P2b"><del>fine</del><add>nice</add></subst>.   Ed2   The September weather was cool, but nice.
If these would be combined into a parallel-segmented apparatus, that could look as follows:
  The     <app> <rdg wit="#Ed1"/> <rdg wit="#P2"><add change="#P2a">fall</add></rdg> <rdg wit="#Ed2">September</rdg> </app> <app type="pseudo"> <rdg wit="#Ed1 #Ed2">weather</rdg> <rdg wit="#P2"><restore change="#P2b"><del>weather</del></restore></rdg> </app> was <app> <rdg wit="#Ed1"/> <rdgGrp type="pseudo"> <rdg wit="#P2"><add change="#P2a">cool, but</add></rdg> <rdg wit="#Ed2">cool, but</rdg> </rdgGrp> </app> <app> <rdg wit="#Ed1">fine</rdg> <rdgGrp type="pseudo"> <rdg wit="#P2"><subst change="#P2b"><del>fine</del><add>nice</add></subst></rdg> <rdg wit="#Ed2">nice</rdg> </rdgGrp> </app>. In this example, readings that actually don't differ are grouped as "pseudo" variants (there are better labels, no doubt) so they can be ignored when they're not relevant to the selected comparison set in the edition. Yet, when one text version is studied in detail, an option could be offered to visually distinguish between the different correction layers.

To complete this idea, I would define the revision rounds in <listChange> elements in the <profileDesc> section in the header, linked to the witness definitions they apply to with a <at> corresp attribute:
<teiHeader>   <fileDesc>     <sourceDesc>        <listWit>          <witness xml:id="Ed1">the first edition</witness>          <witness xml:id="P2">a proof for the second edition (based on Ed1, with manual corrections)</witness>         <witness xml:id="Ed2">the second editon</witness>       </listWit>      </sourceDesc>   </fileDesc>   <profileDesc>     <creation>        <listChange>        <listChange corresp="#P2">            <change xml:id="P2a">corrections in red pencil</change>           <change xml:id="P2b">corrections in black pencil</change>          </listChange>        </listChange>     </creation>   </profileDesc> </teiHeader> Does this all make sense, am I overlooking something, just stating the obvious, or has this been done better before? Any advice welcome!

Kind regards,


-- Ron Van den Branden CTB - Centrum voor Teksteditie en Bronnenstudie / Centre for Scholarly Editing and Document Studies KANTL - Koninklijke Academie voor Nederlandse Taal- en Letterkunde / Royal Academy of Dutch Language and Literature Koningstraat 18 b-9000 Gent Belgium E-mail : ron.vandenbranden <at>
Elena González-Blanco | 24 May 01:35 2016

DH <at> Madrid Summer School 27June-1July, also online!

**Apologies for cross-posting***


Dear colleagues,

We are pleased to announce our DH <at> Madrid Summer School 2016 at LINHD-UNED from 27 June to 1st July. This year the central topic is: “Digital Technologies applied to the study of poetry”. It will cover different technologies and approaches to DH standards and methods, as TEI-XML, semantic web tecnnologies, and some smaller approaches to stylometry and R.

The course can be followed in person or virtually (completely online!). Registration is the same in both cases, but virtual students will have streaming videos and presentations online.

The course is sponsored by HDH (Asociación Hispánica de Humanidades Digitales,, AAHD (Asociación Argentina de Humanidades Digitales), and DIXIT (Digital Scholarly Editing Initial Training Network) Members of all these groups will receive a 10% discount over the registration fees.

Please, find attached the complete program and registration information:

Dates: 27 June to 1st July 2016

Place: Sala Sáenz Torrecilla, Facultad de Económicas, UNED, Madrid – or your own computer…

More information registration process and program:

Best regards


Elena González Blanco: egonzalezblanco <at>

Gimena del Rio Riande: gdelrio.riande <at>

Clara Martínez Cantón cimartinez <at>


Dpto. de Literatura Española y Teoría de la Literatura, Despacho 722

Facultad de Filología, UNED

Paseo Senda del Rey 7
28040 MADRID
tel. 91 3986873

<at> linhduned



**Disculpen la posible duplicidad de mensajes**

Queridos amigos:

Me complace anunciar que ya está abierto el plazo para la inscripción en nuestro curso de verano de este año “Tecnologías aplicadas al estudio de la poesía”, organizado por LINHD en el marco de los cursos de verano de la UNED. Se trata de un curso de humanidades digitales que, centrándose en el tema del análisis poético, realizará un recorrido panorámico a través de las principales tecnologías del ámbito de las humanidades digitales, desde el etiquetado de textos con XML-TEI, a la web semántica, introduciendo además la estilometría y el procesamiento del lenguaje natural como tecnologías que, combinadas, pueden arrojar novedosos e incentivadores resultados de investigación.

Se podrá seguir de forma presencial o virtual, en directo y en diferido, con foros especíifiso para consulta con los profesrores.

El curso está patrocinado por la HDH (Asociación Hispánica de Humanidades Digitales,, la AAHD (Asociación Argentina de Humanidades Digitales), y DIXIT (Digital Scholarly Editing Initial Training Network) Los miembros vinculados a alguna de estas organizaciones contarán con un 10% de descuento adicional sobre el precio de la matrícula.

Fechas: 27 de junio al 1de Julio de 2016

Place: Sala Sáenz Torrecilla, Facultad de Económicas, UNED, Madrid – o tu propio ordenador…

More information, matrícula y programa en:

¡Os esperamos!

Elena González Blanco: egonzalezblanco <at>

Gimena del Rio Riande: gdelrio.riande <at>

Clara Martínez Cantón cimartinez <at>


Dpto. de Literatura Española y Teoría de la Literatura, Despacho 722

Facultad de Filología, UNED

Paseo Senda del Rey 7
28040 MADRID
tel. 91 3986873

<at> linhduned

Tommie Usdin | 23 May 23:19 2016

[ANN] Balisage 2016 Program Posted

Balisage: The Markup Conference
2016 Program Now Available

Balisage: where serious markup practitioners and theoreticians meet every August.

The 2016 program includes papers discussing reducing ambiguity in
linked-open-data annotations, the visualization of XSLT execution
patterns, automatic recognition of grant- and funding-related
information in scientific papers, construction of an interactive
interface to assist cybersecurity analysts, rules for graceful
extension and customization of standard vocabularies, case studies of
agile schema development, a report on XML encoding of subtitles for
video, an extension of XPath to file systems, handling soft hyphens in
historical texts, an automated validity checker for formatted pages,
one no-angle-brackets editing interface for scholars of German family
names and another for scholars of Roman legal history, and a survey of
non-XML markup such as Markdown.

XML In, Web Out: A one-day Symposium on the sub rosa XML that powers
an increasing number of websites will be held on Monday, August 1.

If you are interested in open information, reusable documents, and
vendor and application independence, then you need descriptive markup,
and Balisage is the conference you should attend. Balisage brings
together document architects, librarians, archivists, computer
scientists, XML practitioners, XSLT and XQuery programmers, implementers of
XSLT and XQuery engines and other markup-related software, Topic-Map
enthusiasts, semantic-Web evangelists, standards developers,
academics, industrial researchers, government and NGO staff,
industrial developers, practitioners, consultants, and the world's
greatest concentration of markup theorists. Some participants are busy
designing replacements for XML while other still use SGML (and know
why they do).

Discussion is open, candid, and unashamedly technical.

Balisage 2016 Program:

Symposium Program:

NOTE: TEI members are eligible for discount registration at Balisage! 

Balisage: The Markup Conference 2016          mailto:info <at>
August 2-5, 2016                     
Preconference Symposium: August 1, 2016                +1 301 315 9631

Thierry Pellé | 20 May 13:17 2016

Encoding poem and its translation with alignment

    as part of the HALMA's TALIE repurposing project of La Cerda commentary on Virgil works, we would like to
encode a poem and its non-versified translation, the encoding must enable alignment between the two.

For instance (simplified version)

- for latin verses are

Te quoque, magna Pales, et te memorande canemus
Pastor ab Amphryso: vos, syluae, amnesque Lycaei.
Cetera, quae vacuas tenuissentent carmina mentes,

- and their the french non versified translation

Toi aussi, grande Palès, et toi, que l’on doit mentionner en tant que berger de l'Amphryse, nous te
chanterons, et vous, forêts et rivières du
Lycée. Les autres sujets qui auraient tenu les esprits oisifs sous le charme des vers sont tous divulgués

we would like make an alignment on sentences. For instance

«Te quoque, magna Pales, et te memorande canemus
Pastor ab Amphryso: vos, syluae, amnesque Lycaei.

must be aligned with

«Toi aussi, grande Palès, et toi, que l’on doit mentionner en tant que berger de l'Amphryse, nous te
chanterons: et vous, forêts et rivières du Lycée.»


«Cetera, quae vacuas tenuissentent carmina mentes,
Omnia iam vulgata.»


«Les autres sujets qui auraient tenus les esprits oisifs sous le charme des vers sont tous divulgués désormais.»

The problem is that sentences are spread among multiples <l> elements, breaking XML hierarchy.

I think I could use <milestone>( or <anchor>) to mark the beginning of each sentence but, as a TEI beginner, I
would like to know what are the best current practices in TEI P5 for such a case?

Thank you
Thierry Pellé

PS: Sorry for my frenchy english.

Andrew Jewell | 19 May 20:27 2016

2016 issue of "Scholarly Editing" published

We are pleased to announce the publication of the newest issue of Scholarly Editing: The Annual of the Association for Documentary Editing (vol37, 2016), online at www.scholarlyediting.orgScholarly Editing publishes peer-reviewed editions of primary source materials of cultural significance while continuing the tradition of publishing articles and reviews about scholarly editing. As with previous issues, vol. 37 features diverse content of interest to a range of disciplines and created by an international collection of scholars. Please see below for the full table of contents for the 2016 issue.

Amanda Gailey (gailey <at> and Andrew Jewell (ajewell2 <at>
Editors, Scholarly Editing: The Annual of the Association for Documentary Editing

  • Recent Editions by Ellen C. Hickman (Papers of Thomas Jefferson: Retirement Series)
Gerrit Brüning | 19 May 14:42 2016

FolioViews flat file format to TEI?

Dear all,

Does anyone of you have experience with converting data from the proprietary
FolioViews flat file format to TEI or other useful XML?
It's been a while since FolioViews has been used to publish electronic
editions on CD-ROM. But I hope that there is someone out there who has
already tried to save such data for an up-to-date online edition.
A key feature of this flat file format is that it has angle-bracketed start
tags but no end tags.
Any advice or hint is highly appreciated.



Dr. Gerrit Brüning
Goethe-Universität Frankfurt am Main | Institut für deutsche Literatur und
ihre Didaktik | IG-Hochhaus 1.155
Freies Deutsches Hochstift | Historisch-kritische Edition von Goethes Faust
Bruening <at>

Roberto Rosselli Del Turco | 17 May 08:14 2016

Reminder: Call for papers/posters: Digital editions: representation, interoperability, text analysis and infrastructures

Dear all,
this is a reminder that the deadline for the AIUCD 2016 conference is 
getting closer (May 31), anybody interested is invited to submit an 



Digital editions: representation, interoperability, text analysis and 

	Fifth AIUCD Annual Conference
	7-9 settembre 2016
	Aula Magna S. Trentin, Ca’ Dolfin, Dorsoduro 3825/e - 30123 Venezia

                       CALL FOR PAPERS AND POSTERS

[Full announcement available on:]

The AIUCD 2016 conference is devoted to the representation and study of 
the text under different points of view (resources, analysis, 
infrastructures), in order to bring together philologists, historians, 
digital humanists, computational linguists, logicians, computer 
scientists and software engineers and discuss about the text.

It is time for research infrastructures to be able to guarantee 
interoperability and integration between the instruments for 
philological studies and the instruments for the analysis of large 
textual corpora, breaking down the rigid barriers between digital and 
computational philology on the one hand, and corpus linguistics on the 
other hand.

As a consequence, without ruling out other possible topics belonging to 
the Digital Humanities area, we solicit your contributions (talks and 
posters) on these topics:

_Representation and Interoperability_

  * Which digital representation models prove most effective for 
overcoming the dichotomy between diplomatic and critical editions?
  * How to integrate multimedia products (such as 2D images, 3D models, 
audio, video) in the digital edition?
  * How to apply the methods of digital philology to multimedia products 
(such as film quotations, restored versions, musical variations, etc.)?
  * How to build a constructive dialogue between traditional 
philologists and digital philologists?

_Text Analysis and Digital Objects Processing_

  * Which extensions are needed, in order to apply the methods of 
computational linguistics to the study of variants?
  * How to create linguistic and textual analysis chains starting from 
texts that present variants?
  * How can computational linguistic tools be used to bring out regions 
of interest in large amounts of text on which to focus the attention?
  * What is the state of art for the analysis of digital objects?
  * How to assess the quality of analyses produce by means of the 
crowdsourcing method?


What can research infrastructures offer for the management of digital 
How to conduct a study of requirements for infrastructures so that they 
are increasingly accessible to both digital humanists and traditional 
How can Digital humanities scholars be put in contact with the community 
of traditional scholars?

_Communities and Collaboration_

  * Which benefits do the interaction and the involvement of teachers, 
high school and university students in digital editions projects bring 
to research activities?
  * How can digital libraries collaborate to create, access, share and 
reuse digital resources?
  * How may teachers and students get interested in the dissemination of 
research results?
  * How do digital libraries contribute to the dissemination of research 
  * How to prepare a shared syllabus, in order to train digital 
humanists to become aware of aware of the problems and potentialities of 
digital editions?
  * Which are the best practices to enroll a broader audience in the use 
of digital editions?

*Abstract submission*

The contributions (talks and posters), to be proposed in the form of an 
abstract of 1000 words maximum, in PDF format, must be loaded through 
the EasyChair Web site at this URL: Abstracts will be 
accepted in Italian or in English.

The deadline for submission of abstracts to the Programme Committee is 
scheduled for midnight on May 31, 2016. Information on the acceptance 
will be communicated to the authors by June 30, 2016.

*Abstract preparation and evaluation*

The abstract should describe the objectives of the contribution, a brief 
reference to the state of the art, the methodology adopted, and - if 
possible - the results achieved or expected. It should also contain a 

The call for papers welcome three types of contributions: (1) full 
paper, mainly to discuss innovative methodologies; (2) short paper, 
mainly to present accomplished research outputs; (3) poster, mainly to 
present early and innovative work in progress.

The conference proposals will be selected through peer-reviewing by at 
least two Italian and/or foreign scholars expert in the fields of 
(Digital) Humanities and/or Computer Science.

At the end of the evaluation process, the Scientific Committee may 
decide to move an accepted proposal to a different category.

*Instructions for talks*

Full papers will last 30 minutes (20-25 min + 5-10 min for questions). 
Short papers will last 20 minutes (15 min + 5 min for questions). The 
conference room is equipped with a computer, a projector, and internet 

*Instructions for posters*

Poster will be accepted in Italian or in English:

  * The best configuration of your poster is A1 vertical (841mm x 594mm).
  * Posters will be displayed in a dedicated space at the Conference 
venue. Display panels will be provided. Please bring your printed poster 
as we are unable to provide printing service.
  * Display panels for posters will be ready by Wednesday, 7th September 
2016 at 10:00 am and all posters should be put up before 2:00 pm.
  * Personal laptop computers may be used at the poster display area. 
Should your presentation include a laptop, please inform the organizing 
committee on acceptance of your proposal.
  * Specific sessions will be scheduled in the conference programme for 
authors to provide the audience with a quick intro (max 2 minutes) to 
their poster.

Further information will be progressively published on the conference 




Roberto Rosselli Del Turco   roberto.rossellidelturco at
Dipartimento di Studi        roberto.rossellidelturco at
Umanistici                   Then spoke the thunder  DA
Universita' di Torino        Datta: what have we given?  (TSE)

  Hige sceal the heardra,     heorte the cenre,
  mod sceal the mare,       the ure maegen litlath.  (Maldon 312-3)

adam | 15 May 14:14 2016

joint effort docx to HTML


I am relatively new to the list but have followed TEI for quite a while.
Essentially my introduction was via Sebastian Rhatz, the kind and
generous man that he was. I was sorry to see him go.

Lately I have started a non-profit foundation interested in turning
around scholarly workflows. There are many problems in these workflows
but perhaps one of the highest value is getting content out of docx (90%
of scholarly articles and monographs originate in docx) and into other
formats. We are particularly interested in HTML for many reasons, mostly
so we can edit the content online, but also because conversion chains
can be relatively easily formed to get from HTML into other formats that
publishers need (eg. nicely formatted & paginated PDF for printing, EPUB

We have been using the TEI stylesheets and OxGarage, but OxG is a little
awkward for us and we need more from our tool chain than OxG can provide
so we have built (relatively quickly) an engine to manage these
conversions using the TEI stylesheets. However stylesheet conversion is
forever in need of tweaking it seams and I was contemplating processes
to continuously improve the conversions. One way is to hire someone, and
we may do that, and another is to build a community effort around this.
I think we might be able to appeal to the scholarly infrastructure
providers and get some traction on a shared effort. To do this I was
contemplating how this might be actioned. One way is to have a whole lot
of individual actors trying stuff out and making pull requests, but that
seems a little haphazard. Better to work out some sort of coordinating
mechanism and perhaps also a shared tool set to provide a kind of
'central hub' for conversion trials and manual QA etc...a central space
where many people could contribute in one way or he other...

So...this is a long winded way of getting to the heart of the
matter...I'd love to know if anyone on this list has experience trying
to set up anything similar? I'm specifically meaning something beyond
putting the stylesheets on github and waiting for contributions - ie.
setting up community processes and tools for a shared effort to refine a
specific conversion type (in this case docx to HTML).

I'm sure there are many here that have this experience, and I'd be
grateful for any advice or introductions that may be able to take me a
little further down this path.

Many thanks,




Adam Hyde