DC-2014 Special Session: RDF Application Profiles and Tools for Metadata Validation and Quality Control

RDF Application Profiles and Tools for Metadata Validation and Quality Control
Half-day Special Session <at> DC-2014
Thursday, 9 October 2014 - 1:30-5:00

LOCATION: Austin, Texas, USA
VENUE: AT&T Executive Education & Conference Center (http://www.meetattexas.com/)
CONFERENCE WEBSITE: http://purl.org/dcevents/dc-2014

Session Abstract: 

This session will focus on establishing requirements for implementing Application Profiles from the perspective of software developers. In particular, our interests include the requirements necessary for performing validation and quality checks within tools, and the extent to which established and developing constraint languages remain valuable in our context.

Scope and Motivation: 

Over the last fourteen years, the DCMI community has focused much of its efforts on the development of Application Profiles as a means to enable reuse of properties across multiple schemas, as well as constraint languages to express those profiles. Building on the DC-2013 special session Application Profiles as an Alternative to OWL Ontologies, this session will explore the requirements for defining and implementing Application Profiles from the perspective of software developers and other implementers. In particular, our session will focus on the requirements necessary for performing validation and quality checks within tools, and the extent to which established and developing constraint languages, such as Description Set Profiles and Shape Expressions/RDF Data Shapes, remain valuable in our context.

Confirmed Panelists:
  • Mark Matienzo (mark <at> matienzo.org), DPLA, USA (Facilitator)
  • Kevin Ford, Library of Congress, USA
  • Thomas Johnson, Oregon State University, USA
  • Eric Miller, Zepheira, USA
  • David Wood, 3 Round Stones, USA 
Open Questions Guiding the Session:
  1. How can Application Profile-based validation provide meaningful feedback to a user editing a "record" or set of statements?
  2. From the perspective of an implementor, what do we mean by "validation," and does this mean different things from the perspective of implementers building user-facing tools or automated systems to perform these checks?
  3. How are existing constraint languages valuable to implementers, particularly if the tools we are building cannot interpret or act on them natively?
  4. Should we prioritize developing tools that can interpret serialized constraint definitions, or ensuring that our tools and systems can serialize their constraints into one of these languages?
Special Session Sponsors:
  • Digital Public Library of America (DPLA)
  • DCMI Technical Board
You can register using the day-rate option to DC-2014 or join us for the full DC-2014 program at http://dcevents.dublincore.org/index.php/IntConf/index/pages/view/registration-2014.

Don't procrastinate, register now! The Conference discount hotel block rate at the AT&T Center ends 12 September. This Special Session and DC-2014 Conference cap the second week of the Austin City Limits Music Festival (http://www.aclfestival.com/) and hotel rooms will become increasingly scarce as the Conference dates approaches.

Mark Matienzo
   Director of Technology, DPLA
page for TEI wiki on SharedCanvas


 From a few scattered mentions online, it seems that SharedCanvas ( 
http://www.shared-canvas.org/ ) has some way of working with TEI 
documents, or being used by people who also use TEI.  But I can't make 
out enough of the relation to actually create a page at 
http://wiki.tei-c.org/index.php/Category:Tools .  Is there anyone who 
knows something about it who'd be willing to create this page?  The TEI 
community would appreciate it!


DC-2014 in Austin, Texas, USA--Early Registration closing soon

DC-2014: "Metadata Intersections: Bridging the Archipelago of Cultural Memory"

DATES:  8-11 October 20114 
LOCATION: Austin, Texas, USA

It is just under two weeks until the closing of Early Registration for DC-2014, the International Conference on Dublin Core & Metadata Applications.  If you work or study in any sector of the metadata ecosystem, you will not want to miss DC-2014 in Austin, Texas.  In addition to an excellent set of peer reviewed full papers, project reports and posters, there is an an array of pre- and post-Conference tutorials and workshops, Conference Special Sessions and Best Practices Posters & Demonstrations:

Tutorials & Workshops:
  • Fonds & Bonds: Archival Metadata, Tools, and Identity Management (full day workshop at the Harry Ransom Center)
  • Training the Trainers for Linked Data (full day, hands-on workshop)
  • RDF Validation in the Cultural Heritage Community (1/2 day tutorial)
  • Overview: Positioning DCMI & Dublin Core in the Metadata Ecosystem (1/2 day tutorial)
Special Sessions:
  • RDF Application Profiles and Tools for Metadata Validation and Quality Control (sponsored by Digital Public Library of America (DPLA) & DCMI Technical Board)
  • Schema.org, SchemaBibExtend -- Structured Data on the Web (sponsored by OCLC & Yandex)
  • BIBFRAME -- Expressing and Connecting Bibliographic Data (sponsored by the Library of Congress)
  • DCMI Roundtable (experts on BIBFRAME, Europeana Data Model (EDM), DPLA Metadata Application Profile (MAP), Schema.org/SchemaBibExtend, and more -- sponsored by DCMI/Technical Board)
The program is available at http://dcevents.dublincore.org/IntConf/index/pages/view/program14 and the papers, project report and poster abstracts are available at http://dcevents.dublincore.org/IntConf/index/pages/view/abstracts-2014.

Don't procrastinate, register now! The Conference weekend caps the second week of the Austin City Limits Music Festival (http://www.aclfestival.com/) and hotel rooms will become increasingly scarce as Conference time approaches.

Register for TAPAS Workshop at TEI 2014

Dear Colleagues,

As you clarify your plans for attending the 2014 TEI Conference (October 22-24), the TAPAS project team hopes you will consider attending our inaugural workshop on Saturday, October 25 as part of the conference’s annual workshop series (http://tei.northwestern.edu/workshops/)

This one-day workshop, led by Syd Bauman and Julia Flanders, will introduce participants to the full range of TAPAS services for archiving, sharing, transforming, and publishing TEI data. Conversations and hands-on activities will also focus on topics in metadata, TEI validation and troubleshooting, and incorporating TAPAS into TEI project workflows.

For more information on our workshop, please see our news writeup at http://www.tapasproject.org/news/2014/07/09/inaugural-tapas-workshop-2014-tei-conference-northwestern-university-saturday

To register, please email us at tei.publishing <at> gmail.com, with the subject line: Register | October 2014 TAPAS Workshop. Please be sure to include your full name and, if applicable, institutional affiliation.


div1, div2, ...

I'm a bit confused. I've made a number of schemas in Roma the way I used to (basically RTFM). Since a few years
back I have a tei_lite and a tei_all sedimenting on my hard disks. Our collection validates against all but
the new tei_lite one doesn't seam to support div1, div2... Has that way of expressing hierarchy been
deprecated recently?

I have some problem with the new tei_all and xmllint --relaxng. Is that schema bug or a software bug? Or is it
just me?



Is EEBO-TCP the biggest TEI archive?

I am currently working on an essay about the imminent release of 25,000
EEBO-TCP texts into the public domain, to be followed in 2020 by the
release of another 45,000 texts, adding up to an archive of at least one
version of every "book" published in the English-speaking world before

I'll be grateful for any evidence that would lead me to modify two
propositions that are part of this essay:

1. The EEBO-TCP archive is by a wide margin the largest single archive TEI
texts in the world.
2. Anglophone Early Modern Studies will be in a privileged position to
experience changing modes of reading in a digital world ('distant',
'scalable', 'strategic', 'macro') because there is not other historical
discipline where the print record of a period of comparable duration and
importance has been digitized with comparable density and completeness.

An immediate challenge to #2 might come from the classicists, where just
about every text that has survived has been digitized, often very
carefully. But the record itself is much more fragmentary: we know the
titles of ~1,000 Greek plays, but fewer than 50 have survived. By
contrast, about 800 English plays before 1660 have come down to us, and we
know the titles of ~150 lost plays.

Take advantage of the Early Registration Discount for TEI 2014 by August 31

Dear Colleague,

a quick reminder that the Early Registration discount of 20% for the 2014
TEI conference will end  August 31, 2014.If you register before
that deadline you save yourself a little money, and you give the
conference organizers very useful information about how many people to
plan for. For your sake and ours, please take advantage of the discount
and register at https://www.conftool.net/tei2014/

Conference papers will run from Wednesday morning (October 22) to Friday
noon (October 24). The annual TEI members meeting is scheduled for Friday,
October 24, 9 am. A list of plenary sessions and  papers is available at
the conference web site http://tei.northwestern.edu/keynote-speakers and
http://tei.northwestern.edu/papers/. I expect to publish the actual
schedule in the second week of September.

In previous years, workshops have been held before the conference. This
year they will be held on Saturday, October 25, following the conference.
For a list of the workshops and their times, consult the program at

For hotel information consult http://tei.northwestern.edu/local-info/.
Getting from O'Hare airport to Evanston is uncomplicated and usually a
matter of less than an hour. More details about this later.

Best wishes for what little remains of the summer.

ETCL Web Developer / Programmer Job Posting

The Electronic Textual Cultures Lab at the University of Victoria is 
looking for a full-time (35 hours per week) web programmer to work with 
its team on several initiatives, including:

• developing digital humanities projects within an academic
framework, and

• assisting in the development of plugins and features for an online
journal environment.

The ETCL is a leading-edge Humanities research lab, working on a variety 
of exciting projects. Self-motivated personalities are essential. 
Individual development and new ideas are encouraged. Read more about us at


Experience & Qualifications:

The successful candidate should have completed a computer science or 
other relevant degree program and have demonstrated experience in the 
followingtools and technologies:

- Strong programming fundamentals and experience with the following:

• Current web development technology, including PHP, and JavaScript.
• XML/XSLT/HTML5/CSS3 and W3C Standards

• Relational databases, including design, in the context of literary 
analysis (esp. MySQL and PostgreSQL)

• Content management systems or similar, including WordPress and Drupal

• Apache, and Linux server administration

- Commitment to and interest in contributing to Free or Open Source 
Software (F/OSS)

- Experience in distributed collaboration using git, mailing lists, and 
issue tracking software

- Additional consideration given for:

• Knowledge of, or experience with, Public Knowledge Project software 
(e.g. Open Journal Systems) or a similar open source project is valuable 
but not necessary

• Experience with current interface development using AJAX, JQuery,
Bootstrap or similar tools

• Experience with other relevant technologies, such as Python, Ruby on
Rails, Elasticsearch

• Experience with graphic design in a web-based context

• Experience with Solr and Tika

• Experience with OpenID Connect and/or other SSO technologies

The ability and desire to learn technologies on this list that the 
candidate lacks is an asset.

Position Duties:

• Develop and implement database-driven websites in a humanities
research context

• Conduct open-source software research

• Participate in meetings and constructive discussion with other team

• Engage in requirement elicitation

• Offer consultation, technical planning, and project solutions

• Understand humanities concepts and find ways to realize them as
technical solutions

• Develop plugins, new features, and integrations in and with Open 
Journal Systems and other open source tools

• Provide reporting and documentation

This contractual position is initially for an 8-month term, from 
September 2014 to April 2015, with possibility of renewal. Salary for 
this position is competitive in the academic market and will be 
commensurate with experience and qualifications.

Applications, comprising a brief cover letter, a resume, links to 
completed projects, PHP code samples, and the names and contact 
information for at least two referees, may be sent electronically to 
etcl <at> uvic.ca. Applications will be received and reviewed until the 
position is filled. Salary will be commensurate, in the university 
context, with expertise and experience.

Position subject to funding approval.

Different attributes permitted for: '<pron>' & '<pVar>'


I am creating a dictionary for each entry I include phonetic transcriptions in IPA and SAMPA based notation.

  '<pron>', and  (where needed)  '<pVar>'  elements.

Ideally I would use the ' <at> notation' attribute for expressing this in both the aforementioned elements, however that attribute is not available in <pVar>, thus I have been using ' <at> style'.

Though I realize I could change both to use ' <at> style', but the term 'notation' is the most semantically accurate in terms of describing a phonetic alphabet system.

It seems like given that the core purpose of both elements is to express some kind of phonetic information that they should have the same attributes available to express that..

Though the problem I am describing may be relevant to others doing the same kind of work, I suppose at the core of this post, I'd like to see about changing this..

Any thoughts, advice? 

Call for application — internship/position on LMF-TEI

Short-term research internship/position -- Comparing and Improving Lexical Representation Standards

Inria is looking for a highly motivated young researcher (PhD or PhD Student) to provide an in-depth analysis of the ISO 24613 (Lexical Markup Framework - LMF) and its current application in both the Text Encoding Initiative (TEI) guidelines and the W3C OntoLex initiative.

The researcher’s background and skills should comprise:

- Research interest in lexical information (from a lexicographic, corpus linguistic or computational linguistic viewpoint). The researcher may work on his/her own data.

- Understanding of data modeling in XML and/or OWL, with basic skills in XSLT. Experience with XML schema languages is a plus.

The core activity of the short research stay will be to examine how well the LMF meta-model is reflected in the TEI guidelines and the current OntoLex specification, in order to create a customisation of the TEI guidelines that has at least the same coverage as the Ontolex specification. The work will include gathering lexical samples that could serve as a proof of concept for this customisation.

Salary: may range from 1100€ to 2100€ (after deductions) depending on status and experience

Duration: 5-month employment, starting as soon as possible

Location: Berlin (Germany) with the work contract established in France. Depending on the current location and constraints of the applicant, the precise organisation of work can be subject to further agreements.

Contact: application comprising research CV and motivation letter should be sent to laurent.romary <at> inria.fr

Background reading:

Laurent Romary. TEI and LMF crosswalks. Stefan Gradmann and Felix Sasaki. Digital Humanities: Wissenschaft vom Verstehen, Humboldt Universität zu Berlin, to appear — http://hal.inria.fr/hal-00762664

Laurent Romary, Werner Wegstein. Consistent modelling of heterogeneous lexical structures. Journal of the Text Encoding Initiative, TEI Consortium, 2012 — http://hal.inria.fr/hal-00704511

Lothar Lemnitzer, Laurent Romary, Andreas Witt. Representing human and machine dictionaries in Markup languages. Ulrich Heid. Dictionaries. An International Encyclopedia of Lexicography. Supplementary volume: Recent developments with special focus on computational lexicography, Mouton de Gruyter, 2014 — http://hal.inria.fr/inria-00441215

John McCrae, Dennis Spohr, Philipp Cimiano, “Linking Lexical Resources and Ontologies on the Semantic Web with Lemon”, in The Semantic Web: Research and Applications, Lecture Notes in Computer Science Volume 6643, 2011, pp 245-259

Job: Contract Historian/Electronic Editor, Foreign Relations series

Dear TEI-L recipients,

Please forward this to any candidates you think would be interested.


Job Announcement for Contract Historian/Electronic Editor

Department of State, Contract Historian/Electronic Editor, Foreign
Relations series

Location: Washington, DC, USA

The Digital History Advisor in the Office of the Historian, U.S.
Department of State, Washington, DC, seeks a contract editor with a
strong background in documentary and electronic editing and history to
perform quality control reviews of digitized volumes from the Foreign
Relations of the United States series, the official documentary record
of U.S. foreign relations. The position requires extensive experience
in electronic editing and related technologies, particularly TEI XML,
ODD, XML schema technologies, XPath, XQuery or XSLT, Git or
Subversion, and familiarity with HTML and CSS. Additional requirements
include knowledge of U.S. foreign relations and diplomatic history as
demonstrated by, at minimum, 18 credit hours of history at the
undergraduate or graduate level.

This is a remote telework position, and applicants must possess a
modern computer, a licensed copy of the current version of oXygen XML
Editor, and a high speed broadband internet connection.

Projected start date is September 1, 2014, with 12 months of funding
(possibly renewable) at the GS-13 equivalent pay rate.

Interested candidates should submit a resume and cover letter to Dr.
Joseph Wicentowski at wicentowskijc <at> state.gov by August 20, 2014;
please send any inquiries to this address too. For more information
about the Office of the Historian please visit the website,