Dan Matei | 1 Nov 2009 12:48
Picon

Time-dependent identities

Me too (adding to the "identity complications")...

The discussion on "Tim Berners-Lee on the Semantic Web" gives me the opportunity to share some of my
practical problems 
related to identities.

The triggers:

<snip id="1">
From: Alexander Johannesen <alexander.johannesen_at_nyob> 
Date: Fri, 30 Oct 2009 16:25:38 +1100

An example of this complexity is what semantics are
being applied at the time of use of the identifier. If we use the
un.org one, we must encompass the whole of UN's history in that
identifier, the ever-changing organisation. But if OCLC used theirs,
what semantics are within? The first era? The era after the cold war?
What it is now? This is about bias of identifiers.
</snip>

<snip id="2">
From: McDonald, Stephen <Steve.McDonald_at_nyob> 
Date: Fri, 30 Oct 2009 11:30:18 -0400

>  1. http://www.un.org/    (controlled by themselves)
>  2. http://id.oclc.org/org/d8445    (controlled by third-party)

How about a counter-example?  Try www.digital.com.  Digital Equipment
Corporation used to be very important.  You could have used
www.digital.com as a URI in the 90's, but the company went out of
(Continue reading)

Karen Coyle | 1 Nov 2009 19:16

Re: Time-dependent identities

>
> NB. I'm concerned here only with "subject identifiers" (not with   
> "resource identifiers/locators").

I'm not sure that makes a difference -- identification should be the  
same for both, right?

>
> Case 1.
>
> In our archaeological database we have (sometimes) to distinguish a   
> geo-political entity (as a legal body) from its
> territory. E.g. The Habsburg Empire. Its territory was (very)   
> time-dependent. So, when we locate a historical site we
> have to be very careful with the jurisdiction. How to identify   
> conveniently the Habsburg Empire's territory ?
>
> Besides: one identifier also for the "Austrian Empire" (i.e. ante   
> 1867) and one for "Austria-Hungary (1867 - 1918) ?
> That is, 3 in totto ?

Right. Then the trick is to create some place where you identify the  
relationships between them, kind of "preceding/succeeding", like we do  
for serial titles. The LC authority file has things like "Italy  
(Fascist Republic : 1943-1945)". However, I don't think we actually do  
this very well in the name authority files, because we don't generally  
include specific relationships, so this is an area where adding  
relationships could really help disambiguate.

Basically, every new "thing"or concept gets a different identifier,  
(Continue reading)

Karen Coyle | 1 Nov 2009 19:34

Re: Tim Berners-Lee on the Semantic Web

Great examples, Ross!

I do think you have a mix here of Work and Manifestation (because of  
the LCCN and ISBN and the links to manifestations), but I wonder if we  
won't be resolving that with class and domain definitions... So that  
the subjects will be defined as Work, and then you've connected that  
Work to Manifestations by including those manifestation-related  
identifiers?

As an FYI, whereas you have included both display forms and URI forms  
of subjects separately as:

<dc:subject>Soldiers--Fiction</dc:subject>
<dc:subject>Teenage girls--Fiction</dc:subject>
<dc:subject>Teenage pregnancy--Fiction</dc:subject>
<terms:subject  
rdf:resource="http://id.loc.gov/authorities/sh2007101961#concept"/>
<terms:subject  
rdf:resource="http://id.loc.gov/authorities/sh2008104232#concept"/>
<terms:subject  
rdf:resource="http://id.loc.gov/authorities/sh2008108377#concept"/>

Andy Powell recently put up examples on his blog in this format:

  <dcterms:subject>
       <rdf:Description  
rdf:about="http://id.loc.gov/authorities/sh85101653#concept">
         <rdf:value>Physics</rdf:value>
       </rdf:Description>
     </dcterms:subject>
(Continue reading)

Alexander Johannesen | 2 Nov 2009 03:25
Picon

Re: Time-dependent identities

On Sun, Nov 1, 2009 at 22:48, Dan Matei <Dan <at> cimec.ro> wrote:
> NB. I'm concerned here only with "subject identifiers" (not with "resource identifiers/locators").

Karen asks if these two aren't the same, or say that the difference
doesn't matter. My answer is, well, they are very distinct things
which happens to correlate in the way they look. One is a URI that
resolves, the other is just a piece of string of characters, so they
have two distinctly different semantics.

> Case 1.
...
> How to identify conveniently the Habsburg Empire's territory ?

Geographic territories are inherently difficult to mark up in a
non-graphic way. I can say "King Harold was the King of Norway" and
refer to a specific period in time, yet the geographical markup of
that time was substantially different from the the times before and
after. In the RDF world, I have no idea how to best do it. I think a
stream of GPS coordinates and the such is massively overkill, but
perhaps the only thing that will be precise enough? (Well, precise
until you discover that borders are a very imprecise science of the
past)

In a Topic Map we often have topics that represent the whole, so ;

   http://library.org/history/europe/power/Habsburg_Monarchy

If you want to be specific about certain markups within it, they are
in essence their own subjects ;

(Continue reading)

Ross Singer | 2 Nov 2009 16:12
Picon

Re: Tim Berners-Lee on the Semantic Web

On Sun, Nov 1, 2009 at 2:34 PM, Karen Coyle <lists <at> kcoyle.net> wrote:
> Great examples, Ross!

Thanks, but, obviously, it's really, really, really, really rough.
>
> I do think you have a mix here of Work and Manifestation (because of the
> LCCN and ISBN and the links to manifestations), but I wonder if we won't be
> resolving that with class and domain definitions... So that the subjects
> will be defined as Work, and then you've connected that Work to
> Manifestations by including those manifestation-related identifiers?

Yes, well, you've touched on a good point here.  There is sort of an
assumption here that these are /mostly/ Manifestations.  I've largely
avoided explicit FRBR assertions 1) because there is no persistence
layer to this app and there is no obvious way to "proxy" works or
expressions from the LCCN Permalinks service 2) I need to see more
examples of records to know if there's even consistency in determining
if they are manifestations or not.

So, right -- this is largely a pragmatic approach trying to use what's
actually available in our current data set.  And, of course, it's a
work in progress.

That being said, there's at least a nod to FRBR WEMI in there, rather
than saying http://lccn.heroku.com/94510751 is the same thing as
http://dbpedia.org/resource/The_Hobbit_films (which, now that I'm
looking at it, is the wrong resource anyway -- it should be
http://dbpedia.org/resource/The_Hobbit_%281977_film%29 -- might need
to use Freebase for films) - it's using dcterms:isVersionOf to help
explain that we're actually referring to a videocassette of the movie,
(Continue reading)

McGrath, Kelley C. | 2 Nov 2009 18:00
Picon

Re: At Univ. of South Carolina, the Card Catalog's Graceful Departure

I can see what someone might want to demonstrate certain concepts using a card catalog, but I can't imagine
it's helping today's students very much.

The sad thing is that the card catalog performed certain functions (e.g., giving people a sense of where
they're at in the big picture, providing a sensible organization and display of many types of materials)
than current OPACs do. IMHO, there are several reasons for this.

1. Our data is still designed for printing cards, rather than providing machine-manipulable data for
today's environment. The MARC format, despite some visionary elements, was designed for the practical
task of printing cards. Our data is overly focused on text strings and not designed for easy extraction and
manipulation of parts of the record. We retain practices that were designed to save space on cards. A lot of
things don't work well in the OPAC because they were designed to produce data to be interpreted and filed by
a human being. We need to modernize what data we record and how we record it. As it is, the form of the date
often is an obstacle to developing the systems we need.

2. On the other hand, designers of OPACs seem to often fail to understand the purpose of the data and thus
implement interfaces that are at cross purposes with what the data is trying to do. An obvious example is
the failure of most OPACs to display materials that are available in many versions and are usually meant to
be collocated by uniform titles (e.g., classical Western music) in any way that would help someone browse
or make it easy for them to know what the library actually has.

Sometimes, it seems like OPAC designers don't take the time to understand how something works, even when it
actually is relatively straightforward and meant to work in a computer-based environment. One of my pet
peeves with our catalog (SirsiDynix Symphony) is that it has a completely dysfunctional language
limiter. This is based on the language 008 fixed field and the 041 field, which contains additional
languages other than the main one in the 008. This is coded data. SirsiDynix offers only under-retrieval
(008 only, which misses any bilingual materials or alternate soundtracks or subtitle tracks on DVDs) or
over-retrieval (008 + the entire 041, including $h, which is for the original or intermediate
translation language and makes to sense mushed in with the limiter for languages that it
 ems are actually useful in). We have the over-retrieval setting, which is okay usually for DVDs, but is a
(Continue reading)

Walker, David | 2 Nov 2009 20:29
Favicon

"Limited" Google Books Search ?

Section 4.1 (a)(vi)(1)(b) of the proposed Google Book Search settlement [1], in talking about
"institutional subscriptions,"  says:

    Subscription for each of the classes of institutions identified in
    Section 4.1(a)(iv) (Pricing Bands), including Institutional
    Subscriptions for each of the discipline-based collections that may
    be offered, Institutional Subscriptions that provide access to the
    entire Institutional Subscription Database, and any Limited
    Subscriptions.

As far as I can tell, "Limited Subscriptions" is nowhere else defined in the document.  I'm curious is if
anyone has any insight into this?

I ask because, the document says in an earlier section that institutional subscriptions will be based on
"prices for comparable products and services."  Based on what we're paying now for, say, Safari, I'm
guessing a 10-million (or so) volume e-book collection is going to be VERY expensive.

Further, the document says that Google can (only?) offer two "versions" of subscriptions: (1) the entire
database, or (2) "discipline-based collections."

It seems to me, though, that if an undergraduate institution cannot afford the entire GBS database -- which
I think may be entirely likely -- "discipline-based" collections won't be a suitable alternative.  It's
not like we would only buy a Sociology GBS collection, for example, and tell everyone else they're out of luck.

So I think there is a need for a version of Google Book Search that would span all disciplines, but not include
all books.  A kind of Google Book Search Elite (compared to Premiere or Complete), to borrow an Ebsco naming convention.

I wonder, then, if the "Limited Subscription" is just such a thing?

--Dave
(Continue reading)

Ed Jones | 2 Nov 2009 21:19
Favicon

Re: "Limited" Google Books Search ?

Limited Subscription is defined in 1.83 as "an Institutional Subscription offered to a library that
allows the subscribing library access only to the Books Digitized from that library, or only to the Books
held by that library."

-----Original Message-----
From: Next generation catalogs for libraries [mailto:NGC4LIB <at> LISTSERV.ND.EDU] On Behalf Of Walker, David
Sent: Monday, November 02, 2009 11:30 AM
To: NGC4LIB <at> LISTSERV.ND.EDU
Subject: [NGC4LIB] "Limited" Google Books Search ?

Section 4.1 (a)(vi)(1)(b) of the proposed Google Book Search settlement [1], in talking about
"institutional subscriptions,"  says:

    Subscription for each of the classes of institutions identified in
    Section 4.1(a)(iv) (Pricing Bands), including Institutional
    Subscriptions for each of the discipline-based collections that may
    be offered, Institutional Subscriptions that provide access to the
    entire Institutional Subscription Database, and any Limited
    Subscriptions.

As far as I can tell, "Limited Subscriptions" is nowhere else defined in the document.  I'm curious is if
anyone has any insight into this?

I ask because, the document says in an earlier section that institutional subscriptions will be based on
"prices for comparable products and services."  Based on what we're paying now for, say, Safari, I'm
guessing a 10-million (or so) volume e-book collection is going to be VERY expensive.

Further, the document says that Google can (only?) offer two "versions" of subscriptions: (1) the entire
database, or (2) "discipline-based collections."

(Continue reading)

Walker, David | 2 Nov 2009 22:30
Favicon

Re: "Limited" Google Books Search ?

Thanks, Ed.  I guess it helps to read the "Definitions" section of the document. :-P

Section 4.1 (c) says that "Google may work through intermediaries to sell Institutional Subscriptions." 
I wonder, then, if maybe libraries will be able to license smaller sets of Google Book Search though one of
these "intermediaries," maybe like how we are licensing Safari books through Proquest?

--Dave

==================
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu
________________________________________
From: Next generation catalogs for libraries [NGC4LIB <at> LISTSERV.ND.EDU] On Behalf Of Ed Jones [ejones <at> NU.EDU]
Sent: Monday, November 02, 2009 12:19 PM
To: NGC4LIB <at> LISTSERV.ND.EDU
Subject: Re: [NGC4LIB] "Limited" Google Books Search ?

Limited Subscription is defined in 1.83 as "an Institutional Subscription offered to a library that
allows the subscribing library access only to the Books Digitized from that library, or only to the Books
held by that library."

-----Original Message-----
From: Next generation catalogs for libraries [mailto:NGC4LIB <at> LISTSERV.ND.EDU] On Behalf Of Walker, David
Sent: Monday, November 02, 2009 11:30 AM
To: NGC4LIB <at> LISTSERV.ND.EDU
Subject: [NGC4LIB] "Limited" Google Books Search ?

Section 4.1 (a)(vi)(1)(b) of the proposed Google Book Search settlement [1], in talking about
(Continue reading)

Ed Summers | 2 Nov 2009 22:31
Picon
Favicon

Re: Tim Berners-Lee on the Semantic Web

On Mon, Nov 2, 2009 at 10:12 AM, Ross Singer <rossfsinger <at> gmail.com> wrote:
> On Sun, Nov 1, 2009 at 2:34 PM, Karen Coyle <lists <at> kcoyle.net> wrote:
>> I also wonder what we'll do with situations where we have:
>>
>>        Teenage girls -- Fiction -- Comic books, strips, etc.
>>
>> and id.loc.gov has only
>>
>>        Teenage girls -- Fiction
>>
>> It seems that (other than the problem of matching a longer string to a
>> shorter, rather than vice-versa) we'll want a way to say: this subject
>> heading is an extension of this LC subject in the LC authority file. It
>> seems like it could be a simple relationship... yes?
>
> Hmm... "probably".  Perhaps (the non-existent)
> <http://lccn.heroku.com/subjects/Teenage girls -- Fiction -- Comic
> books, strips, etc> <skos:broaderTransitive
> <http://id.loc.gov/authorities/sh2008112612> ?  That could possibly be
> the way to bridge subjects /based/ on authorities to authorities,
> yeah?

At the time there wasn't enough real world use of SKOS w/ coordination
to warrant cooking something into the SKOS vocabulary itself. However,
as Karen points out there is a real need.

Since the components are controlled:

  Teenage girls -- Fiction <http://id.loc.gov/authorities/sh2008112612#concept>
  Comic books, strips, etc <http://id.loc.gov/authorities/sh99001401#concept>
(Continue reading)


Gmane