Picon

Re: Resignation

On 9/1/07, Rinne, Nathan (ESC) <RinneN <at> district279.org> wrote:
> You would have to rely on proven
> authority work - the work that is now increasingly seen as less
> important by more and more people in the Googlized atmosphere we inhabit

This is where you're wrong, and this is why I've said you're not
reading what I write in the past. The idea here is to take what we've
done so far and push it further, especially now that technologies
advance beyond certain points. We need to work together to make sure
that the excellent work that the library world so far has done is not
only shared with the world, but extended and continued.

> - in order to do this very concrete experiment (where else will you get
> the more-or-less complete biographies of the various authors -
> Wikipedia?)

The paper I pointed to which did auto-classification across Project
Gutenberg chose LCSH specifically so that librarians could verify the
result. They could have chosen anything. Maybe it was a cry for us to
join them? Anyways, I think they did this especially for people like
you who *already* have faith in the LCSH. Did you read the paper in
full? The conclusion has a golden nugget for you. :)

> Maybe, "ironic" was the wrong choice of words.  I just feel
> the need to point out that its not just "natural selection" that makes
> this possible, you know.  :)

Natural selection happens in social and cultural contexts as well as
in nature. Just thought I'd point that out. :)

(Continue reading)

Conal Tuohy | 1 Sep 03:23
Picon
Picon
Favicon

Re: Resignation

Nathan wrote:

> Thanks for your reply.  I see that I overestimated the capabilities of
> scanning technology (Jim's example helped here).

I don't personally think that the capabilities of OCR software are the limiting factor. Note that Google's
low OCR quality is notorious among "etext" practitioners. It is possible to do much better. In any case,
the reason why Google is satisfied with such low quality output is that for their purposes they are able to
"make do" with it. Text of such low quality can actually suffice as input for the Bayesian techniques under
consideration. Remember, these algorithms are not attempting to READ the text in a human fashion. They
are just looking at relative word frequencies. If they have to discard a bunch of unlikely-looking words,
or if a lot of mis-spelled words creep into their input, etc, that is not necessarily going to disrupt their
abilities too much.

So personally, the reason I'm not leaping to perform the suggested experiment is that there's a lot of work
in tracking down all the books of the selected authors, and scanning them page by page. It would take weeks
to do. At the end, I guess a lot of the electronic text would be unusable for anything else, either (for
copyright reasons). Although I can see that some "doubting Thomases" might be impressed by the results of
such an experiment, I don't have the resources personally to do it. However, I think there's enough proof
of the capabilities of Bayesian methods in the scientific literature ... so I don't personally have a need
to perform such an experiment just to satisfy myself it can be done. Which is not to say I don't intend to use
these techniques in my own work though!

I work in a digitisation centre in a university library, and we have done some experiments to harvest
subject classification from full text of digitised magazine articles. What has kept us from developing
the idea further already is mostly a lack of time. There have been some technical problems with the
software (in that the MatLab-based implementation we tried is designed to work with small pieces of text
(abstracts, or newspaper articles, rather than novel-length books). I believe these are surmountable
though, and I think that by using our university grid computing facility we will be able to scale up to deal
with large corpora OK. I'm still hopeful that we'll have something "in production" within the next
(Continue reading)

Tim Spalding | 1 Sep 05:14

Re: Paying for the NGC

I have no opinion on funding aggregation and I'm squishy on
open-source pledges. I think the question is more basic:

If libraries paid their tech people better, they'd get better ones to
start with, and retain the good ones longer.

So, if that's true, what barriers—financial, institutional,
cultural—prevent that from coming to pass?

Now, I'm going to be my own "on the other hand." Actually, as Paul
Graham argues*, the best hackers aren't really motivated by
money—unless it's a life-changing amount. Although he was talking
about private-sector wages—the difference between 80k and 130k, for
example—there's still something there. Good hackers care about their
freedom on the job (and the amount of bs they have to deal with), the
problems they're given and the tools they get to use. In those
respects too, libraries are more severely disadvantaged.

Tim

*http://www.itconversations.com/shows/detail188.html /
http://www.paulgraham.com/gh.html

"Great programmers are sometimes said to be indifferent to money. This
isn't quite true. It is true that all they really care about is doing
interesting work. But if you make enough money, you get to work on
whatever you want, and for that reason hackers are attracted by the
idea of making really large amounts of money. But as long as they
still have to show up for work every day, they care more about what
they do there than how much they get paid for it. Economically, this
(Continue reading)

Picon

Re: Paying for the NGC

On 9/1/07, Tim Spalding <tim <at> librarything.com> wrote:
> If libraries paid their tech people better, they'd get better ones to
> start with, and retain the good ones longer.
...
> Now, I'm going to be my own "on the other hand." Actually, as Paul
> Graham argues*, the best hackers aren't really motivated by
> money—unless it's a life-changing amount. Although he was talking

I think that if libraries made up for the crap pay with really
interesting tasks or a fun and stimulating environment (which, for a
hacker, quite often means less bureaucracy, less meetings/committees
and less micro management), you could assemble a really great bunch of
hackers who would sacrifice a lot for the good cause. The university
environments certainly has taken this approach and succeed by far.

Actually, come to think of it, I've never met any hacker driven by
money, and I'm saying that as an ex-consultant, too. I think we can be
safe that no one started working for the library for the pay, though.
:)

Alex
--
 ---------------------------------------------------------------------------
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
------------------------------------------ http://shelter.nu/blog/ --------

Lynn Reynish | 1 Sep 08:50
Picon

Re: Paying for the NGC

Tim Spalding said:

"If libraries paid their tech people better, they'd get better ones to start with, and retain the good ones longer."

I'd modify that by substituting "staff in general" for "tech people" - but I agree with the general
sentiment. That said, it is indeed not about the money frequently. I agree with Mr. Graham's notion that
"good hackers care about their freedom on the job (and the amount of bs they have to deal with), the problems
they're given and the tools they get to use." But again, this sentiment is true of many more "traditional"
library staff - from cataloguers to children's staff. In fact, I'm hard pressed to think of any worker who
wouldn't like this type of job freedom.

I've been lucky. I've worked with almost 30 libraries of various types in my fairly short career (the joy of
working for consortia) and I have only dealt with one that was resistant to new ideas for seemingly no good
reason. At the other libraries, new ideas sometimes couldn't go ahead for various reasons: lack of money,
lack of time/staff, lack of expertise, feeling that it simply wasn't appropriate for their users - but
some new ideas were always tried. I know that other systems folks (or library staff) have not been so fortunate.

Alexander Johannesen said:

"Why? What has LIS got that the rest of the world could benefit from?"

Ooookay. I see that James Weinheimer has already responded to the general tone of this snark. My response is
more specific. CS is fairly infamous in the general business world and in the library world for not
listening to customers - whether on the help desk or at the design stage. It's bad enough that there are
streams of articles written on how to "talk" to your technical staff or how to "rein in" your technical
staff. Having the attitude that your "client" has nothing to teach you (be they library, insurance
company, Fortune 500 company or a fellow employee) does nothing to dispel this belief.

I certainly don't expect a programmer to understand the intricacies of MARC or authorities or circulation
or outreach service or anything else (I certainly don't). I do expect them to consult someone who does and
(Continue reading)

Picon

Re: Paying for the NGC

> "Why? What has LIS got that the rest of the world could benefit from?"

On 9/1/07, Lynn Reynish <lreynish <at> reginalibrary.ca> wrote:
> Ooookay. I see that James Weinheimer has already responded to the general
> tone of this snark.

It's a shame you didn't see my response to that, in which I made it
clear that this is a serious question, and not meant as a snark at
all. As I can see that that has colored the rest of your post, I'll
leave most of it alone. But ;

> I certainly don't expect a programmer to understand the intricacies of MARC
> or authorities or circulation or outreach service or anything else (I certainly don't).

I do. If the hackers don't understand what the librarians are on
about, why are they there?

> Many libraries (and individual library staff members) feel ignored and derided by
>  programmers and technology companies so it's hardly surprising that they
> can't just "trust" you.

Hehe, wow. Are you somewhat saying that because I know programming I
have no librarian chutzpah in me? That programming and librarianship
are somewhat at odds end?

> I'm a programmer and I don't "trust" people or technology - I haven't drunk that Kool-Aid.

Hmm, what do you trust?

Alex
(Continue reading)

Ross Singer | 1 Sep 14:21
Picon
Favicon

Re: Paying for the NGC

On 9/1/07, Lynn Reynish <lreynish <at> reginalibrary.ca> wrote:

> "Why? What has LIS got that the rest of the world could benefit from?"
>
> Ooookay. I see that James Weinheimer has already responded to the general tone of this snark. My response
is more specific. CS is fairly infamous in the general business world and in the library world for not
listening to customers - whether on the help desk or at the design stage. It's bad enough that there are
streams of articles written on how to "talk" to your technical staff or how to "rein in" your technical
staff. Having the attitude that your "client" has nothing to teach you (be they library, insurance
company, Fortune 500 company or a fellow employee) does nothing to dispel this belief.

At this point I finally feel the need to jump and clear something up.
This started with Jim's initial response to Alex about "the CS people
giving us what they want and getting mad when we criticize" and
continues here.

Alex, Conal, Jonathan and others are talking about researchers (and
implementors) in the fields of Information Retrieval and Artificial
Intelligence.

All of the responses have been about what seem to be campus systems
administrators, arrogant developers and help desk staff, and that's
why "CS can't be trusted to help".

Please understand what sort of strawman this is and kindly refrain
from it, since your campus sysadmin and help desk person probably
would have no idea what these researchers (and implementors) do.  The
programmer might, but I'm doubtful.

I wanted to make an analogy like "My next door neighbor is an Special
(Continue reading)

Picon

Re: Paying for the NGC

On 9/1/07, Ross Singer <ross.singer <at> library.gatech.edu> wrote:
> Please understand what sort of strawman this is and kindly refrain
> from it, since your campus sysadmin and help desk person probably
> would have no idea what these researchers (and implementors) do.  The
> programmer might, but I'm doubtful.

Thanks, Ross, for clarifying. Maybe my point wasn't clear enough (and
I'm trying to be very clear about that). Luckily I've made this
educational video to help us understand where this debate is up to ;

   http://www.youtube.com/watch?v=BrO0TttczJc

What did the LIS researchers and thinkers ever do for us? But more to
the point, what are they doing right now? It's a very genuine
question, because if they're doing cool and sexy things, then not only
is there a communication problem, but I want in on the action!

Alex
--
 ---------------------------------------------------------------------------
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
------------------------------------------ http://shelter.nu/blog/ --------

Weinheimer Jim | 2 Sep 14:18

Re: Fwd: Re: [NGC4LIB] Resignation

> I'm fine with the idea of testing computational linguistics' ability
> to distinguish David Johnsons, but I wouldn't wave the banner of
> AACR2/LCRI authority control's achievements too high on this one.
> When I checked just now, LC's undifferentiated personal name
> authority for "Johnson, David" had 12 differentiated identities
> resident under that one heading. Current AACR2/LCRI rules restrict
> the allowable qualifiers fairly narrowly. Even if sound statistics
> could sort out the works of "David Johnson" among these twelve
> identities, the rules would still have them all sharing one
> heading--as a matter of principle, I suppose. :)

The non-unique names are not the fault of the bibliographic standards: the problem is that the cataloger
did not have enough information to distinguish the author from the others. All non-unique names should be
seen to be in a type of temporary mode: a separate heading will be made sooner or later when somebody has more information.

But the other issue you raise is more important: should there be more and other ways to distinguish forms of
names, other than dates, e.g.
Johnson, David, writer on railroads.

There used to be this sort of flexibility at one time, but this type of practice disappeared as the rules
began to expand and networks grew. Perhaps it's time to reconsider these kinds of practices--although I
am sure that people could find lots of problems with it immediately, and I am personally skeptical, but it
could be tried.

As I have mentioned in several postings, before there is a tremendous amount of work done creating new tools
that may be unsustainable in the long-run, I wish we would try using the information we have at our disposal
in better ways. WorldCat Identities does a great job of giving people an idea of an author by letting users
see the bibliographic records immediately with the heading. Perhaps the problem of identifying authors
is not so much a problem of not enough information, it may just be a matter of displaying the information we
have more effectively.
(Continue reading)

Selden Deemer | 2 Sep 15:37
Favicon

Re: Paying for the NGC

A fact of life is that the entire library marketplace just isn't
all that big. The ALA summer conference, the biggest annual event
in the North American library world, attracts about 20,000 people.
The 20th annual Dragon*Con (fantasy/SF convention) is taking place
in Atlanta this weekend, with an expected draw of 30,000 people.
Most of the front page of today's Sunday paper is devoted to
football coverage; the sports sections occupy 48 pages, while the
entire Arts & Books section is 10 pages, of which 2 and a half are
devoted to books.

We who work in libraries can be passionate about what we do, but
the fact is that what we do just isn't valued that much in our
contemporary culture. Overall, I would say that pay is commensurate
with the value placed by society on what we do.

Selden Deemer, Library Systems Administrator
Emory University Libraries
Atlanta, Georgia
EMAIL:  libssd <at> emory.edu
PHONE:  404-727-0271
FAX:    404-727-0827

On Sep 1, 2007, at 11:00 PM, Automatic digest processor wrote:

> "If libraries paid their tech people better, they'd get better ones
> to start with, and retain the good ones longer."


Gmane