Re: [fcrepo-user] [NEWBIE] Accessing external resource with HTTP
Stefano Cossu <
scossu@...>
2013-05-16 16:59:35 GMT
I eventually opted for having all the data streams managed, mostly due
to an unpredictable structure of the source URIs.
It is good to know that I can have the HTTP auth option though. I was
also testing Fedora's capabilities...
Thanks,
Stefano
On 5/16/13 11:17 AM,
fedora-commons-users-request@...
wrote:
>
> Stefano,
>
> I assume that some sort of access control in Fedora (a policy) will be
> applied to the Fedora external datastream or object, or that the
> repository itself will not be publicly exposed? Otherwise, Fedora
> basically becomes an open back window to access content that's protected
> by a locked front door.
>
> I think the primary reason why this issue hasn't come up before is
> because most repository owners have direct control over the objects
> within their repository, and can manage the authnz architecture to suit
> their own needs. If you have some degree of control over who can access
> the images on the backend webserver, then the simplest solution would be
> to configure the backend web server to allow requests from the Fedora
> host to pass through unimpeded, using IP access control.
>
> I took a look at the source code, and the method that makes the actual
> request is
>
> https://github.com/fcrepo/fcrepo/blob/master/fcrepo-server/src/main/java/org/fcrepo/server/storage/DefaultExternalContentManager.java
>
> line 280: private MIMETypedStream getFromWeb(ContentManagerParams params)
>
> Rich is correct: populating ContentManagerParams would get the
> credentials injected for you. I don't recall how those parameters get
> populated, though.
>
> Turning debug logging on will give you *a lot* of information about the
> handling of the request.
>
> -- Scott
>
> On 05/09/2013 04:02 PM, Stefano Cossu wrote:
>> <at> Scott: I can't afford to have this datastream managed by Fedora,
>> because it's several megabytes large and there are over a million of them.
>>
>> <at> Rich: the http://user <at> pass:/resource syntax works with cURL too, that's
>> why I gave it a shot even though I don't know what Fedora actually uses
>> to connect to remote servers.
>> Your source link is very interesting though. I'll give it a look. I
>> wonder how this hasn't been brought up before. Accessing resources
>> through authentication seems like quite a common task to me, and I hoped
>> I could do it without hacking the Fedora code.
>>
>> Thanks
>> sc
>>
>>
>> Stefano Cossu
>> Director of Application Services, Collections
>>
>> The Art Institute of Chicago
>> 116 S. Michigan Ave.
>> Chicago, IL 60603
>> 312-499-4026
>>
>>
>> On 5/9/13 3:45 PM,
fedora-commons-users-request@... wrote:
>>> Send Fedora-commons-users mailing list submissions to
>>> fedora-commons-users@...
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>> or, via email, send a message with subject or body 'help' to
>>> fedora-commons-users-request@...
>>>
>>> You can reach the person managing the list at
>>> fedora-commons-users-owner@...
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of Fedora-commons-users digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>> 1. Re: [NEWBIE] Accessing external resource with HTTP
>>> authentication (Benjamin Armintor)
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Thu, 9 May 2013 16:45:16 -0400
>>> From: Benjamin Armintor <armintor@...>
>>> Subject: Re: [fcrepo-user] [NEWBIE] Accessing external resource with
>>> HTTP authentication
>>> To: "Support and info exchange list for Fedora users."
>>> <fedora-commons-users@...>
>>> Message-ID:
>>> <CADQQ8TPFxo8va7uB2nCb6XvbJzCbf1s7UnU7hfTCqhpntWBHwg@...>
>>> Content-Type: text/plain; charset="iso-8859-1"
>>>
>>> Yes, in that sense it would be straightforward- though this question of
>>> where the credentials would be stored was one of things that derailed the
>>> feature in the first place!
>>>
>>>
>>> On Thu, May 9, 2013 at 4:40 PM, Rich d'Rich <rich.d.rich@...> wrote:
>>>
>>>> AFAIK the username:password <at> syntax is a browser artefact that the Java
>>>> HTTP access libraries (apache commons httpclient) that Fedora uses doesn't
>>>> support.
>>>>
>>>> This also means that you can't do a server-server import where the source
>>>> Fedora server requires authentication, and it causes problems with
>>>> disseminators.
>>>>
>>>> However, looking at the code, most of the "wiring" is there:
>>>>
>>>> https://github.com/fcrepo/fcrepo/blob/master/fcrepo-server/src/main/java/org/fcrepo/server/access/DefaultAccess.java
>>>>
>>>> it just needs getDatastreamDissemination (around line 1145) to extract a
>>>> username and password from somewhere and put it into ContentManagerParams.
>>>> Ideally, there would be a configured table of known external servers and
>>>> credentials that could be kept secure so passwords aren't bandied about.
>>>>
>>>> I may be wrong though and there's already a way to do this
>>>>
>>>>
>>>> On 10 May 2013 06:36, Scott Prater <prater@...> wrote:
>>>>
>>>>> Stefano --
>>>>>
>>>>> Are you ingesting the datastreams as managed datastreams, or as redirect
>>>>> or external datastreams?
>>>>>
>>>>> If the former, once Fedora ingests the FOXML, the object is referred to
>>>>> by its internal Fedora URI, and no source URLs or passwords are exposed
>>>>> in any object export.
>>>>>
>>>>> If the datastreams are managed, then you may want to take a compromise
>>>>> approach: fetch them to the local machine using curl or some such tool,
>>>>> then ingest the local file. Once it's ingested, you can delete the
>>>>> local file.
>>>>>
>>>>> Managed datastreams are usually preferred to external or redirect
>>>>> datastreams; there are use cases for external and redirect datastreams
>>>>> (which is why they exist), but the normal case is to store datastreams
>>>>> as managed.
>>>>>
>>>>> -- Scott
>>>>>
>>>>> On 05/09/2013 01:08 PM, Benjamin Armintor wrote:
>>>>>> Stefano-
>>>>>> I remember some conversation a couple of years ago about supporting
>>>>>> BASIC auth in services, but as far as I know they didn't go anywhere.
>>>>>> Maybe another committer remembers something? In any case, I don't see
>>>>>> why storing the credentials like that wouldn't work, if you can accept
>>>>>> the plain-text issues you cite.
>>>>>>
>>>>>> As far as certs, Im afraid you're on your own. I will warn you that
>>>>>> Java errs on the side of verification unless you instruct it not to, so
>>>>>> invalid certs will cause other problems.
>>>>>>
>>>>>> - Ben
>>>>>>
>>>>>>
>>>>>> On Thu, May 9, 2013 at 12:32 PM, Stefano Cossu <scossu@...
>>>>>> <mailto:scossu@...>> wrote:
>>>>>>
>>>>>> Hi there,
>>>>>> I'm starting to tinker with Fedora and trying to write a CMA
>>>>> workflow.
>>>>>> I'm building a digital object that should grab an image datastream
>>>>> from
>>>>>> an HTTPS server which requires basic authentication.
>>>>>> I tried inserting the authentication data in the URL for the
>>>>> datastream,
>>>>>> but now I have 2 problems:
>>>>>> 1) Username and password are stored in plain text in the FOXML,
>>>>> visible
>>>>>> by everyone who looks up that record in Fedora, as well as all over
>>>>> the
>>>>>> logs.
>>>>>> 2) I still can't connect to the server this way. The server's
>>>>>> certificate is expired, I don't know if that plays a role.
>>>>>>
>>>>>> Fedora throws this error:
>>>>>>
>>>>>> ERROR 2013-05-09 11:04:28.618 [http-8080-1] (BaseRestResource)
>>>>>> Unexpected error fulfilling REST API request
>>>>>> org.fcrepo.server.errors.HttpServiceNotFoundException:
>>>>>> [DefaultExternalContentManager] returned an error. The underlying
>>>>> error
>>>>>> was a org.fcrepo.server.errors.GeneralException T
>>>>>> he message was "Error getting
>>>>>> https://username:password <at> imageserver/myHugePicture" .
>>>>>> at
>>>>>>
>>>>> org.fcrepo.server.storage.DefaultExternalContentManager.getExternalContent(DefaultExternalContentManager.java:152)
>>>>>> ~[fcrepo-server-3.6.2.jar:na]
>>>>>> at
>>>>>>
>>>>> org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1148)
>>>>>> ~[fcrepo-server-3.6.2.jar:na]
>>>>>> at
>>>>>>
>>>>> org.fcrepo.server.rest.DatastreamResource.getDatastream(DatastreamResource.java:247)
>>>>>> ~[fcrepo-server-3.6.2.jar:na]
>>>>>> [...]
>>>>>>
>>>>>> And the image server's Apache error log:
>>>>>>
>>>>>> Thu May 09 11:04:25 2013] [info] [client 10.80.25.47] Connection to
>>>>>> child 0 established (server imageserver:443)
>>>>>> [Thu May 09 11:04:25 2013] [info] Seeding PRNG with 144 bytes of
>>>>> entropy
>>>>>> [Thu May 09 11:04:25 2013] [info] [client 10.80.25.47] SSL library
>>>>> error
>>>>>> 1 in handshake (server imageserver:443)
>>>>>> [Thu May 09 11:04:25 2013] [info] SSL Library Error: 336151608
>>>>>> error:14094438:SSL routines:SSL3_READ_BYTES:tlsv1 alert internal
>>>>> error
>>>>>> [Thu May 09 11:04:25 2013] [info] [client 10.80.25.47] Connection
>>>>> closed
>>>>>> to child 0 with abortive shutdown (server imageserver:443)
>>>>>> [...]
>>>>>>
>>>>>> Of course, I can always use a redirect datastream and let the client
>>>>>> deal with authentication and SSL, but I'd like to hide the source
>>>>> URI if
>>>>>> possible.
>>>>>>
>>>>>>
>>>>>> Below is the FOXML representation of my object:
>>>>>>
>>>>>> <foxml:digitalObject VERSION="1.1" PID="test:dervPub_obj"
>>>>>> xsi:schemaLocation="info:fedora/fedora-system:def/foxml#
>>>>>> http://www.fedora.info/definitions/1/0/foxml1-1.xsd">
>>>>>> <foxml:objectProperties>
>>>>>> <foxml:property
>>>>>> NAME="info:fedora/fedora-system:def/model#state" VALUE="Active"/>
>>>>>> <foxml:property
>>>>>> NAME="info:fedora/fedora-system:def/model#label" VALUE="Disseminator
>>>>>> object"/>
>>>>>> <foxml:property
>>>>>> NAME="info:fedora/fedora-system:def/model#ownerId"
>>>>> VALUE="fedoraAdmin"/>
>>>>>> <foxml:property
>>>>>> NAME="info:fedora/fedora-system:def/model#createdDate"
>>>>>> VALUE="2013-05-09T15:37:41.708Z"/>
>>>>>> <foxml:property
>>>>>> NAME="info:fedora/fedora-system:def/view#lastModifiedDate"
>>>>>> VALUE="2013-05-09T15:37:41.892Z"/>
>>>>>> </foxml:objectProperties>
>>>>>> <foxml:datastream ID="AUDIT" STATE="A" CONTROL_GROUP="X"
>>>>>> VERSIONABLE="false">
>>>>>> <foxml:datastreamVersion ID="AUDIT.0" LABEL="Audit Trail
>>>>> for
>>>>>> this object" CREATED="2013-05-09T15:37:41.708Z" MIMETYPE="text/xml"
>>>>>> FORMAT_URI="info:fedora/fedora-system:format/xml.fedora.audit">
>>>>>> <foxml:xmlContent>
>>>>>> <audit:auditTrail>
>>>>>> <audit:record ID="AUDREC1">
>>>>>> <audit:process type="Fedora API-M"/>
>>>>>> <audit:action>addDatastream</audit:action>
>>>>>> <audit:componentID>SOURCE_IMG</audit:componentID>
>>>>>> <audit:responsibility>fedoraAdmin</audit:responsibility>
>>>>>> <audit:date>2013-05-09T15:37:41.892Z</audit:date>
>>>>>> <audit:justification/>
>>>>>> </audit:record>
>>>>>> </audit:auditTrail>
>>>>>> </foxml:xmlContent>
>>>>>> </foxml:datastreamVersion>
>>>>>> </foxml:datastream>
>>>>>> <foxml:datastream ID="DC" STATE="A" CONTROL_GROUP="X"
>>>>>> VERSIONABLE="true">
>>>>>> <foxml:datastreamVersion ID="DC1.0" LABEL="Dublin Core
>>>>> Record
>>>>>> for this object" CREATED="2013-05-09T15:37:41.708Z"
>>>>> MIMETYPE="text/xml"
>>>>>> FORMAT_URI="http://www.openarchives.org/OAI/2.0/oai_dc/"
>>>>> SIZE="388">
>>>>>> <foxml:xmlContent>
>>>>>> <oai_dc:dc
>>>>>> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
>>>>>> http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
>>>>>> <dc:title>Disseminator object</dc:title>
>>>>>> <dc:identifier>test:dervPub_obj</dc:identifier>
>>>>>> </oai_dc:dc>
>>>>>> </foxml:xmlContent>
>>>>>> </foxml:datastreamVersion>
>>>>>> </foxml:datastream>
>>>>>> <foxml:datastream ID="RELS-EXT" STATE="A" CONTROL_GROUP="X"
>>>>>> VERSIONABLE="false">
>>>>>> <foxml:datastreamVersion ID="RELS-EXT.0"
>>>>> LABEL="Relationships"
>>>>>> CREATED="2013-05-09T15:37:41.837Z" MIMETYPE="application/rdf+xml"
>>>>>> FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0" SIZE="273">
>>>>>> <foxml:xmlContent>
>>>>>> <rdf:RDF>
>>>>>> <rdf:Description
>>>>>> rdf:about="info:fedora/test:dervPub_obj">
>>>>>> <hasModel
>>>>>> rdf:resource="info:fedora/test:dervPub_CModel"/>
>>>>>> </rdf:Description>
>>>>>> </rdf:RDF>
>>>>>> </foxml:xmlContent>
>>>>>> </foxml:datastreamVersion>
>>>>>> </foxml:datastream>
>>>>>> <foxml:datastream ID="SOURCE_IMG" STATE="A" CONTROL_GROUP="E"
>>>>>> VERSIONABLE="true">
>>>>>> <foxml:datastreamVersion ID="SOURCE_IMG.0" LABEL="full
>>>>> sized
>>>>>> image" CREATED="2013-05-09T15:37:41.892Z" MIMETYPE="image/jpeg">
>>>>>> <foxml:contentLocation TYPE="URL"
>>>>>> REF="https://username:password <at> imageserver/myHugePicture"/>
>>>>>> </foxml:datastreamVersion>
>>>>>> </foxml:datastream>
>>>>>>
>>>>>> I would really appreciate your help.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>>> their applications. This 200-page book is written by three acclaimed
>>>>>> leaders in the field. The early access version is available now.
>>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>>> _______________________________________________
>>>>>> Fedora-commons-users mailing list
>>>>>> Fedora-commons-users@...
>>>>>> <mailto:Fedora-commons-users@...>
>>>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>>> their applications. This 200-page book is written by three acclaimed
>>>>>> leaders in the field. The early access version is available now.
>>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Fedora-commons-users mailing list
>>>>>> Fedora-commons-users@...
>>>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>>>
>>>>> --
>>>>> Scott Prater
>>>>> Shared Development Group
>>>>> General Library System
>>>>> University of Wisconsin - Madison
>>>>> prater@...
>>>>> 5-5415
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>> their applications. This 200-page book is written by three acclaimed
>>>>> leaders in the field. The early access version is available now.
>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>> _______________________________________________
>>>>> Fedora-commons-users mailing list
>>>>> Fedora-commons-users@...
>>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>> their applications. This 200-page book is written by three acclaimed
>>>> leaders in the field. The early access version is available now.
>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>> _______________________________________________
>>>> Fedora-commons-users mailing list
>>>> Fedora-commons-users@...
>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>
>>>>
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>>
>>> ------------------------------
>>>
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and
>>> their applications. This 200-page book is written by three acclaimed
>>> leaders in the field. The early access version is available now.
>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> Fedora-commons-users mailing list
>>> Fedora-commons-users@...
>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>
>>>
>>> End of Fedora-commons-users Digest, Vol 75, Issue 7
>>> ***************************************************
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@...
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>
------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d