Tobias Kind | 1 Jun 04:45 2007
Picon

WIKI content and design

Hi Noel,

Thanks for the links. Regarding the BO WIKI I think it should be possible
to edit content for registered and known and active people. Web 2.0 time!

So if I want to add some stuff on the news page
It should posibble for me to do that after I was cleared by the BO WIKI
admin.
I would not add spam and the admin always has the right to remove content
anyway.
Maybe this would bring more life into the BO website.
Some important parts like agenda, aims and strategies can be protected from
edit by users,
I think most input comes from blogs anyway, but a clear collection
of aims, links, agendas on the BO website is needed.

Regarding the Blueobelisk.org style (is this the main site?)
I like it to have stome static LINKs, like the old plone page from CUBIC
http://almost.cubic.uni-koeln.de/jrg/
You have Strategy and Agenda, People, Projects, News etc.
These plone templates for research groups are usually well designed.

Kind regards
Tobias
Noel O'Boyle | 1 Jun 10:30 2007
Picon

Re: WIKI content and design

On 01/06/07, Tobias Kind <tkind@...> wrote:
> Hi Noel,
>
> Thanks for the links. Regarding the BO WIKI I think it should be possible
> to edit content for registered and known and active people. Web 2.0 time!
>
> So if I want to add some stuff on the news page
> It should posibble for me to do that after I was cleared by the BO WIKI
> admin.
> I would not add spam and the admin always has the right to remove content
> anyway.
> Maybe this would bring more life into the BO website.

This is all already possible. Just email this list asking for permission.

> Some important parts like agenda, aims and strategies can be protected from
> edit by users,

Not really necessary.

> I think most input comes from blogs anyway, but a clear collection
> of aims, links, agendas on the BO website is needed.

Right. I thought it is now quite clear, but feel free to add more.

> Regarding the Blueobelisk.org style (is this the main site?)
Yes - this is not just a wiki for the Blue Obelisk. This is a website
for the Blue Obelisk, which we maintain using a wiki.

> I like it to have stome static LINKs, like the old plone page from CUBIC
(Continue reading)

Christoph Steinbeck | 1 Jun 13:08 2007
Picon

Re: WIKI content and design

Hi Tobias, Noel, and everybody

when we move this to Sourceforge, which will require some investigation,
we can take care of these things.
Currently, the Wiki lives on my site at Cologne.

I remember that there were some limitations for running mediawiki on SF.
Not sure if they still exists.
If someone has time to install mediawiki in our bo webspace on SF, I'll
really appreciate it. Then, the existing MySQL tables can easily be move
from Cologne to into the SF MySQL.

cheers,

Chris

Tobias Kind wrote:
> Hi Noel,
> 
> Thanks for the links. Regarding the BO WIKI I think it should be possible
> to edit content for registered and known and active people. Web 2.0 time!
> 
> So if I want to add some stuff on the news page
> It should posibble for me to do that after I was cleared by the BO WIKI
> admin.
> I would not add spam and the admin always has the right to remove content
> anyway.
> Maybe this would bring more life into the BO website.
> Some important parts like agenda, aims and strategies can be protected from
> edit by users,
(Continue reading)

peter murray-rust | 1 Jun 17:32 2007
Picon
Picon

WWMM server down

Just to say WWMM got hit by a power cut that has caused problems and 
may not be back up for a little while. I don't think it affects BO 
particularly other than links from those of us here who post blogs 
aggregated by PlanetBO

P.
Peter Murray-Rust
Unilever Centre for Molecular Sciences Informatics
University of Cambridge,
Lensfield Road,  Cambridge CB2 1EW, UK
+44-1223-763069
Tobias Kind | 2 Jun 02:49 2007
Picon

The paradigm shift in data sharing in chemistry and the Blue Obelisk movement

I think as open data advocates we should re-think our strategies towards
open data spectral collections including NMR and MS and IR and crystal
structure data or chemical property data. I will post this to the Blue
Obelisk mailing list to obtain more input. There were some interesting
discussions in the BlogOSphere, but I am still stuck in Web 1.0 so I need to
post it to the BO mailing list. I will put this comment later on the new
BlueObelisk wiki and collect discussions (Via votes or comments,
Technology?) and we can compile a list of chemistry journals on BlueObelisk
with comments on their data sharing policy on chemistry data. Additionally
editors and the editorial boards will be contacted. Chemistry journal list:

http://www.cas.org/expertise/cascontent/caplus/corejournals.html
http://www.nlm.nih.gov/bsd/journals/subjects.html

Some recent BlogLinks (2007):
http://www.sennoma.net/main/archives/2006/12/where_are_the_data_can_i_have.p
hp
http://researchremix.wordpress.com/2007/05/30/diverse-journal-requirements-f
or-data-sharing/
http://researchremix.wordpress.com/2007/05/30/diverse-journal-requirements-f
or-data-sharing/
http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=350

****************************************************************
The paradigm shift in data sharing in chemistry and the Blue Obelisk
movement

Tobias Kind - fiehnlab.ucdavis.edu

There are some historic reasons that chemists depend on commercial databases
(Continue reading)

Sanford Dickert | 2 Jun 05:52 2007
Picon

Re: The paradigm shift in data sharing in chemistry and the Blue Obelisk movement

Tobias -

Simply put, this is an excellent post and discussion.  It has been one of our foci to allow for open data exchange much along the lines of Creative Commons with regard to the spectral data that we are building up, and are engaging efforts to work on the legal issues found within the ownership of the spectral data.

If you are interested in learning more about our efforts, please send an email to sanford-l6WDR9V+PS2RcMt6Q3OCCwC/G2K4zDHf@public.gmane.org.

Sanford

On 6/1/07, Tobias Kind <tkind-ZnEz5tD0I2KVc3sceRu5cw@public.gmane.org> wrote:
I think as open data advocates we should re-think our strategies towards
open data spectral collections including NMR and MS and IR and crystal
structure data or chemical property data. I will post this to the Blue
Obelisk mailing list to obtain more input. There were some interesting
discussions in the BlogOSphere, but I am still stuck in Web 1.0 so I need to
post it to the BO mailing list. I will put this comment later on the new
BlueObelisk wiki and collect discussions (Via votes or comments,
Technology?) and we can compile a list of chemistry journals on BlueObelisk
with comments on their data sharing policy on chemistry data. Additionally
editors and the editorial boards will be contacted. Chemistry journal list:

http://www.cas.org/expertise/cascontent/caplus/corejournals.html
http://www.nlm.nih.gov/bsd/journals/subjects.html

Some recent BlogLinks (2007):
http://www.sennoma.net/main/archives/2006/12/where_are_the_data_can_i_have.p
hp
http://researchremix.wordpress.com/2007/05/30/diverse-journal-requirements-f
or-data-sharing/
http://researchremix.wordpress.com/2007/05/30/diverse-journal-requirements-f
or-data-sharing/
http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=350

****************************************************************
The paradigm shift in data sharing in chemistry and the Blue Obelisk
movement

Tobias Kind - fiehnlab.ucdavis.edu

There are some historic reasons that chemists depend on commercial databases
and commercial spectral collections. These sources are usually very reliable
and curated resources which are either subscription based or one time fee
based. However chemists still publish in the same manner as 100 years ago.
Obstacles which culminate in hindrance of science (literally think of it as
the bible only published in Latin) are:

O1) Spectral databases or experimental molecular property data are license
protected in a way no derivate work can be obtained in an easy way on large
data sets. This would be the case if from a one million spectral collection
only one spectrum at a time can be investigated (instead of bulk access) or
such an approach is forbidden in the EULA or license.

O2) Subscription prices to data collections are to high (I am not talking
about hundred dollars, but ten-thousands of dollars for a modern spectral or
cheminformatics lab.)

O3) A combination of both above reasons which ultimatively leads to a
dead-lock in new research synergies even for well equipped academic labs and
especially for smaller companies.

O4) Many scattered and incomplete data collections exist instead of a
complete collection of spectral or molecular property data. Hundreds of labs
have their own little private collections.

O5) If meta-data such as toxicity data, spectral properties etc. are not
published in the appropriate way or just get lost  then this is a waste of
resources, money, time and in case of toxicity test when animals are
involved this is even an severe ethical issue. This data loss is
additionally a double-pay-feature, because chemists have to repeat the
experiments again and again (lets say for NMR spectra) and after 2 or 3
people have it confirmed such copyrighted spectra are collected and sold
back to the chemists. Thousands of man-years of research wasted down the
drain.

In case of meta-data such as molecular spectra or molecular properties the
target for change should not be the commercial publisher but the scientists'
themselves. One assumption is that publishing in an OA journal does not mean
that spectral metadata in CML format is automatically included. Another
assumption is that there will be a mix of open access and commercial
publishers also in the near future.

A1) The power of changing publishing behavior lies in the hand of Editorial
Boards and Editors. These are usually honorary or experienced scientists in
their field. If they can be convinced that supporting spectral data and
chemical property data as CMS or XML is valuable to "their" journal or to
science in general they have the power to change that.

A2) Additional power lies in the hand of reviewers by gradually requesting
that as much as possible spectral data and chemical property data is
submitted electronically to an open access repository with every
publication. They could also forbid a submission if no such minimal data is
delivered.

A3) Some power lies in the hand of chemists themselves by just submitting
spectral or property data as CML or XML supplement with every publication
(requires change of mindset and currently means some more work).

To solve these problems there are some requirements.

R1) The software tools for an easy extraction and submission of spectral
data or experimental molecular property data should exist. For spectral data
this can be either free software (as the existing BioEclipse) or any new
commercial software.

R2)  The problem of linking molecular structures to molecular data to
publications is not yet solved. This is a very chemistry specific problem.
This includes the InChI codes and PubChem IDs as unique identifiers for
molecule structures (with many unresolved problems) and their connection to
the properties and spectra and the linking to the publication via the DOI
number. See also http://sourceforge.net/projects/spectra-chem

R3) Open access and commercial publishers should be directly involved in
such a process, because the metadata should be linked to the publication
itself via the DOI number and the meta-data should have a DOI pointer or any
other link to the publication.

R4) The data structures (how data is sent to a database, definitions of CML
or XML files) most follow minimum standards. The submission process should
start immediately, because definition of data standards can take decades.
Common sense would be a good starting point. Existing exchange formats can
be directly used (JCAMP, netCDF, CML). For example in case of mass spectra,
this would be the name, INChI, Pubchem ID, DOI, formula, MW, m/z value and
intensity.

Solutions must include academia and commercial publishers and commercial
databases or software providers. The transformation of open chemistry data
collections will come without question, but this should be considered as a
chance or opportunity for new services. Many commercial cheminformatics
companies operate on the forefront of technology. So instead of copying data
out of paper or PDF journals (a job which can be done by computers) they
could free their workforce from this boring task and let them work on truly
new innovations.

S1) The collection and submission of data with every publication must start
now. Think of it like the eternal beta state in Web 2.0. This must be
triggered by a paradigm shift (revolution) or a petition (slow) or an
organization like Blue Obelisk (small but growing).

S2) Targets must be Editorial Boards and Editors and later reviewers of the
most innovative OA and commercial chemistry, biochemistry and
chemoinformatics journals. They must be convinced that open data supplements
are good for science. There should be a requirement to supply such data with
every publication.

S3) The meta-data must be submitted as open accessible (OA) supplement to
the journal or to an open-data collector such as NMRShiftDB or SPECTRa or
RedHen Spectra or CrystalEye. The publication itself can still be
copyrighted if needed. The problem is that currently only the NMRShiftDB is
in a complete working state. A good solution would be one global open data
collector for chemistry (like hosted on SourceForge) instead of many
specialized solutions.

S4) Data Format Dogmatism should be kept outside; for molecular property
data even EXCEL XML or Open Office XML data or SQL dumps should be allowed.
For spectral data only exchange formats like JCAMP, netCDF, CML or XML
should be used. Supporting information as PDF or JPG for data collections
should be forbidden. This is due to multiple problems converting it back to
machine readable data.

S5) The spectral data, molecular property data and molecular structures must
be published under a open data license which allows commercial and
non-commercial reuse and redistribution (like Creative Commons Attribution
CC-by). Commercial reuse is important because data curation still costs
money. New innovative chemistry software or databases would rely on such
large open data collections. Open science can take theses data collections
and provide basic services, hence push science itself and also commercial
operations forward in innovation.



_______________________________________________
Blue-obelisk mailing list
Blue-obelisk-1MXgZGmSEouYropViTRrB8edHCjMHISts0AfqQuZ5sE@public.gmane.org
http://hardly.cubic.uni-koeln.de/mailman/listinfo/blue-obelisk

_______________________________________________
Blue-obelisk mailing list
Blue-obelisk@...
http://hardly.cubic.uni-koeln.de/mailman/listinfo/blue-obelisk
peter murray-rust | 2 Jun 12:45 2007
Picon
Picon

Re: The paradigm shift in data sharing in chemistry and the Blue Obelisk movement

In reply to both of these posts.

Tobias - this is a useful summary - I think all these points have 
been made within the BO list, chemical blogosphere and more generally 
within the Open Access and Open Data community. It's worth realising 
that whatever we do here will require a lot of work and we have to 
make our case carefully. (I have been making all of these points in 
different forums for about 5 years and while they are accepted in 
some communities - eScience, Digital libraries, etc. they have very 
litte traction in chemistry. Indeed Henry Rzepa and I wrote a 
"manifesto for Open chemistry" in 2004. Note that NMRShiftDB is not 
highly valued by many in the chemistry community. It will be 
interesting to see what they think of CrystalEye when we launch it. 
One important thing is that we must be *better* than the current 
position - free/open is not always compelling in this community. We 
have to produce more than arguments - we have to create new things 
that people want. I wish it weren't so.

I thin we should restrict our work to mainstream chemistry. Work is 
already going on elsewhere. Moreover chemistry is seen in the Open 
Access and Open Data communities as an area of darkness so it's a 
simple and attractive target to point to. I also think we should 
restrict ourselves to data - Open Access is already well worked over

I think we could do the following - but again be warned it's a lot of work:
* collect metadata systematically from publishers. Heather (Remix) 
has done this very nicely for some in the biomedical community. For 
that we need an agreed list of journals (note that policies vary 
between journals of the same publisher). For each we should list:
- publisher details and contacts
- current openly stated license policies
- apparent current practice (formats, etc.). You will see I asked the 
blogosphere for some of this yesterday in synthetic org chem
- anecdotes as to whether the license is, in fact, waivable if authors ask.
* prepare a document summarising our views on desirable practice. 
This will include some of the points below, some of our previous 
manifestos, etc. It will need to be very carefully worded as we 
cannot go forward with a substandard document. It *must* refer to 
general protocols on publications (e.g. the ALPSP/STM document) which 
IMO should induce the publishers to open data but doesn't in 
chemistry. We should also collect policies from funding bodies.
* then (or earlier) enlist the help of sympathetic bodies. These 
might include SPARC (e.g. on the Open Data mailing list) and Peter 
Suber. They will be able to relate our suggestions to other 
manifestos, protcols, etc.
* (optional but desirable). Show from our own discipline the value of 
Openness. This is not easy as we don't have many examples. We might 
also collect examples of negative (anticommons) practice.

Then - and only then - do we have a credible public face to take to 
publishers. Remember that many of these have a large data business 
(abstracts, databases) and have every reason to oppose us. The 
publishers have lobbied the EU to refuse the request t make funded 
research open. So it is naive to think that a good argument will carry the day.

What I have written is a lot of work. Until recently I would have 
said that it required formal research funding to carry out. But I now 
believe in the power of the blogosphere and I think we have an ideal 
position. But it has to be done well. And that is work.

We might start with subfields. SPECTRa created a questionnaire for 
crystallography, comp chem and spectra. It took a person-year to take 
it round colleagues. A person-year is now quite feasible in the 
blogosphere - much can be done at coffee meetings, etc.

It is critical that we have a systematic and professional approach to 
our interaction with journals and editors. Here is an example of part 
of a possible standard letter

Dear (X), Editor of Y

We represent a large group of parctising chemical scientists who are 
concerned that lack of access to primary data is holding back 
chemistry and sciences that rely on it. We note that bodies such as 
CODATA, [funding bodies], provosts, etc. have expressed similar 
concerns in science and have argued for...  We have summarised a 
number of areas where access to data enhance science and lack of 
access is harmful. We have prepared a summary of the practices we 
feel would be valuable for journals publishing chemistry and ask for 
your help in clarifying your current practice and your comments on 
our protocol.

All our discussion is hosted Openly and we will publish your reply 
verbatim and with attribution. Comments on our site will be factual 
rather than judgmental but we may make comparisons with other 
publishers. We shall record lack of a reply after [... days] as 
"failed to reply".

=========

So if you wish to take up the challenge - and no-one will think less 
if you do not - you can see the way that I feel we should take it. 
Others may take this up as well and you are unlikely to be without 
help. But building a systematic framework is essential and probably tedious.

P.

Peter Murray-Rust
Unilever Centre for Molecular Sciences Informatics
University of Cambridge,
Lensfield Road,  Cambridge CB2 1EW, UK
+44-1223-763069
Egon Willighagen | 5 Jun 22:55 2007
Picon

Re: Re: Chemical test file repository

On Tuesday 17 April 2007, Egon Willighagen wrote:
> Based on Daniel's good proposal, I would like to propose a slightly
> different hierarchy:
>
> <dir name="x-foo"
>      notest="gchempaint" test="kfile-chemical"
>      content-test="mixed"/>
>   <subdir name="valid"/>
>   <subdir name="invalid"/>
>
>   <!-- hence, fewer duplication of facts:
>         <at> contains already defined by file <at> mime;
>        (sub)dir <at> contains defined in index.xml in that dir; etc -->
>
>   <!-- I would personally leave it up to programs to decide which dirs
>        to pick, and suggest to drop dir <at> test and dir <at> notest -->
>
>   <chemfiles>
>     <file name="foo.bar" src="" mime="" valid="yes"
>           producedBy="X" license=""/>
>
>     <!-- for license we can use a dictionary -->
>   </chemfiles>
> </dir>

OK, started setting this up in BO's new SVN on SF. I can use some help with 
adding the LGPL-ed test files from the CDK project.

Additionally, some XSLT sheets need to be set up to create HTML pages, or, 
alternatively, we use some PHP magic to dynamically convert the XML to HTML 
or so...

Egon

--

-- 
e.willighagen@...
Blog: http://chem-bla-ics.blogspot.com/
GPG: 1024D/D6336BA6
Noel O'Boyle | 8 Jun 14:40 2007
Picon

Wiki has moved to sourceforge

Dear all

The Blue Obelisk wiki/website has moved to http://blueobelisk.sf.net.
There were some difficulties so all user accounts will have to be
recreated. You should use the same user name for your account as
originally used, if possible.

Please let me know if there are any problems. For the record, the
Admin password is the same as that for the mysql database (SF admins
have access to this).

An attempted install of bad behavior (spam filtering) didn't work.
Maybe Geoff or someone can look at this at a later date.

Regards,
    Noel
Christoph Steinbeck | 8 Jun 16:32 2007
Picon

Re: Wiki has moved to sourceforge

Thanks, Noel, for doing the work :-)

BOers: It should be mentioned that if you use the same username, the
system will connect this username with your old edits!
It worked nicely for me. So, if you want to keep connected with your
past, there is a good reason for choosing the same username :-)

I have switched the BO wiki at CUBIC to read-only but will keep it
online as long as possible for reference.

I've redirected http://www.blueobelisk.org to
http://blueobelisk.sourceforge.net, a switch which may take a while to
be recognized by the name servers.

Cheers,

Chris

Noel O'Boyle wrote:
> Dear all
> 
> The Blue Obelisk wiki/website has moved to http://blueobelisk.sf.net.
> There were some difficulties so all user accounts will have to be
> recreated. You should use the same user name for your account as
> originally used, if possible.
> 
> Please let me know if there are any problems. For the record, the
> Admin password is the same as that for the mysql database (SF admins
> have access to this).
> 
> An attempted install of bad behavior (spam filtering) didn't work.
> Maybe Geoff or someone can look at this at a later date.
> 
> Regards,
>    Noel
> _______________________________________________
> Blue-obelisk mailing list
> Blue-obelisk@...
> http://hardly.cubic.uni-koeln.de/mailman/listinfo/blue-obelisk

--

-- 
PD Dr. Christoph Steinbeck (c.steinbeck@...)
Gastdozent für Chemieinformatik
Univ. Tuebingen, WSI-RA, Sand 1, D-72076 Tuebingen, Germany
Phone: (+49/0) 7071-29-78978   Fax (+49/0) 7071-29-5091

What is man but that lofty spirit - that sense of enterprise.
... Kirk, "I, Mudd," stardate 4513.3..

Gmane