Angshu Kar | 3 Jan 2006 21:37
Picon

loading yeast data failing...

Hi,

Could you please help me resolve the follwoing error?

I run:

./load_seqdatabase.pl --dbname=USBA --dbuser=postgres --format=fasta
--driver=Pg --pipeline="SeqProcessor::Accession" yeast_nrpep.fasta

The error:

Loading yeast_nrpep.fasta ...

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::SeqAdaptor (driver) failed, values were
("gi|4261605|gb|AAD13905.1|S58126_11111111111111","gi|4261605|gb|AAD13905.1|S58126_11111111111111","gi|4261605|gb|AAD13905.1|S58126_11111111111111","Unknown
[Saccharomyces cerevisiae]","0","") FKs (19,<NULL>)
ERROR:  value too long for type character varying(40)
---------------------------------------------------
Could not store gi|4261605|gb|AAD13905.1|S58126_11111111111111:
------------- EXCEPTION  -------------
MSG: error while executing statement in
Bio::DB::BioSQL::SeqAdaptor::find_by_unique_key: ERROR:  current transaction
is aborted, commands ignored until end of transaction block
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key
/home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:951
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key
/home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:855
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:205
(Continue reading)

Hilmar Lapp | 3 Jan 2006 22:17
Picon
Gravatar

Re: loading yeast data failing...

You could do that but first that puts you out of sync with the
official schema, and second if you look at the value it isn't really
an accession number anyway that's causing the problem but rather a
concatenation of identifiers, accession numbers, and namespace
acronyms. Since you're using a custom SeqProcessor anyway already why
don't you just add a line or two of code that parses the display_id
value into the accession and identifier? (for instance, the token
between two '|' characters following the token 'gb')

   -hilmar

On 1/3/06, Angshu Kar <angshu96 <at> gmail.com> wrote:
> Hi,
>
> Could you please help me resolve the follwoing error?
>
> I run:
>
> ./load_seqdatabase.pl --dbname=USBA --dbuser=postgres --format=fasta
> --driver=Pg --pipeline="SeqProcessor::Accession" yeast_nrpep.fasta
>
> The error:
>
> Loading yeast_nrpep.fasta ...
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::SeqAdaptor (driver) failed, values were
> ("gi|4261605|gb|AAD13905.1|S58126_11111111111111","gi|4261605|gb|AAD13905.1|S58126_11111111111111","gi|4261605|gb|AAD13905.1|S58126_11111111111111","Unknown
> [Saccharomyces cerevisiae]","0","") FKs (19,<NULL>)
> ERROR:  value too long for type character varying(40)
(Continue reading)

Angshu Kar | 4 Jan 2006 02:41
Picon

Re: loading yeast data failing...

Hi Hilmar,

On what basis should I parse? I found the following 3 entries (arbitrary) in
the bioentry table. The same 3 entries all went to each of the name,
identifier and accession fields!And the version field contains all 0s!

gi|51013395|gb|AAT92991.1|
gi|732941|emb|CAA54130.1|
gi|6321883|ref|NP_011959.1|

So, here for record 1: gi|51013395 is the identifier, AAT92991 is the
accession number, 1 is the version. Am I right? And then what is the name?

Also I found out just the following entry in the 3 same fields in the same
table:

AT1G08520.1

I'm not getting this!I used the TAIR6 dataset.How to parse this data?
Could you please advise on how to resolve this?

Thanks,
Angshu

On 1/3/06, Hilmar Lapp <hlapp <at> gmx.net> wrote:
> You could do that but first that puts you out of sync with the
> official schema, and second if you look at the value it isn't really
> an accession number anyway that's causing the problem but rather a
> concatenation of identifiers, accession numbers, and namespace
> acronyms. Since you're using a custom SeqProcessor anyway already why
(Continue reading)

Hilmar Lapp | 4 Jan 2006 02:47
Picon
Gravatar

Re: loading yeast data failing...

On 1/3/06, Angshu Kar <angshu96 <at> gmail.com> wrote:
> Hi Hilmar,
>
> On what basis should I parse? I found the following 3 entries (arbitrary) in
> the bioentry table. The same 3 entries all went to each of the name,
> identifier and accession fields!And the version field contains all 0s!
>
>
> gi|51013395|gb|AAT92991.1|
> gi|732941|emb|CAA54130.1|
>  gi|6321883|ref|NP_011959.1|
>
> So, here for record 1: gi|51013395 is the identifier, AAT92991 is the
> accession number, 1 is the version. Am I right? And then what is the name?

I'd only used 51013395 as the identifier. Other than that: correct.
There is no name in the above examples, either because the entry
doesn't have one designated, or because the tool that wrote the FASTA
file didn't put it into the identifier part. FASTA format doesn't
define these things. Have you checked the description whether there is
a name somewhere? If there isn't one, I'd default name to accession
number.

>
> Also I found out just the following entry in the 3 same fields in the same
> table:
>
>  AT1G08520.1
>
> I'm not getting this!I used the TAIR6 dataset.How to parse this data?
(Continue reading)

Angshu Kar | 4 Jan 2006 02:56
Picon

Re: loading yeast data failing...

Thanks Hilmar.
Now I've another query:

Here is the accessor.pm I'm using <http://accessor.pm/> (one written by
Marc):

use strict;
use vars qw( <at> ISA);
use lib '/home/akar/local/perl/';
use Bio::Seq::BaseSeqProcessor;
use Bio::SeqFeature::Generic;

 <at> ISA = qw(Bio::Seq::BaseSeqProcessor);

sub process_seq
{
  my ($self, $seq) =  <at> _;
  $seq->accession_number($seq->display_id);
  return ($seq);
}

Could you please let me know what is display_id here? Also which variable
contains the "gi|51013395|gb|AAT92991.1|" string?

Thanks,
Angshu

On 1/3/06, Hilmar Lapp <hlapp <at> gmx.net> wrote:
>
> On 1/3/06, Angshu Kar <angshu96 <at> gmail.com> wrote:
(Continue reading)

Hilmar Lapp | 4 Jan 2006 03:07
Picon
Gravatar

Re: loading yeast data failing...

I suggest you read the SeqIO HOWTO and have a look at the FASTA format
definition (try Google - it's your friend).

Hint: you're answering your own question. Did someone forbid you to
play around and use the debugger (or simple print statements for that
matter)?

On 1/3/06, Angshu Kar <angshu96 <at> gmail.com> wrote:
> Thanks Hilmar.
> Now I've another query:
>
> Here is the accessor.pm I'm using (one written by Marc):
>
> use strict;
> use vars qw( <at> ISA);
>  use lib '/home/akar/local/perl/';
> use Bio::Seq::BaseSeqProcessor;
> use Bio::SeqFeature::Generic;
>
>  <at> ISA = qw(Bio::Seq::BaseSeqProcessor);
>
>  sub process_seq
> {
>   my ($self, $seq) =  <at> _;
>    $seq->accession_number($seq->display_id);
>   return ($seq);
>  }
>
> Could you please let me know what is display_id here? Also which variable
> contains the "gi|51013395|gb|AAT92991.1|" string?
(Continue reading)

Angshu Kar | 4 Jan 2006 03:15
Picon

Re: loading yeast data failing...

I'll try that out Hilmar. And thanks for the clue. :)
Scent a good mentor in you. :)

Thanks again,
Angshu

PS: And no one forbid me but being a tyro I'm not feeling much confident to
fiddle with the real data!

On 1/3/06, Hilmar Lapp <hlapp <at> gmx.net> wrote:
>
> I suggest you read the SeqIO HOWTO and have a look at the FASTA format
> definition (try Google - it's your friend).
>
> Hint: you're answering your own question. Did someone forbid you to
> play around and use the debugger (or simple print statements for that
> matter)?
>
> On 1/3/06, Angshu Kar <angshu96 <at> gmail.com> wrote:
> > Thanks Hilmar.
> > Now I've another query:
> >
> > Here is the accessor.pm I'm using (one written by Marc):
> >
> > use strict;
> > use vars qw( <at> ISA);
> >  use lib '/home/akar/local/perl/';
> > use Bio::Seq::BaseSeqProcessor;
> > use Bio::SeqFeature::Generic;
> >
(Continue reading)

Dr. Dhundy R. Bastola | 3 Jan 2006 05:40
Picon

Please help with bioperl install WIN


Hi all,

I would really appreciate if some one could help . I followed the
instruction for installing bioperl in my laptop. I know the ppm is
installed. I do get the ppm> prompt. However, when I type 'rep add Bioperl
http://bioperl.org/DIST

I get the message 'Unknown or ambiguous command 'rep'; type 'help' for
commands. Help does not show any 'rep' commands.

Thanks

Kiran

kiranbina <at> gmail.com

_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l
Barry Moore | 4 Jan 2006 03:33
Picon

RE: loading yeast data failing...

Angshu-

You should read the following documents carefully before asking more
questions like this one, this is yet another example that demonstrates
that you ask questions before you try to solve the problem yourself.  Do
you have a copy of Programming Perl sitting next to you on the desk?  If
not you should, and it should be tattered and worn before you hit the
list with basic questions like that.  Now try these documents and the
suggestions below, repent of you ways and good luck.

http://www.catb.org/~esr/faqs/smart-questions.html

http://chicago.pm.org/meetings/20031202/perl-debug.txt

http://debugger.perl.org/580/perldebug.html

Now to get you headed on you way for this problem, specifically, what
you want to know about the perl debugger for this issue is:

You can run it like this:

perl -d your_script.pl

You can burrow into your code to the module in question like this:

c Path::To::Your::accessor::process_seq

Once there you can step through code with n or s.

Finally, you can look at varibles (and objects and methods called on
(Continue reading)

Barry Moore | 4 Jan 2006 03:37
Picon

RE: Please help with bioperl install WIN

Kiran,

That looks correct to me.  The full command is repository, you could try
that.  What version of ppm are you using?  Try ppm> version.  If you're
using an older version then maybe older versions didn't accept the rep
abbreviation.

Barry

> -----Original Message-----
> From: bioperl-l-bounces <at> portal.open-bio.org [mailto:bioperl-l-
> bounces <at> portal.open-bio.org] On Behalf Of Dr. Dhundy R. Bastola
> Sent: Monday, January 02, 2006 9:40 PM
> To: bioperl-l <at> bioperl.org
> Subject: [Bioperl-l] Please help with bioperl install WIN
> 
> 
> 
> Hi all,
> 
> I would really appreciate if some one could help . I followed the
> instruction for installing bioperl in my laptop. I know the ppm is
> installed. I do get the ppm> prompt. However, when I type 'rep add
Bioperl
> http://bioperl.org/DIST
> 
> I get the message 'Unknown or ambiguous command 'rep'; type 'help' for
> commands. Help does not show any 'rep' commands.
> 
> Thanks
(Continue reading)


Gmane