I Heart Robotics | 29 Apr 2013 16:30
Picon

Checksums

|Is there a better way of passing checksums of streams?
The use case is sending YAML data over a serial connection that occasionally gets errors or dropped characters.

%YAML 1.2
%CRC B8C5938E
---
a: foo
b: bar
c: baz
...
%YAML 1.2
%CRC 498BBC05
---
a: alpha
b: beta
c: gamma
...|

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
Misha Penkov | 18 Apr 2013 13:17
Picon
Gravatar

PyYAML hangs when reading a large YAML file

Hi,

I'm trying to load data from a large YAML file (approx. 300MB).  PyYAML hangs when I try to load using this code:

    import yaml
    y = yaml.load(open("/tmp/tmp6aJfKz"))

The YAML file itself is valid (it is being output by OpenCV).

Can anyone suggest a way to diagnose this problem?  I can provide the file, but as I've already mentioned, it's fairly large.

Cheers,
Michael
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
KOSEKI Kengo | 13 Apr 2013 13:10
Picon
Gravatar

Top level block scalar without indentation

Hi,

I noticed that LibYAML doesn't parse block scalar without indentation.

  Sample: 1
  ---
  XXX
  YYY
  ZZZ

  Sample: 2
  --- |
  XXX
  YYY
  ZZZ

I expected that these samples would be parsed like this:

  [{"Sample": 1}, "XXX YYY ZZZ"]
  [{"Sample": 2}, "XXX\nYYY\nZZZ\n"]

LibYAML can parse Sample 1 but not Sample 2.

  $ ./run-parser Sample1.yaml
  [1] Parsing 'Sample1.yaml': SUCCESS (11 events)

  $ ./run-parser Sample2.yaml
  [1] Parsing 'Sample2.yaml': FAILURE (10 events)

PyYAML returned error.

  expected '<document start>', but found '<scalar>'

I checked some implementations. The results are:

  Syck (YAML 1.0) - Ruby
    Sample1 ... PASS
    Sample2 ... PASS

  YAML.pm (YAML 1.0) - Perl
    Sample1 ... FAIL
    Sample2 ... PASS

  PyYAML (YAML 1.1) - Python / LibYAML
    Sample1 ... PASS
    Sample2 ... FAIL

  Psych     (YAML 1.1) - Ruby / LibYAML
    Sample1 ... PASS
    Sample2 ... FAIL

  SnakeYAML (YAML 1.1) - Java / based on LibYAML
    Sample1 ... PASS
    Sample2 ... FAIL

  JS-YAML (YAML 1.2) - JavaScript
    Sample1 ... PASS
    Sample2 ... PASS

  Yamerl (YAML 1.2) - Erlang
    Sample1 ... PASS
    Sample2 ... PASS

My questions are:

  (1) Which parsers are implemented properly, at this point?
  (2) Is there possibility that LibYAML support Sample2?
      At YAML 1.1 trunk or future 1.2 branch?

In YAML 1.0 spec, Example 4.21 says:

  --- |
  Usually top level nodes are not indented.

In YAML 1.2 spec, Example 9.5 says:

  %YAML 1.2
  --- |
  %!PS-Adobe-2.0
  ...

  %YAML 1.2
  ---
  !!str "%!PS-Adobe-2.0\n"
  ...

I couldn't find an example like these in the YAML 1.1 spec.
It may be determined by [161] detect(m) being able to return 0 or not.

I found this message.

 * http://osdir.com/ml/text.yaml.general/2008-07/msg00001.html

> The spec requires that continuation lines of a flow scalar are indented
> with at least one space. LibYAML relaxes this rule by allowing
> multi-line flow scalars not to be indented,

So I realized that the Sample1 result was came from relaxing
the spec. But how about the block scalar, Sample 2?

I think this notation is important, because this enable to append
metadata to the plain text.

An email will be represented like this.

  Date: 2013-04-13 20:10:53+09:00
  Subject: Top level block scalar without indentation
  From: koseki@...
  To: yaml-core@...
  --- |
  Hi,
  I noticed that ...

Using template engine to generate HTML.

  title: YAML block scalar
  template: blog
  published: 2013-04-13
  --- |
  <article>
    <h1>xxxxxx</h1>
    <pre>
      Here must
      be written in
      multi lines.
    </pre>
  </article>

Best regards,

--

-- 
koseki

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
Patrick Pelletier | 28 Mar 2013 01:31
Favicon

feature request for libyaml: support "\/" escape sequence for YAML 1.2/JSON compatibility

I attempted to submit a feature request ticket for libyaml:

http://pyyaml.org/newticket?component=libyaml

but it rejected my ticket, telling me it was spam.  (I could go off on a 
whole rant about how so-called "spam filtering" makes it impossible for 
legitimate users to get anything done these days, but that would be 
off-topic.)

Anyway, since I can't submit a ticket, sending the feature request to 
this list seemed like the next-best thing.

Here is the text of the ticket I attempted to submit to libyaml:

YAML 1.2 adds support for the escape sequence "\/", which was not 
present in YAML 1.1:

http://www.yaml.org/spec/1.2/spec.html#id2776092

YAML 1.2 added this escape sequence in order to be compatible with JSON. 
  (Since YAML's goal is to be a superset of JSON.)

Although libyaml is only a YAML 1.1 parser, it would be nice to have 
this feature, and adding it shouldn't cause any trouble with parsing 
YAML 1.1.

This is all that's needed:

{{{
--- a/scanner.c~
+++ b/scanner.c
 <at>  <at>  -3164,6 +3164,10  <at>  <at>  yaml_parser_scan_flow_scalar(yaml_parser_t 
*parser, yaml_
                          *(string.pointer++) = '\\';
                          break;

+                    case '/':
+                        *(string.pointer++) = '/';
+                        break;
+
                      case 'N':   /* NEL (#x85) */
                          *(string.pointer++) = '\xC2';
                          *(string.pointer++) = '\x85';
}}}

--Patrick

------------------------------------------------------------------------------
Own the Future-Intel&reg; Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game 
on Steam. $5K grand prize plus 10 genre and skill prizes. 
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
Trans | 21 Feb 2013 19:22
Picon
Gravatar

Re: Load Delegation

Shoot. "good practice" != "enforce", so I will have to deal with the implications of the possibility in the parser. :-(

Hey, btw, could you change the default respond-to address of the mailing list to "yaml-core-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org"?  I try to remember to add it every time I use respond, but I *always* forget eventually. It makes for a very broken conversation for others trying to read on the list!!!



On Thu, Feb 21, 2013 at 1:08 PM, Oren Ben-Kiki <oren-vmbulFz3td5g9hUCZPvPmw@public.gmane.org> wrote:
I think it is "good practice" at the very least. Unless there's a really good reason not to (e.g., resolving mapping of types that don't quite match between different systems, such as between Perl numbers to Java integers, or whatever).


On Thu, Feb 21, 2013 at 7:52 PM, Trans <transfire-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

Interesting. So do you think it would make sense to enforce a one to one relationship between class and tag, per schema?




--
Sorry, says the barman, we don't serve neutrinos. A neutrino walks into a bar.

Trans <transfire-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
7r4n5.com      http://7r4n5.com


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
Trans | 21 Feb 2013 17:47
Picon
Gravatar

Re: Load Delegation




On Thu, Feb 21, 2013 at 11:04 AM, Oren Ben-Kiki <oren-vmbulFz3td5g9hUCZPvPmw@public.gmane.org> wrote:
I'm nervous about the "same object comes in via two different tags" for several reasons.

First, an object has only one (full) tag. Of course one can have several equivalent shorthand which expand to the same tag.

Yes, that *should* be the case. But people fudge. Ruby's Psych implementation recognizes appox. 13 different tags when one defines a single "full" tag. Perfect round-tripping is of course not necessary, but I thought it would be nice if it were at least smart enough to have `document == YAML.dump(YAML.load(document))`, i.e. if you immediately emit what you just loaded it will come out exactly the same, local tags and all. Or is that a bad idea in itself? Should all local tags always become the "one full tag" when emitted? Note, the same goes for decoration. If a !!str, for instance, was loaded using the literal `|` form, it seems like a nice idea to remember that and emit it the same way.

 
Second, since one either gives a tag or dereferences an anchor but never both, how do you manage to have "the same object" with two different tags in the same YAML document?

Less likely to occur in the same document, but it could. For instance, a schema could translate a !!int to an Integer object, but also !number to an Integer object. But Integer is singleton in Ruby, so there is only ever one object for a given number. So if a document has:

   ---
   a: !!int 42
   b: !number 42

Then how is that going to re-emit? I guess one could just say that's a bad schema, and we can only expect one class to have but one emit tag per schema, regardless of what it might have been loaded in with. That would eliminate the round-tripping portion of my case, at least in so far as the tag goes. Is perfect round tripping simply not something one can expect?


Finally, at least in Ruby, you can get away with a lot if you monkey-patch either specific objects or even classes, but I hesitate to promote this :-)


Actually, it's the monkey patching I am trying to avoid. By delegating I can store the tag and decoration used for the loaded node in the delegator, so that re-emission looks the same. If I don't use the delegator, the only way to do this is via monkey-patching --which I too would rather not do.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
Trans | 21 Feb 2013 16:47
Picon
Gravatar

Load Delegation

I'd like to get some feedback on an idea I have to implementing a YAML parser.

I was thinking that instead of instantiating the objects to native types directly, it could instantiate a delegator around the native type. e.g. instead of (in Ruby code):

     YAML.load('--- "string"') => "string"

It would produce

     YAML.load('--- "string"') => Y("string")

For all practical purposes Y("string") behaves just like "string".

The main reason for doing this is because of immutable types. Immutable types are difficult to load with self-referencing anchors --indeed the only solution I found was to forbid it. Immutable types also might not round trip well, b/c they are often singleton. So if the same object comes in via two different tags --say a schema supports both `!foo` and `foo.org,2000:foo`, it is only possible to emit it with on or the other. There is no way to remember which it came in with. Using the delegator is also nice b/c schemas then don't need to specify the type-class a tag goes with necessarily, they can just define how to load the representation. Without the delegator, the schema has to provide the class so that it can be allocated ahead of time in case there is an anchor/reference for it.

The downside, of course, it that as close as Y("string") is to mimicking "string" there is always going to be a few ways in which it is not the same. Typically these differences don't matter. But when doing tricky things, in particular  meta-programming kinds of things, then it could be cause issues --in which case one would have to be sure to manually "unwrap" the delegation,

So what do you think? Is this a good idea? Or am I bat-shit crazy and just asking for trouble?

Thanks.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
Trans | 3 Feb 2013 20:39
Picon
Gravatar

Re: Tag Spec


On Sun, Feb 3, 2013 at 1:31 PM, Oren Ben-Kiki <yaml-oren-vmbulFz3td5g9hUCZPvPmw@public.gmane.org> wrote:
The question isn't one of ease of parsing (we all know YAML isn't easy to parse :-).

:-) Indeed!

It is a matter of ambiguity. The !<verbatim> syntax was added very late in the game, to allow avoiding the tag prefix games.   
 
The thinking isn't "domain" vs. "local", the thinking is "full arbitrary URI" (inside <...>) vs. "suffix added to some arbitrary URI prefix" (without the <...>). The difference between "!", "!!" and "!foo!" is just "which URI prefix should we use here".

Though in practice that's what people see. Perhaps the better terminology is "globally unique" vs "local" tags. Although determining if a prefix is applied would depend on whether the tag contains a `:` or not, sticking with current spec, !<...> tags would still not resolve. To clarify the difference:

    %TAG ! tag:foo.org/
    --- 
    - !foo                   => !<tag:foo.org/foo>
    - !tag:bar.org/bar   => !<tag:bar.org/bar>

Also, I think this would open up `!<foo>` to be a legal local tag, rather then the degenerate global tag it is now b/c it is not a valid URI. So, without a `%TAG !` directive:

    ---
    - !foo                 => !<foo>

So the `!` would no longer have any significance if a tags *name*. It would be used only to designate a tag and to sub prefixes, but a local tag would not need to be `!foo` any more, just `foo`.

If we said "we don't attach any prefix if what follows the "!" looks like a complete URI" we'd be entering a world of pain. URIs can be in all sort of forms: "urn:isbn:0-395-36341-1" is a URI and hence (if someone wanted) !<urn:isbn:0-395-36341-1> would be a valid (if somewhat insane) tag. So "looking like a complete URI" isn't really easy to define. "Looking like a tag URI" is well-defined, but YAML really doesn't insist on using "tag URIs", even though we call the "node type tags", well, tags.

I'm not seeing how this translates into be a world of pain. Keeping it simple, as in "does it contain a `:`" should suffice b/c `:` is required of a valid URI. Local tags can live without them. Some adjustment might be required by end users in rare cases. Like your example, they could use !<urn:isbn#0-395-36341-1>. I don't think it's too much to ask of end-users that local tags not use `:`. (If it really is too much to ask then perhaps a different escape notion could be allowed, e.g `![urn:isbn:0-395-36341-1]`.

(Note, before I suggested that `/` be an indicator of global tag too. In that case if no `:` is present, `tag:` would be assumed. That just seemed like a nice convenience, but it is not necessary.)

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
Trans | 3 Feb 2013 19:02
Picon
Gravatar

Tag Spec

I was reading over the spec on Tags. There is an aspect of it that is confusing. To the general viewer, it is not intuitive that the following tag is not a "domain" tag:

    --- !tag:foo.org:org

It is actually a local tag with a very domain-esque name. The correct tag is:

    --- !<tag:foo.org:org>

That's a very technical distinction and I think too difficult.

Would it be acceptable to just designate domain tags as any tag that contains a `:` or `/`? I realize it might not be as efficient to parse, but it would be a whole lot more comprehensible to humans.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
Trans | 1 Feb 2013 22:24
Picon
Gravatar

Re: Help understanfing tag resolution confusion

Ok, so all tags are *verbatim*, and there is no official spec that says `tag:foo` should be considered the same as `foo`. Is the correct?

Also, to be clear, is there any reason that Psych transforms `foo.org,2000/foo` into `foo.org,2000:foo`? Why would it change the '/' to a ':'? maybe b/c of old spec or something?

I appreciate the help.

On Fri, Feb 1, 2013 at 3:35 PM, Oren Ben-Kiki <oren-vmbulFz3td5g9hUCZPvPmw@public.gmane.org> wrote:
I'm not certain about the specific library implementation, but the spec is pretty clear on tag resolution (well... as clear as I succeeded in making it :-)

Different verbatim tags in general map to different types, but an application is allowed to define any mapping it wants between tags and native types, so if it chooses to make !<foo> and !<bar> both be the same type baz, that's its business (though it is arguably confusing and needs a good reason). It _sounds_ like this is what the library is doing as an intentional practice, I'm not certain why.

Have fun,

    Oren Ben-Kiki


2013/2/1 Trans <transfire-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Sorry, typo:

s/`tag:foo.org:foo` of `tag:foo`/`tag:foo.org:foo` or `tag:foo`/

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Yaml-core mailing list
Yaml-core-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/yaml-core





--
Sorry, says the barman, we don't serve neutrinos. A neutrino walks into a bar.

Trans <transfire <at> gmail.com>
7r4n5.com      http://7r4n5.com


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core
Trans | 1 Feb 2013 20:34
Picon
Gravatar

Help understanfing tag resolution confusion

HI, I've been work with the Pysch library which is the YAML library that Ruby uses built upon libyaml.

Psych provides a away to add domain tag. Below I add a tag with `foo.org` domain and `foo` name,
it creates a tag from that called `tag:foo.org:foo`. It also creates a tag called `tag:foo`.
(See https://github.com/tenderlove/psych/blob/master/lib/psych.rb#L304-L308)

    require 'yaml'

    YAML.add_domain_type('foo.org', 'foo'){ |*a| "frak" }
    ["tag:foo.org:foo", #<Proc:0x00000000eec280 <at> (irb):2>]

But then I can use all the following tags in the YAML to reolve to this domain tag:

    YAML.load('--- !<tag:foo.org:foo> "a"')
    => "frak"

    YAML.load('--- !<tag:foo> "a"')
    => "frak"

    YAML.load('--- !tag:foo.org:foo "a"')
    => "frak"

    YAML.load('--- !tag:foo "a"')
    => "frak"

    YAML.load('--- !foo.org:foo "a"')
    => "frak"

    YAML.load('--- !foo "a"')
    => "frak"

Are all of these tags actually valid and resolvable to `tag:foo.org:foo` of `tag:foo` ?
It seems like the `tag:` is meaningless and can match if it is there or if it is not there.
Is that right? Or is this non-spec behaviour on Psych's part?


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Yaml-core mailing list
Yaml-core@...
https://lists.sourceforge.net/lists/listinfo/yaml-core

Gmane