Michael Strecke | 2 Aug 2006 00:38
Picon
Picon

[OSM-dev] API question: segment/#/ways

The resulting output contains a list of ways (with their segments), to
which this segment belongs to.

My question: Is the segment list of these ways guaranteed to be complete
(as opposed to the ways' segment list delivered by the "map" command)?
Erik Johansson | 2 Aug 2006 12:56
Picon

[OSM-dev] Re: Re: [OSM-talk] planet.dump

On 8/2/06, Jonas Svensson <jonass <at> lysator.liu.se> wrote:
> On Wed, 2 Aug 2006, SteveC wrote:
>
> > > I believe that the code that generates the dumps is
> > >     http://svn.openstreetmap.org/utils/planet.osm/planet.rb
> > >
> > > I'm sure that Steve would appreciate patches to make it better :)
> >
> > yes please
>
> I do not know anything about ruby at the moment. But one thing to check is
> any changes in the code or the environment between the may and july dump.
> The may dump seems to be properly (html-entity) encoded, while the july is
> lacking. That is one problem, the other is the data in the database. The
> july dump suggests (at least to me) that the database contains a mixture
> of charsets. Maybe we also have to make sure any data entered into the
> database is either converted to utf-8 or at least tagged with the proper
> coding.

The past dumps used an XML library and now it's printed with println.
The section that needs improvement is:

			v1 = v.gsub(/[']/,"&apos;") # escape quotes
			v2 = v1.gsub(/</,"&lt;") # escape <
			v3 = v2.gsub(/>/,"&gt;") # escape >
			puts "<tag k='#{k}' v='#{v3}' />"

I think you can manage that. ;-)
Immanuel Scholz | 2 Aug 2006 13:16
Picon
Picon

[OSM-dev] planet.rb rewrite

Hi,

Before anyone put too much work into it, I could do a rewrite of planet.rb
today evening using my experiences from little-osm. ;-)

The features I promise (that are not already in current planet.rb):
- constant memory consumption, regardless of db-size
- utf-8 complaint
- correctness of timestamp=... for all objects

sheduled release: today evening ;)

Ciao, Imi
Nick Black | 2 Aug 2006 13:48
Picon

[OSM-dev] Dev is down?

There seems to be no web access to dev, though I can still get ssh access.

Anyone know why this is?

Cheers,

Nick
Immanuel Scholz | 3 Aug 2006 00:46
Picon
Picon

[OSM-dev] new planet.rb and planet2mysql.rb

Hi,

I uploaded a rewrite of planet.rb and a new script called planet2mysql.rb to 
subversion (.../utils/planet.osm/).

The former should read the current mysql database and output a planet.osm file 
(to stdout). There is no support for latitude/longitude boundaries, areas or 
history.
The latter script reads a planet.osm file and writes to a mysql database.

Some facts:
- both scripts use constant memrory (regardless of the db-size)
- both scripts should be UTF-8 complaint
- both scripts are only quick-tested (more testing tomorrow)
- servinfo.rb is needed (should default to root <at> localhost without password)
- no dependency to libxml (sorry David ;)
- planet.rb outputs the time in RFC rather than mysql's own time format
- Expect "hours" to read a dump with planet2mysql (REXML).

Ciao, Imi.
Erik Johansson | 3 Aug 2006 10:05
Picon

[OSM-dev] Re: Re: [OSM-talk] planet.dump

On 8/3/06, Jonas Svensson <jonass <at> lysator.liu.se> wrote:
> On Wed, 2 Aug 2006, Raphael Jacquot wrote:
>
> > Jonas Svensson wrote:
> > > Has there been any discussion on how to handle international names and
> > > character encodings? Also things like writing direction (left-to-right,
> > > right-to-left and others)?
> > >
> > > I notice that the MapFeatures-page mentions International name, local name
> > > and regional namn so there must have been some thinking on this subject.
> >
> >
> > for starters, the whole thing should be UTF-8
>
> Yes, wouldn't it be good to change the API to require strings (like names)
> to be UTF-8 when sent to the server/database? If possible also change the
> server to validate strings to be valid UTF-8.

Valid UTF-8 isn't enough. E.g. some time ago someone[1] complained
about the encoding in planet.osm, they gave the example of
"Älvsjövägen" and said it looked horrible in the dump, it was UTF-8
encoded with "&"-entities.

That was perfectly valid UTF-8, but perhaps not the thing you want to
have in the DB. And I don't see how you can make sure applications
handle that correctly, because someone will always write a small
one-line script and make a mess.

So
1. only pass valid UTF-8 chars
(Continue reading)

Immanuel Scholz | 3 Aug 2006 10:19
Picon
Picon

Re: [OSM-dev] Re: Re: [OSM-talk] planet.dump

Hi,

> E.g. some time ago someone[1] complained about the encoding in
> planet.osm, they gave the example of "Älvsjövägen" and said it looked
> horrible in the dump, it was UTF-8 encoded with "&"-entities.

That was a bug, where the server encoded the byte representation of UTF-8
as html-escapes, which is not valid.

> That was perfectly valid UTF-8, but perhaps not the thing you want to
> have in the DB.

Well, the xml was "well formed", but it was not a valid representation of
the original data. And it was fortunatly not in the DB ;-).

Ciao, Imi
SteveC | 3 Aug 2006 18:21
Gravatar

Re: [OSM-dev] new planet.rb and planet2mysql.rb

It doesn't look like it checks whether a segments nodes are visible?

Yes, the db is inconsistant and when you delete a node it should delete
its segments. But it doesn't.

*  <at>  02/08/06 11:46:17 PM immanuel.scholz <at> gmx.de wrote:
> Hi,
> 
> I uploaded a rewrite of planet.rb and a new script called planet2mysql.rb to 
> subversion (.../utils/planet.osm/).
> 
> The former should read the current mysql database and output a planet.osm file 
> (to stdout). There is no support for latitude/longitude boundaries, areas or 
> history.
> The latter script reads a planet.osm file and writes to a mysql database.
> 
> Some facts:
> - both scripts use constant memrory (regardless of the db-size)
> - both scripts should be UTF-8 complaint
> - both scripts are only quick-tested (more testing tomorrow)
> - servinfo.rb is needed (should default to root <at> localhost without password)
> - no dependency to libxml (sorry David ;)
> - planet.rb outputs the time in RFC rather than mysql's own time format
> - Expect "hours" to read a dump with planet2mysql (REXML).
> 
> 
> Ciao, Imi.
> 
> _______________________________________________
> dev mailing list
(Continue reading)

Tommy Persson | 4 Aug 2006 02:11
Picon
Picon
Picon
Favicon

Re: [OSM-dev] new planet.rb and planet2mysql.rb

SteveC <steve <at> asklater.com> writes:

> It doesn't look like it checks whether a segments nodes are visible?
> 
> Yes, the db is inconsistant and when you delete a node it should delete
> its segments. But it doesn't.

Should it really?  I would expect that the deletion of a node should
fail if the segments contaning the node is not deleted first.  In that
way you guard somewhat against mistakes and buggy code.

--

-- 
/Tommy Persson
SteveC | 4 Aug 2006 02:14
Gravatar

Re: [OSM-dev] new planet.rb and planet2mysql.rb

*  <at>  04/08/06 01:11:12 AM tpe <at> ida.liu.se wrote:
> SteveC <steve <at> asklater.com> writes:
> 
> > It doesn't look like it checks whether a segments nodes are visible?
> > 
> > Yes, the db is inconsistant and when you delete a node it should delete
> > its segments. But it doesn't.
> 
> Should it really?  I would expect that the deletion of a node should
> fail if the segments contaning the node is not deleted first.  In that
> way you guard somewhat against mistakes and buggy code.

Either way, it's performing extra db lookups on delete. The scheme you
outline is what I'll aim for with the rails-base OSM I think.

have fun,

SteveC steve <at> asklater.com http://www.asklater.com/steve/

Gmane