StoneyBoh | 1 May 01:01 2009

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

Against my better judgment ... a few opposing opinions.

> i'm just concerned that the last time this was brought up
> there seemed to have been some resistance, but if everyone else is
> cool with it, as they seem to be (or maybe they're just bored of
> PartNumberStyle by now!) then i think it's good :)

I'm not cool with it, but quite honestly I am tired of these debates and 
don't have the inclination nor energy to discuss this ad infinitum.  So, 
I will respond 1 time and 1 time only with my arguments here - take them 
or leave them, but I won't be arguing back and forth.  Whatever the 
community ultimately decides, I'll accept. (But I will note that the 
community should be more than just Brian and Chris, who seem to be the 
only ones with the energy to debate something like this lately:) )

To take a couple specific points.

>> For common practice in English, the argument for "Parts 1?3" was that it was
>> the most correct for English, to not have the spaces.  That was noted at
>> Wikipedia, however, within a section which also noted that the most correct,
>> for English, is that an en-dash be used.
>>     

That's nice and all, but MBz is a collection of data, not a written 
work.  Therefore, typography rules do not need to apply.  You may prefer 
them to, but that is a preference not a requirement.

>> I'd also note that just about every modern word processor automatically
>> makes this substitution, for [0-9] numeric ranges, transparently converting
>> 1-9 into 1?9;
(Continue reading)

Brian Schweitzer | 1 May 02:26 2009
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

>> For common practice in English, the argument for "Parts 1?3" was that it was
>> the most correct for English, to not have the spaces.  That was noted at
>> Wikipedia, however, within a section which also noted that the most correct,
>> for English, is that an en-dash be used.
>>

That's nice and all, but MBz is a collection of data, not a written
work.  Therefore, typography rules do not need to apply.  You may prefer
them to, but that is a preference not a requirement.

Well, the *only* reason given, other than "I like it better", for using "1-3" instead of "1 - 3" was that it is more typographically correct.  I fail to see the real logic in arguing typographic correctness for spacing, but not for the character in between the spaces.
 
>> I'd also note that just about every modern word processor automatically
>> makes this substitution, for [0-9] numeric ranges, transparently converting
>> 1-9 into 1?9;

So?  We don't use word processors to edit MBz data, nor do people tag
their files using Word.

I wasn't suggesting that we do use word processors to edit MB.  However, I was suggesting that the use of correct typography, the en-dash included, is not unusual in the modern day, with modern software.  As an alternate example, I would point to Wikipedia; I think it could be suggested, without offending anyone here, that far more people edit there than edit MusicBrainz.  Yet, consulting their manual of style, it not only suggests, but *directs* the use of an en-dash, when appropriate.  http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style#Dashes
 
>> The new Guess Case is intelligent enough to do this substitution on
>> the fly, also transparently.
>>

Nice.  That would certainly be helpful, but that is only half the
problem.  I am sitting here reading and responding to this e-mail in
Thunderbird, on a reasonably modern Windows machine (ok, I'm still
running XP :P) and notice how your n-dashes displayed in the couple of
paragraphs above ... as question marks (i.e. unknown characters that I
can only guess are en-dashes).  I'm betting some mp3 players will have
problems too.  I'm willing to accept that some reverse cyrillic
character or some Kanji text doesn't display right on my screen or in my
mp3 player, but I am not willing to accept that what should (to me) be a
commonly used character - a dash (in the generic sense) - doesn't even
display correctly.

I can't speak to why your install of Thunderbird isn't showing the en-dash.  Checking their docs, I *can*, however, confirm that, according to the Thunderbird docs, the en-dash, em-dash, and all other Unicode characters are supported.  They also have been in the default Windows font since at least Windows 98, as well as supported by Mac and Linux, by default, for at least half a decade.  The only thing I can think that might possibly by causing you this problem would be that your mailserver itself is possibly mangling UTF-8 into something else, such as basic ASCII.  However, this sounds more like an argument for a new mailserver, not an argument for or against en-dashes.  :P
 
>> I think we could all agree, however, that, even if it's the easiest to type,
>> the hyphen-minus (the key on most keyboards) is the least correct range
>> indication character.

You would be incorrect that we could all agree to that. (I would say
that any character out of the dash-family would be much worse).

The hyphen-minus, by definition, has no typographical meaning.  It is not even a valid punctuation mark in *any* language or script.  It simply exists because, at one time in history, given no room to fit multiple dash, hyphen, and minus keys, and given that the output was pretty rough (and thus the distinction between them could not be detected anyhow), a compromise was made, *specifically for typewriters*.  The key then carried over to computer keyboards because they initially used, what else, typewriter keyboards.  Thus how can it be argued that it still is the best character, now that the correct typographical characters do exist, and have existed for sufficient enough time that every single computer (ok, save perhaps some of those still running Windows 95) on the planet supports them, without even changing fonts?  "Easiest", perhaps, but definitely not "best".

We don't
need a typographically correct character to indicate a range in a
database.  We could choose to use one if we like, but we don't need to.
We just need to agree on one that represents what we want it to.  Hell,
we could agree to adopt the phrase " to " if we wanted to, or how about
two dots ("..") like some programming languages use?  I personally think
we should pick the character closest to what people expect to see (i.e.
a dash of some sort) and that is easy for anyone to enter (i.e. it
exists on western keyboards without the need of any macros or special
gymnastics to type.)  To me, that means the thing that is next to the 0
and above the o and the p on my keyboard.  I don't know (and don't care)
whether that is a hyphen, a dash, a minus, or a thingamawhatchacallit.

We could also agree that red is blue.  :P

Seriously, I understand the argument about the hyphen-minus being the easiest to type.  We don't "need" to support anything at all, right?  However, we're talking about something here which is unarguably the more correct character to use.  The suggested guideline also, quite specifically, does not say that using a hyphen-minus is incorrect or unallowed, only that using an en-dash is preferred.  Also, the number of cases where there is something funky going on, and guess case is not used, is pretty small, with regards to the totality of the database.  So most of these would be autocorrected to en-dashes, even without the user doing anything at all.  For those that still end up entered using hyphen-minuses, there's nothing in the suggested guideline which would make those edits in any way whatsoever incorrect; definitely nothing in this guideline would suggest that an edit using a hyppehn-minus should be voted against, just on that basis.
 
So, to sum up my feelings:
- MBz collects data, not printed text, and therefore does not need to
follow typographical rules that are intended to make printed text "look
better".

But "data" becomes text.  MusicBrainz data ends up used in many different contexts, not just as a source for taggers or a raw data dump (such as a release's listing on the MB site.)  Why should we not, when we so easily can support it, suggest that correct typography indeed then be used?  We don't need to include all the accented characters either - "Johann Johannsson" is just as comprehensible as "Jóhann Jóhannsson".  However, it's not as correct, so we use the accented o's.  (And for the record, Jóhann Jóhannsson is a lot more difficult for me to type, using a US keyboard with Linux English/US layout, without reference to character lookups, than is Johann Johannsson.)
 
- Dash types are font dependent, and the differences between them will
be lost on many people.

The apparent difference may be lost, but the inherent meaning is not.  Just because a hyphen-minus and an en-dash may look identical, in a given font, in a given context, that does not then make them identical characters, nor is the computer suddenly then rendered unable to recognize that they are different characters with differring typographical meaning.
 
- If we add characters that don't display properly in applications
people use, when a very close facsimile character is available, we risk
alienating MBz contributors.

What applications?  Windows 95?  Windows 3.1?  Very old mp3 players that don't support even the very most basic Unicode characters?  Any software that today cannot display an en-dash is at least ten years old - for any software or hardware unable to render an en-dash correctly, there's bigger problems present, when attempting to use MB data, than whether or not we allow the use of correct typography.
 
- The more MBz grows, the more inclusive we need to be, so we should be
encouraging people to contribute by making it easy for them to do so.

Hence we allow the use of the hyphen-minus, and only "prefer" the en-dash.  Hence we provide a tool (the new Guess Case) which is capable to detecting proper situations to use an en-dash, at least with regards to Part Number Style.  However, I don't think that this is really a good reason to not use correct typography.  If Wikipedia can require the use of correct typography, we can at least suggest it, without our then making it "too hard" for the new editor to figure things out.  (Personally, given the number of people who enter things in ALL CAPS, I think some new users won't care how we word the guideline, or what we do or don't suggest re: typography...  but that's just me :P).
 
- Even if it is not "mandatory" but just "preferred", then we will be
encouraging some typographically-minded editors to spend untold hours
running around the database and cleaning up people's dashes - and with
all the work that needs to be done, is that really the best use of these
people's time?  Already, I have spent 15 minutes or more writing this

Well, as I mentioned a while back, the number of cases where even "Parts 1-3" (per the guideline as it currently is written) occur are quite few, vs those where all sorts of other mess are present and not in compliance with *any* official or proposed PartNumberStyle.  So should some editor(s) decide to try to clean some of that mess up, more power to them.  :)  But seriously, if someone is typographically minded, and cares to spend time changing hyphen-minuses into en-dashes, with regards to Part Number Style, why should we argue against that?  Everyone who edits contributes his or her own time to MusicBrainz, and no matter how they edit, (hopefully), the data benefits.  Is it really for you or me to decide that, say, adding 100 ARs to a release really would be a better use of some other editor's time, vs their going through to convert the hyphen-minuses?  No one is telling anyone to do it, if he or she thinks it a waste of time; anyone doing it would be doing it *because he or she wanted to*.

Brian
_______________________________________________
Musicbrainz-style mailing list
Musicbrainz-style@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
Chad Wilson | 1 May 05:50 2009
Picon
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

On 1/05/2009 7:01 a.m., StoneyBoh wrote:
Against my better judgment ... a few opposing opinions.
i'm just concerned that the last time this was brought up there seemed to have been some resistance, but if everyone else is cool with it, as they seem to be (or maybe they're just bored of PartNumberStyle by now!) then i think it's good :)
I'm not cool with it, but quite honestly I am tired of these debates and don't have the inclination nor energy to discuss this ad infinitum. So, I will respond 1 time and 1 time only with my arguments here - take them or leave them, but I won't be arguing back and forth. Whatever the community ultimately decides, I'll accept. (But I will note that the community should be more than just Brian and Chris, who seem to be the only ones with the energy to debate something like this lately:) ) To take a couple specific points.
+1000. This debate has lost it's appeal entirely to me. It's largely philosophical in nature and will likely never be resolved.
So, to sum up my feelings: - MBz collects data, not printed text, and therefore does not need to follow typographical rules that are intended to make printed text "look better". - Dash types are font dependent, and the differences between them will be lost on many people. - If we add characters that don't display properly in applications people use, when a very close facsimile character is available, we risk alienating MBz contributors. - The more MBz grows, the more inclusive we need to be, so we should be encouraging people to contribute by making it easy for them to do so. - Even if it is not "mandatory" but just "preferred", then we will be encouraging some typographically-minded editors to spend untold hours running around the database and cleaning up people's dashes - and with all the work that needs to be done, is that really the best use of these people's time? Already, I have spent 15 minutes or more writing this e-mail, and that's time I didn't spend working on the database. My thoughts. Take them, leave them, curse at them at your leisure. Done. Peace. Jeff
Completely agreed, 100%. In the current lack-of-debate situation we have, I think I would personally veto any RFV that recommends use of obscure typography for "normal situations", including the en-dash. Brian, people are so burnt out from this thread that there's almost zero chance of getting any discussion and this needs a lot more philosophical thought amongst a variety of contributors before going ahead with such a recommendation, IMO.

I don't know how widely it is and isn't supported (which is why I'd rather hear from more people than just brian on the "breadth of support" issue), however I want to know more. We have enough out-of-touch (to some) rules and guidelines I sometimes struggle to defend, let alone further deviating from the mainstream for purposes as inconsequential as typographical correctness. It's just a song title. Next will be appropriate use of angle apostrophes, quotes etc. It feels a slippery slope to me; and this is setting a typography precedent that I don't believe is the direction the community necessarily all feel we should go in. It feels to me like the direction that at least one person wants to go in (which is fine); but that everyone else is too tired to argue the point (not fine!).

I've now sat down and drafted about 3 different replies to this thread (at various intervals) which have been deleted; this is my fourth attempt. I feel exasperated!

Chad / voice

_______________________________________________
Musicbrainz-style mailing list
Musicbrainz-style@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
Brian Schweitzer | 1 May 06:44 2009
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

Completely agreed, 100%. In the current lack-of-debate situation we have, I think I would personally veto any RFV that recommends use of obscure typography for "normal situations", including the en-dash. Brian, people are so burnt out from this

I give up.

Why does the MusicBrainz style community, at least the vocal part, consider standard typography to be "obscure", "confusing", "difficult", etc?

* I do know that I was taught standard typography, including the elipses, en-dash, em-dash, and hyphen, in school at around age 12, and didn't consider it difficult, hard, or confusing.  (And then we covered it again, when I was learning French.)

* Wikipedia does not consider it confusing, and in fact requires its use. 

* The authors of standard word processors don't consider it obscure, to such a point that they auto-insert it. 

* Most common wiki utilities include support for it (using --, ---, etc, to make insertion even easier, without having to even look up how to type an en-dash, etc.)

* Guidelines for the use of correct typography are present is just about any English style reference book you care to look at.

Yet, any time the suggestion is made that the en-dash, the em-dash, the elipses - or any other punctuation not found in the basic ASCII 127 character set - be allowed, the same people threaten preemptive vetos, before an RFV is even made.  Notice the emphasis on "allowed".  Not mandated, not required, merely allowed.  I even went so far as, when rewriting Guess Case, to make sure that it had support to allow auto-correcting this typography, when possible.

Last time this was discussed, around 14 months ago, the argument was against it because it would have been required.  A suggestion was made a few times that Guess Case could be made to auto-correct the typography, so users didn't have to type it.  Both points would seem to be addressed here; the hyphen-minus is still allowed, just given lesser preference than the *correct* typographical mark, and the new Guess Case has been rewritten from scratch to perform, as part of its functions, exactly that suggested auto-correction.

Look, MusicBrainz includes the other 100k or so characters in Unicode.  So why, when it comes to this handful of typographical marks, is there such antipathy towards at least allowing those of us who care about typographical correctness to actually have it?

I also don't buy this argument about people being tired of this debate.  The argument so far, be it about the spaces or allowing the en-dash, has come down to "I don't like it, so I'll veto it".  That's not a reasonable debate, that's a pissing contest, to see who can outlast whom.

Please, those of you who oppose even *allowing* correct typography, explain to me why this is such a thing to allow to pass?  With the exception of Chris, there was no comment on this RFC until it hit the end of day 6, just before it would have gone to RFV.  Yet, look at the reasons given for opposition (I've tried to list all the reasons so far given):

1) " MBz is a collection of data, not a written work.  Therefore, typography rules do not need to apply."
2) "n-dashes displayed in the couple of paragraphs above ... as question mark"
3) "some mp3 players will have problems" / "it'll probably give some mp3 players a fit"
4) the en-dash is "obscure typography"
5) "i don't believe it really reflects common practice in english"
6) ..."or record sleeves"
7) "Dash types are font dependent"
8) "If we add characters that don't display properly in applications people use, when a very close facsimile character is available, we risk alienating MBz contributors."
9) "we should be encouraging people to contribute by making it easy for them to do so"
10) "...some typographically-minded editors to spend untold hours running around the database and cleaning up people's dashes"

Are any of these really arguments against allowing the use of *correct* typography?

1) Data which consists of a collection of interrelated strings is not "written"?  Also consider that that data can be, and already is, incorporated into text, where it then quite definitely becomes part of a written text.  (Even if it somehow wasn't already???)

2) Any computer sold in at least the past decade has included support for any correct typography we might discuss, out of the box and by default.  That some email (and IRC bots, I should mention... :P) servers still don't support UTF-8 should not be our concern...

3) This is entirely a tagger issue.  Picard can replace the en-dash with a hyphen-minus (as well as replacing any other characters a user might wish to replace), if it is a problem.  However, tagger or mp3 player issues should not define anything with regards to the data.  Should we also forbid the use of any other non ACSII 127 characters, for the same reason?  Of course not.  :)

4) This may be more a comment on various educational systems, I don't know.  However, the en-dash (as well as the figure dash, the em-dash, the hyphen, the minus symbol, the elipses, and the guillemot) has an entry in every English style guideline manual in my library.  Speaking only for myself, it was part of my grammar school education on basic English language.  And, when I studied a foreign language, those same punctuation marks were re-taught in the first year.  So how, then, is the en-dash "obscure"?  Would you consider, then, a semi-colon or full colon to be obscure typography?  They are used far less often, in typographically correct English, than the en-dash or elipses, yet would the argument then be that, becuase they appear on the US keyboard layout, they're somehow not as "obscure"?

5) First, since when did any of our guidelines reflect usage of anything in common English?  Capitalization Standard English definitely isn't "common English".  This is an argument that, because "common English" doesn't really actually mean anything, can be made to argue for or against *anything*.

6) Re: record (or CD, or tape, etc) sleeves, first, PartNumberStyle mostly ignores that text anyhow, but second, if this argument were actually to be used here, it would be the first time, to my knowledge, that we allow *any* guideline to be determined by actually arguing that a record sleeve is always perfectly interpretable, especially with regards to punctuation.  Given all the text effects that appear on sleeves, attempting to determine whether the character (if even present, outside of PartNumberStyle) used is a hyphen, en-dash, or something else...  that's just a hopeless case.  One could just as easily argue that all the characters on sleeve are en-dashes, em-dashes, hyphens, minuses, or figure dashes, and not hyphen-minuses.

7) Every character is font dependant.  A correctly designed font should render an en-dash the width of a capital N, and an em-dash the width of a capital M.  A hyphen-minus has no defined width, mainly because a hyphen-minus has *no defined meaning*.  Whether or not the dashes and hyphens actually are distinguishable, to the human eye, for a given font, does not change the fact that in fonts that are correctly designed, and intended for readabiliy/legibility, these characters are plainly distinguishable.  Also, whether or not they can be distinguished, regardless of the font, the computer still knows what character is being used, and, given #1 above, can then render or wrap the text properly when the characters are included from the database into other text.

8) See #3

9) "Allowed" or "Preferred" is not the same as "Required".  Just because we encourage one character, the other character would not then somehow become wrong or incorrect.  Providing the correct tools, like Guess Case, should be the answer here, not dumbing down the data, simply to make things "easier".

10) What people choose to edit is their own business.  If someone wants to take the time to edit typography, why is this a problem?    A facetious question, but should we also eliminate the capitalization standards, or even just get rid of all the guidelines entirely?  That definitely would make data easier for new people to enter, as well as avoiding all that wasted time by all those editors going around using guess case (!) to fix capitalization and style issues...  :P

Brian
_______________________________________________
Musicbrainz-style mailing list
Musicbrainz-style@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
Sami Sundell | 1 May 07:01 2009
Picon
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")


Brian Schweitzer wrote:

Just notes on couple of issues.

> (such as a release's listing on the MB site.)  Why should we not,
> when we so easily can support it, suggest that correct typography
> indeed then be used?

I actually have nothing against this, but it should be automatic, and 
automatic, in all data uses. Since hyphen-minus is easier to type, 
that's what people will use. At least I will.

If Guess Case or some other automatic mangler then changes it into some 
dash, that's okay by me. But similarly, if I use that release on my MP3 
tags, I fully expect it to turn into hyphen-minus at least 
semi-automatically.

> We don't need to include all the accented characters either - "Johann
> Johannsson" is just as comprehensible as "Jóhann Jóhannsson".

Depends on language, and accent. Finnish accented characters 
(particularly ä and ö) change the pronunciation significantly, and I 
could give you quite a list of words that have very different meaning, 
and the only difference is accents, or lack of them. I suspect the same 
is true on many other languages as well.

So no, that really doesn't apply to hyphen-minus vs. whatever-dash.

> - If we add characters that don't display properly in applications
> people use, when a very close facsimile character is available, we
> risk alienating MBz contributors.
>
>
> What applications?  Windows 95?  Windows 3.1?  Very old mp3 players
> that don't support even the very most basic Unicode characters?  Any

My MP3s are all tagged with ISO-8859-1. My phone doesn't support UTF-8, 
and SqueezeCenter dies horribly with UTF-16 (at least on Perl 5.10, I've 
heard rumors older Perl versions behave better).

> software that today cannot display an en-dash is at least ten years
> old - for any software or hardware unable to render an en-dash

Phone is from 2007, SqueezeCenter from middle of April.

Yeah, in theory majority of software should support Unicode. In 
practice, there are still problems, and I suspect there will be for some 
time still. And it's not just single software, as you can see from my 
case it's also combinations of software.

--

-- 
  Sami Sundell
  ssundell@...
Bogdan Butnaru | 1 May 22:45 2009
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

+ 1

-- Bogdan Butnaru

On Fri, May 1, 2009 at 6:44 AM, Brian Schweitzer
<brian.brianschweitzer@...> wrote:
> I give up.
>
> Why does the MusicBrainz style community, at least the vocal part, consider
> standard typography to be "obscure", "confusing", "difficult", etc?
>
> * I do know that I was taught standard typography, including the elipses,
> en-dash, em-dash, and hyphen, in school at around age 12, and didn't
> consider it difficult, hard, or confusing.  (And then we covered it again,
> when I was learning French.)
>
> * Wikipedia does not consider it confusing, and in fact requires its use.
>
> * The authors of standard word processors don't consider it obscure, to such
> a point that they auto-insert it.
>
> * Most common wiki utilities include support for it (using --, ---, etc, to
> make insertion even easier, without having to even look up how to type an
> en-dash, etc.)
>
> * Guidelines for the use of correct typography are present is just about any
> English style reference book you care to look at.
>
> Yet, any time the suggestion is made that the en-dash, the em-dash, the
> elipses - or any other punctuation not found in the basic ASCII 127
> character set - be allowed, the same people threaten preemptive vetos,
> before an RFV is even made.  Notice the emphasis on "allowed".  Not
> mandated, not required, merely allowed.  I even went so far as, when
> rewriting Guess Case, to make sure that it had support to allow
> auto-correcting this typography, when possible.
>
> Last time this was discussed, around 14 months ago, the argument was against
> it because it would have been required.  A suggestion was made a few times
> that Guess Case could be made to auto-correct the typography, so users
> didn't have to type it.  Both points would seem to be addressed here; the
> hyphen-minus is still allowed, just given lesser preference than the
> *correct* typographical mark, and the new Guess Case has been rewritten from
> scratch to perform, as part of its functions, exactly that suggested
> auto-correction.
>
> Look, MusicBrainz includes the other 100k or so characters in Unicode.  So
> why, when it comes to this handful of typographical marks, is there such
> antipathy towards at least allowing those of us who care about typographical
> correctness to actually have it?
>
> I also don't buy this argument about people being tired of this debate.  The
> argument so far, be it about the spaces or allowing the en-dash, has come
> down to "I don't like it, so I'll veto it".  That's not a reasonable debate,
> that's a pissing contest, to see who can outlast whom.
>
> Please, those of you who oppose even *allowing* correct typography, explain
> to me why this is such a thing to allow to pass?  With the exception of
> Chris, there was no comment on this RFC until it hit the end of day 6, just
> before it would have gone to RFV.  Yet, look at the reasons given for
> opposition (I've tried to list all the reasons so far given):
>
> 1) " MBz is a collection of data, not a written work.  Therefore, typography
> rules do not need to apply."
> 2) "n-dashes displayed in the couple of paragraphs above ... as question
> mark"
> 3) "some mp3 players will have problems" / "it'll probably give some mp3
> players a fit"
> 4) the en-dash is "obscure typography"
> 5) "i don't believe it really reflects common practice in english"
> 6) ..."or record sleeves"
> 7) "Dash types are font dependent"
> 8) "If we add characters that don't display properly in applications people
> use, when a very close facsimile character is available, we risk alienating
> MBz contributors."
> 9) "we should be encouraging people to contribute by making it easy for them
> to do so"
> 10) "...some typographically-minded editors to spend untold hours running
> around the database and cleaning up people's dashes"
>
> Are any of these really arguments against allowing the use of *correct*
> typography?
>
> 1) Data which consists of a collection of interrelated strings is not
> "written"?  Also consider that that data can be, and already is,
> incorporated into text, where it then quite definitely becomes part of a
> written text.  (Even if it somehow wasn't already???)
>
> 2) Any computer sold in at least the past decade has included support for
> any correct typography we might discuss, out of the box and by default.
> That some email (and IRC bots, I should mention... :P) servers still don't
> support UTF-8 should not be our concern...
>
> 3) This is entirely a tagger issue.  Picard can replace the en-dash with a
> hyphen-minus (as well as replacing any other characters a user might wish to
> replace), if it is a problem.  However, tagger or mp3 player issues should
> not define anything with regards to the data.  Should we also forbid the use
> of any other non ACSII 127 characters, for the same reason?  Of course not.
> :)
>
> 4) This may be more a comment on various educational systems, I don't know.
> However, the en-dash (as well as the figure dash, the em-dash, the hyphen,
> the minus symbol, the elipses, and the guillemot) has an entry in every
> English style guideline manual in my library.  Speaking only for myself, it
> was part of my grammar school education on basic English language.  And,
> when I studied a foreign language, those same punctuation marks were
> re-taught in the first year.  So how, then, is the en-dash "obscure"?  Would
> you consider, then, a semi-colon or full colon to be obscure typography?
> They are used far less often, in typographically correct English, than the
> en-dash or elipses, yet would the argument then be that, becuase they appear
> on the US keyboard layout, they're somehow not as "obscure"?
>
> 5) First, since when did any of our guidelines reflect usage of anything in
> common English?  Capitalization Standard English definitely isn't "common
> English".  This is an argument that, because "common English" doesn't really
> actually mean anything, can be made to argue for or against *anything*.
>
> 6) Re: record (or CD, or tape, etc) sleeves, first, PartNumberStyle mostly
> ignores that text anyhow, but second, if this argument were actually to be
> used here, it would be the first time, to my knowledge, that we allow *any*
> guideline to be determined by actually arguing that a record sleeve is
> always perfectly interpretable, especially with regards to punctuation.
> Given all the text effects that appear on sleeves, attempting to determine
> whether the character (if even present, outside of PartNumberStyle) used is
> a hyphen, en-dash, or something else...  that's just a hopeless case.  One
> could just as easily argue that all the characters on sleeve are en-dashes,
> em-dashes, hyphens, minuses, or figure dashes, and not hyphen-minuses.
>
> 7) Every character is font dependant.  A correctly designed font should
> render an en-dash the width of a capital N, and an em-dash the width of a
> capital M.  A hyphen-minus has no defined width, mainly because a
> hyphen-minus has *no defined meaning*.  Whether or not the dashes and
> hyphens actually are distinguishable, to the human eye, for a given font,
> does not change the fact that in fonts that are correctly designed, and
> intended for readabiliy/legibility, these characters are plainly
> distinguishable.  Also, whether or not they can be distinguished, regardless
> of the font, the computer still knows what character is being used, and,
> given #1 above, can then render or wrap the text properly when the
> characters are included from the database into other text.
>
> 8) See #3
>
> 9) "Allowed" or "Preferred" is not the same as "Required".  Just because we
> encourage one character, the other character would not then somehow become
> wrong or incorrect.  Providing the correct tools, like Guess Case, should be
> the answer here, not dumbing down the data, simply to make things "easier".
>
> 10) What people choose to edit is their own business.  If someone wants to
> take the time to edit typography, why is this a problem?    A facetious
> question, but should we also eliminate the capitalization standards, or even
> just get rid of all the guidelines entirely?  That definitely would make
> data easier for new people to enter, as well as avoiding all that wasted
> time by all those editors going around using guess case (!) to fix
> capitalization and style issues...  :P
>
> Brian
>
> _______________________________________________
> Musicbrainz-style mailing list
> Musicbrainz-style@...
> http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
>
SwissChris | 1 May 23:38 2009
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

+ 1

On Fri, May 1, 2009 at 10:45 PM, Bogdan Butnaru <bogdanb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
+ 1

-- Bogdan Butnaru



On Fri, May 1, 2009 at 6:44 AM, Brian Schweitzer
<brian.brianschweitzer <at> gmail.com> wrote:
> I give up.
>
> Why does the MusicBrainz style community, at least the vocal part, consider
> standard typography to be "obscure", "confusing", "difficult", etc?
>
> * I do know that I was taught standard typography, including the elipses,
> en-dash, em-dash, and hyphen, in school at around age 12, and didn't
> consider it difficult, hard, or confusing.  (And then we covered it again,
> when I was learning French.)
>
> * Wikipedia does not consider it confusing, and in fact requires its use.
>
> * The authors of standard word processors don't consider it obscure, to such
> a point that they auto-insert it.
>
> * Most common wiki utilities include support for it (using --, ---, etc, to
> make insertion even easier, without having to even look up how to type an
> en-dash, etc.)
>
> * Guidelines for the use of correct typography are present is just about any
> English style reference book you care to look at.
>
> Yet, any time the suggestion is made that the en-dash, the em-dash, the
> elipses - or any other punctuation not found in the basic ASCII 127
> character set - be allowed, the same people threaten preemptive vetos,
> before an RFV is even made.  Notice the emphasis on "allowed".  Not
> mandated, not required, merely allowed.  I even went so far as, when
> rewriting Guess Case, to make sure that it had support to allow
> auto-correcting this typography, when possible.
>
> Last time this was discussed, around 14 months ago, the argument was against
> it because it would have been required.  A suggestion was made a few times
> that Guess Case could be made to auto-correct the typography, so users
> didn't have to type it.  Both points would seem to be addressed here; the
> hyphen-minus is still allowed, just given lesser preference than the
> *correct* typographical mark, and the new Guess Case has been rewritten from
> scratch to perform, as part of its functions, exactly that suggested
> auto-correction.
>
> Look, MusicBrainz includes the other 100k or so characters in Unicode.  So
> why, when it comes to this handful of typographical marks, is there such
> antipathy towards at least allowing those of us who care about typographical
> correctness to actually have it?
>
> I also don't buy this argument about people being tired of this debate.  The
> argument so far, be it about the spaces or allowing the en-dash, has come
> down to "I don't like it, so I'll veto it".  That's not a reasonable debate,
> that's a pissing contest, to see who can outlast whom.
>
> Please, those of you who oppose even *allowing* correct typography, explain
> to me why this is such a thing to allow to pass?  With the exception of
> Chris, there was no comment on this RFC until it hit the end of day 6, just
> before it would have gone to RFV.  Yet, look at the reasons given for
> opposition (I've tried to list all the reasons so far given):
>
> 1) " MBz is a collection of data, not a written work.  Therefore, typography
> rules do not need to apply."
> 2) "n-dashes displayed in the couple of paragraphs above ... as question
> mark"
> 3) "some mp3 players will have problems" / "it'll probably give some mp3
> players a fit"
> 4) the en-dash is "obscure typography"
> 5) "i don't believe it really reflects common practice in english"
> 6) ..."or record sleeves"
> 7) "Dash types are font dependent"
> 8) "If we add characters that don't display properly in applications people
> use, when a very close facsimile character is available, we risk alienating
> MBz contributors."
> 9) "we should be encouraging people to contribute by making it easy for them
> to do so"
> 10) "...some typographically-minded editors to spend untold hours running
> around the database and cleaning up people's dashes"
>
> Are any of these really arguments against allowing the use of *correct*
> typography?
>
> 1) Data which consists of a collection of interrelated strings is not
> "written"?  Also consider that that data can be, and already is,
> incorporated into text, where it then quite definitely becomes part of a
> written text.  (Even if it somehow wasn't already???)
>
> 2) Any computer sold in at least the past decade has included support for
> any correct typography we might discuss, out of the box and by default.
> That some email (and IRC bots, I should mention... :P) servers still don't
> support UTF-8 should not be our concern...
>
> 3) This is entirely a tagger issue.  Picard can replace the en-dash with a
> hyphen-minus (as well as replacing any other characters a user might wish to
> replace), if it is a problem.  However, tagger or mp3 player issues should
> not define anything with regards to the data.  Should we also forbid the use
> of any other non ACSII 127 characters, for the same reason?  Of course not.
> :)
>
> 4) This may be more a comment on various educational systems, I don't know.
> However, the en-dash (as well as the figure dash, the em-dash, the hyphen,
> the minus symbol, the elipses, and the guillemot) has an entry in every
> English style guideline manual in my library.  Speaking only for myself, it
> was part of my grammar school education on basic English language.  And,
> when I studied a foreign language, those same punctuation marks were
> re-taught in the first year.  So how, then, is the en-dash "obscure"?  Would
> you consider, then, a semi-colon or full colon to be obscure typography?
> They are used far less often, in typographically correct English, than the
> en-dash or elipses, yet would the argument then be that, becuase they appear
> on the US keyboard layout, they're somehow not as "obscure"?
>
> 5) First, since when did any of our guidelines reflect usage of anything in
> common English?  Capitalization Standard English definitely isn't "common
> English".  This is an argument that, because "common English" doesn't really
> actually mean anything, can be made to argue for or against *anything*.
>
> 6) Re: record (or CD, or tape, etc) sleeves, first, PartNumberStyle mostly
> ignores that text anyhow, but second, if this argument were actually to be
> used here, it would be the first time, to my knowledge, that we allow *any*
> guideline to be determined by actually arguing that a record sleeve is
> always perfectly interpretable, especially with regards to punctuation.
> Given all the text effects that appear on sleeves, attempting to determine
> whether the character (if even present, outside of PartNumberStyle) used is
> a hyphen, en-dash, or something else...  that's just a hopeless case.  One
> could just as easily argue that all the characters on sleeve are en-dashes,
> em-dashes, hyphens, minuses, or figure dashes, and not hyphen-minuses.
>
> 7) Every character is font dependant.  A correctly designed font should
> render an en-dash the width of a capital N, and an em-dash the width of a
> capital M.  A hyphen-minus has no defined width, mainly because a
> hyphen-minus has *no defined meaning*.  Whether or not the dashes and
> hyphens actually are distinguishable, to the human eye, for a given font,
> does not change the fact that in fonts that are correctly designed, and
> intended for readabiliy/legibility, these characters are plainly
> distinguishable.  Also, whether or not they can be distinguished, regardless
> of the font, the computer still knows what character is being used, and,
> given #1 above, can then render or wrap the text properly when the
> characters are included from the database into other text.
>
> 8) See #3
>
> 9) "Allowed" or "Preferred" is not the same as "Required".  Just because we
> encourage one character, the other character would not then somehow become
> wrong or incorrect.  Providing the correct tools, like Guess Case, should be
> the answer here, not dumbing down the data, simply to make things "easier".
>
> 10) What people choose to edit is their own business.  If someone wants to
> take the time to edit typography, why is this a problem?    A facetious
> question, but should we also eliminate the capitalization standards, or even
> just get rid of all the guidelines entirely?  That definitely would make
> data easier for new people to enter, as well as avoiding all that wasted
> time by all those editors going around using guess case (!) to fix
> capitalization and style issues...  :P
>
> Brian
>
> _______________________________________________
> Musicbrainz-style mailing list
> Musicbrainz-style-VWJlPdIPk9unah5wVhspdpv38IJrVKKD@public.gmane.org
> http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
>

_______________________________________________
Musicbrainz-style mailing list
Musicbrainz-style <at> lists.musicbrainz.org
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style

_______________________________________________
Musicbrainz-style mailing list
Musicbrainz-style@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
Paul C. Bryan | 2 May 00:38 2009
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, "Foo, Parts 1 - 3")

So, +1 from me too.

Not to muddy the waters too much on this RFC, but eventually I'd love to
see sanctioned use of other punctuation and typography: ellipses, 
guillemets, ordinals, etc.

I'm not in favour of disabling the MB database to handle media players,
filesystems, etc. that have encoding disabilities. Filtering and
transforming text to support ASCII, ISO 8859-x, etc. should be the
function of taggers, not MB proper.

Finally, on how this was handled, I too feel somewhat burnt-out from:

a) the style discussion precursor to the RFC,
b) the sheer amount of content being thrown back and forth,
c) having to diff wiki pages to see what changed.

As to c), I am in favour of future proposals remaining documented within
the email message to the list rather than being made by reference per a
wiki link to a page (or even a diff).

Paul

On Fri, 2009-05-01 at 00:44 -0400, Brian Schweitzer wrote:
>         Completely agreed, 100%. In the current lack-of-debate
>         situation we have, I think I would personally veto any RFV
>         that recommends use of obscure typography for "normal
>         situations", including the en-dash. Brian, people are so burnt
>         out from this 
> 
> I give up.
> 
> Why does the MusicBrainz style community, at least the vocal part,
> consider standard typography to be "obscure", "confusing",
> "difficult", etc?
> 
> * I do know that I was taught standard typography, including the
> elipses, en-dash, em-dash, and hyphen, in school at around age 12, and
> didn't consider it difficult, hard, or confusing.  (And then we
> covered it again, when I was learning French.)
> 
> * Wikipedia does not consider it confusing, and in fact requires its
> use.  
> 
> * The authors of standard word processors don't consider it obscure,
> to such a point that they auto-insert it.  
> 
> * Most common wiki utilities include support for it (using --, ---,
> etc, to make insertion even easier, without having to even look up how
> to type an en-dash, etc.)
> 
> * Guidelines for the use of correct typography are present is just
> about any English style reference book you care to look at.
> 
> Yet, any time the suggestion is made that the en-dash, the em-dash,
> the elipses - or any other punctuation not found in the basic ASCII
> 127 character set - be allowed, the same people threaten preemptive
> vetos, before an RFV is even made.  Notice the emphasis on "allowed".
> Not mandated, not required, merely allowed.  I even went so far as,
> when rewriting Guess Case, to make sure that it had support to allow
> auto-correcting this typography, when possible.
> 
> Last time this was discussed, around 14 months ago, the argument was
> against it because it would have been required.  A suggestion was made
> a few times that Guess Case could be made to auto-correct the
> typography, so users didn't have to type it.  Both points would seem
> to be addressed here; the hyphen-minus is still allowed, just given
> lesser preference than the *correct* typographical mark, and the new
> Guess Case has been rewritten from scratch to perform, as part of its
> functions, exactly that suggested auto-correction.
> 
> Look, MusicBrainz includes the other 100k or so characters in Unicode.
> So why, when it comes to this handful of typographical marks, is there
> such antipathy towards at least allowing those of us who care about
> typographical correctness to actually have it?
> 
> I also don't buy this argument about people being tired of this
> debate.  The argument so far, be it about the spaces or allowing the
> en-dash, has come down to "I don't like it, so I'll veto it".  That's
> not a reasonable debate, that's a pissing contest, to see who can
> outlast whom.
> 
> Please, those of you who oppose even *allowing* correct typography,
> explain to me why this is such a thing to allow to pass?  With the
> exception of Chris, there was no comment on this RFC until it hit the
> end of day 6, just before it would have gone to RFV.  Yet, look at the
> reasons given for opposition (I've tried to list all the reasons so
> far given):
> 
> 1) " MBz is a collection of data, not a written work.  Therefore,
> typography rules do not need to apply."
> 2) "n-dashes displayed in the couple of paragraphs above ... as
> question mark"
> 3) "some mp3 players will have problems" / "it'll probably give some
> mp3 players a fit"
> 4) the en-dash is "obscure typography"
> 5) "i don't believe it really reflects common practice in english"
> 6) ..."or record sleeves"
> 7) "Dash types are font dependent"
> 8) "If we add characters that don't display properly in applications
> people use, when a very close facsimile character is available, we
> risk alienating MBz contributors."
> 9) "we should be encouraging people to contribute by making it easy
> for them to do so"
> 10) "...some typographically-minded editors to spend untold hours
> running around the database and cleaning up people's dashes"
> 
> Are any of these really arguments against allowing the use of
> *correct* typography?
> 
> 1) Data which consists of a collection of interrelated strings is not
> "written"?  Also consider that that data can be, and already is,
> incorporated into text, where it then quite definitely becomes part of
> a written text.  (Even if it somehow wasn't already???)
> 
> 2) Any computer sold in at least the past decade has included support
> for any correct typography we might discuss, out of the box and by
> default.  That some email (and IRC bots, I should mention... :P)
> servers still don't support UTF-8 should not be our concern...
> 
> 3) This is entirely a tagger issue.  Picard can replace the en-dash
> with a hyphen-minus (as well as replacing any other characters a user
> might wish to replace), if it is a problem.  However, tagger or mp3
> player issues should not define anything with regards to the data.
> Should we also forbid the use of any other non ACSII 127 characters,
> for the same reason?  Of course not.  :)
> 
> 4) This may be more a comment on various educational systems, I don't
> know.  However, the en-dash (as well as the figure dash, the em-dash,
> the hyphen, the minus symbol, the elipses, and the guillemot) has an
> entry in every English style guideline manual in my library.  Speaking
> only for myself, it was part of my grammar school education on basic
> English language.  And, when I studied a foreign language, those same
> punctuation marks were re-taught in the first year.  So how, then, is
> the en-dash "obscure"?  Would you consider, then, a semi-colon or full
> colon to be obscure typography?  They are used far less often, in
> typographically correct English, than the en-dash or elipses, yet
> would the argument then be that, becuase they appear on the US
> keyboard layout, they're somehow not as "obscure"?
> 
> 5) First, since when did any of our guidelines reflect usage of
> anything in common English?  Capitalization Standard English
> definitely isn't "common English".  This is an argument that, because
> "common English" doesn't really actually mean anything, can be made to
> argue for or against *anything*.
> 
> 6) Re: record (or CD, or tape, etc) sleeves, first, PartNumberStyle
> mostly ignores that text anyhow, but second, if this argument were
> actually to be used here, it would be the first time, to my knowledge,
> that we allow *any* guideline to be determined by actually arguing
> that a record sleeve is always perfectly interpretable, especially
> with regards to punctuation.  Given all the text effects that appear
> on sleeves, attempting to determine whether the character (if even
> present, outside of PartNumberStyle) used is a hyphen, en-dash, or
> something else...  that's just a hopeless case.  One could just as
> easily argue that all the characters on sleeve are en-dashes,
> em-dashes, hyphens, minuses, or figure dashes, and not hyphen-minuses.
> 
> 7) Every character is font dependant.  A correctly designed font
> should render an en-dash the width of a capital N, and an em-dash the
> width of a capital M.  A hyphen-minus has no defined width, mainly
> because a hyphen-minus has *no defined meaning*.  Whether or not the
> dashes and hyphens actually are distinguishable, to the human eye, for
> a given font, does not change the fact that in fonts that are
> correctly designed, and intended for readabiliy/legibility, these
> characters are plainly distinguishable.  Also, whether or not they can
> be distinguished, regardless of the font, the computer still knows
> what character is being used, and, given #1 above, can then render or
> wrap the text properly when the characters are included from the
> database into other text.
> 
> 8) See #3
> 
> 9) "Allowed" or "Preferred" is not the same as "Required".  Just
> because we encourage one character, the other character would not then
> somehow become wrong or incorrect.  Providing the correct tools, like
> Guess Case, should be the answer here, not dumbing down the data,
> simply to make things "easier".
> 
> 10) What people choose to edit is their own business.  If someone
> wants to take the time to edit typography, why is this a problem?    A
> facetious question, but should we also eliminate the capitalization
> standards, or even just get rid of all the guidelines entirely?  That
> definitely would make data easier for new people to enter, as well as
> avoiding all that wasted time by all those editors going around using
> guess case (!) to fix capitalization and style issues...  :P
> 
> Brian
> _______________________________________________
> Musicbrainz-style mailing list
> Musicbrainz-style@...
> http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style
StoneyBoh | 2 May 01:01 2009

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, , "Foo, Parts 1 - 3")

Ok, one more quick reply.

I'm sorry Brian, my e-mail was not intended to piss you off, but just to 
give an opposing opinion.  As I indicated, these are my thoughts, folks 
can disagree with them or label me as wonky for holding them, that's 
fine.  I just wanted to make sure such an opinion was voiced, as there 
had not been much feedback on either side of the fence here.  This is 
not a pissing contest, just an exchange of ideas.

As I said in my e-mail - although probably not clear enough - I will 
support whatever the community decides.  If lots of folks in the 
community think this is important, then that's fine and I will be ok 
with it.  I just hadn't heard much support either way until now.  It 
seems clear now that at least some others support your intent, and that 
wasn't at all clear before now.  As has been stated recently, style 
decisions don't have to be unanimous, but they should have support of a 
good chunk of the community.  Has anyone heard from our style-leader lately?

This is a subject you are obviously very passionate about.  That can be 
good.  Just make sure you look hard at all sides of the issues before 
assuming everyone will surely agree with you.

Again, peace.
Jeff
Brian Schweitzer | 2 May 02:02 2009
Picon

Re: RFC: PartNumberStyle rewrite (was "Foo, Parts 1-3" vs, , "Foo, Parts 1 - 3")

On Fri, May 1, 2009 at 7:01 PM, StoneyBoh <jshoj-9q/xBM6aKHVWk0Htik3J/w@public.gmane.org> wrote:
Ok, one more quick reply.

I'm sorry Brian, my e-mail was not intended to piss you off, but just to
give an opposing opinion.  As I indicated, these are my thoughts, folks
can disagree with them or label me as wonky for holding them, that's
fine.  I just wanted to make sure such an opinion was voiced, as there
had not been much feedback on either side of the fence here.  This is
not a pissing contest, just an exchange of ideas.

As I said in my e-mail - although probably not clear enough - I will
support whatever the community decides.  If lots of folks in the
community think this is important, then that's fine and I will be ok
with it.  I just hadn't heard much support either way until now.  It
seems clear now that at least some others support your intent, and that
wasn't at all clear before now.  As has been stated recently, style
decisions don't have to be unanimous, but they should have support of a
good chunk of the community.  Has anyone heard from our style-leader lately?

This is a subject you are obviously very passionate about.  That can be
good.  Just make sure you look hard at all sides of the issues before
assuming everyone will surely agree with you.

Again, peace.
Jeff

Oh definitely, and my apologies for losing my cool as well.  :)

Personally, I've not seen Jim in a long while...

The only thing I'd suggest reconsidering, regarding what you said above, is the part about "As has been stated recently, style
decisions don't have to be unanimous,".  Maybe I missed something, but did something change regarding the "any one person can veto a RFV, and that kills it" policy?  That's what had annoyed me; a premptive veto being sent to the list, which essentially kills any proposal, and makes any further discussion seem simply a waste of time...

Brian
_______________________________________________
Musicbrainz-style mailing list
Musicbrainz-style@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style

Gmane