James Clark | 17 Sep 2003 04:10

Re: Editing basics: splitting elements

DuCharme, Bob (LNG-CHO) wrote:

> To extrapolate this to a design principle, I find it very productive to 
> have the cursor end up where I can type away with PCDATA. 
>  
> In general, Emacs is a content creation tool for me

Creating new marked-up content is one very important kind of usage. But 
I don't think it's the only one.  Another kind is adding markup to 
unmarked-up content. nxml-mode as a whole needs to work well for both, 
without any customization.  One XML editor I've used makes > electric 
and automatically insert the closing tag.  That's handy for the former 
kind of usage but super-annoying for the latter.  Similarly C-RET needs 
to work well for both kinds of users.

On the other hand, when adding a new command, we can target that new 
command exclusively at one kind of usage.  For example, when can add a 
C-c C-e command for creating a new element that inserts the end-tag and 
leaves point between tags; this would be good only for the former kind 
of usage.  Or we can add a C-c C-r command for tagging a region, that's 
good only the latter.

Another point to bear in mind is that nxml-mode isn't just for creating 
document-based XML (i.e. XML intended to create human-readble 
documents).  It's also for more data-oriented XML: this sort of XML may 
not have any PCDATA.  I want to be able to use nxml-mode to edit not 
just DocBook but also my XSLT stylesheets.

There's a danger that by excessive tuning for one kind of editing task 
or one kind of XML, we make it unsuitable for other kinds of task or 
(Continue reading)

James Clark | 17 Sep 2003 06:24

Re: nxml-mode performance on a big file

Sebastian Rahtz wrote:

> my machine is  P3 933, I think; when I tried it again today, after
> removing the DOCTYPE declaration to avoid confusion, it looked to be
> taking around a minute on average to do the first validation. Killing
> the buffer and reloading it takes that down to 30 seconds. If I start
> editing the root element it has to start again, of course, and that
> seems to take a while to get wound up again (30-60 seconds).

I clocked it at about 40 seconds, though it seems to depend on what sort 
of state Emacs is in (I guess memory usage).

Part of the explanation is that your file has a lot of references to 
character entities.  When you remove the DOCTYPE, there are nearly 2000 
errors.   Errors are represented using Emacs overlays, and Emacs doesn't 
handle large numbers of overlays efficiently.  I suspect it's not 
typical to have this many errors, so I'm not too worried about it. 
Nonetheless I have put in a change that seems to help with initial 
loading of a file with lots of errors.

The other explanation of why it took longer than I expected is simply 
that the markup in your file is very dense.  The time taken is 
proportional more to the number of elements and attributes than to the 
size of the file.

The RELAX NG validation proper accounts for only a small fraction of the 
overall time.  Apart from the overlay issue, there's no one single 
bottleneck.  If I invested a *lot* of time, I might (at best) make it 
twice as fast (and probably introduce some bugs and decrease 
maintainability).  However, I don't think there's any way I could make 
(Continue reading)

Sebastian Rahtz | 17 Sep 2003 11:04
Picon
Picon
Favicon
Gravatar

Re: extra characters inserted at front of file


> Very strange.  You could do M-x list-coding-systems with and without -q 
> to see if my guess is correct.  You also might try grepping your 
> packages for
> 
>   "define-coding-system-alias 'utf-8"

ah, I found it, I think. 
Mule-UCS defines  "utf-8-ws", which adds the signature. looks like that
was getting picked up my mistake. and I think you already said
not to use mule....
--

-- 
Sebastian Rahtz <sebastian.rahtz <at> computing-services.oxford.ac.uk>
OUCS

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Upgrade to 128-Bit SSL Security!
http://us.click.yahoo.com/p7cEmB/s7qGAA/yigFAA/2U_rlB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
emacs-nxml-mode-unsubscribe <at> yahoogroups.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 

Sebastian Rahtz | 17 Sep 2003 11:09
Picon
Picon
Favicon
Gravatar

Re: nxml-mode performance on a big file


> The other explanation of why it took longer than I expected is simply 
> that the markup in your file is very dense.  The time taken is 
> proportional more to the number of elements and attributes than to the 
> size of the file.

thats a fair point. thanks for the explanations. I quite agree with your
conclusion that performance isn't a serious issue at the
moment.

> > By the way, I note that xsd:IDREF attributes are not validated.
> > Something for the TODO?
> 
> Right. RELAX NG DTD compatibility is not implemented at all. 
> Eventually, I want something more functional than ID/IDREF (e.g. 
> whatever DSDL comes up with); I don't think ID/IDREF is not enough in a 
> schema world.

OK. Fair enouugh. 
I'll action myself to add an emacs menu which includes
an option to run an external validator for occasions when one
needs that check.
--

-- 
Sebastian Rahtz <sebastian.rahtz <at> computing-services.oxford.ac.uk>
OUCS

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Upgrade to 128-Bit SSL Security!
http://us.click.yahoo.com/p7cEmB/s7qGAA/yigFAA/2U_rlB/TM
---------------------------------------------------------------------~->
(Continue reading)

Norman Walsh | 17 Sep 2003 16:38
Favicon

Re: Editing basics: splitting elements


/ James Clark <jjc <at> jclark.com> was heard to say:
| There's a danger that by excessive tuning for one kind of editing task 
| or one kind of XML, we make it unsuitable for other kinds of task or 
| kinds of XML.

Conversely, if it isn't tuned sufficiently to satisfy the needs of a
class of authors, they won't use it.

I wonder if there are more than two classes: document creators and
document editors. If not, maybe two sets of commands can be created. I
can imagine one set bound to X, Y, Z, etc. and the other set bound to
C-u X, C-u Y, C-u Z, etc. with some sort of toggle to invert the set
bindings.

                                        Be seeing you,
                                          norm

--

-- 
Norman Walsh <normyahoo <at> nwalsh.com> | I don't make predictions. I never
http://nwalsh.com/                  | have and I never will.--Tony Blair
DuCharme, Bob (LNG-CHO | 17 Sep 2003 18:40
Favicon

RNG and selective validation

(This is a general RNG question and not an nxml question, but despite my
OASIS membership I can't figure out how to subscribe to the mailing list
there. The "mailing list" link on
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=relax-ng doesn't
go anywhere, and selecting http://lists.oasis-open.org/ob/adm.pl  from
http://www.oasis-open.org/private/using_email.php just lists xml.org mailing
lists, without the RNG one.)

The following might look familiar to some of you who saw a similar issue
that I recently posted on xml-dev. It's more ambitious here, because, well,
because RNG can handle it. It all works, but I just want to see if anyone
has any suggestions about the basic model for doing what I'll call selective
validation: a schema that accepts any well-formed document, except that any
elements and attributes it comes across that are declared in that schema
must conform to those declarations. 

The sample schema below accepts anything, but checks that any beta and gamma
elements conform to the included declarations. Furthermore, if any quantity
or rev attributes show up anywhere, the processor should validate their
types. It's basically a more generalized version of the sample schema shown
at the end of section 11 of the tutorial at
http://www.relaxng.org/tutorial-20011203.html#IDAFLZR.

My plan is that if I decide to add a new element for the schema to watch
for, I add its name to the exception list and I add its declaration to the
elements group. Similarly, to add a new attribute that isn't supposed to be
part of a declared element, I add its name to the attribute exception list
and add its declaration to the attribute declaration list. 

Does this make sense? 
(Continue reading)

Bruce D'Arcus | 17 Sep 2003 18:54

Re: RNG and selective validation


On Wednesday, September 17, 2003, at 12:40  PM, DuCharme, Bob (LNG-CHO) 
wrote:

> This is a general RNG question and not an nxml question, but despite my
> OASIS membership I can't figure out how to subscribe to the mailing 
> list
> there.

Yeah, I ran into this too. I had understood I could subscribe to 
relaxng-comment and post my newbie questions there, but I was never 
able to figure out how.  James was nice enough to forward my question 
to the list and respond there, but I don't think he wants to do that on 
a regular basis.

Bruce

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Upgrade to 128-Bit SSL Security!
http://us.click.yahoo.com/p7cEmB/s7qGAA/yigFAA/2U_rlB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
emacs-nxml-mode-unsubscribe <at> yahoogroups.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 

James Clark | 18 Sep 2003 02:35

Re: RNG and selective validation

DuCharme, Bob (LNG-CHO) wrote:

> (This is a general RNG question and not an nxml question, but despite my
> OASIS membership I can't figure out how to subscribe to the mailing list
> there. The "mailing list" link on
> http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=relax-ng doesn't
> go anywhere, and selecting http://lists.oasis-open.org/ob/adm.pl  from
> http://www.oasis-open.org/private/using_email.php just lists xml.org mailing
> lists, without the RNG one.)

I believe the relax-ng mailing list is the TC mailing list, so you have 
to join the TC (which you can do as an OASIS member) to do that.   The 
relax-ng-comment is public, but I think they are making that 
non-subscribable.  I think general RNG questions are appropriate for 
xml-dev.

> The following might look familiar to some of you who saw a similar issue
> that I recently posted on xml-dev. It's more ambitious here, because, well,
> because RNG can handle it. It all works, but I just want to see if anyone
> has any suggestions about the basic model for doing what I'll call selective
> validation: a schema that accepts any well-formed document, except that any
> elements and attributes it comes across that are declared in that schema
> must conform to those declarations. 

What's the practical utility of this?  If somebody mistypes "gamma" as 
"gama", then it will pass as valid rather than being caught by the 
validator.

James

(Continue reading)

James Clark | 18 Sep 2003 02:40

Re: Editing basics: splitting elements

Norman Walsh wrote:

> I wonder if there are more than two classes: document creators and
> document editors.

I think there are classes of task rather than author.  I might be 
creating one day and editing the next.

> If not, maybe two sets of commands can be created. I
> can imagine one set bound to X, Y, Z, etc. and the other set bound to
> C-u X, C-u Y, C-u Z, etc. with some sort of toggle to invert the set
> bindings.

I don't think toggling will work well when the same user both edits and 
creates.

Also I hope eventually that nxml-mode will make it into Emacs.  That 
means I have to play by the rules as far as key bindings are concerned 
(basically I have to use C-c followed by another control character as 
far as possible).

James

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Upgrade to 128-Bit SSL Security!
http://us.click.yahoo.com/p7cEmB/s7qGAA/yigFAA/2U_rlB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
emacs-nxml-mode-unsubscribe <at> yahoogroups.com
(Continue reading)

peter17ring | 18 Sep 2003 12:58
Picon
Favicon

Re: Editing basics: splitting elements

It may not be pretty, but after all, one of the nice things about
Emacs is that you add alternative keymaps. Why not have an official
kosher keymap, but prepare for alternative keymaps?

Apart from Emacs keybinding conventions, like using C-c as prefix in
user defined commands, I can think of a couple of sources of design
criteria.

First, there's the general question of what constitutes an efficient
tool. There's family of modeling techniques known as GOMS that
analyzes the user complexity of interactive systems. One of the
simplistic GOMS variants, the Keystroke-Level Model, essentially
counts the number of keypresses and choices that must be made to
accomplish a simple goal. From this point of view, if you can save
some keystrokes, it's OK to have keybindings that depend on history. 

Norman Walsh is considering the different classes of potential users
(or user roles). I can see some additional dimensions here: which
other tools are these people likely to use? I suggest that some
knowledge about this can be gathered from psgml users. I would expect
that psgml users could be categorized as

emacs buffs, psgml is one out of at least fifty modes
  vs.
psgml users, couldn't care less about other modes

power markup operators, use psgml along with Arbortext Epic, jEdit in
XML mode, or Topologi Collaborative Markup Editor (is that a name?!)
  vs.
uses psgml because the price is right and emacs was already installed
(Continue reading)


Gmane