Jeremias Maerki | 1 Mar 2005 15:02
Picon
Favicon

page-breaking strategies and performance

I finally have Knuth's "Digital Typography" and let myself enlighten by
his well-written words. In [1] Simon outlined different strategies for
page-breaking, obviously closely following the different approaches
defined by Knuth. At first glance, I'd say that "best-fit" is probably
the obvious strategy to select, especially if TeX is happy with it.
Obviously, it can't find the optimal solution like this but the
additional overhead (memory and CPU power) of a look-ahead/total-fit
strategy is simply too much and unnecessary for things like invoices and
insurance policies which are surely some of the most popular use cases
of XSL-FO. Here, speed is extremely important. People writing
documentation (maybe using DocBook) or glossy stock reports have
additional requirements and don't mind the longer processing time and
additional memory requirements. This leads me to the question if we
shouldn't actually implement two page-breaking strategies (in the end,
not both right now). For a speed-optimized algorithm, we could even
think about ignoring side-floats.

Obviously, in this model we would have to make sure that we use a common
model for both strategies. For example, we still have to make sure that
the line layout gets information on the available IPD on each line, but
probably this will not be a big problem to include later.

An enhanced/adjusted box/glue/penalty model sounds like a good idea to
me especially since Knuth hints at that in his book, too. There's also a
question if part of the infrastructure from line breaking can be reused
for page breaking, but I guess rather not.

As for the plan to implement a new page-breaking mechanism: I've got to
do it now. :-) I'm sorry if this may put some pressure on some of you.
I'm also not sure if I'm fit already to tackle it, but I've got to
(Continue reading)

Jeremias Maerki | 1 Mar 2005 15:09
Picon
Favicon

Skype-conference on page-breaking?

To speed things up could we hold a conference (using Skype, for example)
to discuss further details on page-breaking? I'd volunteer to sum up any
results during that discussion for the archives. I have Finn on my Skype
radar already.

Jeremias Maerki

Renaud Richardet | 1 Mar 2005 15:26
Picon

Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer

Victor and Jeremias, thanks for your Inputs.

Victor, I've checked out your aXSL. I'll study it and come back to you
if I have questions.

Jeremias wrote:

> > Speaking of startVParea(), could we rename it to something more meanigfull?
> > Proposition: TransformPosition, or something like this.
> > Deleted the methods moved to AbstractRenderer.
> Actually, I like startVParea() (or rather startViewportArea like I would
> rather call it) because only for viewport a new transformation matrix is
> necessary. I think when you port the matrix concatenation from the PDF
> renderer over to Java2D in startVParea() you will start to understand
> what's going on here. 
OK,  thanks. That makes sense.

> > fop.area.CTM: added two getters for e and f. If there's another way to get those
> > values, please let me know.
> Normally, we use toArray() but I guess these two getters are ok and
> don't hurt although I think they are not necessary because you need to
> use all other values in the CTM, too, to get the reference orientation
> stuff right. See above.
OK, I'll use the available toArray() instead.

> > The enclosed image doesn't have ipd/bpd
> > either. Again: is this normal so? I have a workaround in mind (getting those
> > values through the FopImage), but it doesn't sound right.
> In this case it is probably better to fix the LMs. I've started doing
> that but haven't finished. ATM this is lower priority for me. I can send
(Continue reading)

Renaud Richardet | 1 Mar 2005 15:38
Picon

Re: Skype-conference on page-breaking?

I would be please to listen.

Renaud

Victor Mote | 1 Mar 2005 15:52

RE: page-breaking strategies and performance

Jeremias Maerki wrote:

> processing time and additional memory requirements. This 
> leads me to the question if we shouldn't actually implement 
> two page-breaking strategies (in the end, not both right 
> now). For a speed-optimized algorithm, we could even think 
> about ignoring side-floats.
> 
> Obviously, in this model we would have to make sure that we 
> use a common model for both strategies. For example, we still 
> have to make sure that the line layout gets information on 
> the available IPD on each line, but probably this will not be 
> a big problem to include later.

This is an excellent idea. It has from time to time gone under the moniker
LayoutStrategy or pluggable layout. To do it without duplicating everything
requires that the other pieces of the system be modularized, the concerns
separated so that they can be reused. The upside is tremendous and the cost
pays for itself in developer productivity.

Victor Mote

Jeremias Maerki | 1 Mar 2005 15:54
Picon
Favicon

Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer


On 01.03.2005 15:26:04 Renaud Richardet wrote:
> > Another thought: One of my low-priority tasks is to create a little
> > application that renders a test suite with all of FOP's renderers
> > creating bitmap images for each generated document and ultimately
> > creating a little website that lets us compare the output. PDFs and PS
> > files can be converted to bitmaps using GhostScript. Maybe you might
> > want to write such a thingy. I won't get to it before I get to updating
> > the PS renderer to full quality.
> That would be good. Do you mean something like the "Bitmap production"
> you documented on FopAndJava2D [1]? This is what I intend to work on
> after the basic Java2DRenderer works.

Yes, exactly, at least for the AWT/Java2DRenderer. For PDF and PS you
need to use GhostScript for the conversion.

Jeremias Maerki

Jeremias Maerki | 1 Mar 2005 20:12
Picon
Favicon

Re: [XML Graphics - FOP Wiki] Updated: PageLayout

Simon, I've tried to think your example through. If I read the spec
right about space resolution then I get the impression that we may need
to do more in this area than find a suitable box/glue/penalty
combination. There may be several spaces which need to be taken into
account during resolution. There's the precedence and the conditionality
that needs to be evaluated. I think we may need to create special
elements that can hold this information (or reference it). They need to
be distinguishable so we can apply the resolution rules properly.

I believe your example should then look like this:

- box
- penalty (w=0, p=infinite)
- space
- glue (w=0, y=0, z=0)
- space
- penalty (w=0, p=infinite)
- box

A more complex example would look like this:

<fo:block space-after="5pt">
  <fo:block>a line</fo:block>
  <fo:block space-after="3pt">
     blah blah
  </fo:block>
</fo:block>
<fo:block space-before="10pt">
  blah bla
</fo:block>
(Continue reading)

bugzilla | 1 Mar 2005 21:48
Picon
Favicon

DO NOT REPLY [Bug 33597] - [Patch] for xdocs "Design and Implementation"

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=33597>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=33597

spepping <at> leverkruid.nl changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

------- Additional Comments From spepping <at> leverkruid.nl  2005-03-01 21:48 -------
Renaud,

Applied. Thanks.

There is a patch with a refactoring of the implementation of Knuth's
algorithm, in bug 32612. It does not change the basic algorithm. You
may wish to wait with any documentation until that is applied.

Simon

--

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
(Continue reading)

Simon Pepping | 1 Mar 2005 22:18
Picon

Re: page-breaking strategies and performance

On Tue, Mar 01, 2005 at 03:02:38PM +0100, Jeremias Maerki wrote:
> As for the plan to implement a new page-breaking mechanism: I've got to
> do it now. :-) I'm sorry if this may put some pressure on some of you.
> I'm also not sure if I'm fit already to tackle it, but I've got to
> do it anyway. Since I don't want to work with a series of patches like
> you guys did earlier, I'd like to create a branch to do that on as soon
> as we've agreed on a strategy. Any objections to that?

That is a good idea.

Regards, Simon

--

-- 
Simon Pepping
home page: http://www.leverkruid.nl

Simon Pepping | 1 Mar 2005 22:25
Picon

Re: page-breaking strategies and performance

On Tue, Mar 01, 2005 at 07:52:27AM -0700, Victor Mote wrote:
> Jeremias Maerki wrote:
> 
> > processing time and additional memory requirements. This 
> > leads me to the question if we shouldn't actually implement 
> > two page-breaking strategies (in the end, not both right 
> > now). For a speed-optimized algorithm, we could even think 
> > about ignoring side-floats.
> > 
> > Obviously, in this model we would have to make sure that we 
> > use a common model for both strategies. For example, we still 
> > have to make sure that the line layout gets information on 
> > the available IPD on each line, but probably this will not be 
> > a big problem to include later.
> 
> This is an excellent idea. It has from time to time gone under the moniker
> LayoutStrategy or pluggable layout. To do it without duplicating everything
> requires that the other pieces of the system be modularized, the concerns
> separated so that they can be reused. The upside is tremendous and the cost
> pays for itself in developer productivity.

The idea of having two page breaking strategies is OK. But it is also
a goal that is yet far over the horizon.

I hope this is smaller than having pluggable layout. We should try to
express the layout constraints in a simple language, which can be used
by the algorithms of both strategies. Knuth's model is an effort to
achieve that, and a PageLayoutManager which receives the Knuth
elements and invokes the appropriate algorithm goes with it.

(Continue reading)


Gmane