Nicolas Lacombe | 2 Dec 19:16 2009
Picon

Arabic decoder

Hi,

I'm using HarfBuzz with freetype to render arabic text. I must say that this library was exactly what I was looking for, great job!

Anyway I spotted something strange, I don't know if it's a bug or if it's me that didn't understand correctly the spirit behind Harfbuzz.

When rendering arabic letter equivalent of la , it should give me only one glyph (لا).

However, Harfbuzz render me correctly the glyph, but do not get rid of the a. See screenshot here:

http://img522.yfrog.com/i/21856341.png/


Here's my (simplified) render loop:

HB_ShaperItem shaper_item;

    shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1, NULL, &numberOfWords, NULL);

   shaper_item.kerning_applied = 0;
    shaper_item.stringLength = 0;
    shaper_item.shaperFlags = 0;
    shaper_item.font = &hbFont;
    shaper_item.face = hbFace;
    shaper_item.glyphIndicesPresent = 0;
    shaper_item.initialGlyphCount = 0;

    shaper_item.item.bidiLevel = 0;

    out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));
    memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
    out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords * sizeof(HB_GlyphAttributes));
    memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));
    out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));
    memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));
    out_offsets = (HB_FixedPoint*)malloc(numberOfWords * sizeof(HB_FixedPoint));
    memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));
    out_logClusters = (unsigned short*)malloc(numberOfWords * sizeof(unsigned short));
    memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));

    shaper_item.glyphs = out_glyphs;
    shaper_item.attributes = out_attrs;
    shaper_item.advances = out_advs;
    shaper_item.offsets = out_offsets;
    shaper_item.log_clusters = out_logClusters;
    shaper_item.num_glyphs = numberOfWords;
    shaper_item.stringLength = numberOfWords;

    int l = 0;
    while(1){
       
        shaper_item.num_glyphs = numberOfWords;
        if (!hb_utf16_script_run_next(NULL, &shaper_item.item, shaper_item.string, numberOfWords , &iterator))
            break;

        memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
        HB_ShapeItem(&shaper_item);
        for (unsigned int j = 0; j < shaper_item.item.length; j++)
        {
            ind[l++] = out_glyphs[j];
        }
    }

so at the end, ind contains all the index of the glyph I need to draw.

Is there a way harfbuzz can tell me to get rid of a glyph?

Thanks, Nico.


<div><p>Hi,<br><br>I'm using HarfBuzz with freetype to render arabic text. I must say that this library was exactly what I was looking for, great job!<br><br>Anyway I spotted something strange, I don't know if it's a bug or if it's me that didn't understand correctly the spirit behind Harfbuzz.<br><br>When rendering arabic letter equivalent of la , it should give me only one glyph (&#1604;&#1575;).<br><br>However, Harfbuzz render me correctly the glyph, but do not get rid of the a. See screenshot here:<br><br><a href="http://img522.yfrog.com/i/21856341.png/">http://img522.yfrog.com/i/21856341.png/</a><br><br><br>Here's my (simplified) render loop:<br><br>HB_ShaperItem shaper_item;<br><br>&nbsp;&nbsp;&nbsp; shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1, NULL, &amp;numberOfWords, NULL);<br><br>&nbsp;&nbsp; shaper_item.kerning_applied = 0;<br>
&nbsp;&nbsp;&nbsp; shaper_item.stringLength = 0;<br>&nbsp;&nbsp;&nbsp; shaper_item.shaperFlags = 0;<br>&nbsp;&nbsp;&nbsp; shaper_item.font = &amp;hbFont;<br>&nbsp;&nbsp;&nbsp; shaper_item.face = hbFace;<br>&nbsp;&nbsp;&nbsp; shaper_item.glyphIndicesPresent = 0;<br>&nbsp;&nbsp;&nbsp; shaper_item.initialGlyphCount = 0;<br><br>&nbsp;&nbsp;&nbsp; shaper_item.item.bidiLevel = 0;<br><br>&nbsp;&nbsp;&nbsp; out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));<br>&nbsp;&nbsp;&nbsp; memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));<br>&nbsp;&nbsp;&nbsp; out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords * sizeof(HB_GlyphAttributes));<br>
&nbsp;&nbsp;&nbsp; memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));<br>&nbsp;&nbsp;&nbsp; out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));<br>&nbsp;&nbsp;&nbsp; memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));<br>&nbsp;&nbsp;&nbsp; out_offsets = (HB_FixedPoint*)malloc(numberOfWords * sizeof(HB_FixedPoint));<br>
&nbsp;&nbsp;&nbsp; memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));<br>&nbsp;&nbsp;&nbsp; out_logClusters = (unsigned short*)malloc(numberOfWords * sizeof(unsigned short));<br>&nbsp;&nbsp;&nbsp; memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));<br><br>&nbsp;&nbsp;&nbsp; shaper_item.glyphs = out_glyphs;<br>&nbsp;&nbsp;&nbsp; shaper_item.attributes = out_attrs;<br>&nbsp;&nbsp;&nbsp; shaper_item.advances = out_advs;<br>&nbsp;&nbsp;&nbsp; shaper_item.offsets = out_offsets;<br>&nbsp;&nbsp;&nbsp; shaper_item.log_clusters = out_logClusters;<br>&nbsp;&nbsp;&nbsp; shaper_item.num_glyphs = numberOfWords;<br>
&nbsp;&nbsp;&nbsp; shaper_item.stringLength = numberOfWords;<br><br>&nbsp;&nbsp;&nbsp; int l = 0;<br>&nbsp;&nbsp;&nbsp; while(1){<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; shaper_item.num_glyphs = numberOfWords;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (!hb_utf16_script_run_next(NULL, &amp;shaper_item.item, shaper_item.string, numberOfWords , &amp;iterator))<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; break;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; HB_ShapeItem(&amp;shaper_item);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; for (unsigned int j = 0; j &lt; shaper_item.item.length; j++)<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; ind[l++] = out_glyphs[j];<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; }<br><br>so at the end, ind contains all the index of the glyph I need to draw.<br><br>Is there a way harfbuzz can tell me to get rid of a glyph?<br><br>Thanks, Nico.<br><br><br></p></div>
Nicolas Lacombe | 4 Dec 08:21 2009
Picon

Re: Arabic decoder

problem solved:

in my loop:



  for (unsigned int j = 0; j < shaper_item.item.length; j++)

I was taking into account the number of item, not the number of glyph.

with

for (unsigned int j = 0; j < shaper_item.num_glyphs; j++)

it looks better.

I'm still having problem rendering complexe sentances, but it looks like working fine on small letter... I'm wondering if I have to take into account the advances given by the shaper... for now I'm just letting freetype taking care of it for me... need some test.

Are you aware of any problem in the arabic/syriac translation module, or is it suppose to work fine?


Behdad answear to this question with:

"Supposed to work fine.  Though you need a bidi engine also.  Try setting bidi level to 1 instead of 0..."


Wich does not appear to change a lot of things. However I resolved my problem by changing font... they are very important, and I'm having trouble to find good Persian and Syriac font.



2009/12/2 Nicolas Lacombe <n.lacombe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Hi,

I'm using HarfBuzz with freetype to render arabic text. I must say that this library was exactly what I was looking for, great job!

Anyway I spotted something strange, I don't know if it's a bug or if it's me that didn't understand correctly the spirit behind Harfbuzz.

When rendering arabic letter equivalent of la , it should give me only one glyph (لا).

However, Harfbuzz render me correctly the glyph, but do not get rid of the a. See screenshot here:

http://img522.yfrog.com/i/21856341.png/


Here's my (simplified) render loop:

HB_ShaperItem shaper_item;

    shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1, NULL, &numberOfWords, NULL);

   shaper_item.kerning_applied = 0;
    shaper_item.stringLength = 0;
    shaper_item.shaperFlags = 0;
    shaper_item.font = &hbFont;
    shaper_item.face = hbFace;
    shaper_item.glyphIndicesPresent = 0;
    shaper_item.initialGlyphCount = 0;

    shaper_item.item.bidiLevel = 0;

    out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));
    memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
    out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords * sizeof(HB_GlyphAttributes));
    memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));
    out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));
    memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));
    out_offsets = (HB_FixedPoint*)malloc(numberOfWords * sizeof(HB_FixedPoint));
    memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));
    out_logClusters = (unsigned short*)malloc(numberOfWords * sizeof(unsigned short));
    memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));

    shaper_item.glyphs = out_glyphs;
    shaper_item.attributes = out_attrs;
    shaper_item.advances = out_advs;
    shaper_item.offsets = out_offsets;
    shaper_item.log_clusters = out_logClusters;
    shaper_item.num_glyphs = numberOfWords;
    shaper_item.stringLength = numberOfWords;

    int l = 0;
    while(1){
       
        shaper_item.num_glyphs = numberOfWords;
        if (!hb_utf16_script_run_next(NULL, &shaper_item.item, shaper_item.string, numberOfWords , &iterator))
            break;

        memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
        HB_ShapeItem(&shaper_item);
        for (unsigned int j = 0; j < shaper_item.item.length; j++)
        {
            ind[l++] = out_glyphs[j];
        }
    }

so at the end, ind contains all the index of the glyph I need to draw.

Is there a way harfbuzz can tell me to get rid of a glyph?

Thanks, Nico.



<div>
<p>problem solved:<br><br>in my loop:</p>
<div class="im">
<br><br>&nbsp; for (unsigned int j = 0; j &lt; shaper_item.item.length; j++)<br><br>
</div>I was taking into account the number of item, not the number of glyph.<br><br>with<br><br>
for (unsigned int j = 0; j &lt; shaper_item.num_glyphs; j++)<br><br>it looks better.<br><br>I'm still having problem rendering complexe
sentances, but it looks like working fine on small letter... I'm
wondering if I have to take into account the advances given by the
shaper... for now I'm just letting freetype taking care of it for me...
need some test.<br><br>Are you aware of any problem in the arabic/syriac translation module, or is it suppose to work fine?<br><br><br>Behdad answear to this question with:<br><br>"Supposed to work fine. &nbsp;Though you need a bidi engine also. &nbsp;Try setting bidi level to 1 instead of 0..."<br><br><br>Wich does not appear to change a lot of things. However I resolved my problem by changing font... they are very important, and I'm having trouble to find good Persian and Syriac font.<br><br><br><br><div class="gmail_quote">
2009/12/2 Nicolas Lacombe <span dir="ltr">&lt;<a href="mailto:n.lacombe <at> gmail.com">n.lacombe@...</a>&gt;</span><br><blockquote class="gmail_quote">
Hi,<br><br>I'm using HarfBuzz with freetype to render arabic text. I must say that this library was exactly what I was looking for, great job!<br><br>Anyway I spotted something strange, I don't know if it's a bug or if it's me that didn't understand correctly the spirit behind Harfbuzz.<br><br>When rendering arabic letter equivalent of la , it should give me only one glyph (&#1604;&#1575;).<br><br>However, Harfbuzz render me correctly the glyph, but do not get rid of the a. See screenshot here:<br><br><a href="http://img522.yfrog.com/i/21856341.png/" target="_blank">http://img522.yfrog.com/i/21856341.png/</a><br><br><br>Here's my (simplified) render loop:<br><br>HB_ShaperItem shaper_item;<br><br>&nbsp;&nbsp;&nbsp; shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1, NULL, &amp;numberOfWords, NULL);<br><br>&nbsp;&nbsp; shaper_item.kerning_applied = 0;<br>

&nbsp;&nbsp;&nbsp; shaper_item.stringLength = 0;<br>&nbsp;&nbsp;&nbsp; shaper_item.shaperFlags = 0;<br>&nbsp;&nbsp;&nbsp; shaper_item.font = &amp;hbFont;<br>&nbsp;&nbsp;&nbsp; shaper_item.face = hbFace;<br>&nbsp;&nbsp;&nbsp; shaper_item.glyphIndicesPresent = 0;<br>&nbsp;&nbsp;&nbsp; shaper_item.initialGlyphCount = 0;<br><br>&nbsp;&nbsp;&nbsp; shaper_item.item.bidiLevel = 0;<br><br>&nbsp;&nbsp;&nbsp; out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));<br>&nbsp;&nbsp;&nbsp; memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));<br>&nbsp;&nbsp;&nbsp; out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords * sizeof(HB_GlyphAttributes));<br>

&nbsp;&nbsp;&nbsp; memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));<br>&nbsp;&nbsp;&nbsp; out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));<br>&nbsp;&nbsp;&nbsp; memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));<br>&nbsp;&nbsp;&nbsp; out_offsets = (HB_FixedPoint*)malloc(numberOfWords * sizeof(HB_FixedPoint));<br>

&nbsp;&nbsp;&nbsp; memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));<br>&nbsp;&nbsp;&nbsp; out_logClusters = (unsigned short*)malloc(numberOfWords * sizeof(unsigned short));<br>&nbsp;&nbsp;&nbsp; memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));<br><br>&nbsp;&nbsp;&nbsp; shaper_item.glyphs = out_glyphs;<br>&nbsp;&nbsp;&nbsp; shaper_item.attributes = out_attrs;<br>&nbsp;&nbsp;&nbsp; shaper_item.advances = out_advs;<br>&nbsp;&nbsp;&nbsp; shaper_item.offsets = out_offsets;<br>&nbsp;&nbsp;&nbsp; shaper_item.log_clusters = out_logClusters;<br>
&nbsp;&nbsp;&nbsp; shaper_item.num_glyphs = numberOfWords;<br>
&nbsp;&nbsp;&nbsp; shaper_item.stringLength = numberOfWords;<br><br>&nbsp;&nbsp;&nbsp; int l = 0;<br>&nbsp;&nbsp;&nbsp; while(1){<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; shaper_item.num_glyphs = numberOfWords;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; if (!hb_utf16_script_run_next(NULL, &amp;shaper_item.item, shaper_item.string, numberOfWords , &amp;iterator))<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; break;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; HB_ShapeItem(&amp;shaper_item);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; for (unsigned int j = 0; j &lt; shaper_item.item.length; j++)<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; ind[l++] = out_glyphs[j];<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; }<br><br>so at the end, ind contains all the index of the glyph I need to draw.<br><br>Is there a way harfbuzz can tell me to get rid of a glyph?<br><br>Thanks, Nico.<br><br><br>
</blockquote>
</div>
<br>
</div>
Martin Hosken | 4 Dec 09:09 2009

Thoughts on harfbuzz API

Dear All,

I'm in the process of writing a python wrapper to help with testing harfbuzz before hopefully integrating
Graphite. This gives me a good way to review the API :) and here are some thoughts.

1. Features

Currently a feature in hb-shape.h is defined as an association between two char * over a range. My
understanding of all smart font technologies is that they work with longs. So I would suggest making the
name and value entries unsigned longs rather than char *.

2. Script and Lang

Currently script and lang values are enums. I can sort of understand script being an enum from a default
block based run segmenter. But given, again, fonts use long tags for this. I would suggest making them open
ended either as char * or as unsigned long. This is particularly true for the lang tag and I would suggest it
as a more helpful way to go for the script tag too.

Yours, humbly submitted,
Martin

Evan Martin | 4 Dec 18:18 2009

Re: Arabic decoder

On Thu, Dec 3, 2009 at 11:21 PM, Nicolas Lacombe <n.lacombe@...> wrote:
> I'm still having problem rendering complexe sentances, but it looks like
> working fine on small letter... I'm wondering if I have to take into account
> the advances given by the shaper... for now I'm just letting freetype taking
> care of it for me... need some test.

The code under contrib/ (I notice you're using
hb_utf16_script_run_next) was code cooked up just for Google Chrome on
Linux.  It has some bugs in more complicated Arabic text and rather
than track those down I plan to just port to harfbuzz-ng and switch to
using ICU in the process.

I'm not certain that's the cause of your problem, but to save yourself
some time you might want to compare the misrendering you're getting
against the way Google Chrome fails on sites like
http://www.quranexplorer.com/Quran/Default.aspx .
Lars Knoll | 4 Dec 13:47 2009
Picon

Re: Arabic decoder

On Friday 04 December 2009 08:21:16 am Nicolas Lacombe wrote:
> problem solved:

[snip]

> I'm still having problem rendering complexe sentances, but it looks like
> working fine on small letter... I'm wondering if I have to take into
>  account the advances given by the shaper... for now I'm just letting
>  freetype taking care of it for me... need some test.

There's a reason harfbuzz delivers you advances and positions for the glyphs 
;-) You'll need to use them to get correct layout.

> Are you aware of any problem in the arabic/syriac translation module, or is
> it suppose to work fine?

We're using the code directly in Qt and haven't gotten any bug reports about 
problems with arabic or syriac for quite some time.

Cheers,
Lars

> 
> Behdad answear to this question with:
> 
> "Supposed to work fine.  Though you need a bidi engine also.  Try setting
> bidi level to 1 instead of 0..."
> 
> 
> Wich does not appear to change a lot of things. However I resolved my
> problem by changing font... they are very important, and I'm having trouble
> to find good Persian and Syriac font.
> 
> 
> 
> 2009/12/2 Nicolas Lacombe <n.lacombe <at> gmail.com>
> 
> > Hi,
> >
> > I'm using HarfBuzz with freetype to render arabic text. I must say that
> > this library was exactly what I was looking for, great job!
> >
> > Anyway I spotted something strange, I don't know if it's a bug or if it's
> > me that didn't understand correctly the spirit behind Harfbuzz.
> >
> > When rendering arabic letter equivalent of la , it should give me only
> > one glyph (لا).
> >
> > However, Harfbuzz render me correctly the glyph, but do not get rid of
> > the a. See screenshot here:
> >
> > http://img522.yfrog.com/i/21856341.png/
> >
> >
> > Here's my (simplified) render loop:
> >
> > HB_ShaperItem shaper_item;
> >
> >     shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1,
> > NULL, &numberOfWords, NULL);
> >
> >    shaper_item.kerning_applied = 0;
> >     shaper_item.stringLength = 0;
> >     shaper_item.shaperFlags = 0;
> >     shaper_item.font = &hbFont;
> >     shaper_item.face = hbFace;
> >     shaper_item.glyphIndicesPresent = 0;
> >     shaper_item.initialGlyphCount = 0;
> >
> >     shaper_item.item.bidiLevel = 0;
> >
> >     out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));
> >     memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
> >     out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords *
> > sizeof(HB_GlyphAttributes));
> >     memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));
> >     out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));
> >     memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));
> >     out_offsets = (HB_FixedPoint*)malloc(numberOfWords *
> > sizeof(HB_FixedPoint));
> >     memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));
> >     out_logClusters = (unsigned short*)malloc(numberOfWords *
> > sizeof(unsigned short));
> >     memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));
> >
> >     shaper_item.glyphs = out_glyphs;
> >     shaper_item.attributes = out_attrs;
> >     shaper_item.advances = out_advs;
> >     shaper_item.offsets = out_offsets;
> >     shaper_item.log_clusters = out_logClusters;
> >     shaper_item.num_glyphs = numberOfWords;
> >     shaper_item.stringLength = numberOfWords;
> >
> >     int l = 0;
> >     while(1){
> >
> >         shaper_item.num_glyphs = numberOfWords;
> >         if (!hb_utf16_script_run_next(NULL, &shaper_item.item,
> > shaper_item.string, numberOfWords , &iterator))
> >             break;
> >
> >         memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
> >         HB_ShapeItem(&shaper_item);
> >         for (unsigned int j = 0; j < shaper_item.item.length; j++)
> >         {
> >             ind[l++] = out_glyphs[j];
> >         }
> >     }
> >
> > so at the end, ind contains all the index of the glyph I need to draw.
> >
> > Is there a way harfbuzz can tell me to get rid of a glyph?
> >
> > Thanks, Nico.
> 
_______________________________________________
HarfBuzz mailing list
HarfBuzz <at> lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Nicolas Lacombe | 4 Dec 18:26 2009
Picon

Re: Arabic decoder

Thanks for the answear.

I found that Freetype and Harfbuzz don't give the same advance for some glyph. Since I guess it's only looking to the font setting, how is that possible?

I guess Harfbuzz is doing more than just looking to the font, I should use his advance then.



2009/12/4 Lars Knoll <lars.knoll-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>
On Friday 04 December 2009 08:21:16 am Nicolas Lacombe wrote:
> problem solved:

[snip]

> I'm still having problem rendering complexe sentances, but it looks like
> working fine on small letter... I'm wondering if I have to take into
>  account the advances given by the shaper... for now I'm just letting
>  freetype taking care of it for me... need some test.

There's a reason harfbuzz delivers you advances and positions for the glyphs
;-) You'll need to use them to get correct layout.

> Are you aware of any problem in the arabic/syriac translation module, or is
> it suppose to work fine?

We're using the code directly in Qt and haven't gotten any bug reports about
problems with arabic or syriac for quite some time.

Cheers,
Lars

>
> Behdad answear to this question with:
>
> "Supposed to work fine.  Though you need a bidi engine also.  Try setting
> bidi level to 1 instead of 0..."
>
>
> Wich does not appear to change a lot of things. However I resolved my
> problem by changing font... they are very important, and I'm having trouble
> to find good Persian and Syriac font.
>
>
>
> 2009/12/2 Nicolas Lacombe <n.lacombe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>
> > Hi,
> >
> > I'm using HarfBuzz with freetype to render arabic text. I must say that
> > this library was exactly what I was looking for, great job!
> >
> > Anyway I spotted something strange, I don't know if it's a bug or if it's
> > me that didn't understand correctly the spirit behind Harfbuzz.
> >
> > When rendering arabic letter equivalent of la , it should give me only
> > one glyph (لا).
> >
> > However, Harfbuzz render me correctly the glyph, but do not get rid of
> > the a. See screenshot here:
> >
> > http://img522.yfrog.com/i/21856341.png/
> >
> >
> > Here's my (simplified) render loop:
> >
> > HB_ShaperItem shaper_item;
> >
> >     shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1,
> > NULL, &numberOfWords, NULL);
> >
> >    shaper_item.kerning_applied = 0;
> >     shaper_item.stringLength = 0;
> >     shaper_item.shaperFlags = 0;
> >     shaper_item.font = &hbFont;
> >     shaper_item.face = hbFace;
> >     shaper_item.glyphIndicesPresent = 0;
> >     shaper_item.initialGlyphCount = 0;
> >
> >     shaper_item.item.bidiLevel = 0;
> >
> >     out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));
> >     memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
> >     out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords *
> > sizeof(HB_GlyphAttributes));
> >     memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));
> >     out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));
> >     memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));
> >     out_offsets = (HB_FixedPoint*)malloc(numberOfWords *
> > sizeof(HB_FixedPoint));
> >     memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));
> >     out_logClusters = (unsigned short*)malloc(numberOfWords *
> > sizeof(unsigned short));
> >     memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));
> >
> >     shaper_item.glyphs = out_glyphs;
> >     shaper_item.attributes = out_attrs;
> >     shaper_item.advances = out_advs;
> >     shaper_item.offsets = out_offsets;
> >     shaper_item.log_clusters = out_logClusters;
> >     shaper_item.num_glyphs = numberOfWords;
> >     shaper_item.stringLength = numberOfWords;
> >
> >     int l = 0;
> >     while(1){
> >
> >         shaper_item.num_glyphs = numberOfWords;
> >         if (!hb_utf16_script_run_next(NULL, &shaper_item.item,
> > shaper_item.string, numberOfWords , &iterator))
> >             break;
> >
> >         memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));
> >         HB_ShapeItem(&shaper_item);
> >         for (unsigned int j = 0; j < shaper_item.item.length; j++)
> >         {
> >             ind[l++] = out_glyphs[j];
> >         }
> >     }
> >
> > so at the end, ind contains all the index of the glyph I need to draw.
> >
> > Is there a way harfbuzz can tell me to get rid of a glyph?
> >
> > Thanks, Nico.
>
_______________________________________________
HarfBuzz mailing list
HarfBuzz-PD4FTy7X32lNgt0PjOBp9w@public.gmane.orgp.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

<div>
<p>Thanks for the answear.<br><br>I found that Freetype and Harfbuzz don't give the same advance for some glyph. Since I guess it's only looking to the font setting, how is that possible? <br><br>I guess Harfbuzz is doing more than just looking to the font, I should use his advance then.<br><br><br><br></p>
<div class="gmail_quote">2009/12/4 Lars Knoll <span dir="ltr">&lt;<a href="mailto:lars.knoll@...">lars.knoll@...</a>&gt;</span><br><blockquote class="gmail_quote">
On Friday 04 December 2009 08:21:16 am Nicolas Lacombe wrote:<br>
&gt; problem solved:<br><br>
[snip]<br><div class="im">
<br>
&gt; I'm still having problem rendering complexe sentances, but it looks like<br>
&gt; working fine on small letter... I'm wondering if I have to take into<br>
&gt; &nbsp;account the advances given by the shaper... for now I'm just letting<br>
&gt; &nbsp;freetype taking care of it for me... need some test.<br><br>
</div>There's a reason harfbuzz delivers you advances and positions for the glyphs<br>
;-) You'll need to use them to get correct layout.<br><div class="im">
<br>
&gt; Are you aware of any problem in the arabic/syriac translation module, or is<br>
&gt; it suppose to work fine?<br><br>
</div>We're using the code directly in Qt and haven't gotten any bug reports about<br>
problems with arabic or syriac for quite some time.<br><br>
Cheers,<br>
Lars<br><div>
<div></div>
<div class="h5">
<br>
&gt;<br>
&gt; Behdad answear to this question with:<br>
&gt;<br>
&gt; "Supposed to work fine. &nbsp;Though you need a bidi engine also. &nbsp;Try setting<br>
&gt; bidi level to 1 instead of 0..."<br>
&gt;<br>
&gt;<br>
&gt; Wich does not appear to change a lot of things. However I resolved my<br>
&gt; problem by changing font... they are very important, and I'm having trouble<br>
&gt; to find good Persian and Syriac font.<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; 2009/12/2 Nicolas Lacombe &lt;<a href="mailto:n.lacombe@...">n.lacombe@...</a>&gt;<br>
&gt;<br>
&gt; &gt; Hi,<br>
&gt; &gt;<br>
&gt; &gt; I'm using HarfBuzz with freetype to render arabic text. I must say that<br>
&gt; &gt; this library was exactly what I was looking for, great job!<br>
&gt; &gt;<br>
&gt; &gt; Anyway I spotted something strange, I don't know if it's a bug or if it's<br>
&gt; &gt; me that didn't understand correctly the spirit behind Harfbuzz.<br>
&gt; &gt;<br>
&gt; &gt; When rendering arabic letter equivalent of la , it should give me only<br>
&gt; &gt; one glyph (&#1604;&#1575;).<br>
&gt; &gt;<br>
&gt; &gt; However, Harfbuzz render me correctly the glyph, but do not get rid of<br>
&gt; &gt; the a. See screenshot here:<br>
&gt; &gt;<br>
&gt; &gt; <a href="http://img522.yfrog.com/i/21856341.png/" target="_blank">http://img522.yfrog.com/i/21856341.png/</a><br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; Here's my (simplified) render loop:<br>
&gt; &gt;<br>
&gt; &gt; HB_ShaperItem shaper_item;<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.string = (HB_UChar16 *) g_utf8_to_utf16((gchar*)txt, -1,<br>
&gt; &gt; NULL, &amp;numberOfWords, NULL);<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp;shaper_item.kerning_applied = 0;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.stringLength = 0;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.shaperFlags = 0;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.font = &amp;hbFont;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.face = hbFace;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.glyphIndicesPresent = 0;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.initialGlyphCount = 0;<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.item.bidiLevel = 0;<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; out_glyphs = (HB_Glyph*)malloc(numberOfWords * sizeof(HB_Glyph));<br>
&gt; &gt; &nbsp; &nbsp; memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));<br>
&gt; &gt; &nbsp; &nbsp; out_attrs = (HB_GlyphAttributes*)malloc(numberOfWords *<br>
&gt; &gt; sizeof(HB_GlyphAttributes));<br>
&gt; &gt; &nbsp; &nbsp; memset(out_attrs, 0, numberOfWords * sizeof(HB_GlyphAttributes));<br>
&gt; &gt; &nbsp; &nbsp; out_advs = (HB_Fixed*)malloc(numberOfWords * sizeof(HB_Fixed));<br>
&gt; &gt; &nbsp; &nbsp; memset(out_advs, 0, numberOfWords * sizeof(HB_Fixed));<br>
&gt; &gt; &nbsp; &nbsp; out_offsets = (HB_FixedPoint*)malloc(numberOfWords *<br>
&gt; &gt; sizeof(HB_FixedPoint));<br>
&gt; &gt; &nbsp; &nbsp; memset(out_offsets, 0, numberOfWords * sizeof(HB_FixedPoint));<br>
&gt; &gt; &nbsp; &nbsp; out_logClusters = (unsigned short*)malloc(numberOfWords *<br>
&gt; &gt; sizeof(unsigned short));<br>
&gt; &gt; &nbsp; &nbsp; memset(out_logClusters, 0, numberOfWords * sizeof(unsigned short));<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.glyphs = out_glyphs;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.attributes = out_attrs;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.advances = out_advs;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.offsets = out_offsets;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.log_clusters = out_logClusters;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.num_glyphs = numberOfWords;<br>
&gt; &gt; &nbsp; &nbsp; shaper_item.stringLength = numberOfWords;<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; int l = 0;<br>
&gt; &gt; &nbsp; &nbsp; while(1){<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; shaper_item.num_glyphs = numberOfWords;<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; if (!hb_utf16_script_run_next(NULL, &amp;shaper_item.item,<br>
&gt; &gt; shaper_item.string, numberOfWords , &amp;iterator))<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; break;<br>
&gt; &gt;<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; memset(out_glyphs, 0, numberOfWords * sizeof(HB_Glyph));<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; HB_ShapeItem(&amp;shaper_item);<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; for (unsigned int j = 0; j &lt; shaper_item.item.length; j++)<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; {<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ind[l++] = out_glyphs[j];<br>
&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; }<br>
&gt; &gt; &nbsp; &nbsp; }<br>
&gt; &gt;<br>
&gt; &gt; so at the end, ind contains all the index of the glyph I need to draw.<br>
&gt; &gt;<br>
&gt; &gt; Is there a way harfbuzz can tell me to get rid of a glyph?<br>
&gt; &gt;<br>
&gt; &gt; Thanks, Nico.<br>
&gt;<br>
</div>
</div>_______________________________________________<br>
HarfBuzz mailing list<br><a href="mailto:HarfBuzz@...">HarfBuzz@...p.org</a><br><a href="http://lists.freedesktop.org/mailman/listinfo/harfbuzz" target="_blank">http://lists.freedesktop.org/mailman/listinfo/harfbuzz</a><br>
</blockquote>
</div>
<br>
</div>
Behdad Esfahbod | 5 Dec 18:50 2009

Re: Thoughts on harfbuzz API

On 12/04/2009 03:09 AM, Martin Hosken wrote:
> Dear All,
>
> I'm in the process of writing a python wrapper to help with testing harfbuzz before hopefully integrating
Graphite. This gives me a good way to review the API :) and here are some thoughts.

Thanks Martin.

> 1. Features
>
> Currently a feature in hb-shape.h is defined as an association between two char * over a range. My
understanding of all smart font technologies is that they work with longs. So I would suggest making the
name and value entries unsigned longs rather than char *.

That may be true, but from a user point of view, I'd rather keep it as generic 
as possible.  Jonathan and I discussed also providing an integer API, and that 
most probably will happen at some point, but I want to keep the hb_shape() API 
as is.

> 2. Script and Lang
>
> Currently script and lang values are enums.

Are they?  Script is, but not lang:

typedef const void *hb_language_t;

hb_language_t
hb_language_from_string (const char *str);

const char *
hb_language_to_string (hb_language_t language);

For script, we need the Unicode script anyway.  Though we would allow, for 
example, providing the OpenType script tag directly too.

behdad

> Yours, humbly submitted,
> Martin
>
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz@...
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
Martin Hosken | 7 Dec 04:33 2009

Re: Thoughts on harfbuzz API

Dear Behdad,

> > I'm in the process of writing a python wrapper to help with testing harfbuzz before hopefully
integrating Graphite. This gives me a good way to review the API :) and here are some thoughts.

I've got it to the point that I can print out a list of glyphs & positions, etc. from a text buffer and a font. So
that's probably far enough to be useful.

> > 1. Features
> >
> > Currently a feature in hb-shape.h is defined as an association between two char * over a range. My
understanding of all smart font technologies is that they work with longs. So I would suggest making the
name and value entries unsigned longs rather than char *.
> 
> That may be true, but from a user point of view, I'd rather keep it as generic 
> as possible.  Jonathan and I discussed also providing an integer API, and that 
> most probably will happen at some point, but I want to keep the hb_shape() API 
> as is.

But that openness comes with a cost. The cost is that the mapping between the input and what is stored in the
font has to be thoroughly described. Let me take each of the 3 aspects (features, lang, script) in turn.

1. Features

Inside a font a feature identifier is either a 16-bit number in the case of AAT or a 32-bit tag in OT or a 32-bit
num/tag in Graphite. Both AAT and Graphite have an optional linkage from a feature identifier to a
language string for its name. Now how might we interpret the feature identifier string (name)? It could be
an ascii number which is converted to either a 16 bit or 32 bit number, or it could be a 4 char tag that gets
converted to a long or it could be a UI level name that has to be interpretted via a specified (or defaulted)
language id and the name table. Ultimately, I would suggest that it has to map down to a long (which for AAT
can be further truncated to a 16-bit id). Given that the choice of what the input char * may be is up to the
calling application, I would suggest that 
 the mapping is best done there and just pass in the long. Thus reducing the complexity of harfbuzz.

Likewise for the value of a feature, again it has to get down to a number, in this case. In the case of OT it can be
more than just 0 or 1 as some newer features take a numeric parameter. So I would suggest for the ease of
harfbuzz it is passed as a long.

There is nothing to stop us later adding helper functions that can fill in the entries of a feature struct
from char *s. But I would suggest we start simply.

2. Langs

I was about to write a similar argument for langs, but then realised you are right. The lang identifier
should be a full string. My main concern here is that the list of languages supported by harfbuzz be open. I
think your current solution works well: allowing an initialised cache and caching the rest.

3. Scripts

As for languages, I think we have an opportunity here to make harfbuzz resilient against Unicode version
changes. If the script is passed as a string instead of as a member of an enum, then there is no enum that has to
be updated every Unicode release with all the new scripts that have been added. It's a simple matter to
dictate that the string is interpretted via ISO 15924. This will make harfbuzz more stable, especially
when it ends up in embedded devices without an annual upgrade cycle.

This is not to say that a segmenter can't work with a closed set of scripts (although the more that can be done
to open such things up, the better). Also the mapping from script to shaper in OT would become a search
(binary perhaps) rather than a simple array lookup. But I think the gained forward compatibility would be
worth the cost.

Yours,
Martin
Martin Hosken | 9 Dec 02:44 2009

More progress on hbng & graphite and questions

Dear Behdad,

I'm making good progress. Down to fixing bugs.

1. hb_ft_get_glyph_metrics seems to be returning all its values in 26.6 fixed point. I assume this isn't
meant to be the standard for harfbuzz. But it might be. We had a similar issue in graphite and rounding
errors here encouraged us to go with floats.

2. Are you sure you want top side bearing for your y_offset in glyph bearings in hb-ft.c? Perhaps it's ascent
minus that?

3. My code is all pretty much in C++ so that's going to pull in the libstdc++ how hard do we have to try not to do
that? I've followed the same approach in configure.ac of only including graphite integration if the
library is there at build time.

I'll probably think of more issues as I go on.

Yours,
Martin
Martin Hosken | 10 Dec 04:47 2009

graphite and python available

Dear All,

a git repo with graphite and python additions is available from git@...:harfbuzz-dev/harfbuzz-dev.git

If you are tracking behdad's harfbuzz-ng repo then you probably can just pull from above without needing to
clone, but I'm a bit of a git newbie so probably have done it all wrong. Comments, suggestions on both the git
side and the programming are welcomed.

Yours,
Martin


Gmane