Dirk Reiners | 8 Mar 2012 19:01
Picon

Doxygen Performance Issues and Solutions


	Hi All,

we've been struggling with doxygen's performance for a while. Generating the 
docs for our project (OpenSG) with the stock 1.7.4 on F14 takes about 10 hours 
(and that's without using dot).

So I ran the whole thing through cachegrind (don't ask me how long that took ;) 
and I found a few hotspots. The biggest one by far were the isAccessibleFrom and 
isAccessibleFromWithExpScope functions in util.cpp. They use a QDict for storing 
results, which requires building a string for the key every time a test is made, 
which is very slow. I replaced that with a little struct and the QDict with an 
std::vector (it is really just used as a stack), which gave a significant speed 
boost.

I also tried adding a cache to isAccessibleFrom, but that had some problems (it 
needed to be flushed a few times during the process, and I'm not sure exactly 
why or if I got all cases), and it did not make a very big difference in the 
end. It's in there, but it's easy to take out (search for 
USE_ISACCESSIBLEFROM_CACHE in utils.cpp and the clearIsAccessibleFromCache(); 
calls in doxygen.cpp).

The other big one was the creation of the PNG files for the class diagrams etc. 
The images doxygen creates are pretty simple and have long single color runs. On 
these images the brute force encoder (which is still in the code) runs quite a 
bit faster than the smart one that is used by default. To make things even 
better I added some special case code that explicitly tries to find single color 
runs and quickly encodes them. The resulting PNGs are slightly larger than the 
original ones, but the encoding is noticeably faster.

(Continue reading)

Dimitri Van Heesch | 10 Mar 2012 21:13
Picon
Gravatar

Re: Doxygen Performance Issues and Solutions

Hi Dirk,

On Mar 8, 2012, at 19:01 , Dirk Reiners wrote:

> 
> 	Hi All,
> 
> we've been struggling with doxygen's performance for a while. Generating the docs for our project
(OpenSG) with the stock 1.7.4 on F14 takes about 10 hours (and that's without using dot).
> 
> So I ran the whole thing through cachegrind (don't ask me how long that took ;) and I found a few hotspots. The
biggest one by far were the isAccessibleFrom and isAccessibleFromWithExpScope functions in util.cpp.
They use a QDict for storing results, which requires building a string for the key every time a test is made,
which is very slow. I replaced that with a little struct and the QDict with an std::vector (it is really just
used as a stack), which gave a significant speed boost.
> 
> I also tried adding a cache to isAccessibleFrom, but that had some problems (it needed to be flushed a few
times during the process, and I'm not sure exactly why or if I got all cases), and it did not make a very big
difference in the end. It's in there, but it's easy to take out (search for USE_ISACCESSIBLEFROM_CACHE in
utils.cpp and the clearIsAccessibleFromCache(); calls in doxygen.cpp).
> 
> The other big one was the creation of the PNG files for the class diagrams etc. The images doxygen creates
are pretty simple and have long single color runs. On these images the brute force encoder (which is still
in the code) runs quite a bit faster than the smart one that is used by default. To make things even better I
added some special case code that explicitly tries to find single color runs and quickly encodes them. The
resulting PNGs are slightly larger than the original ones, but the encoding is noticeably faster.
> 
> With my patches on top of the latest SVN the time goes down to ~70 minutes, or an order of magnitude less,
which is still not really fast but a lot more reasonable, IMHO. :)

(Continue reading)

Robert Abel | 17 Mar 2012 07:39
Picon
Picon

[PATCH] VHDL Component Instantiation Fixes

Hi,

I noticed that component instantiations were handled improperly:

Doubly instantiated components were just hidden and not documented at
all. This resulted in wrong documentation. The original hack (keeping a
list inside the parser) might have been intended to relieve the
"inheritance graphs" of some clutter, but that did not work properly either.
Component Instances would also point to "dummy.html" instead of their
containing architecture's page. I fixed that quick and dirty as well.
Still, the whole VHDL module seems like one big hack anyway :-/

Patch is against the latest trunk.

Signed-off-by: Robert Abel <abel <at> uni-bielefeld.de>

diff -Naur doxygena/src/classdef.cpp doxygen/src/classdef.cpp
--- doxygena/src/classdef.cpp	2012-03-17 07:28:12.739596100 +0100
+++ doxygen/src/classdef.cpp	2012-03-17 05:59:25.844915100 +0100
 <at>  <at>  -2529,7 +2529,7  <at>  <at> 
 // returns TRUE iff class definition `bcd' represents an (in)direct base 
 // class of class definition `cd'.
 
-bool ClassDef::isBaseClass(ClassDef *bcd, bool followInstances,int level)
+bool ClassDef::isBaseClass(ClassDef *bcd, bool followInstances,int level, int maxlevel)
 {
   bool found=FALSE;
   //printf("isBaseClass(cd=%s) looking for %s\n",name().data(),bcd->name().data());
(Continue reading)

Robert Abel | 18 Mar 2012 02:10
Picon
Picon

[BUG] Tag File Parsing Issue

Hi,

I'm currently using tag files to stitch multiple documentation files (for multiple programming languages) together.

There seem to be two bugs related to tag/xml files:
  1. Tag files are statically produced with encoding='ISO-8859-1' (doxygen.cpp ll.10604).
    Yet there is not one instance of a conversion function used that I could find that would actually convert any tag file output from the source input file encoding given using INPUT_ENCODING. That doesn't seem right!
  2. There is a bug in tagreader.cpp. Basically, QXmlSimpleReader (qxml.cpp) will read any XML input file according to its encoding stated in each file. However, the TagFileParser handler in tagreader.cpp will store all incoming QString (16bit) strings inside m_curString which is a QCString (8bit) inside bool characters (tagreader.cpp ll. 789).
    This effectively annihilates the correctly parsed XML source encoding when curString is assigned to different information entities, e.g. when assigning group titles in ll. 664. While I'm not 100% sure what happens at this implicit conversion, I reckon the QString will be using the thread locale to convert the QCString back to 16bit, thus resulting in gibberish when thread locale and XML encoding mismatch.

As a quick fix for 2.), I changed the declaration of m_curString to QString so no conversions take place (but there may be memory overhead wrt explicit/implicit sharing I read?). I didn't notice any immediate problems with this hack.

As a fix for 1.) I propose to either actually convert to ISO-8859-1 from the INPUT_ENCODING, or just declare the XML file to be encoded using INPUT_ENCODING. The latter would be simplest and cleanest, IMHO.

Also, please notice that 2.) cannot just be fixed by fixing 1.), since tag files might be produced by "3rd party" software using any encoding they wish. The quick and dirty fix for 2.) I did would probably need some revisitation by someone who knows about the memory overhead/sharing capabilities involved and can decide on a proper course of action. (Which is why I'm hesitant to post a [one-line] patch...)

Regards

Robert Abel

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Doxygen-develop mailing list
Doxygen-develop <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/doxygen-develop
Robert Abel | 20 Mar 2012 12:56
Picon
Picon

Re: [PATCH] VHDL Component Instantiation Fixes

On 19.03.2012 20:27, Dimitri Van Heesch wrote:
Can you send me an example that I can use to test your patch?

Yes, please find an example here. My example includes the VHDL instantiation bug/graph creation bug and the XML tag file bug I posted more recently to the list.

You will have to run

old-doxygen DoxyfileB
old-doxygen DoxyfileA

new-doxygen DoxyfileB
new-doxygen DoxyfileA_Fix

srcA contains all a DFF (sub_component), a senseless inverted DFF (sub_sub_component) and a top level entity. Instantiating sub_sub_component in toplevel will hide sub_component in toplevel completely depending on the order the instantiations are processed by the parser. So you will see different results on each parse when you switch the locations of dffA, dffNotA and dffB.

On a side note, I just noticed that detail descriptions on instantiations don't work for some reason...

srcB is only for showcasing the XML tag file bug. It creates two groups with "special characters" (though they aren's /that/ special...). When you look inside docA, you will find them to be gibberish.

I assume your maxlevel hack is used to see is a class is a direct base class or sub class, right?
Yes. It could be altered to be a boolean, either only direct "inheritance" or inheritance over all levels. That what would be needed for VHDL at least. However, I thought it might come in handy to have control over the depth of the recursion so the abort() can be avoided by other language parsers etc.

Regards,

Robert
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Doxygen-develop mailing list
Doxygen-develop <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/doxygen-develop
Dimitri Van Heesch | 20 Mar 2012 20:28
Picon
Gravatar

Re: [BUG] Tag File Parsing Issue

Hi Robert,

On Mar 18, 2012, at 2:10 , Robert Abel wrote:

> Hi,
> 
> I'm currently using tag files to stitch multiple documentation files (for multiple programming
languages) together.
> 
> There seem to be two bugs related to tag/xml files:
> 	• Tag files are statically produced with encoding='ISO-8859-1' (doxygen.cpp ll.10604).
> Yet there is not one instance of a conversion function used that I could find that would actually convert
any tag file output from the source input file encoding given using INPUT_ENCODING. That doesn't seem right!

Indeed, all internal strings inside doxygen are UTF-8 encoded, so the tag files as well. I'll correct this.

> 	• There is a bug in tagreader.cpp. Basically, QXmlSimpleReader (qxml.cpp) will read any XML input
file according to its encoding stated in each file. However, the TagFileParser handler in tagreader.cpp
will store all incoming QString (16bit) strings inside m_curString which is a QCString (8bit) inside
bool characters (tagreader.cpp ll. 789).
> This effectively annihilates the correctly parsed XML source encoding when curString is assigned to
different information entities, e.g. when assigning group titles in ll. 664. While I'm not 100% sure what
happens at this implicit conversion, I reckon the QString will be using the thread locale to convert the
QCString back to 16bit, thus resulting in gibberish when thread locale and XML encoding mismatch.
> As a quick fix for 2.), I changed the declaration of m_curString to QString so no conversions take place
(but there may be memory overhead wrt explicit/implicit sharing I read?). I didn't notice any immediate
problems with this hack.

I plan to remove to implicit conversion from QString to QCString and use an explicit .utf8() everywhere.

> 
> As a fix for 1.) I propose to either actually convert to ISO-8859-1 from the INPUT_ENCODING, or just
declare the XML file to be encoded using INPUT_ENCODING. The latter would be simplest and cleanest, IMHO.
> Also, please notice that 2.) cannot just be fixed by fixing 1.), since tag files might be produced by "3rd
party" software using       any encoding they wish.

No, I would state that doxygen requires tag files to be UTF-8 encoded, rather than supporting arbitrary
encodings or
depending on INPUT_ENCODING. Tag file are files to link different projects, which could have different
INPUT_ENCODING settings.

> The quick and dirty fix for 2.) I did would probably need some revisitation by someone who knows about the
memory overhead/sharing capabilities involved and can decide on a proper course of action. (Which is why
I'm hesitant to post a [one-line] patch...)

Will do. I try to get rid of QString as much as possible.

Regards,
  Dimitri

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
mkk1 | 24 Mar 2012 22:24
Picon

Re: [PATCH] VHDL Component Instantiation Fixes


Robert Abel wrote:
> 
> On 19.03.2012 20:27, Dimitri Van Heesch wrote:
>> Can you send me an example that I can use to test your patch?
> 
> Yes, please find an example here
> <https://sites.google.com/site/rawbdagslair/DoxygenTestProject.7z?attredirects=0&d=1>.
> My example includes the VHDL instantiation bug/graph creation bug and
> the XML tag file bug I posted more recently to the list.
> 
> You will have to run
> 
> old-doxygen DoxyfileB
> old-doxygen DoxyfileA
> 
> new-doxygen DoxyfileB
> new-doxygen DoxyfileA_Fix
> 
> srcA contains all a DFF (sub_component), a senseless inverted DFF
> (sub_sub_component) and a top level entity. Instantiating
> sub_sub_component in toplevel will hide sub_component in toplevel
> completely depending on the order the instantiations are processed by
> the parser. So you will see different results on each parse when you
> switch the locations of dffA, dffNotA and dffB.
> 
> On a side note, I just noticed that detail descriptions on
> instantiations don't work for some reason...
> 
> srcB is only for showcasing the XML tag file bug. It creates two groups
> with "special characters" (though they aren's /that/ special...). When
> you look inside docA, you will find them to be gibberish.
> 
>> I assume your maxlevel hack is used to see is a class is a direct
>> base class or sub class, right?
> Yes. It could be altered to be a boolean, either only direct
> "inheritance" or inheritance over all levels. That what would be needed
> for VHDL at least. However, I thought it might come in handy to have
> control over the depth of the recursion so the abort() can be avoided by
> other language parsers etc.
> 
> Regards,
> 
> Robert
> 
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here 
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Doxygen-develop mailing list
> Doxygen-develop <at> lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/doxygen-develop
> 
> 

--

-- 
View this message in context: http://old.nabble.com/-PATCH--VHDL-Component-Instantiation-Fixes-tp33521670p33544693.html
Sent from the Doxygen - Development mailing list archive at Nabble.com.

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
mkk1 | 24 Mar 2012 22:37
Picon

Re: [PATCH] VHDL Component Instantiation Fixes


I modified doxygen and generated the html documention for your
DoxygenTestProject example.
see http://www.2shared.com/file/hneieq4a/DoxygenTestProject.html
Can you verify the generated html documentation.If the documentation is
correct, I can post you a patch.

Martin
--

-- 
View this message in context: http://old.nabble.com/-PATCH--VHDL-Component-Instantiation-Fixes-tp33521670p33544694.html
Sent from the Doxygen - Development mailing list archive at Nabble.com.

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
Robert Abel | 27 Mar 2012 19:40
Picon
Picon

Re: [PATCH] VHDL Component Instantiation Fixes

Hi Martin,

On 24.03.2012 22:37, mkk1 wrote:
> I modified doxygen and generated the html documention for your
> DoxygenTestProject example.
> see http://www.2shared.com/file/hneieq4a/DoxygenTestProject.html
> Can you verify the generated html documentation.
I'm not sure what you did, but it seems correct. However, your instances
are kind of out-of-order.
> If the documentation is
> correct, I can post you a patch.
Care to explain what you did differently than what was done in my patch?
Did you keep the list and re-added instances later somewhere?

Regards

Robert

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure

Gmane