PackageObjectFactory's Class.forName usage
Seth Pellegrino <seth.pellegrino <at> jivesoftware.com>
2013-06-05 18:09:10 GMT
Hello Checkstyle Developers,
Our team is using checkstyle across a fairly intensely modularized project (over 200 POMs and counting).
Our build performance is nothing to write home about, as you might expect, but I was surprised to discover
that checkstyle was accounting for about a third of our build time on my machine.
My interest piqued, I landed in PackageObjectFactory. We're leveraging ~60 checks across most of our
modules, and since the maven-checkstyle-plugin creates a new Checker for every module, we ended up there
a lot. If I may summarize the logic in this class, it seems to be behaving like so:
For each suffix in ("", "Check"):
For each package in ("", ...):
where there's usually a half-dozen (or more) packages that we'll end up scanning through in the worst case.
This algorithm seems to me like an abuse of the class loading infrastructure; misses are rarely cached in
class loaders so for each non-hit we'll end up scanning the classpath. So, on average, every time we load a
check (the hot path through that class), we expect to scan the class path (num_packages * 3/2) times.
As a short-term solution, I've implemented caching (critically, with negative caching) in the
PackageObjectFactory, and I've flipped the load order to look for Check classes first (as this is the more
common case). These changes are available on bitbucket: https://bitbucket.org/sethp_jive/checkstyle/commits/fa747d132c52f584c7a85ffadd7fc041c449e80e
As a longer-term solution, it would be better to introduce an authoritative mapping between check name and
fully-qualified class name. Then PackageObjectLoader would be greatly simplified in code complexity
as well as runtime profile – it would merely expand the requested module name if such a mapping exists and
then do a single Class.forName call. That way, class loader's caching of existing classes gets leveraged
to the maximal extent, and the error case is the only case that we regularly re-scan the whole class path
from beginning to end.
Please take a look and let me know what you think.
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes