Ian Romanick | 1 Nov 2009 02:00

Re: Visable performance regression in sauerbraten game


Maxim Levitsky wrote:
> Tested revision just before texformat-rework, and everything was fine,
> after there few revisions that won't work, and then I notice significant
> speed reduction, and hiccups.
> 
> Tested with default settings + 800x600, Intel G965

Is there a specific test case or application that shows the regression?
Ian Romanick | 1 Nov 2009 02:09

Re: Initial version of GL_MESA_gpu_program3


Keith Whitwell wrote:
> On Mon, 2009-10-19 at 16:05 -0700, Ian Romanick wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Version 2 of the extension spec has been posted:
>>
>> http://people.freedesktop.org/~idr/MESA_gpu_program3.txt
>>
>> Unless anyone has major comments or objections, I think it's time to
>> circulate this to down-stream users (e.g., Wine).  Who are the right
>> contacts?
> 
> Is the intention to fill the gap between where we are now and the NV
> program4 extensions, or to start out on a new MESA-specific path which
> would include later on a MESA_gpu_program4 extension?
> 
> The question is relevant because of things like condition-codes which
> are in the NV GPU4 extension.  
>   - If this is an intermediate step on the way to NV GPU4, then we
> should probably prefer condition-codes over predicates.
>   - If we expect to define a Mesa GPU4, then sticking with predicates is
> fine.

For right now, the intention is to target what shipping hardware that
Mesa supports can do. :)  R500 can't do NVIDIA-style condition codes.
We can fake it on 965, but it some cases a single instruction would be
expanded to a big pile of instructions.

(Continue reading)

Ian Romanick | 1 Nov 2009 02:13

Re: [PATCH] gallium: Add a PREDICATE register file.


michal wrote:
> Keith Whitwell pisze:
>> On Fri, 2009-10-30 at 11:24 -0700, michal wrote:
>>   
>>> +/*
>>> + * Currently, the following constraints apply.
>>> + *
>>> + * - PredSwizzleXYZW is either set to identity or replicate.
>>> + * - PredSrcIndex is 0.
>>> + */
>>>     
>> Michal,
>>
>> This is looking a lot better.  In terms of the above comment, is this
>> talking about the semantics of PIPE_CAP_GPU3 ?  Or is GPU3 supposed to
>> do full PredSwizzle/PredSrcIndex, just we haven't implemented it
>> somewhere (eg in tgsi_exec.c)?
>>
>> I'd think we want to either:
>> 	- remove fields from the token so that the comment isn't necessary,
>> 	- remove the comment and have GPU3 mean that the full semantics are
>> available
>> 	- come up with yet another cap bit to say whether or not full predicate
>> semantics are implemented by a particular driver.
>>
>> Needless to say I don't like the last option, so I guess that means we
>> need to decide now whether the full semantics in the token are in or
>> out.  
>>
(Continue reading)

Marek Olšák | 1 Nov 2009 04:52
Picon

[PATCH] mesa/st: fix crash when drawing stencil pixels outside the drawbuffer

Hi,

the attached patch fixes a possible crash in function draw_stencil_pixels in mesa/state_tracker. I've also updated the list of formats we read from to prevent an assertion failing later in the code.

This makes glean/depthStencil work on r300g and softpipe.

Best regards
Marek Olšák

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Pierre Ossman | 1 Nov 2009 14:34
Picon

r600/r700 compiler future?

I'm looking at r600/r700 compiler with the ambition of filling in the
missing pieces. I've just read through the documentation and the basic
structure of the compiler, and I'm having a hard time understanding the
design choices of the code. Hopefully someone can fill me in on what
the plan is here.

The basic problem is that the compiler is a somewhat poor fit for the
hardware. The compiler is designed around vectors, whilst the hardware
works more in terms of individual elements (albeit with a whole bunch
of restrictions). This disparity means that the compiler in its current
form can be very inefficient. Worst case scenario is using 25% of the
hardware.

As an example, I've been looking at implementing the EXP op. A simple
implementation would use 4 instruction groups with the current
compiler, but the hardware should be capable of doing it with 2. An
optimised implementation would range in efficiency between 100% and 33%
of hardware capability. The common case would probably still be 50%.

So what's the plan here? This kind of inefficiency must be a temporary
solution and not the final goal?

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by FRA, a
  Swedish intelligence agency. Make sure your server uses
  encryption for SMTP traffic and consider using PGP for
  end-to-end encryption.
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Alex Deucher | 1 Nov 2009 17:06
Picon

Re: r600/r700 compiler future?

On Sun, Nov 1, 2009 at 9:34 AM, Pierre Ossman <pierre-list <at> ossman.eu> wrote:
> I'm looking at r600/r700 compiler with the ambition of filling in the
> missing pieces. I've just read through the documentation and the basic
> structure of the compiler, and I'm having a hard time understanding the
> design choices of the code. Hopefully someone can fill me in on what
> the plan is here.
>
> The basic problem is that the compiler is a somewhat poor fit for the
> hardware. The compiler is designed around vectors, whilst the hardware
> works more in terms of individual elements (albeit with a whole bunch
> of restrictions). This disparity means that the compiler in its current
> form can be very inefficient. Worst case scenario is using 25% of the
> hardware.
>
> As an example, I've been looking at implementing the EXP op. A simple
> implementation would use 4 instruction groups with the current
> compiler, but the hardware should be capable of doing it with 2. An
> optimised implementation would range in efficiency between 100% and 33%
> of hardware capability. The common case would probably still be 50%.
>
> So what's the plan here? This kind of inefficiency must be a temporary
> solution and not the final goal?

At this point the compiler is pretty much unoptimized; it was written
to make sure everything worked.  In most cases it just provides the
basic functionality, so optimizations are welcome.   I've cc'ed
Richard as he has written most of the shader code at this point.

Alex

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
Maxim Levitsky | 2 Nov 2009 13:57
Picon

Re: Visable performance regression in sauerbraten game

On Sat, 2009-10-31 at 18:00 -0700, Ian Romanick wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Maxim Levitsky wrote:
> > Tested revision just before texformat-rework, and everything was fine,
> > after there few revisions that won't work, and then I notice significant
> > speed reduction, and hiccups.
> > 
> > Tested with default settings + 800x600, Intel G965
> 
> Is there a specific test case or application that shows the regression?
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAkrs3acACgkQX1gOwKyEAw8oFwCgiv04BtX59QOlQ/9YSZnkXp8n
> I/0Anihf4O3wQCmmcszj/3+meDHXGE58
> =SAEd
> -----END PGP SIGNATURE-----

To reproduce, just start the sauerbraten game.
I play now the version in ubuntu 9.10 (0.0.20090504.dfsg-1)

I tested the 'Private stan sauer', part 2 (run and guns ||)
Here attached version of configuration I used.

With this regression, I see frequent hickups, that make the game almost
unplayable. 

Without the regression I get about 18-24 FPS.

I bisected this bug (and tested successfully master branch with this
commit reverted)

2c30ee9bd69ed606b984c051748a7cdb34905eeb is first bad commit
commit 2c30ee9bd69ed606b984c051748a7cdb34905eeb
Author: Eric Anholt <eric <at> anholt.net>
Date:   Fri Oct 30 13:20:13 2009 -0700

    i965: Fix BRW_WM_MAX_INSN to reflect current limits.

    Part of fixing bug #24355.

Best regards,
	Maxim Levitsky
Attachment (config.cfg): text/x-csrc, 10 KiB
Attachment (init.cfg): text/x-csrc, 240 bytes
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Karl Schultz | 2 Nov 2009 15:24
Picon

Re: [Bug 24650] Visual Studio takes over 30 minutes to compile Mesa core

I've seen the older (2005) compilers take a long time to compile one the files, image.c I think, in 32-bit Release mode.  I think that this source code file used a lot of macros that expanded out to a lot of code.  And this code's size or structure caused the optimizer to take a long time to build it.

I worked around it in the 2005 studio by turning off optimization just for that one file.  I have not noticed the problem with the 2008 tools.

On Wed, Oct 21, 2009 at 1:45 AM, <bugzilla-daemon <at> freedesktop.org> wrote:
--- Comment #5 from José Fonseca <jfonseca <at> vmware.com>  2009-10-21 01:45:33 PST ---
We're using the WinSDK compilers and we're seeing the slow compilation times
here too. I've never seen the corrput .obj files, though -- it might a bug be
specific to 2005.

MSVC x64 compiler optimization algorithms appear to be inefficient. Certain
source files plus any optimization option causes it to take ages -- probably
trying to evaluate a combinatorial explosion of alternatives.

I couldn't detect anything wrong or fancy with the source files in question. It
is not even clear why some files are slower and other not. gcc compiles the
same code, both in x86 or x64, without any sweat.

I haven't tried VS 2008 yet.


--
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
bugzilla-daemon | 2 Nov 2009 15:30

[Bug 24650] Visual Studio takes over 30 minutes to compile Mesa core

http://bugs.freedesktop.org/show_bug.cgi?id=24650

--- Comment #6 from Karl Schultz <Karl.W.Schultz <at> gmail.com>  2009-11-02 06:30:43 PST ---
I've seen the older (2005) compilers take a long time to compile one the
files, image.c I think, in 32-bit Release mode.  I think that this source
code file used a lot of macros that expanded out to a lot of code.  And this
code's size or structure caused the optimizer to take a long time to build
it.

I worked around it in the 2005 studio by turning off optimization just for
that one file.  I have not noticed the problem with the 2008 tools.

On Wed, Oct 21, 2009 at 1:45 AM, <bugzilla-daemon <at> freedesktop.org> wrote:

> http://bugs.freedesktop.org/show_bug.cgi?id=24650
>
>
>
>
>
> --- Comment #5 from José Fonseca <jfonseca <at> vmware.com>  2009-10-21
> 01:45:33 PST ---
> We're using the WinSDK compilers and we're seeing the slow compilation
> times
> here too. I've never seen the corrput .obj files, though -- it might a bug
> be
> specific to 2005.
>
> MSVC x64 compiler optimization algorithms appear to be inefficient. Certain
> source files plus any optimization option causes it to take ages --
> probably
> trying to evaluate a combinatorial explosion of alternatives.
>
> I couldn't detect anything wrong or fancy with the source files in
> question. It
> is not even clear why some files are slower and other not. gcc compiles the
> same code, both in x86 or x64, without any sweat.
>
> I haven't tried VS 2008 yet.
>
>
> --
> Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> Mesa3d-dev mailing list
> Mesa3d-dev <at> lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
>

--

-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Li, RichardZ | 2 Nov 2009 16:35
Picon
Favicon

Re: r600/r700 compiler future?

> -----Original Message-----
> From: Alex Deucher [mailto:alexdeucher <at> gmail.com]
> Sent: Sunday, November 01, 2009 11:06 AM
> To: Pierre Ossman
> Cc: mesa3d-dev <at> lists.sourceforge.net; Li, RichardZ
> Subject: Re: [Mesa3d-dev] r600/r700 compiler future?
> 
> On Sun, Nov 1, 2009 at 9:34 AM, Pierre Ossman <pierre-list <at> ossman.eu>
> wrote:
> > I'm looking at r600/r700 compiler with the ambition of filling in
the
> > missing pieces. I've just read through the documentation and the
basic
> > structure of the compiler, and I'm having a hard time understanding
the
> > design choices of the code. Hopefully someone can fill me in on what
> > the plan is here.
> >
> > The basic problem is that the compiler is a somewhat poor fit for
the
> > hardware. The compiler is designed around vectors, whilst the
hardware
> > works more in terms of individual elements (albeit with a whole
bunch
> > of restrictions). This disparity means that the compiler in its
current
> > form can be very inefficient. Worst case scenario is using 25% of
the
> > hardware.

Yeah, there are 5 micro processor grouped into one vector processor
which intends to process ALU instruction group. Each ALU instruction
group consists of 5 per-component instruction, optionally 2 constants.
ALU instruction groups form the clause. 
Surely there could be less per-component instruction in one ALU
instruction group. In any case, the last instruction should set
m_Word0.f.last = 1.

> >
> > As an example, I've been looking at implementing the EXP op. A
simple
> > implementation would use 4 instruction groups with the current
> > compiler, but the hardware should be capable of doing it with 2. An
> > optimised implementation would range in efficiency between 100% and
33%
> > of hardware capability. The common case would probably still be 50%.
> >

Given concept of r6/r7 alu processor, surely we could choose only
effective alu component instruction to emit; we could use write-mask to
do that. In r6/r7 mesa dri driver, there is assemble_alu_instruction to
assemble the alu instructions, where write-mask should be referenced.

> > So what's the plan here? This kind of inefficiency must be a
temporary
> > solution and not the final goal?

Yes, you are right. When the compiler code was put there, as Alex said,
we hope to get everything worked. Certainly emit alu instruction based
on write-mask is the must-do tune up for current compiler. We are
preparing compiling capability for r6/r7 driver for mesa glsl IL to hw
instructions. Originally I hoped could tune up compiler after it. 
Surely any optimizations are welcome, especially this one. I hope
current code is only a start point for us to make its graphics run
better together. :-)

> 
> At this point the compiler is pretty much unoptimized; it was written
> to make sure everything worked.  In most cases it just provides the
> basic functionality, so optimizations are welcome.   I've cc'ed
> Richard as he has written most of the shader code at this point.
> 
> Alex

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

Gmane