1 Feb 01:10 2009

> The GPU will read the 9 MB of pixel data from one location,
> then write
> it to another location, therefore using 18 MB of bandwidth.

Oh, so you were already assuming the surface was on video memory? Because I was thinking it was going from RAM
to vRAM. hence much slower.

> I really need to clean that junk up.

huahahahaha

> I'm not sure I see what you mean... You want to
> minimize the amount
> copied? If stuff is in video memory, using the hardware

yep.

> blitter to
> copy everything without bothering to optimize might
> actually be
> faster, as it is fewer commands than the detailed
> "instructions" that
> would be necessary to optimize that. Even when doing it in

i dont know.. after all each blit, on the lower level, is actually repeating memory copies for each
pixel(assuming a 32 bit depth on a 32bit machine). so the fewer pixels the better.

> main memory
> with the CPU, your code being simpler and having fewer
> branches might


1 Feb 01:14 2009

> If you're gonna be using the GPU you probably want the
> VRAM if you can.  And yes, it still needs to go through the
> GPU.  DMA is a very
> fast way to copy, but that's all it does is copy
> memory.  It can't do the number
> crunching part.

I dont get the point of DMA then :( probably because I dont know whats this number crunching doing... but I
dont get whats the point of DMA if the whole data have still to go through the gpu...

cheers,
AM.

--- Em fri, 30/1/09, Mason Wheeler <masonwheeler <at> yahoo.com> escreveu:

> De: Mason Wheeler <masonwheeler <at> yahoo.com>
> Assunto: Re: [SDL] DirectAccess to Video Memory [was: Hello, new member  reporting]
> Para: sdl <at> lists.libsdl.org
> Data: Sexta-feira, 30 de Janeiro de 2009, 20:11
> >----- Original Message ----
>
> >From: Antonio Marcos <amcmr2003 <at> yahoo.com.br>
> Hello, new member  reporting]
> >
> >> >I still dont get why I need 18MB bandwidth for
> 9MB.. :)
> >> Because a blit is not a straight-up copy.  It
> requires


1 Feb 16:50 2009

On Sat, Jan 31, 2009 at 7:14 PM, Antonio Marcos <amcmr2003 <at> yahoo.com.br> wrote:

> I dont get the point of DMA then :( probably because I dont know whats this number crunching doing... but I
dont get whats the point of DMA if the whole data have still to go through the gpu...

DMA is to send data from the main RAM to the video RAM. So in the case
of this example, where we wanted to top out the capacity of the GPU,
we'd use DMA (well, by "we", I mean the video card driver) to put the
sprites *once* in video RAM, then use the GPU to blit it at various
locations on the screen in the fastest possible way.

As Mason Wheeler said, you can't do anything special with DMA, though,
no colorkeying, no nothing. If it can't be done with memcpy(), it
can't be done with DMA.

--

--
http://pphaneuf.livejournal.com/

1 Feb 18:28 2009

On Sat, Jan 31, 2009 at 7:10 PM, Antonio Marcos <amcmr2003 <at> yahoo.com.br> wrote:

> Oh, so you were already assuming the surface was on video memory? Because I was thinking it was going from
RAM to vRAM. hence much slower.

Oh, yes! The I/O bus is *much* slower than the 51.2 GB/s (well, on a
normal computer, there's some crazy PCI Express v3.0 32x buses that
are extremely fast, but you then probably hit the limits of the main
memory bandwidth), so you don't even *think* about the video card
memory bandwidth when doing main RAM to video RAM transfers.

> i dont know.. after all each blit, on the lower level, is actually repeating memory copies for each
pixel(assuming a 32 bit depth on a 32bit machine). so the fewer pixels the better.

True, but when telling the video card to do VRAM to VRAM blits, that
has to go through a command buffer that is generally not optimized for
high volumes, so there's a cost to each command, and there's a
breaking point where you're better off just blitting the whole thing
(or at least, in some bigger chunks).

In principle you're right, though, fewer pixels is better.

Also, you were mentioning RLE earlier, if I'm not mistaken, most video
cards nowadays use texture compression, which usually gets a similar
effect. That's also one of the reasons why locking surfaces is
something you want to avoid, because on those modern cards, it will
not only wait for the GPU to be done with it, but also needs to
uncompress it somewhere (all wasted if you were going to overwrite it
completely!), then recompress it when you unlock the surface. That
"somewhere" is also often main memory, so it also goes over the I/O


1 Feb 18:36 2009

On Sat, Jan 31, 2009 at 7:10 PM, Antonio Marcos <amcmr2003 <at> yahoo.com.br> wrote:

> actually it would have the same number of branches (if i understood correctly what you mean), when I do
dirty rectangles i have to copy the BG, and then blit it on the previous position of the moving sprite. Then I
blit the actual sprite at the new position. This last blit is already colorkeyed(otherwise I would blit
only the sides that moved and exposed the BG, like I said previously), the idea is to do the same with the
first blit. :)
>
>  The catch though is that its a little trickier, because i would have to use a 'negative mask' of the
sprite... and the way im seeing with SDL, i would need an actual blit of the whole BG rect to a new surface,
then paste the mask to it, and only then pass it to the video card... and though this can save data going to the
card, it could be done in a direct blit if I could 'modify' the colorkey code of SDL, and use the 'negative
mask' at the AND phase.. SDL could have a blit with a mask parameter :)

Just to put things in perspective:

One of my co-worker on the 3D team explained to me how the
depth-of-field effect in Crysis is done. They re-render every frame a
number of times, with slightly different camera locations, all pointed
at the same focal point, and blend the resulting multiple frames
together.

Think about it: the frame you see has not been rendered once, but
*many* times, and blended. Just the blending effect is probably more
work for the GPU than doing what you're saying the naive way, and it's
just a tiny part of every single frame in that game!

So don't worry too much, if everything is in video memory and you've
got an OpenGL renderer in the style of SDL 1.3, you need to be *very*
wasteful with 2D operations before you start to feel it.


2 Feb 01:58 2009

> As Mason Wheeler said, you can't do anything special
> with DMA, though,
> no colorkeying, no nothing. If it can't be done with
> memcpy(), it
> can't be done with DMA.

Yes, thats what I believed :) my intention is to get the image ready to display, and THEN DMA it to the card. But
yes, if the sprite is already at the vRAM great, it will make it "ready" (the number crunching) using the
gpu.. before that is just loading time.. disk to RAM.. DMA to vRAM, the user will just see it once, probably
at the start of the game, or level.. but I believe you were talking about RAM to vRAM bandwidth.. not vRAM to
vRAM... well, sorry for any misunderstanding :)

--- Em dom, 1/2/09, Pierre Phaneuf <pphaneuf <at> gmail.com> escreveu:

> De: Pierre Phaneuf <pphaneuf <at> gmail.com>
> Assunto: Re: [SDL] DirectAccess to Video Memory [was: Hello, new member  reporting]
> Para: amcmr2003 <at> yahoo.com.br, "A list for developers using the SDL library. (includes SDL-announce)" <sdl <at> lists.libsdl.org>
> Data: Domingo, 1 de Fevereiro de 2009, 13:50
> On Sat, Jan 31, 2009 at 7:14 PM, Antonio Marcos
> <amcmr2003 <at> yahoo.com.br> wrote:
>
> > I dont get the point of DMA then :( probably because I
> dont know whats this number crunching doing... but I dont
> get whats the point of DMA if the whole data have still to
> go through the gpu...
>
> DMA is to send data from the main RAM to the video RAM. So
> in the case
> of this example, where we wanted to top out the capacity of
> the GPU,


2 Feb 02:34 2009

> What I mean by more branches is that for every pixel, you
> add a "should I get it from surface A or surface B". I
> was mistaken, though, it's no worse than a colorkeyed blit, with the
> added twist of the colorkey not just saying "don't blit this
> pixel", but rather "get this
> pixel from the background instead".

exactly! :)

> You don't really need a separate
> "negative mask", the colorkey could be used for
> this purpose, no?

..let me think... oh.. :) Indeed!! hheheheheheh thanks!! :) But yes, the colorkey is in the sprite
surface's pixels, SDL still need the mask parameter :) (any ideas how to use a mask in SDL?)

> But if everything is in video memory, I wouldn't care.
> You could
> reblit the *entire* background, then blit *every* sprite on
> *every*
> frame, and you could still be close to 1000 frames per
> second on my
> GeForce 8800 (ok, maybe just 500, still plenty).

wow!

> I understand what you mean by the "two rectangles on
> each side", you
> were thinking of a moving sprite. With everything in video
> memory, I


2 Feb 02:46 2009

> Think about it: the frame you see has not been rendered
> once, but
> *many* times, and blended. Just the blending effect is
> probably more
> work for the GPU than doing what you're saying the
> naive way, and it's
> just a tiny part of every single frame in that game!

Hmpf.. there goes my optimizing-l33t-hacking-drives down the drain.. but they will be back with torches,
Pierre, be warned!

>
> So don't worry too much, if everything is in video
> memory and you've
> got an OpenGL renderer in the style of SDL 1.3, you need to
> be *very*
> wasteful with 2D operations before you start to feel it.
>

wait a sec.. will I need to actually talk to openGL myself? or this will be taken care of by SDL?

> how the depth-of-field effect in Crysis is done.

Awesome! Got any videos of this in action, so I can check it out? It sounds like the player would feel stoned or
something :) (sorry if this offends the programmers in any way haha, and it does seems to be at least a couple
of ways, lol)

> But if you SDL_LockSurface even *once* per frame,
> everything will


2 Feb 06:59 2009

### Problem with SDL-1.2.13 installation..

Hi All,

I am trying to setup QEMU on Windows XP. I have followed the below steps for the same:

================================================1.Install MinGW-5.1.4.exe   - Which MinGW package do you wish to install? --> Candidate  - Select the type of install --> Minimal 2.Install MSYS-1.0.10.exe3.Unpack msysCORE-1.0.11-20080826 and overwrite them at c:\msys\1.0\ 4.Execute C:\msys\1.0\postinstall\pi.bat and setup c:/MinGW  Now doubleclick msys icon and check it runs good.  5.Install msysDTK-1.0.1.exe6.Unpack bash-3.1-MSYS-1.0.11-snapshot.tar.bz2 and overwrite them at c:\msys\1.0\ 7.Unpack MSYS-1.0.11-20090105-dll.tar and overwrite them at c:\msys\1.0\ Now, run msys and below is excuted in msys shell.  8. tar -C /MinGW -xzvf directx-devel.tar.tar9. Install SDL-1.2.13.tar.tar tar xzvf SDL-1.2.13.tar.tarcd SDL-1.2.13 ./configure --prefix=/MinGW && make && make install10. Install zlib-1.2.3.tar.gz tar xzvf zlib-1.2.3.tar.gzcd zlib-1.2.3 ./configure --prefix=/MinGW && make test && make install================================================  Currently, I am facing problem with the installation of SDL. When i run the command ./configure --prefix=/MinGW && make && make install in the SDL-1.2.13 folder, i get "sh.exe -- Application error" 6 or 7 times followed by the below error: "checking build system type... config.guess: cannot create a temporary directory in /tmp
configure: error: cannot guess build type; you must specify one"

Thanks
Prakash

_______________________________________________
SDL mailing list
SDL <at> lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

2 Feb 19:37 2009

On Sun, Feb 1, 2009 at 8:46 PM, Antonio Marcos <amcmr2003 <at> yahoo.com.br> wrote:

> Hmpf.. there goes my optimizing-l33t-hacking-drives down the drain.. but they will be back with
torches, Pierre, be warned!

Yeah, it's a bit disappointing for the bit-twiddling aspects that
we're losing, but there's a whole new art of making the fixed set of
APIs (well, less fixed now that there are more and more shaders kind
of programmability) of OpenGL do the wacky things we want to do as
fast as possible.

> wait a sec.. will I need to actually talk to openGL myself? or this will be taken care of by SDL?

No, in SDL 1.3, the new API has a "renderer" that talks OpenGL behind the scene.

>> how the depth-of-field effect in Crysis is done.
>
> Awesome! Got any videos of this in action, so I can check it out? It sounds like the player would feel stoned
or something :) (sorry if this offends the programmers in any way haha, and it does seems to be at least a
couple of ways, lol)

There's got to be some stuff on Youtube, I guess? There's a similar
effect in Call of Duty 4, when you bring up a weapon for more accurate
firing, it simulates the effect of your eyes focusing on the sights,
but I think it's faked with a bit of blurring, instead of being more
optically correct, as in Crysis (that game really needs a powerful
system, no need to say!).

>> But if you SDL_LockSurface even *once* per frame,
>> everything will
>> memory (as per
>> the subject) is the *last* thing you want, really.
>
> ookaay.. got it :( back to DosBox then... :(

Well, if it's any comfort, with PCI Express, the "toilet" is much
faster than it used to be. But if you want to use the hardware
properly and kick real ass, you'll have to keep to the new SDL_Texture
API (rather than the old SDL_Surface API). You can still lock textures
with the new API, but there's a flag when creating the texture to say
whether it is "static" or "streamable", and you can only lock the
latter (this is so SDL knows when it's free to optimize the heck out
of stuff, so presumably, operations done with "static" textures have
better chances of being in the fast path).

--

--
http://pphaneuf.livejournal.com/


Gmane