Larrabee :: pouët.net

Larrabee

category: general [glöplog]

Anyone sort of excited with this idea of having a GPU with tons of x86-like cores?

added on the 2008-07-06 22:31:10 by _-_-__

if it was a gpu and not an hpc device :)
the final format will be much more like a tesla i guess - but i'd love to play with one anyway.

added on the 2008-07-06 23:29:16 by makc

playing with it would indeed be nice.. but i bet that the first generation(s) will suck performance-wise.. i also wonder how they want to support full OpenGL and DX (=stable!! drivers) in that timeframe..

added on the 2008-07-06 23:35:55 by toxie

Probably by getting themselves delayed? ;)

added on the 2008-07-06 23:42:27 by _-_-__

I am curious to see what will be done with it and how much it will cost.

added on the 2008-07-07 01:27:15 by xernobyl

I guess running the software renderer on one (or more if it's threaded well) of the cores would count, and would get them out of any delays :)

I'm interested to see what people will do with this kind of tech too. ATI and nvidia are also supporting on--GPU calculation, but I think it's not going to be popular until there's a common API for all of the various platforms. OpenCL or something similar perhaps?

I'll be amazed if there's a demo that only works on an intel GPU any time soon anyway :D

added on the 2008-07-07 02:02:13 by psonice

x86 cores on their cards. it's a plan, to take over the world. don't you see it. i do, inside my eyes - i seeeeee!!

added on the 2008-07-07 09:10:55 by button

Quote:

I guess running the software renderer on one (or more if it's threaded well) of the cores would count, and would get them out of any delays :)

sure, unless you count the delays waiting for frames to render in gl/dx apps :)

they've got an enormously steep uphill struggle to beat gpus with this at the gpu's game. they'll need to be able to find applications for it where the flexibility of the cores over gpus add enough value to get over the huge head start that gpus have in terms of pure brute force speed and bandwidth, not to mention the rate they're increasing that.
i'd quite like to play with one for raytracing or something. maybe some of my spu code would translate well.. :)

added on the 2008-07-07 10:09:27 by smash

Maybe they'll announce later on that their cores will not be as numerous as previously mentioned, and that they will be assisted by an old style GPU.. ala cell?

added on the 2008-07-07 10:12:35 by _-_-__

Revival of the software-renderer... that could be really great fun.
I see some VoxelSpace out there... :D

I've been waiting for this for a loooooong time...

added on the 2008-07-07 11:11:51 by doc^21o6

intel whores :)

added on the 2008-07-07 12:24:19 by Zest

@doc: actually id is currently working on a voxel "software" renderer!

added on the 2008-07-07 14:02:33 by toxie

imho larrabee is only a glimpse at what the future may be in the next decade, as the retail price in 2009 or 2010 will be unaffordable by most people...

added on the 2008-07-07 15:02:34 by Zest

they wont suck energy wize neither....

added on the 2008-07-07 19:12:26 by thec

wouldn't it be cool if intel gives larrabees for the next intel demo compo so that FAN rules it all! ;)

added on the 2008-07-07 21:33:25 by Zest

it will be interesting to see what larabee can do in terms of real raytracing (I don't include FAN in real raytracing) where a big bandwidth is needed. Like if the main memory will be accessible from the cores or not, or only by manual dma transfering like in cell, or what.

added on the 2008-07-07 21:47:33 by iq

Anyone have an idea of how this will be marketed? I can see its use for some niche markets, but how about mainstream? Or is it not intended to be mainstream at all?

I.e. - if it's intended as a GPU / general accelerator, will it be powerful enough as a GPU to get enough interest? I can see it being good for accelerating video/3d/photoshop type apps, but are the likes of adobe on board? Just wondering where intel are going with it..

I can't see any of this kind of stuff taking off though, until there's either one dominant product (so far it's pretty split), or there's a single way of accessing all the options (like openCL?)

added on the 2008-07-08 00:43:36 by psonice

Can't you just expose its capabilities via something like traditional opengl?

added on the 2008-07-08 07:10:40 by _-_-__

Larrabee is actually the thing that is exciting me the most since years. :D

As for the technical details, the place that contains the most technical details is probably the wikipedia article about it. http://en.wikipedia.org/wiki/Larrabee_(GPU) .

We know since yesterday's rumours that the cores will be in order cores derived from P54C ones (Pentium 1 cores). However their architecture will be slighly tweaked as each core comes with a 512 bit vector unit that can do operation on 16 floats at a time. Add up to these 4 threads per core using hyperthreading like structure (which is logical, if the cores are in order, and if you want to feed a lot of ALUs, the easiest is to decode more threads in parallel to get non-dependant instructions that can be executed in parallel).

You end up with a massive 128 threads per chip able to perform operations on 16 floats at a time, with 1 cycle L1 cache access and 4 cycles L2 shared cache access (which is quite freaking).

As a result i guess we will go towards realtime radiosity sooner than we expect via raytracing on such a chip (I hope!). I'm little worried about the 16 floats at a time as it means that rays will need to have lots of coherency (=hard to exploit in a GI scenario).

But yes wait and see. By the time it comes out with the delays and stuff, i'll adapt my raytracer for future modifications required to implement a larrabee frontend.

added on the 2008-07-08 09:27:51 by nystep

And L2 cache has a latency of 10 cycles according to speculations. Sorry.

added on the 2008-07-08 09:31:08 by nystep

Who needs radiosity when you can have path tracing or photon mapping anyway?

I can imagine Larabee will only be the first step and will be a real success for cheap-ass huge scientific-computing clusters.
And I can't believe it will be based on the old Pentium, its probaby just the design of the cores that is as simple as the old P5 cores. It all sounds like a 32-core Cell with an x86 instruction set...

added on the 2008-07-08 09:53:32 by raer

rarefluid, i was mentioning radiosity via raytracing if you reread my post. I think about path tracing of course. So before flaming me, read twice. But yes it's fun to flame me. Go on if there is something else.

It will be much better than Cell because you will have access to 1gb of memory without doing any wicked DMA shit via a performant cache system. I regret the x86 instruction set, but compability rocks.

added on the 2008-07-08 10:01:32 by nystep

knos: openGL would expose it for graphics stuff, but a chip like this can be used for a lot more (for us, I guess physics, raytracing etc. are the obvious ones, but I bet a lot of funky but un-thought-of stuff would be possible too).

You could do some of that with shaders, but it makes more sense to run code directly and not go through an API designed for something else. So far, nvidia is pushing CUDA, which runs on nvidia gpus only, I think ATI has something similar but perhaps only for physics?

There's also a new standard called openCL which is designed for this sort of stuff and will hopefully be compatible with everything, but I guess won't be ready for a while yet. I know apple will support it as they're one of the founders of the group making it, but MS will no doubt want their own standard (I vaguely recall hearing of some DirectPhysics type thing, so perhaps it's already in the pipeline), so who knows what will happen on windows.

Anyone have an idea of how larabee might compare with current GPUs for standard dx/opengl stuff, performance wise?

added on the 2008-07-08 10:24:59 by psonice

Quote:

it will use the standard x86 instruction set for its shader cores instead of a proprietary graphics-focused instruction set

?
mmhhh ? I thought classic shaders have more powerful instruction sets, things like pow(), cross products, arithmetics, operating on quads, etc... (well, in high level languages, but this could be potentially implemented on hardware as well.) Can a "x86 instruction set" run that better ?

added on the 2008-07-08 10:51:31 by krabob

psonice, to be totally honnest, and totally speculative, and totally personnal unfounded feeling based, my guess is that larrabee will perform worse than next generation gfx chips from ATI and NV in directX and OpenGL. :D

added on the 2008-07-08 11:15:32 by nystep

Larrabee

login