Parallel Programming
#1
Posted 23 February 2007 - 12:13 PM
With the recent news about Intel showing off its Terascale 80 core chip, and the latest NVidia GPUs having 128 cores, all of which are becoming more general purpose by the week (so it seems), I decided to have a look at the parallel programming literature... seems like nobody has really found a convincing way yet of automatically parallelising an arbritrary program.
I've read about stuff like SDKs that provide pre-programmed parallelism, and transactional programming at a hardware level... although both sound useful, neither of these approaches seem to fit the bill when considering systems that may contain thousands or even millions of processing cores.
I've also read something somewhere that said progress in computing (at least from a software perspective) could grind to a halt five years from now if nobody can find a solution to this problem. There wouldn't seem to be much point in providing thousands of processing cores if nobody can figure out how to program them.
So what do you guys think about such things? Do you think it's insurmountable? Are most programs just inherently serial and can't be realistically parallelised? Or has someone got some cunning ideas?
#2
Posted 23 February 2007 - 01:15 PM
karligula said:
Err right? Where did you hear that? Probably a sloppy journalist.
They have 128 "Stream Processors". They are not equivalent to a core at all!
Even the groups of them (of which it only has 8 of) aren't equivalent to a separate core because of the lack of control.
The 80 intel chip isn't really usable now either, just a proof of concept.
I dont understand why you post these vague (shallow) questions. You'd get more insight reading a book, or looking on wikipedia or just reading the first few search results on google.
#3
Posted 23 February 2007 - 02:06 PM
Dave I think we're just here for different reasons and if you don't like my questions then please just don't reply to them.
#4
Posted 23 February 2007 - 03:03 PM
Quote
So in order for you to fish out opinion, first you must 'seed' an opinion. True, you have 'provoked' some of us with your question(s) but what about you? what do you think is going to happen in 5 years in respect to parallel processing? Do you think such problems are insurmountable?
I don't think anything is insurmountable, given time. I don't think everything can be parallel-ized to the point that there would be 1 core per 1 binary instruction. I believe that we're moving in the right direction with distributed systems, among other ways to achieve parallelism.
#5
Posted 23 February 2007 - 03:48 PM
By shallow I mean you haven't shown any depth in understanding the subject. Not least with the "facts" you presented. The questions you ask aren't really debatable. They've been solved a long time ago. Sure, 'average' programmers don't know to do it but they need to learn.
This isn't a new topic highly parallel machines have existed for decades. They just weren't available on the desktop.
Its well know that its difficult to achieve more than 25% efficiency on a parallel machine. The software effort to make things go faster often isn't worth it. By the time you've made it more efficient, you could have just bought a faster computer. So far this hasn't changed. It might in future.
Here's a interesting read about valve's approach
#6
Posted 23 February 2007 - 03:56 PM
karligula said:
Nope, and I doubt this will ever (well. for a long time) be possible to do automatically for arbitrary programs...it would require too much 'intelligence' on the part of a tool. Programmers will have to write their programs for parallelism, not write them serially and depend on tools to make them parallel. It's kind of like writing a program in procedural C and wanting a tool to convert to well-designed object-oriented code...not going to happen.
Quote
Many programs that exist today are indeed inherently serial. However, the problems that those programs solve frequently aren't. It's possible to write many, many things in a parallel way, but you may have to spend some time thinking about it to see how.
Quote
You hit the nail on the head here. Most programmers still aren't used to thinking parallel and this is one of the major obstacles to concurrent program development. I expect concurrent thinking will eventually be taught in intro CS courses and programming books as a basic concept like loops and functions. (And built-in language support for concurrency, like Java has to some extent, will go a long way toward making this easier.)
Of course...most computer users still just run one application at a time and use their computer for word processing, browsing the web, watching videos, and so forth. Multiprocessors are never going to be a great advantage here; they can help a bit, but in these cases the majority of the CPU time is spent waiting for the user anyway. (At least until we get AI operating systems capable of speech and gesture processing!)
#7
Posted 23 February 2007 - 05:43 PM
DonBerto... give a little get a little.... that's a good point well taken, I'll try to elucidate my thinking a bit more.
Right, so given an arbritrary program, what are the difficulties involved in having the compiler scan the code, figuring out which bits are dependent on others, which bits are independent and thus capable of being parallalised, and inserting sequence points to make sure everything stays synchronised?
Would creating programs using a much finer granularity of objects help, where the objects can run independently and the cores would communicate the object mesages? Could that be automatically synchronised?
Perhaps there might be language constructs to help, eg:
parallel for(x=0;x<1000;x++)
{
parallel for(y=0;y<1000;y++)
{
dosomehorrendouslycomplicatedtask(x,y)
}
}
which is a simple example that can obviously be done manually by launching threads but a language construct would make it simpler.
Anyway I'm writing this at the office, my brain is shutting down faster than Windows ever has and it's time for home... ;-)
#8
Posted 23 February 2007 - 11:36 PM
I wrote a couple of articles on multi-core for games, but focusing more on the current generation of 2-4 core CPUs.
http://cowboyprogram...ore-processors/
http://cowboyprogram...article-sytems/
#9
Posted 23 February 2007 - 11:55 PM
dave_ said:
Your 25% number is a little specious. Game engines generally process a large number of atomic objects, which is a great candidate for parallization, especially in physics and graphics. AI is harder, but you can also do the approach of pipelining when moving from 1 to 2 cores. You could easily get a 50% improvement at least on a game engine in less that a month's work.
#10
Posted 24 February 2007 - 01:28 AM
#11
Posted 24 February 2007 - 03:42 AM
karligula said:
parallel for(x=0;x<1000;x++)
{
parallel for(y=0;y<1000;y++)
{
dosomehorrendouslycomplicatedtask(x,y)
}
}
However, using this feature has a cost: The compiler needs ages to compile if you enable this optimization feature, and to get good result out of it you have to hint the compiler about the loop in various ways (minimum tripcount, tripcount is always a multiple of x ect).
Nils
#12
Posted 24 February 2007 - 04:56 PM
http://msdn.microsof...s/05/10/OpenMP/
#pragma omp parallel
{
#pragma omp for
for(int i = 1; i < size; ++i)
x[i] = (y[i-1] + y[i+1])/2;
}
#13
Posted 25 February 2007 - 03:45 PM
#14
Posted 25 February 2007 - 06:10 PM
Cowboy Coder said:
Thats 25% of max theoretical through-put.
Even the very fastest super computers in the world with specially crafted software only achieve around 80%.
I wasn't talking about 1-2 processors I was talking about thousands, like the OP wanted to talk about.
#15
Posted 25 February 2007 - 06:33 PM
karligula said:
The trouble is those facts where what you used to formulate your hypothesis so its not just pedantry.
Well, I've done a few courses on parallel processing, even worked on projects (in a paid job) where the server was on a cluster and worked on PS2/PS3, Xbox360, so I do have a little insight.
The trouble is, this is a huge subject. What can I hope to say of any worth in a few sentences here?
With many good books. I'd suggest you read one. For starting I'd recommend Tanenbaum but it is a bit dry. Its a good reference when combined with some other forms/sources of learning.
GPGPU stuff is great... I seriously doubt they'll become fully programmable cores. If they were why would you not just use a second main CPU? More likely is that main CPU will gain special vector units like 286/386 gained integrated FPU.
I'm sure they'll keep their specialist pipelines and complement traditional CPUs for many years to come.
However what if we were to take in further and put an FPGA into a PC. Now that would be interesting. You could re-configure it for every app for a special pipeline and get some interesting parallel computation done.
#16
Posted 25 February 2007 - 07:44 PM
dave_ said:
Thats 25% of max theoretical through-put.
Even the very fastest super computers in the world with specially crafted software only achieve around 80%.
I wasn't talking about 1-2 processors I was talking about thousands, like the OP wanted to talk about.
Data center code is not the same as game code. And it's worth noting that even essentially single CORE processor systems like the PS2 don't ever do anything like their theoretical FLOPS throughput, mostly due to cache issues.
Of course there are problems trying to reach the maximum efficiency. It will vary by application. 25% just seems rather low - perhaps that is reflecting just attempting to scale an existing single threaded application. A rendering engine targeting a 1024+ core machine should be able to get way higher than that.
#17
Posted 26 February 2007 - 12:47 AM
dave_ said:
It looks to me like things are heading that way. You must know about CUDA right? And it wouldn't make them just another processor by any means. They're stream processors, with their own strengths and weaknesses.
dave_ said:
You mean other than MMX, 3dnow, SSE1, 2 and 3...?
#18
Posted 26 February 2007 - 09:03 AM
Razor said:
Razor said:
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












