Different Types of Parallelism in HPDC part2
// March 19th, 2009 // No Comments » // HPDC
In Part1 of this article we went over the basics, so read that first if you already haven’t. In this article, we will go into more detail over the different types of parallelism.
Parallelism types from Implicit to Explicit
- Instruction Level parallelism
- Compiler assisted parallelism
- Programmer guided, Compiler assisted parallelism
- Programmer guided, automatic multi-threading
- Multi-threaded programs (manually developed)
- Multi-process applications (manually developed)
Instruction Level parallelism
This is what is provided at the processor level, when you compile the serial code, parallelism is automatically extracted by the microprocessor. The CPU tries to runs concurrent instructions in parallel. This is a pure hardware solution, future articles will discuss how instruction level parallelism is achieved and the challenges developers must concur in order to fully utilize this parallelism.
Compiler assisted parallelism
Still in the implicit parallelism category, the compiler and the CPU work together to achieve parallelism. The compiler identifies concurrent instructions while compiling and tags them with OP codes. When the CPU executes the instructions the ones with OP tags are run in parallel. The difference here is the hardware overhead is decreased because the software is taking on some of the load of extracting parallelism. This requires the compiler to be highly optimized for the CPU, for example ICC, the Intel C Compiler.
Programmer guided, Compiler assisted parallelism
Here is where we start to fall a little over to the explicit parallelism, but mostly it’s implicit. This is just like compiler assisted parallelism, but the programmer helps the compiler by reorganizing parts of the code. I will be writing an article soon that deals with some cool code reorganization, which will improve your code big time! Usually, this process is done via a process called profile driven optimization. Profilers are tools that can show you bottlenecks in your code.
Programmer guided, automatic multi-threading
This one falls more into the explicit parallelism, but has some implicit as well. The idea here is to use a programming language or a compiler that is extended/enhanced with special constructs that can be used by the programmer to tag different areas that concurrent. A good example of this is OpenMP. The reason it’s automatic mulit-threading is because the compiler will handle the work of creating, synchronizing, and destroying threads for you. However, scope of gain is limited to shared-memory architecture.
Multi-threaded programs (manually developed)
Now we are getting into the full explicit parallelism category. These programs are basically mulit-threaded. Most languages you use will have a way to create and use threads. If you have a mulit-core machine, it’s a given to use multi-threads which will utilize the different cores. The programmer is responsible for the extra overhead of synchronizing critical sections and avoiding deadlocks and race conditions. For most programmers, it’s often challenging to develop good multi-threaded programs.
Multi-process applications (manually developed)
This is the last type and the most difficult to get right. This is fully explicit parallelism and the program typically runs on multiple processes. Each process works on a different computer (computes are interconnected). Note, each process may contain mulitple threads. Again, it’s up to the programmer to keep track of all the overhead of allocating resources and handling synchronization. This is the current favored solution on most supercomputing clusters.
That concludes this article. Remember the idea of this article was to get you a little info into the different types. Future articles will go even further into some of these types and discuss how we can optimize our code to get some big improvement. Questions or comments are always welcomed!



