<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Meseret Gebre &#187; HPDC</title>
	<atom:link href="http://meseretgebre.com/archives/category/hpdc/feed/" rel="self" type="application/rss+xml" />
	<link>http://meseretgebre.com</link>
	<description></description>
	<lastBuildDate>Wed, 16 Nov 2011 04:42:35 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Different Types of Parallelism in HPDC part2</title>
		<link>http://meseretgebre.com/archives/different-types-of-parallelism-in-hpdc-part2/</link>
		<comments>http://meseretgebre.com/archives/different-types-of-parallelism-in-hpdc-part2/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 03:56:56 +0000</pubDate>
		<dc:creator>mez</dc:creator>
				<category><![CDATA[HPDC]]></category>

		<guid isPermaLink="false">http://meseretgebre.com/?p=39</guid>
		<description><![CDATA[In Part1 of this article we went over the basics, so read that first if you already haven&#8217;t. In this article, we will go into more detail over the different types of parallelism. Parallelism types from Implicit to Explicit Instruction Level parallelism Compiler assisted parallelism Programmer guided, Compiler assisted parallelism Programmer guided, automatic multi-threading Multi-threaded [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://meseretgebre.com/?p=36">Part1</a> of this article we went over the basics, so read that first if you already haven&#8217;t. In this article, we will go into more detail over the different types of parallelism.</p>
<p><strong>Parallelism types from Implicit to Explicit </strong></p>
<ol>
<li> Instruction Level parallelism</li>
<li> Compiler assisted parallelism</li>
<li> Programmer guided, Compiler assisted parallelism</li>
<li> Programmer guided, automatic multi-threading</li>
<li> Multi-threaded programs (manually developed)</li>
<li> Multi-process applications (manually developed)</li>
</ol>
<p><strong>Instruction Level parallelism</strong><br />
This is what is provided at the processor level, when you compile the serial code, parallelism is automatically extracted by the microprocessor. The CPU tries to runs concurrent instructions in parallel. This is a pure hardware solution, future articles will discuss how instruction level parallelism is achieved and the challenges developers must concur in order to fully utilize this parallelism.</p>
<p><strong>Compiler assisted parallelism</strong><br />
Still in the implicit parallelism category, the compiler and the CPU work together to achieve parallelism. The compiler identifies concurrent instructions while compiling and tags them with OP codes. When the CPU executes the instructions the ones with OP tags are run in parallel. The difference here is the hardware overhead is decreased because the software is taking on some of the load of extracting parallelism. This requires the compiler to be highly optimized for the CPU, for example ICC, the Intel C Compiler.</p>
<p><strong>Programmer guided, Compiler assisted parallelism</strong><br />
Here is where we start to fall a little over to the explicit parallelism, but mostly it&#8217;s implicit. This is just like compiler assisted parallelism, but the programmer helps the compiler by reorganizing parts of the code. I will be writing an article soon that deals with some cool code reorganization, which will improve your code big time! Usually, this process is done via a process called profile driven optimization. Profilers are tools that can show you bottlenecks in your code.</p>
<p><strong>Programmer guided, automatic multi-threading</strong><br />
This one falls more into the explicit parallelism, but has some implicit as well. The idea here is to use a programming language or a compiler that is extended/enhanced with special constructs that can be used by the programmer to tag different areas that concurrent. A good example of this is OpenMP. The reason it&#8217;s automatic mulit-threading is because the compiler will handle the work of creating, synchronizing, and destroying threads for you. However, scope of gain is limited to shared-memory architecture.</p>
<p><strong>Multi-threaded programs (manually developed)</strong><br />
Now we are getting into the full explicit parallelism category. These programs are basically mulit-threaded. Most languages you use will have a way to create and use threads. If you have a mulit-core machine, it&#8217;s a given to use multi-threads which will utilize the different cores. The programmer is responsible for the extra overhead of synchronizing critical sections and avoiding deadlocks and race conditions. For most programmers, it&#8217;s often challenging to develop good multi-threaded programs.</p>
<p><strong>Multi-process applications (manually developed)</strong><br />
This is the last type and the most difficult to get right. This is fully explicit parallelism and the program typically runs on multiple processes. Each process works on a different computer (computes are interconnected). Note, each process may contain mulitple threads. Again, it&#8217;s up to the programmer to keep track of all the overhead of allocating resources and handling synchronization. This is the current favored solution on most supercomputing clusters.</p>
<p>That concludes this article. Remember the idea of this article was to get you a little info into the different types. Future articles will go even further into some of these types and discuss how we can optimize our code to get some big improvement. Questions or comments are always welcomed!</p>
]]></content:encoded>
			<wfw:commentRss>http://meseretgebre.com/archives/different-types-of-parallelism-in-hpdc-part2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Different Types of Parallelism in HPDC part1</title>
		<link>http://meseretgebre.com/archives/different-types-of-parallelism-in-hpdc/</link>
		<comments>http://meseretgebre.com/archives/different-types-of-parallelism-in-hpdc/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 03:48:40 +0000</pubDate>
		<dc:creator>mez</dc:creator>
				<category><![CDATA[HPDC]]></category>

		<guid isPermaLink="false">http://meseretgebre.com/?p=36</guid>
		<description><![CDATA[There are many types of parallelism in HPDC, but in general the term is typically used to describe the idea of performing two or more tasks at the same time using different computational devices. These computational devices can be: A different computer thats networked or interconnected Another microprocessor on the same computer Another core on [...]]]></description>
			<content:encoded><![CDATA[<p>There are many types of parallelism in HPDC, but in general the term is typically used to describe the idea of performing two or more tasks at the same time using different computational devices. These computational devices can be:</p>
<ol>
<li> A different computer thats networked or interconnected</li>
<li> Another microprocessor on the same computer</li>
<li> Another core on a microprocessor</li>
<li> A seprate ALU or part of ALU on a processor</li>
</ol>
<p>This idea of parallelism is extended and we have different types of parallelism. Imagine a spectrum if you will. In this spectrum, one extreme end will be implicit parallelism and the other extreme is explicit parallelism.</p>
<p>Implicit parallelism in its basic form is parallelism that is automatically or semi-automatically realized. Meaning the microprocessor  and the compiler work together to extract parallelism. Note, that the scope of what you gain are limited to a single processor or single machine. Explicit parallelism is the other extreme, so parallelism here is realized manually. The programmer is responsible for developing the program to run in parallel. Typically, for anything worthwhile, would require considerable programming efforts. However, the gain be scoped to multiple machines or supercomputing clusters!</p>
<p>You might be asking, why is explicit parallelism so difficult to code? If you are not asking yourself this question, pretend you just did! It&#8217;s tough because most of the time the serial running program that you created is hard to break up into multiple small chunks of code. Basically, parallelism strives to effectively utilize concurrency in code to reduce the overall computational time required to complete processing.</p>
<p>By concurrency, I am talking about the lack of dependency or relationship between two tasks. Identifying concurrency is what makes explicit parallelism so hard. If you understood everything to this point, then you have the basics of parallelism and the challenges. <a href="http://meseretgebre.com/2009/03/different-types-of-parallelism-in-hpdc-part2/">Part2</a> of this article will discuss the different types of parallelism, the stuff between our imaginary spectrum we described earlier.</p>
]]></content:encoded>
			<wfw:commentRss>http://meseretgebre.com/archives/different-types-of-parallelism-in-hpdc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>High Performance Distributed Computing</title>
		<link>http://meseretgebre.com/archives/high-performance-distributed-computing/</link>
		<comments>http://meseretgebre.com/archives/high-performance-distributed-computing/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 03:12:22 +0000</pubDate>
		<dc:creator>mez</dc:creator>
				<category><![CDATA[HPDC]]></category>
		<category><![CDATA[Optimize]]></category>

		<guid isPermaLink="false">http://meseretgebre.com/?p=20</guid>
		<description><![CDATA[High performance distributed computing also known as HPDC is the effective development and use of software on distributed memory supercomputing clusters. Developing code that is optimized is a challenging task for many reasons. One issue is the semantic gap between the processor and the language you choose to develop with. The higher abstract the language, [...]]]></description>
			<content:encoded><![CDATA[<p>High performance distributed computing also known as HPDC is the effective development and use of software on distributed memory supercomputing clusters. Developing code that is optimized is a challenging task for many reasons. One issue is the semantic gap between the processor and the language you choose to develop with. The higher abstract the language, the less performance you realize from the processor. Also often the resources of cluster computing are under utilized. This is because it is challenging to extract area of code that can be run concurrently. Most of the time jobs are run in serial. </p>
<p>I have big interest in HPDC and this section of my blog is all about HPDC. Most of my examples are with C unless stated otherwise. I hope you learn from this section and don&#8217;t hesitate to ask questions! </p>
]]></content:encoded>
			<wfw:commentRss>http://meseretgebre.com/archives/high-performance-distributed-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bitwise Operators</title>
		<link>http://meseretgebre.com/archives/bitwise-operators/</link>
		<comments>http://meseretgebre.com/archives/bitwise-operators/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 02:48:21 +0000</pubDate>
		<dc:creator>mez</dc:creator>
				<category><![CDATA[HPDC]]></category>
		<category><![CDATA[Bitwise operators]]></category>
		<category><![CDATA[Optimize]]></category>

		<guid isPermaLink="false">http://meseretgebre.com/?p=8</guid>
		<description><![CDATA[Bitwise operators let you manipulate individual bits. Normally I would not bother writing about this, but I believe knowing about bit wise operation is an important asset to have when trying to optimize code. Especially in memory consumption. Keep in mind that all bitwise operations  in C only work of singed and unsigned integer primitive [...]]]></description>
			<content:encoded><![CDATA[<p>Bitwise operators let you manipulate individual bits. Normally I would not bother writing about this, but I believe knowing about bit wise operation is an important asset to have when trying to optimize code. Especially in memory consumption. Keep in mind that all bitwise operations  in C only work of singed and unsigned integer primitive data types, mainly:</p>
<ol>
<li>char</li>
<li>short</li>
<li>int</li>
<li>long</li>
</ol>
<p><strong>Article outline</strong><br />
I&#8217;ll talk about each of the following operations and give some examples on how to use each.</p>
<ol>
<li>Bitwise NOT (~)</li>
<li>Bitwise AND (&amp;)</li>
<li>Bitwise OR (|)</li>
<li>Bitwise XOR (^)</li>
<li>Shift Left (&lt;&lt;)</li>
<li>Shift Right(&gt;&gt;)</li>
</ol>
<p><strong>Bitwise NOT (~)</strong><br />
This is an unary operator. It simply filps all &#8217;0&#8242;s to &#8217;1&#8242;s and &#8217;1&#8242;s to &#8217;0&#8242;s. Here is a simple example:</p>
<pre class="brush: c">
unsigned byte x = 10;
x = 00001010
~x = 11110101
</pre>
<p><strong>Bitwise AND (&amp;)</strong><br />
This is a binary operator. Just follow the truth table. Both bits have to be true to get a resulting true bit. Anything else is false. Here is a simple example:</p>
<pre class="brush: c">
unsigned byte x = (10&amp;8);
10  = 00001010
8   = 00001000
x   = 00001000
</pre>
<p><strong>Bitwise OR (|)</strong><br />
This is a binary operator. Just follow the truth table. Here, just one of the bits has to be true to get a resulting true. Here is a simple example:</p>
<pre class="brush: c">
unsigned byte x = (10|8);
10  = 00001010
8   = 00001000
x   = 00001010
</pre>
<p style="text-align: left;"><strong>Bitwise XOR (^)</strong><br />
This is a binary operator. Just follow the truth table. Basically it flags all bits that are different. Here is a simple example:</p>
<pre class="brush: c">
unsigned byte x = (10^15);
10  = 00001010
15  = 00001111
x   = 00000101
</pre>
<p><strong>Shift Left (&lt;&lt;)</strong><br />
Keep in mind that when you do one of the shift operators the size of the data type does not change.<br />
A left shift, moves the bits to the left inserting &#8217;0&#8242;s in the right most bit for every shift. The left most bit is discarded. This is effectively multiplying the number by 2. Note if you use too many shifts you could end up with a negative number. This is possible if the data type is interpreted as 2&#8242;s complement. Here is a simple example:</p>
<pre class="brush: c">
unsigned byte x = 4 &lt;&lt; 2;
4   = 00000100
x   = 00010000
x = 16
</pre>
<p><strong>Shift Right (&gt;&gt;)</strong><br />
A rightshift, moves the bits to the right inserting &#8217;0&#8242;s in the left most bit for every shift. The right most bit is discarded. This is effectively dividing the number by 2. Note if you use too many shifts you could end up with a zero. Here is a simple example:</p>
<pre class="brush: c">
unsigned byte x = 4 &gt;&gt; 2;
4   = 00000100
x   = 00000001
x = 1
</pre>
<p>That wraps up the bitwise operators. If there is anything I forgot, let me know and I&#8217;ll make the updates. I hope you learned something in this read. Remember these are operation which the processor can execute very fast. For example, if you every need to multiple or divide by 2, it best to use the shift operations. Not only will you code execute faster, but also save you processor some energy!</p>
]]></content:encoded>
			<wfw:commentRss>http://meseretgebre.com/archives/bitwise-operators/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

