Categories
Introduction to Parallel Programming Online Courses

Introduction to Parallel Programming – Week 2

After learning the GPU programming model and writing a basic CUDA program Week 2 introduces some concepts for efficient algorithms.

Communication is the first issues; how do threads communicate efficiently. This is easier in some problems, ie: Map than others ie: Scan/Sort. The memory access and write patterns  of some of the key algorithm types are discussed. Input-to-Output relationships:

  • Map: one-to-one
  • Gather: many-to-one
  • Stencil: several-to-one
  • Reduce: all-to-one
  • Scatter: one-to-many
  • Scan: all-to-all
  • Sort: all-to-all

The basic problem of concurrent memory access was illustrated via a scatter example. With one input tying to write a 3rd of its value to the neighboring elements we can see that trouble will result from independent thread execution.

one-to-many memory writes will pose a problem with independent threads writing over the same memory locations
one-to-many memory writes will pose a problem with independent threads writing over the same memory locations

To overcome these barriers threads must communicate in some ways. Shared memory and synchronization points were described as tools for this job.

Some important information about CUDA and what it guarantees (and doesn’t guarantee) about thread execution was also touched on:

  • No guarantees about when and where thread blocks will run –  this is an enabler for massive parallelism. There are also some limitation due to this, no assumptions about what blocks will run on what SM and there can be no direct communication between blocks.
  • Does guarantee – all threads in a block are guaranteed to run on the same SM at the same time.
  • Does guarantee – all blocks in a kernel finish before blocks from the next kernel run

A good source for some of the hardware diagrams and acronyms: www.cs.nyu.edu/manycores/cuda_many_cores.pdf

Finally coalescing memory access was discussed as a very good practice for efficiency.

cuda_coalesced_memory_access
The good. the not so good and the bad types of memory access

Week 2 lectures

Week 2 Assignment

Categories
Introduction to Parallel Programming Online Courses

Introduction to Parallel Programming – Week 1

Introduction to parallel programming is the second MOOC course that I signed up for. The emergence of parallel and distributed computing is not slowing down and it seems that most developers are not accustomed to the very different train of though that parallelism invokes. Most recent GPUs have 1024 simple compute units each of which can run parallel threads. The general course over by the course instructor, John Owens:
 

 
The first week started off pretty simple focussing on why GPUs, parallel programming and CUDA. I found the pace of the videos just right and much more engaging than other courses I have looked at.

The basic of CUDA:

 

CUDA programs are controlled by the host CPU and memory and the libraries enable interaction with the GPU/s.
CUDA programs are controlled by the host CPU and memory and the libraries enable interaction with the GPU/s.

Week 1 lectures

Week 1 Assignment
On review the assignment solution is sub-optimal enough to do the job.

Categories
Functional Programming - Scala

Functional Programming in Scala – Week 7

The final week of the course brought together all of the previous concepts and capabilities to leverage lazy evaluation.

Structural Inductions on trees can be conducted in a number of ways. Obviously they are all bound by logic and I was somewhat confused by the level of detail that this topic was covered on. Perhaps because this concept is not readily available in imperative languages.

Streams was the next topic and one that was used extensively in the week 7 assignment.

((1000 to 10000) filter isPrime)(1)

Above is very inefficient as it finds all prime numbers between 1000 and 10,000, whilst only using element 0 and 1 from the results.

A good solution is to avoid evaluating all of the numbers in the list from 1000 to 10,000 is the .toStream function.

((1000 to 10000).toSteam filter isPrime)(1)
Example of a stream implementation in Scala
Example of a stream implementation in Scala

Lazy evaluation was demonstrated next, its importance paralleled to that of streams

Week 7 lectures

Week 7 assignment source – first pass, lots of problems

Categories
Functional Programming - Scala

Functional Programming in Scala – Week 6

Week 6 was the second last week in the course and focused on Collections. For collections that are not generally going to have linear type access, Vectors are a more apt collection.

Determining the right collection type to use is not so difficult when they are arranged logically.
Determining the right collection type to use is not so difficult when they are arranged logically.

Using the Sequence classes allows the inheritance on useful operations:

scala_sequence_operations
Sequence sub classes inherit some useful operations

Week 6 Lectures

Week 6 assignment source (again this was a quick and dirty job and is far from complete marks)

 

Categories
Functional Programming - Scala

Functional Programming in Scala – Week 5

Week 5 – Lists, investigate a ‘fundamental’ data structure in functional programming. The recursive nature of lists in Scala makes them quite different from arrays. This tied in with the power of pattern matching and the recursive nature of function programming gives credibility to the ‘fundamental’ label.

Pattern matching on lists is a powerful tool
Pattern matching on lists is a powerful tool

Sorting of lists was covered in good detail and in addition to list1.head and list1.tail more functions were revealed:

scala_list_functions
Some key functions for lists

Week 5 Lectures

Week 4 and Week 6 assignments were considered higher work load than others thus there was no assignment for week 5

Categories
Functional Programming - Scala

Functional Programming in Scala – Week 4

Week 4 continued to delve into polymorphism with a focus on Types and Pattern Matching. Generalizing functions to accept multiple parameter types is a clear highlight of Scala.

scala_polymorphism
generic functions in scala

Pattern matching was a topic of week 4 that was used in every subsequent week. Case statement with pattern matching appears to be another staple of Scala.

Number(1) match {
  case Number(n) => n
  case Sum(e1, e2) => eval(e1) + eval(e2)
} + eval(Number(2))

Week 4 lectures

Week 4’s assignment of implementing the Huffman coding compression algorithm was a good practical example of Scala’s power. Note my source has a number of errors in it.

Week 4 Assignment source

Categories
Functional Programming - Scala

Functional Programming in Scala – Week 3

Week 3 looked at the object oriented aspects of Scala. The evaluation of encapsulated data was the most interesting part of week 3,

The lectures went into details about some more of the ‘syntactic sugar’ of Scala.

Abstraction of classes in Scala was explained clearly in the lectures too.

The assignment for week 3 demonstrated Scala’s ability to implement sort, filtering, and union operations in just a few lines.

Polymorphism in Scala was also described. The use of types and traits was shown to enabled different ‘flavors’ of functions.

Week 3 lectures

Week 3 assignment source

scala_class_heirarchy
Class heirarchy in Scala
Categories
Functional Programming - Scala

Functional Programming in Scala – Week 2

Week 2 delved more into tail recursion but more broadly review higher order functions. Looking at perhaps the key characteristic of functional programming languages we found that functions are treated as first class values. Meaning like any other values, functions can be passed as a parameter and returned as a results. This becomes important later in the course and in real world applications where the resolution of values can be deferred until the optimal time.

Another key point of functional programming was introduced, immutable values.  How immutable values related to distributed and parallel programming was touched on briefly. Odersky elaborates on this in the following presentation:

 

The concept of currying was also explained. In essence the passing of functions to functions in the interest of simplifying and generalizing code. These concepts and some of the different function types are details that have not stuck with me very well since the course. I guess that happens when you don’t write a lot of code in a language then leave if for e few months.

Week 2’s assignment was relatively straight forward and followed the contents of the weeks lectures.

Week 2 Assignment code

Week 2 Lectures 

 

Categories
Functional Programming - Scala

Functional Programming in Scala – Week 1

Started and completed this course in the second half of 2012. Thought revisiting the material and uploading the weekly assignments would be a good idea. Week 1 looked at basic functions and evaluations. The recursive nature of functional programming was alluded to, particularly in the assignment.

The basics of (x <- 0 to y) and other scala language specifics such as scoping and structure can all be reviewed in the weeks source code.

I signed up for this course after watching a presentation by Rich Hickey, the creator of Clojure (another functional language for the JVM).

http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey

weeks howework

Week 1 lecture videos: https://class.coursera.org/progfun-2012-001/lecture/8

Once of the most important concepts I took from week 1 was that of tail recursion:

  /**
   * Exercise 1
   */
  def pascal(c: Int, r: Int): Int = {
		//recursive function	
		def tFactorial(number: Int) : Int = {
		//Calculate factorial with tail recursion
  		def tfactorialWithAccumulator(accumulator: Int, number: Int) : Int = {
      	if (number == 1) accumulator
      	else tfactorialWithAccumulator(accumulator * number, number - 1)
			}
		    //start from the start!
        tfactorialWithAccumulator(1, number)
		}

	// element value is calulated using r!/(c!(r-c)!)
	if (c == 0 || c == r || r == 0 || r == 1) 1
	else		
		tFactorial(r) / (tFactorial(c) * tFactorial(r-c))
	}