Grid computing’s 6th week provided a look into clusters, specifically:
- Clusters vs Grids
- Benchmarking techniques
- Cluster and Grid programming environments
Discussion of a cluster implementation method, Beowulf described how some clusters differ. A key characteristic of clusters is their interconnect technologies, some of the options are:
- Fast Ethernet
- Gigabit Ethernet
- 10 Gigabit Ethernet
For a detailed list and comparison: http://en.wikipedia.org/wiki/List_of_device_bandwidths#Local_area_networks
It is also worth noting that latency can become more important than bandwidth is many cluster networks. This would be dependent on the programs running on the cluster.
Benchmarking was the second sub-topic, how do we determine the performance of a cluster?
- Floating point Operations Per Second [FLOPS]
- Benchmarking applications
- Application Run-times (wallclock) – most reliable
- Scalability tests (decreases in time using more CPUs)
- LINPACK (http://en.wikipedia.org/wiki/LINPACK)
Programming Envrionments came next:
- Parallel Virtual Machine [PVM]
- Message Passing Interface [MPI]
- MPICH-G2 (Grid enabled MPI)
- Linux Virtual Server [LVS]
In the tutorial we did some introductory work with MPI. It seems MPI is the most customizable environment. The idea of not having to think about the underlying physical layers that cloud computing present still seems much more attractive to me than tweaking with MPI. I am not sure if PVM and LVS allocate resource autonomously, will have to check this out.