United Kingdom, September 10, 2012
The Autumn Academy is targeted at people just starting a
PhD, though it is highly suitable for more mature
researchers from academia or industry. The Academy will take
attendees through Basic HPC
tools, Programming, Performance Programming, Parallel
Architectures, Shared Variables Parallelism, Message Passing
Parallelism and Practical Parallel Programming.
Description:
The Autumn Academy will cover the following modules. In
addition there will be several small group activities
designed to span these modules, and a series of plenary
lectures from international experts that will illustrate HPC
research in action.
Programming Languages (C or Fortan)
Parallel courses will be run in both Fortran and C; students
will choose one stream. Teaching in other parts of the
Academy will illustrate examples with both languages.
Performance Programming
Designed to teach students to think about and explore
factors that affect the performance of their code. Relevant
factors include compiler, operating system, hardware
architecture, and the interplay between them. Emphasis will
be on the general principles of performance measurement, not
the specific packages being used. Includes familiarisation
with basic serial debugging and profiling tools
Parallel Architectures
This module will introduce shared-memory and
distributed-memory HPC architectures and programming models
and algorithms; basic parallel performance measurement and
Amdahl�s law. It will also cover the concept of
communicating parallel processes: synchronous/ asynchronous;
blocking/ non-blocking. Basic concepts regarding data
dependencies and data sharing between threads and processes
will be introduced using pseudocode exercises and thought
experiments
OpenMP
OpenMP model, initialisation, parallel regions and parallel
loops; shared and private data; loop scheduling and
synchronisation. By the end, students should be able to
modify and run example programs on a multi-core system, and
understand the performance characteristics of different loop
scheduling options
MPI
MPI basics, point-to-point, synchronous and asynchronous
modes; non-blocking forms and collective communications;
data types. The course will illustrate how standard
communications patterns can be implemented in MPI and used
to parallelise simple array-based computations. By the end,
students should be able to modify and run an example program
on a distributed-memory system and understand its basic
performance characteristics.
Practical Programming
This will review the material from the entire course,
compare and contrast different programming approaches and
place the course in the wider context of computational
science as a research discipline. It will also outline other
important areas in parallel software design and development
that are beyond the scope of this initial academy. The
module will include: comparison of parallel models and their
suitability for different architectures; basics of parallel
program design; use of packages and libraries; the HPC
ecosystem in the UK, Europe and worldwide.
Numerical Analysis
Students will learn the fundamentals in algorithm design
such as robustness, convergence, error testing and
computational complexity, and also become aware of overflow
and underflow. Specific algorithms will be introduced for
linear systems and interpolation with polynomials, as well
as splines and numerical solutions to ODEs. A short outlook
on numerical methods for PDEs will be given.
Hardware/Software
An overview of how a computer operates at a hardware level
will be presented, with the intention of making it clearer
why certain programming techniques improve performance,
often dramatically. This overview covers pipelined CPUs,
cache memory, virtual memory, pages and TLBs, and the basic
architecture of parallel computers.