November 13, Tuesday
12:00 – 13:00
Fast Parallel Matrix Multiplication
Computer Science seminar
Lecturer : Oded Schwartz
Affiliation : EECS Department, UC Berkeley
Location : 202/37
Host : Dr. Aryeh Kontorovich
Faster algorithms can be obtained by minimizing communication. That
is, reducing the amount of data sent across the memory hierarchy and
between processors.
The communication costs of algorithms, in terms of time or energy, are
typically much higher than the arithmetic costs.
We have computed lower bounds on these communication costs by
analyzing geometric and expansion properties of the underlying
computation DAG of algorithms. These techniques (honored by SIAG-LA
prize for 2009-2011 and SPAA'11 best paper award) inspired many new
algorithms, where existing ones proved not to be communication-optimal.
Parallel matrix multiplication is one of the most studied fundamental
problems in parallel computing. We obtain a new parallel algorithm
based on Strassen's fast matrix multiplication that is communication-optimal.
It exhibits perfect strong scaling, within the maximum possible range.
The algorithm asymptotically outperforms all known parallel matrix
multiplication algorithms, classical and Strassen-based. It also
demonstrates significant speedups in practice, as benchmarked on
several super-computers (Cray XT4, Cray XE6, and IBM BG/P).
Our parallelization approach is simple to implement, and extends to
other algorithms. Both the lower bounds and the new algorithms have an
immediate impact on saving power and energy at the algorithmic level.
Based on joint work with
Grey Ballard, James Demmel, Olga Holtz, Ben Lipshitz