next up previous [pdf]

Next: Parallel multifrontal algorithms Up: Poulson et al.: Parallel Previous: Related work

Parallel sweeping preconditioner

The setup and application stages of the sweeping preconditioner (Algs. 0.0.4 and 0.0.5) essentially consist of $ m=O(n/\gamma)$ multifrontal factorizations and solves, respectively. The most important detail is that the subdomain factorizations can be performed in parallel, while the subdomain solves must happen sequentially.% latex2html id marker 3518
\setcounter{footnote}{8}\fnsymbol{footnote} When we also consider that each subdomain factorization requires $ O(\gamma^3 n^3)$ work, while subdomain solves only require $ O(\gamma^2 n^2 \log n)$ work, we see that, relative to the subdomain factorizations, subdomain solves must extract a factor of $ m=O(n/\gamma)$ more parallelism from a factor of $ O(\gamma n/\log n)$ less operations. We thus have a strong hint that, unless the subdomain solves are carefully handled, they will be the limiting factor in the scalability of the sweeping preconditioner.