AESOP home

Publications

Distributed Disk-based Solution Techniques for Large Markov Models

William J. Knottenbelt, Peter G. Harrison

Conference or Workshop Paper
NSMC'99, 3rd International Workshop on the Numerical Solution of Markov Chains
August, 1999
pp.58–75
ISBN 84-7733-512-5
Abstract

Very large Markov chains often arise from stochastic models of complex real-life systems. In this paper we investigate disk-based techniques for solving such Markov chains on a distributed-memory parallel computer. We focus on two scalable numerical methods, namely the Jacobi and Conjugate Gradient Squared (CGS) algorithms.

The critical bottleneck in these methods is the parallel sparse matrix-vector multiply operation. By exploiting the data locality typically found in automatically generated state-transition-rate matrices, we develop an efficient matrix-vector multiply kernel that is characterised by low per-processor memory usage, low communication cost, and good load balance. At a slightly higher memory cost, the kernel also allows for the overlapping of communication and computation.

We describe a distributed software architecture which makes use of two processes per node to allow for the overlapping of disk I/O with computation and communication. By embedding our matrix-vector multiply kernel in this architecture, we implement a high-performance disk-based solver on a 16-node Fujitsu AP3000 computer. We demonstrate results showing good speedups and the capability to analyse very large models with over 50 million states and over 500 million transitions.

PDF of full publication (1.4 megabytes)
(need help viewing PDF files?)
GZipped Postscript of full publication (541.7 kilobytes)
(need help viewing GZipped Postscript files?)

Information from pubs.doc.ic.ac.uk/distributed-disk-markov-nsmc.