A fast response time and a reliable service are important quality of service criteria for almost all computer and communication systems. Indeed, a system which does not meet its performance and dependability requirements - its system performability - is, in practical terms, as ineffectual as a system that does not meet its correctness requirements.
This proposal aims to develop a range of novel techniques and software tools for the performability analysis of massively parallel systems, that is systems which are constructed from large groups of identical interdependent distributed components. The analysis of these systems is a difficult, yet practically important problem in many diverse fields, including the biological sciences - e.g. the dynamics of insect colonies and rate of spread of infections, environmental sciences - e.g. population growth and crowd dynamics, economics - e.g. fluctuations of financial markets as a result of individual trader behaviour, and computer science - e.g. diffusion of computer worms and viruses, dynamic load modelling in computational Grids, peer-to-peer networks, and mobile ad hoc networks.
From this wide range of possible application areas, we will focus on massively parallel computer-communication systems. Many such systems, such as distributed mobile publish-subscribe architectures, peer-to-peer filesharing networks and network worm infestations, are having an increasing economic, social and technological impact on our society. Yet, while great strides have been made in our ability to concisely describe these systems and their dynamic behaviour by means of compositional modelling formalisms, much less progress has been made on our ability to analyse these systems quantitatively. This is because almost all traditional performance analysis techniques are based on the idea of (explicitly or implicitly) constructing and solving a Markov chain made up of all possible system behaviours or states. Because the number of states explodes combinatorially with an increasing number of components, the number of states found in a massively parallel system is typically far beyond the feasible limit of direct quantitative analytical study (currently the state of the art is of the order 100 million states). This means that usually the only practical alternative for tackling such models is discrete-event simulation. However, for large models, even long-running simulations often suffer from low state-space coverage, which in turn leads to problems with stability and accuracy of performance metrics (especially where rare-events are not taken into account).
An exciting recent development in the performance analysis of such massively parallel systems when represented in stochastic process algebras (such as PEPA), is to use a fluid approximation of the state space. We aim to significantly develop this new paradigm, in collaboration with compositional techniques, by exploring how the fluid approximations can be captured precisely using ordinary and stochastic differential equations (ODEs and SDEs). This gives us the potential to explore the emergent behaviour of a massively parallel system based on the discrete agent description of the underlying components.