Tiberiu Chis, Peter G. Harrison

- Conference or Workshop Paper
- Imperial College Computing Student Workshop (ICCSW 2012)
- September, 2012
- Schloss Dagstuhl
**DOI**10.4230/OASIcs.ICCSW.2012.29- Abstract
There is an increasing demand for systems which handle higher density, additional loads as seen in storage workload modelling, where workloads can be characterized on-line. This paper aims to find a workload model which processes incoming data and then updates its parameters "on-the-fly." Essentially, this will be an incremental hidden Markov model (IncHMM) with an improved Baum-Welch algorithm. Thus, the benefit will be obtaining a parsimonious model which updates its encoded information whenever more real time workload data becomes available. To achieve this model, two new approximations of the Baum-Welch algorithm are defined, followed by training our model using discrete time series. This time series is transformed from a large network trace made up of I/O commands, into a partitioned binned trace, and then filtered through a K-means clustering algorithm to obtain an observation trace. The IncHMM, together with the observation trace, produces the required parameters to form a discrete Markov arrival process (MAP). Finally, we generate our own data trace (using the IncHMM parameters and a random distribution) and statistically compare it to the raw I/O trace, thus validating our model.

- PDF of full publication (394.7 kilobytes)
- (need help viewing PDF files?)

Information from pubs.doc.ic.ac.uk/improved-baum-welch.