# Publications

### Journal Papers

Allen, M., Williams, M. D., Pitt, M., Stein, K., Worthington, D. and **Suen, D**. (2013). Managing emergency hospital admissions and beds: the impact of day-to-day variation by hospital size. In submission.

### Abstract

**Objectives:**To compare the day-to-day variability of emergency admissions and the number of beds occupied by emergency patients across 30 different hospitals of varying sizes.

**Design:**Analysis of hospital admissions data augmented by mathematical modelling.

**Setting:**30 English hospitals with admissions ranging from ~15 to ~150 emergency admissions/day. 2011 data extracted from Hospital Episodes Statistics.

**Main outcome measures:**Emergency admissions and number of beds occupied by emergency admission patients: Mean, standard deviation, number of busy days (admissions >25% above average, or occupied beds >10% above average), peak load (95th percentile).

**Results:**The five smallest hospitals had 17% (range 14-20%) days when admissions were 25% greater than average. Peaks (95th percentile) in admission were 45% (range 42-49%) above average. 16% (range 12-23%) of days had bed usage 10% higher than average, and average bed use was 86% (range 84-89%) of peak (95th percentile) bed usage.

The five largest hospitals had 2% (range 1-2%) days when admissions were 25% greater than average. Peaks (95th percentile) in admission were 20% (range 19-22%) above average. 9% (range 8-12%) of days had bed usage 10% higher than average, and average bed use was 91% (range 89-93%) of peak (95th percentile) bed usage.

The results are compared with a modelled system based on queuing theory, which explains the variation in admissions and seems to provide a lower-bound for the variability in occupied beds.

**Conclusions:**Smaller hospitals with fewer admissions experience more frequent busy days, and larger peaks in admissions and number of beds occupied compared with larger hospitals. In order to provide a safe and effective environment, smaller hospitals require greater proportional spare capacity to absorb the increased frequency and severity of peaks in admissions and bed usage.

Bardwell, L., Eckley, I, **Fearnhead, P.**, Smith, S. and Spott, M. (2016).Most recent changepoint detection in panel data. Submitted.

### Abstract

We present a novel approach to detect sets of most recent changepoints (MRC) in panel data. A panel is made up of a number of univariate time series and our method is described firstly for the univariate case where finding the most recent changepoint (prior to the end of the data) is straightforward. We then extend this to panel data where a number of MRC's that affect disjoint subsets of the series are identified. Focusing on the most recent changepoints makes sense for forecasting and understanding recent behaviour of the panel and the specific subsets of series within it. The possibility that such changes will only be present in a, potentially small, subset of the series which many existing methodologies do not account for is a key consideration in this work. This involves pooling information across individual series of the panel to increase the power of detection. We develop a general model for this problem, and show how it is possible to accurately and efficiently perform inference. We present simulations showing that this approach is more accurate than other proposed method for analysing such data. Two real data sets are considered, regarding the number of events that occur over time in a telecommunications network and a data set from the field of Corporate finance where our method gives insights into the different subsets of series that change.

Bardwell, L. and **Fearnhead, P.** (2016).Bayesian detection of abnormal segments in multiple time series. To appear in Bayesian Analysis.

### Abstract

We present a novel Bayesian approach to analysing multiple time-series with the aim of detecting abnormal regions. These are regions where the properties of the data change from some normal or baseline behaviour. We allow for the possibility that such changes will only be present in a, potentially small, subset of the time-series. We develop a general model for this problem, and show how it is possible to accurately and efficiently perform Bayesian inference, based upon recursions that enable independent sampling from the posterior distribution. A motivating application for this problem comes from detecting copy number variation (CNVs), using data from multiple individuals. Pooling information across individuals can increase the power of detecting CNVs, but often a specific CNV will only be present in a small subset of the individuals. We evaluate the Bayesian method on both simulated and real CNV data, and give evidence that this approach is more accurate than a recently proposed method for analysing such data.

**Barnett, H.**, Geys, H., Jacobs, T and Jaki, T. (2015). Multiple Comparisons of Model Averaged Derived Parameters with Applications to the Comparison of Sampling Methods in Pharmacokinetic Studies. In Submission.

### Abstract

Pharmacokinetic (PK) studies aim to study how a compound is absorbed, distributed, metabolised and excreted (ADME). Concentration of the compound in the blood or plasma is measured at different time points after administration and pharmacokinetic parameters such as the area under the curve ($AUC$) or maximum concentration ($C_{max}$) are derived from the resulting concentration time profile. In this paper we want to compare different methods for collecting concentration measurements (traditional sampling versus microsampling) on the basis of the derived parameters. We adjust and evaluate an existing method for testing superiority of multiple derived parameters that accounts for model uncertainty. Two improvements to the method are introduced and assessed. We subsequently extend the approach to allow testing for equivalence. We motivate the methods through an illustrative example and evaluate the performance using simulations.

**Crespo Del Granado P.**, Pang Z. and Wallace S. W. (2016). Synergy of Smart grids and hybrid distributed generation on the value of energy storage, Applied Energy, forthcoming APEN_7540.

### Abstract

In smart grids, demand response and distributed energy systems aim to provide a higher degree of flexibility for load-shifting operations and the leverage to control intermittent wind supply. In this more dynamic energy system, deployment of energy storage at the site of consumption is envisioned to create synergies with the local distributed generation (DG) system. From a large end-user perspective, this paper contributes to the practical understanding of smart grids by modelling the impact of real-time pricing schemes (smart grids) on a hybrid DG system (mixed generation for heating and electricity loads) coupled with storage units. Specifically, we address: How does the portfolio of DG units affect the value of energy storage? and, what is the value of energy storage when assessing different designs of demand response for the end-user? To this end, we formulate a dynamic optimization model to represent a real-life urban community's energy system composed of a co-generation unit, gas boilers, electrical heaters and a wind turbine. We discuss the techno-economic benefits of complementing this end-user's energy system with storage units (thermal storage and battery devices). The paper analyses the storages policy strategies to simultaneously satisfy heat and electricity demand through the efficient use of DG units under demand response mechanisms. Results indicate that the storage units reduce energy costs by 7-10% in electricity and 3% in gas charges. In cases with a large DG capacity, the supply-demand mismatch increases, making storage more valuable.

**Crespo Del Granado P.**, Wallace S. W. and Pang Z. (2015). The impact of wind uncertainty on the strategic valuation of distributed electricity storage, Computational Management Science, Vol 13, 1, pp 5-27, DOI: 10.1007/s10287-015-0235-0.

### Abstract

The intermittent nature of wind energy generation has introduced a new degree of uncertainty to the tactical planning of energy systems. Short-term energy balancing decisions are no longer (fully) known, and it is this lack of knowledge that causes the need for strategic thinking. But despite this observation, strategic models are rarely set in an uncertain environment. And even if they are, the approach used is often inappropriate, based on some variant of scenario analysis—what-if analysis. In this paper we develop a deterministic strategic model for the valuation of electricity storage (a battery), and ask: “Though leaving out wind speed uncertainty clearly is a simplification, does it really matter for the valuation of storage?”. We answer this question by formulating a stochastic programming model, and compare its valuation to that of its deterministic counterpart. Both models capture the arbitrage value of storage, but only the stochastic model captures the battery value stemming from wind speed uncertainty. Is the difference important? The model is tested on a case from Lancaster University’s campus energy system where a wind turbine is installed. From our analysis, we conclude that considering wind speed uncertainty can increase the estimated value of storage with up to 50 % relative to a deterministic estimate. However, we also observe cases where wind speed uncertainty is insignificant for storage valuation.

**Crespo Del Granado P.**, Wallace S. W. and Pang Z. (2014). The value of electricity storage in domestic homes: A smart grid perspective. Energy Systems, Vol 5, 2, pp 211-232,DOI: 10.1007/s12667-013-0108.

### Abstract

About 7 % of the energy consumption in the UK presently comes from wind, but this is expected to grow to well over 20 %. This causes serious concerns about the ability of the energy system to balance supply and demand, as it is already very inflexible. Though each household is very small, in total they contribute substantially to the energy demand, and in particular to the peak demand. In this paper, we develop a bottom-up approach, focusing on the value of energy storage and renewable micro-generation in domestic homes. Specifically, we consider a connection to the grid, a boiler, a solar collector, a small wind turbine, a water tank, and a battery. We use the wholesale spot prices as proxies for the provision costs of gas and electricity. We focus on the predictable inter-temporal variations of energy demand, wind speed, and spot prices, and thus assume that these parameters, though deterministic, are time varying. The objective of the model is to minimize the total energy consumption cost, as seen from the grid, throughout a finite horizon. We conduct a numerical case study using a sample of real-life demand and weather data for some typical houses in the UK and recent spot price data. Our results show that a battery might have a significant contribution to energy cost savings, and shed new light on the design of distributed energy systems for a smart grid, especially when coupled with a wind turbine. The benefits do not depend on behavioural changes in the households.

**D. Nasiri** (2014). Correlation Between Speeds on a Congested Road Network in the City of London. Submitted for publication. In Submission.

### Abstract

This article uses real traffic data, from the City of London, to explore temporal and spatial correlations between travel speeds in a congested road network. It is shown that, in contrast with results found in other studies on non-congested networks, the first-order Markovian property does not hold for spatial correlations. Indeed, for our data set the spatial correlations are still significant for roads up to twenty links apart. On the other hand, if one analyses the correlations using principal component analysis, it turns out that only three components are needed to explain 89% of the temporal variation in speed, and only six components are needed to explain over 80% of the spatial variation.

Edwards, J, **Fearnhead, P**. and Glazebrook, K. (2016).‘On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits’, Probability in the Engineering and Informational Sciences, doi: 10.1017/S0269964816000279.

### Abstract

The knowledge gradient (KG) policy was originally proposed for offline ranking and selection problems but has recently been adapted for use in online decision-making in general and multi-armed bandit problems (MABs) in particular. We study its use in a class of exponential family MABs and identify weaknesses, including a propensity to take actions which are dominated with respect to both exploitation and exploration. We propose variants of KG which avoid such errors. These new policies include an index heuristic, which deploys a KG approach to develop an approximation to the Gittins index. A numerical study shows this policy to perform well over a range of MABs including those for which index policies are not optimal. While KG does not take dominated actions when bandits are Gaussian, it fails to be index consistent and appears not to enjoy a performance advantage over competitor policies when arms are correlated to compensate for its greater computational demands.

**Fairbrother, J.**, Turner, A. and Wallace, S. (2015). Scenario generation for stochastic programs with tail risk measures. In Submission.

### Abstract

Tail risk measures such as Value-at-Risk and Conditional Value-at-Risk are used in stochastic programming to mitigate or reduce the probability of large losses. However, because tail risk measures only depend on the upper tail of a distribution, scenario generation for these problems is difficult as standard methods such as sampling will typically inadequately represent these areas. We present a problem-based approach to scenario generation for stochastic programs which use tail risk measures, and demonstrate this approach on a class of portfolio selection problems.

**Fairbrother, J.**, Turner, A. and Wallace, S. (2015). Scenario generation for portfolio selection problems with tail risk measure. In Submission.

### Abstract

Tail risk measures such as the conditional value-at-risk are useful in the context of portfolio selection for quantifying potential losses in worst cases. However, for scenario-based problems these are problematic: because the value of a tail risk measure only depends on a small subset of the support of the distribution of asset returns, traditional scenario based methods, which spread scenarios evenly across the whole support of the distribution, yield very unstable solutions unless we use a very large number scenarios. In this paper we propose a problem-driven scenario generation methodology for portfolio selection problems using a tail risk measure where the the asset returns have elliptical or near-elliptical distribution. Our approach in effect prioritizes the construction of scenarios in the areas of the distribution which correspond to the tail losses of feasible portfolios. The methodology is shown to work particularly well when the distribution of assets returns are positively correlated and heavy-tailed, and the performance is shown to improve as we tighten the constraints on feasible assets.

** Haynes, K.**, Eckley, I. A., and Fearnhead, P., (2015). Computationally efficient changepoint detection for a range of penalties. Journal of Computational and Graphical Statistics (to appear).

### Abstract

In the multiple changepoint setting, various search methods have been proposedwhich involve optimising either a constrained or penalised cost function over possiblenumbers and locations of changepoints using dynamic programming. Recent work inthe penalised optimisation setting has focussed on developing an exact pruning-basedapproach which, under certain conditions, is linear in the number of data points. Suchan approach naturally requires the specification of a penalty to avoid under/over- fitting.

Work has been undertaken to identify the appropriate penalty choice for data generatingprocesses with known distributional form, but in many applications the model assumedfor the data is not correct and these penalty choices are not always appropriate. Tothis end we present a method that enables us to find the solution path for all choiceof penalty values across a continuous range. This permits an evaluation of the varioussegmentations to identify a suitable penalty choice. The computational complexity ofthis approach can be linear in the number of data points and linear in the differencebetween the number of changepoints in the optimal segmentations for the smallest andlargest penalty values.

** Haynes, K.**, Fearnhead, P., and Eckley, I.A. (2015). A computationally efficient nonparametric approach for changepoint detection. To Appear in Statistics and Computing.

### Abstract

In this paper we build on an approach proposed by Zou et al. (2014) for nonparametric changepoint detection. This approach defines the best segmentation for a data set as the one which minimises a penalised cost function, with the cost function defined in term of minus a non-parametric log-likelihood for data within each segment. Minimising this cost function is possible using dynamic programming, but their algorithmhad a computational cost that is cubic in the length of the data set. To speed upcomputation, Zou et al. (2014) resorted to a screening procedure which means that theestimated segmentation is no longer guaranteed to be the global minimum of the costfunction. We show that the screening procedure adversely affects the accuracy of thechangepoint detection method, and show how a faster dynamic programming algorithm,Pruned Exact Linear Time, PELT (Killick et al., 2012), can be used to find the optimalsegmentation with a computational cost that can be close to linear in the amount ofdata. PELT requires a penalty to avoid under/overfitting the model which can have adetrimental effect on the quality of the detected changepoints. To overcome this issuewe use a relatively new method, Changepoints Over a Range of PenaltieS (CROPS)(Haynes et al., 2015), which finds all of the optimal segmentations for multiple penaltyvalues over a continuous range. We apply our method to detect changes in heart rateduring physical activity.

** Hofmeyr, D.** (2016) Clustering by minimum cut hyperplanes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016 (to appear)

### Abstract

Minimum normalised graph cuts are highly effective ways of partitioning unlabeled data, having been made popular by the success of spectral clustering. This work presents a novel method for learning hyperplane separators which minimise this graph cut objective, when data are embedded in Euclidean space. The optimisation problem associated with the proposed method can be formulated as a sequence of univariate subproblems, in which the optimal hyperplane orthogonal to a given vector is determined. These subproblems can be solved in log-linear time, by exploiting the trivial factorisation of the exponential function. Experimentation suggests that the empirical runtime of the overall algorithm is also log-linear in the number of data. Asymptotic properties of the minimum cut hyperplane, both for a finite sample, and for an increasing sample assumed to arise from an underlying probability distribution are discussed. In the finite sample case the minimum cut hyperplane converges to the maximum margin hyperplane as the scaling parameter is reduced to zero. Applying the proposed methodology, both for fixed scaling, and the large margin asymptotes, is shown to produce high quality clustering models in comparison with state-of-the-art clustering algorithms in experiments using a large collection of benchmark datasets.

** Hofmeyr, D.**, Pavlidis, N., Eckley, I., (2015). Divisive clustering of high dimensional data streams. Statistics and Computing, doi: 10.1007/s11222-015-9597-y.

### Abstract

Clustering streaming data is gaining importance as automatic data acquisition technologies are deployed in diverse applications. We propose a fully incremental projected divisive clustering method for high-dimensional data streams that is motivated by high density clustering. The method is capable of identifying clusters in arbitrary subspaces, estimating the numberof clusters, and detecting changes in the data distribution which necessitate a revision of the model. The empirical evaluation of the proposed method on numerous real and simulated datasets shows that it is scalable in dimension and number of clusters, is robust to noisy and irrelevant features, and is capable of handling a variety of types of non-stationarity.

**Hofmeyr, D.**, Pavlidis, N., Eckley, I., (2015)Minimum spectral connectivity projection pursuit for unsupervised classification. In Submission.

### Abstract

We study the problem of determining the optimal univariate subspace for maximising the separability of a binary partition of unlabeled data, as measured by spectral graph theory. This is achieved by finding projections which minimise the second eigenvalue of the Laplacian matrices of the projected data, which corresponds to a non-convex, non-smooth optimisation problem. We show that the optimal projection based on spectral connectivity converges to the vector normal to the maximum margin hyperplane through the data, as the scaling parameter is reduced to zero. This establishes a connection between connectivity as measured by spectral graph theory and maximal Euclidean separation. It also allows us to apply our methodology to the problem of finding large margin linear separators. The computational cost associated with each eigen-problem is quadratic in the number of data. To mitigate this problem, we propose an approximation method using microclusters with provable approximation error bounds. We evaluate the performance of the proposed method on simulated and publicly available data sets and find that it compares favourably with existing methods for projection pursuit and dimension reduction for unsupervised data partitioning.

**James, T.**, Glazebrook, K. D., Lin, K. (2014).Developing Effective Service Policies for Multiclass Queues with Abandonment: Asymptotic Optimality and Approximate Policy Improvement. INFORMS Journal of Computing, 28, 2, p. 251-264.

### Abstract

A single server is faced with impatient customers across multiple customer classes. Each customer has a random lifetime during which she is available for preemptive service. Should a customer be served before this lifetime expires, a reward is received; otherwise, the customer is lost from the system. The goal is to determine a service policy which maximises the long-run reward rate. Standard methods can only compute the optimal policy for problems of small size; hence, we seek effective heuristic approaches. We first develop an index policy which is asymptotically optimal as customer abandonment rates approach zero, if the system is stable without customer abandonment. We next consider extensions of the model and show how generalisations of the index policy retain the same asymptotic optimality. Finally, we develop an approximate policy improvement method which uses simulation to estimate bias functions at a set of carefully chosen states, which are then used to interpolate bias functions for all states. Policy improvement based on these approximate bias functions produces a heuristic policy which performs very well in an extensive set of numerical experiments.

**Kereszturi, M.**, Tawn, J. A. and Jonathan, P. (2015).Assessing extremal dependence of North Sea storm severity. Ocean Engineering, 118, 242-259.

### Abstract

Extreme value theory provides asymptotically motivated methods for modelling the occurrences of extreme values of oceanographic variables, such as significant wave height at a single location. Problems are often multivariate or spatial in nature, where interest lies in the risk associated with joint occurrences of rare events, e.g. the risk to multiple offshore structures from a storm event. There are two different classes of joint tail behaviour that have very different implications: asymptotic independence suggesting that extreme events are unlikely to occur together, and asymptotic dependence implying that extreme events can occur simultaneously. It is vital to have good diagnostics to identify the appropriate dependence class. If variables are asymptotically independent, incorrectly assuming an asymptotically dependent model can lead to overestimation of the joint risk of extreme events, and hence to higher than necessary design costs of offshore structures. We develop improved diagnostics for differentiating between these two classes, which leads to increased confidence in model selection. Application to samples of North Sea sea-state and storm-peak significant wave height suggests that tail dependence changes with wave direction and distance between spatial locations, and that in most cases asymptotic independence seems to be the more appropriate assumption.

**Kereszturi, M.**, Tawn, J. A. (2016).Properties of extremal dependence models built on bivariate max-linearity. To Appear in J. Multivariate Analysis.

### Abstract

Bivariate max-linear models provide a core building block for characterising bivariate max-stable distributions. The limiting distribution of marginally normalised componentwise maxima of bivariate max-linear models can be dependent (asymptotically dependent) or independent (asymptotically independent). However, for modelling bivariate extremes they have weaknesses in that they are exactly max-stable with no penultimate form of convergence to asymptotic dependence, and asymptotic independence arises if and only if the bivariate max-linear model is independent. In this work we present more realistic structures for describing bivariate extremes. We show that these models are built on bivariate max-linearity but are much more general. In particular, we present models that are dependent but asymptotically independent and others that are asymptotically dependent but have penultimate forms. We characterise the limiting behaviour of these models using two new different angular measures in a radial-angular representation that reveal more structure than existing measures.

Letchford, A.N. and **Nasiri, S.D.** (2015).The Steiner travelling salesman problem with correlated costs. Eur. J. Oper. Res., 245(1), 62–69.

### Abstract

The Steiner Travelling Salesman Problem (STSP) is a variant of the TSP that is suitable for instances defined on road networks. We consider an extension of the STSP in which the road traversal costs are both stochastic and correlated. This happens, for example, when vehicles are prone to delays due to rush hours, road works or accidents.Following the work of Markowitz on portfolio selection, we model this problem as a bi-objective mean-variance problem. Then, we show how to approximate the e?fficient frontier via integer programming. We also show how to exploit certain special structures in the correlation matrices. Computational results indicate that our approach is viable for instances with up to 100 nodes. It turns out that minimum variance tours can look rather different from minimum expected cost tours.

Letchford, A.N. and **Nasiri, S.D.** (2013).Pricing Routines for Vehicle Routing with Time Windows on Road Networks. Comput. & Oper. Res., 51, 331-337.

### Abstract

Several very effective exact algorithms have been developed for vehicle routing problems with time windows. Unfortunately, most of these algorithms cannot be applied to instances that are defined on road networks, because they implicitly assume that the cheapest path between two customers is equal to the quickest path. Garaix and co-authors proposed to tackle this issue by first storing alternative paths in an auxiliary multi-graph, and then using that multi-graph within a branch-and-price algorithm. We show that, if one works with the original road network rather than the multi-graph, then one can solve the pricing subproblem more quickly, in both theory and practice.

Letchford, A. N. & **Nasiri, S. D.** (2013).Compact formulations of the Steiner TSP and related problems. Eur. J. Oper. Res., 228, 83-92.

### Abstract

The Steiner Travelling Salesman Problem (STSP) is a variant of the TSP that is particularly suitable when routing on real-life road networks. The standard integer programming formulations of both the TSP and STSP have an exponential number of constraints. On the other hand, several compact formulations of the TSP, i.e., formulations of polynomial size, are known. In this paper, we adapt some of them to the STSP, and compare them both theoretically and computationally. It turns out that, just by putting the best of the formulations into the CPLEX branch-and-bound solver, one can solve instances with over 200 nodes. We also briefly discuss the adaptation of our formulationsto some related problems.

**Maidstone, R.**, Hocking, T., Rigaill, G., Fearnhead, P. (2014).On Optimal Multiple Changepoint Algorithms for Large Data. To Appear in Statistics and Computing.

### Abstract

There is an increasing need for algorithms that can accurately detect changepoints in long time-series, or equivalent, data. Many common approaches to detecting changepoints, for example based on penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. Dynamic programming methods exist to solve this minimisation problem exactly, but these tend to scale at least quadratically in the length of the time-series. Algorithms, such as Binary Segmentation, exist that have a computational cost that is close to linear in the length of the time-series, but these are not guaranteed to find the optimal segmentation. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to find true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data, FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method at detecting Copy Number Variations and observe that FPOP has a computational cost that is competitive with that of Binary Segmentation.

**Maidstone, R.**, **Pickering, B.** (2013).Comment on "Multiscale Change-Point Inference" by K. Frick, A. Munk and H. Sieling.

### Abstract

Multiscale Change-Point Inference (Frick et al., 2014) introduces a novel estimator used for the detection of changepoints, known as the ‘Simultaneous Multiscale Change-Point Estimator’ (SMUCE). This comment piece, published alongside the original paper, explores the performance of SMUCE in the case of large datasets with an increasing number of changepoints. Comparisons are drawn with other leading changepoint detection methods.

Malory, S., Sherlock, C.(2016).Residual-Bridge Constructs for Conditioned Diffusions. Submitted.

### Abstract

We introduce a new residual-bridge proposal for approximately simulating conditioned diffusions. This proposal is formed by applying the modified diffusion bridge approximation of Durham and Gallant (2002) to the difference between the true diffusion and a second, approximate diffusion driven by the same Brownian motion, and can be viewed as a natural extension to recent work on residual-bridge constructs (Whitaker et al., 2016). This new proposal attempts to account for volatilities which are not constant and can therefore lead to gains in efficiency over the recently proposed residual-bridge constructs in situations where the volatility varies considerably, as is often the case for larger inter-observation times and for time-inhomogeneous volatilities. These potential gains in efficiencies are illustrated via a simulation study.

**Nemeth, C.**, Fearnhead, P. and Mihaylova, L. (2013).“Sequential Monte Carlo Methods for State and Parameter Estimation in Abruptly Changing Environments”. IEEE Transactions on Signal Processing, Vol. 62, No. 5, pp. 1245-1255.

### Abstract

This paper develops a novel sequential Monte Carlo (SMC) approach for joint state and parameter estimation that can deal eciently with abruptly changing parameters which is a common case when tracking ma- neuvering targets. The approach combines Bayesian methods for dealing with changepoints with methods for estimating static parameters within the SMC framework. The result is an approach which adaptively esti- mates the model parameters in accordance with changes to the target's trajectory. The developed approach is compared against the Interacting Multiple Model (IMM) lter for tracking a maneuvering target over a complex maneuvering scenario with nonlinear observations. In the IMM lter a large combination of models is required to account for unknown parameters. In contrast, the proposed approach circumvents the combina- torial complexity of applying multiple models in the IMM lter through Bayesian parameter estimation techniques. The developed approach is validated over complex maneuvering scenarios where both the system pa- rameters and measurement noise parameters are unknown. Accurate es- timation results are presented.

**Nemeth, C**., Fearnhead, P. and Mihaylova, L. (2016).Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost. Journal of Computational and Graphical Statistics. 25, 1138-1157.

### Abstract

Poyiadjis et al. (2011) show how particle methods can be used to esti- mate both the score and the observed information matrix for state-space models. These methods either su er from a computational cost that is quadratic in the number of particles, or produce estimates whose variance increases quadratically with the amount of data. This paper introduces an alternative approach for estimating the score and information matrix, which has a computational cost that is linear in the number of particles. The method is derived using a combination of kernel density estimation to avoid the particle degeneracy that causes the quadratically increasing variance, and Rao-Blackwellisation. Crucially, we show the method is ro- bust to the choice of bandwidth within the kernel density estimation, as it has good asymptotic properties regardless of this choice. Our estimates of the score and observed information matrix can be used within both online and batch procedures for estimating parameters for state-space models. Empirical results show improved parameter estimates compared to exist- ing methods at a signi cantly reduced computational cost.

**Nemeth, C.**,Sherlock, C. and Fearnhead, P. (2016).Particle Metropolis-adjusted Langevin algorithms. Biometrika. 103, 701-717.

### Abstract

This paper proposes a new sampling scheme based on Langevin dynamics that is applicablewithin pseudo-marginal and particle Markov chain Monte Carlo algorithms. We investigate thisalgorithm’s theoretical properties under standard asymptotics, which correspond to an increasingdimension of the parameters, n. Our results show that the behaviour of the algorithm dependscrucially on how accurately one can estimate the gradient of the log-target. If the error in theestimate of the gradient is not sufficiently controlled as dimension increases, then asymptoticallythere will be no advantage over the simpler random-walk algorithm. However, if the error issufficiently well-behaved, then the optimal scaling of this algorithm will be O(n^{-1/6} ) comparedto O(n^{−1/2}) for the random-walk. Our theory also gives guidelines on how to tune the numberof Monte Carlo samples in the likelihood estimate and the proposal step-size.

Palvidis, N., **Hofmeyr, D.**, Tasoulis, S., (2015)Minimum density hyperplanes. Journal of Machine Learning Research, 17(156):1-33, 2016.

### Abstract

Associating distinct groups of objects (clusters) with contiguous regions of high probability density (high-density clusters), is central to many statistical and machine learning approaches to the classification of unlabelled data. We propose a novel hyperplane classifier for clustering and semi-supervised classification which is motivated by this objective. The proposed minimum density hyperplane minimises the integral of the empirical probability density function along it, thereby avoiding intersection with high density clusters. We show that the minimum density and the maximum margin hyperplanes are asymptotically equivalent, thus linking this approach to maximum margin clustering and semi-supervised support vector classifiers. We propose a projection pursuit formulation of the associated optimisation problem which allows us to find minimum density hyperplanes efficiently in practice, and evaluate its performance on a range of benchmark data sets. The proposed approach is found to be very competitive with state of the art methods for clustering and semi-supervised classification.

**Park, T.** Eckley, I. A. and Ombao H. C. (2013).Estimating time-evolving dependenc quantities between signals via multivariate locally stationary wavelet processes”, OEEE Transactions on Signal Processing, 62, 5240-5250.

### Abstract

We consider the problem of estimating cross-dependence in a collection of non-stationary signals.To this end we develop the multivariate locally stationary wavelet process which provides a time-scaledecomposition of the signals and thus naturally captures the time-evolving scale-specific dependencestructure in the signals. Under the proposed model, we rigorously define and estimate two forms ofdependence measures: wavelet coherence and wavelet partial coherence. These dependence measuresdiffer in a subtle, but important way. The former is a broad measure of dependence which may includeindirect associations (i.e. dependence between a pair of signals that is being driven by another signal).Conversely the wavelet partial coherence provides a measure of the direct linear association between apair of signals (i.e. decoupling the (linear) effect of other observed signals). Our novel time-scale waveletpartial coherence estimation scheme thus provides us with a mechanism to identify hidden relationshipswithin the network of non-stationary signals, as we demonstrate on electroencephalograms recorded ina visual-motor experiment.

**Park, T.**, Eckley, I. A., and Ombao, H. (2015).Dynamic classification of multivariate time series using the multivariate locally stationary wavelet model (under revision).

### Abstract

Methods for the supervised classification of signals generally aim to assign a signal to one class forits entire time span. In this paper we present an alternative formulation for multivariate signals wherethe class membership is permitted to change over time. Our aim therefore changes from classifyingthe signal as a whole to classifying the signal at each time point to one of a fixed number of knownclasses. We assume that each class is characterised by a different stationary generating process, thesignal as a whole will however be nonstationary due to class switching. To capture this nonstationaritywe use the recently proposed Multivariate Locally Stationary Wavelet model. To account for uncertaintyin class membership at each time point our goal is not to assign a definite class membership but ratherto calculate the probability of a signal belonging to a particular class. Under this framework we provesome asymptotic consistency results. This method is also shown to perform well when applied to bothsimulated and accelerometer data. In both cases our method is able to place a high probability on thecorrect class for the majority of time points.

Randell, D., **Turnbull, K.**, Ewans, K., and Jonathan, P. (2015).Bayesian inference for non-stationary marginal extremes. In Submission.

### Abstract

We propose a simple piecewise model for a sample of peaks-over-threshold, non-stationary with respect to multidimensional covariates, estimated using a carefully-designed computationally-efficient Bayesian inference. Model parameters are themselves parameterised as functions of covariates using penalised B-spline representations. This allows detailed characterisation of non-stationarity extreme environments. The approach gives similar inferences to a comparable frequentist penalised maximum likelihood method, but is computationally considerably more efficient and allows a more complete characterisation of uncertainty in a single modelling step. We use the model to quantify the joint directional and seasonal variation of storm peak significant wave height at a northern North Sea location, and estimate predictive directional-seasonal return value distributions necessary for the design and reliability assessment of marine and coastal structures.

Reeve, D.T., Randall, D., Ewans, K. C. and Jonathan, P. (2012).Uncertainty due to choice of measurement scale in Extreme Value modelling of North Sea Storm Severity. Ocean Engineering, 53, 164-176.

**Rohrbeck, C.**, Costain, D and Frigessi, A. (2016).Bayesian Spatial Monotonic Multiple Regression. In submission.

### Abstract

We consider monotonic, multiple regression for a set of contiguous regions (lattice data). The regression functions permissibly vary between regions and exhibit geographical structure. We develop new Bayesian non-parametric methodology which allows for both continuous and discontinuous functional shapes and which are estimated using marked point processes and reversible jump Markov Chain Monte Carlo techniques. Geographical dependency is incorporated by a flexible prior distribution; the parametrisation allows the dependency to vary with functional level. The approach is tuned using Bayesian global optimization and cross-validation. Estimates enable variable selection, threshold detection and prediction as well as the extrapolation of the regression function. Performance and flexibility of our approach is illustrated by simulation studies and an application to a Norwegian insurance data set.

**Rohrbeck, C.**, Eastoe, E. F., Frigessi, A., and Tawn, J. A.(2016).Extreme Value Modelling of Water-related Insurance Claims. Submitted.

### Abstract

This paper considers the dependence between weather events, e.g. rainfall or snow-melt, and the number of water-related property insurance claims. Weather events which cause severe damages are of general interest, decision makers want to take efficient actions against them while the insurance companies want to set adequate premiums. The modelling is challenging since the underlying dynamics vary across geographical regions due to differences in topology, construction designs and climate. We develop new methodology to improve the existing models which fail to model high numbers of claims. The statistical framework is based on both mixture and extremal mixture modelling, with the latter being based on a discretized generalized Pareto distribution. Furthermore, we propose a temporal clustering algorithm and derive new covariates which lead to a better understanding of the association between claims and weather events. The modelling of the claims, conditional on the locally observed weather events, both fits the marginal distributions well and captures the spatial dependence between locations. Our methodology is applied to three cities across Norway to demonstrate its benefits.

**Ross, E.**, Kirkbride, C., Shakya, S. and Owusu, G. (2015).Cross-trained workforce planning for service industries: The effects of temporal demand flexibility. In submission.

### Abstract

This paper develops a multi-period cross-trained workforce planning model with temporal demand flexibility. Temporal demand flexibility incorporates the option to utilise spare capacity by completing some work early. It also provides the capability to model the flow of incomplete work (or carryover ) across the planning horizon. Set in a proposed Aggregate Planning stage, the model permits the planning of large and complex workforces over a horizon of many months and provides a bridge between the traditional Tactical and Operational stages of workforce planning. The performance of the different levels of planning flexibility the model offers is evaluated in an industry motivated case study. An extensive numerical study, under various supply and demand characteristics, then broadens the application of the modelling of temporal demand flexibility and evaluates the value of cross-training as a supply strategy in this domain.

**Sharkey, P.** and Tawn, J.A. (2016).A Poisson process reparameterisation for Bayesian inference for extremes. Extremes (to appear).

### Abstract

In extreme value theory, excesses above a threshold are typically modelled using the Generalised Pareto (GP) distribution. However, the parameterisation of the GP is merely a special case of a Poisson process representation of threshold excesses. Unlike the GP, the parameters of the Poisson process model are invariant to choice of threshold, which makes it more suitable for covariate modelling and hence suggests that it may be the better parameterisation to use. In contrast, it has been found that the parameters are highly dependent, making estimation more difficult. In a likelihoodframework, numerical maximisation schemes tend to converge to local optima, with the consequence that parameter estimates are largely influenced by choice of starting values. For this reason, we choose to work in a Bayesian setting.

A common way of overcoming parameter dependence in a Markov Chain Monte Carlo (MCMC) procedure is to explore the parameter space using a dependent proposalvariance, though this requires a knowledge of the parameter dependence structure a priori. Even in this case, the dependence structure potentially varies in different regions of the parameter space,which would require different parameterisations of the proposal variance to be applied. This paper illustrates an approach to improving estimation and efficiency of the Poisson processmodel. Our method exploits the scaling factor in the Poisson process likelihood as a means of creating a near-orthogonal representation of the parameter space, ensuring more rapid convergence and improved sampling from the MCMC scheme, before transforming to the parameterisation of interest. This work is presented for use in both stationary and non-stationary modelling of threshold excesses.

Turner, L., Dimitrov, N. and **Fearnhead, P.** (2016).Bayes linear methods for large-scale network search. Submitted.

### Abstract

Consider the problem of searching a large set of items, such as emails, for a small set which are relevant to a given query. This can be implemented in a sequential manner whereby we use knowledge from earlier items that we have screened to help us choose future items in an informed way. Often the items we are searching have an underlying network structure: for example emails can be related to a network of participants, where an edge in the network relates to the presence of a communication between those two participants. Recent work by Dimitrov, Kress and Nevo has shown that using the information about the network structure together with a modelling assumption that relevant items and participants are likely to cluster together, can greatly increase the rate of screening relevant items. However their approach is computationally expensive and thus limited in applicability to small networks. Here we show how Bayes Linear methods provide a natural approach to modelling such data; that they output posterior summaries that are most relevant to heuristic policies for choosing future items; and that they can easily be applied to large-scale networks. Both on simulated data, and data from the Enron Corpus, Bayes Linear approaches are shown to be applicable to situations where the method of Dimitrov et al. is infeasible; and give substantially better performance than methods that ignore the network structure.

**Williamson, F.**, Jacko, P., Jaki, T. and Villar S. (2015).A Bayesian adaptive design for clinical trials in rare diseases. Computational Statistics & Data Analysis, DOI: 10.1016/j.csda.2016.09.006.

### Abstract

Development of treatments for rare diseases is challenging due to the limited number of patients available for participation. Learning about treatment effectiveness with a view to treat patients in the larger outside population, as in the traditional fixed randomised design, may not be a plausible goal. An alternative goal is to treat the patients within the trial as effectively as possible. Using the framework of finite-horizon Markov decision processes and dynamic programming (DP), a novel randomised response-adaptive design is proposed which maximises the total number of patient successes in the trial and penalises if a minimum number of patients are not recruited to each treatment arm. Several performance measures of the proposed design are evaluated and compared to alternative designs through extensive simulation studies using a recently published trial as motivation. For simplicity, a two-armed trial with binary endpoints and immediate responses is considered. Simulation results for the proposed design show that: (i) the percentage of patients allocated to the superior arm is much higher than in the traditional fixed randomised design; (ii) relative to the optimal DP design, the power is largely improved upon and (iii) it exhibits only a very small bias and mean squared error of the treatment effect estimator. Furthermore, this design is fully randomised which is an advantage from a practical point of view because it protects the trial against various sources of bias. As such, the proposed design addresses some of the key issues that have been suggested as preventing so-called bandit models from being implemented in clinical practice.

**Winter, H. C.** and Tawn, J. A. (2016).Modelling heatwaves in central France: a case study in extremal dependence. Journal of the Royal Statistical Society: Series C, 65(3), 345-365.

### Abstract

Heatwaves are a phenomena that have large social and economic consequences. Understanding and estimating the frequency of such events is of great importance to climate scientists and decision makers. Heatwaves are a type of extreme event which are by definition rare and as such there exists little data in the historical record to help planners. Extreme value theory is a general framework from which inference can be drawn from extreme events. When modelling heatwaves it is important to take into account the intensity and duration of events above a critical level as well as the interaction between both factors. Previous methods assume independence between the duration distribution and the critical level, a shortcoming that can lead to incorrect inferences. This paper characterises a novel method for analysing the temporal dependence of heatwaves with reference to observed temperatures from Orleans in central France, enabling the probability of observing heatwaves as severe as the 2003 European heatwave to be estimated with more accuracy.

**Winter, H. C.**, Tawn, J. A. and Brown, S. J. (2017).Detecting changing behaviour of heatwaves with climate change. Accepted for publication in Dynamics and Statistics of the Climate System.

### Abstract

The question of how extreme events will change with future human induced climate change has become very important in recent years. Heatwaves are a type of extreme event that are defined as a period of persistent hot temperatures. Extreme value theory provides a general framework from which inference can be drawn from extreme events. Previous extreme value studies have focused on how a certain critical level (often defined in terms of the return level) varies with a covariate, such as global mean temperature, without considering how dependence between extreme events might change. An increase in the level of extremal dependence with climate change could lead to longer and more severe heatwave events. Coupling this with any changes in the marginal behaviour of heatwaves could lead to drastic changes that will need to be mitigated for. This paper investigates marginal features and temporal dependence of heatwaves with reference to temperature values taken from a selection of 13 global climate models forced under the A2 climate scenario (large local growth with heavy reliance on fossil fuels). Unlike previous methods we quantify the likelihood of changes in the duration and severity of heatwave events with climate change.

**Winter, H. C.**, Tawn, J. A. and Brown, S. J. (2016).Modelling the effect of the El Niño-Southern Oscillation on extreme spatial temperature events over Australia. Accepted for publication in The Annals of Applied Statistics.

### Abstract

When assessing the risk posed by high temperatures, it is necessary to consider not only the temperature at separate sites but also how many sites are expected to be hot at the same time. Hot events that cover a large area have the potential to put a great strain on health services and cause devastation to agriculture, leading to high death tolls and much economic damage. Australia experienced a severe heatwave in 2009; 374 people died and Melbourne recorded its highest temperature since records began in 1859. One area of particular interest in climate science is the effect of large scale climatic phenomena, such as the El Niño-Southern Oscillation (ENSO), on extreme temperatures. Here, we develop a framework based upon extreme value theory to estimate the effect of ENSO on extreme temperatures across Australia. This approach permits us to estimate the change in temperatures with ENSO at important sites, such as Melbourne, and also whether we are more likely to observe hot temperatures over a larger spatial extent during a particular phase of ENSO. To this end, we design a set of measures that can be used to effectively summarise all important aspects of an extreme temperature event. These measures are estimated using our extreme value framework and we validate whether we can accurately replicate the 2009 Australian heatwave, before using the model to estimate the probability of having a more severe event than was observed.

**Winter, H. C.**, Tawn, J. A. (2016). kth-order Markov models for assessing heatwave risks. Extremes. doi:10.1007/s10687-016-0275-z.

### Abstract

Heatwaves are most commonly defined as a set of hot days and nights that cause a marked short-term increase in excess mortality. As such obtaining accurate estimates of the probability of an event lasting many days is important. Previous studies have often made a first-order Markov assumption to simplify the extreme value models that are fitted. A first-order Markov assumption does not capture the prevailing conditions, i.e. whether the previous days have seen an increase or decrease in temperatures. This paper provides a kth-order Markov model framework for incorporating higher-order information when evaluating the probabilistic characteristics of heatwave events using a conditional approach developed for multivariate extremes; an approach that encompasses both asymptotic dependence and asymptotic independence. We provide novel methods for the selection of the order Markov process that are based upon the structure of the extremes. The methods improve upon standard time-series methods for this selection when the extremal structure is less complicated than the structure in the body of the distribution or vice versa. Finally, extremal quantities, such as the probability of a heatwave event lasting a certain number of days, are estimated under the new framework for observed temperatures at Orleans in central France. These estimates are compared with results obtained under a first-order Markov assumption, which are shown to underestimate the probability of long and potentially devastating heatwave events like the European 2003 heatwave event.

Yates, K., Pavlidis, N. and Hofmeyr, D. (2016).Density clustering for mixed and high-dimensional data. Submitted.

### Abstract

We propose an approach to perform clusteringof high-dimensional datasets containing diversedata types, which is able to identify high-density clustersin arbitrarily oriented subspaces. We first transformthe data to obtain an appropriate continuous representation,and then perform projection pursuit to locatelow-density linear cluster separators. Combiningmultiple separators we obtain divisive and partitioningclustering algorithms that can estimate the number ofclusters. The resulting clusters concentrate around themodes of the estimated density of the transformed data.Through extensive simulation and empirical evaluation,we show that the proposed algorithms produce consistentlyhigh quality results across simulated and realworlddatasets with different characteristics, and thattheir performance is competitive with state-of-the-artalgorithms.

### Conference Papers

Briggs, K., **Fairbrother, J.** (2014). Enhanced algorithms for TV whitespace power maximization.

### Abstract

The secondary use of television spectrum (TV whitespace) by low-power radio devices is now areality in the UK and the USA. The UK regulator Ofcom has released rules for computation ofmaximum allowed transmit power. These rules, however, allow some leeway for different algo-rithms for internal steps, and one would like to exploit this leeway to the advantage of secondaryusers, while still protecting the primary user to the required degree. We describe here two tech-niques implemented in a current BT software product, which additionally allows a compute-on-demand mode of operation, in distinction to so-called TV whitespace databases which merely lookup pre-computed tables.

**Davies, R.**, Mihaylova, L., Pavlidis, N. and Eckley, I.A. (2013). The effect of recovery algorithms on compressive sensing background subtraction” In Sensor Data Fusion: Trends, Solutions, Applications (SDF), 2013 Workshop on (pp. 1-6). IEEE.

**Hofmeyr, D.** (2016) On the topology of genetic algorithms. Proc. IJCAI (2016).

### Abstract

Genetic algorithms are stochastic search heuristics which are popular for their broad applicabil- ity, especially in combinatorial search problems. The search mechanism relies on an abstraction of genetic evolution and selection as seen in na- ture. This paper introduces a topological structure for the search space which is consistent with existing theory and practice for genetic algorithms, namely forma analysis. A notion of convexity is de- fined within this context and connections between this definition and forma analysis are established. This framework provides an alternative perspective on the exploitation/exploration dilemma as well as population convergence, which relates directly to the genetic operators employed to drive the evolu- tion process. It also provides a different interpretation of design constraints associated with genetic algorithm implementations. The intention is to provide a different analytical perspective for genetic algorithms, and to establish a connection with exact search methods through the concept of convexity with the hope that some theoretical and method- ological ideas can be ported between the two, thus ultimately improving the methodological aspects of genetic algorithms.

### Abstract

Background subtraction is a key method required to aid processing surveillance videos. Current methods require storing each pixel of every video frame, which can be wasteful as most of this information refers to the uninteresting background. Compressive sensing can offer an efficient solution by using the fact that foreground is often sparse in the spatial domain. By making this assumption and applying a specific recovery algorithm to a trained background, it is possible to reconstruct the foreground, using only a low dimensional representation of the difference between the current frame and the estimated background scene. Although new compressive sensing background subtraction algorithms are being created, no study has been made of the effect of recovery algorithms on performance of background subtraction. This is considered by applying both Basis Pursuit and Orthogonal Matching Pursuit (OMP) to a standard test video, and comparing their accuracy.

**Hofmeyr, D.** And Pavlidis, N. (2015)Maximum Clusterability Divisive Clustering. Proc. IEEE SSCI (CIDM)

The notion of clusterability is often used to determine how strong the cluster structure within a set of data is, as well as to assess the quality of a clustering model. In multivariate applications, however, the clusterability of a data set can be obscured by irrelevant or noisy features. We study the problem of finding low dimensional projections which maximise the clusterability of a data set. In particular, we seek low dimensional representations of the data which maximise the quality of a binary partition. We use this bi-partitioning recursively to generate high quality clustering models. We illustrate the improvement over standard dimension reduction and clustering techniques, and evaluate our method in experiments on real and simulated data sets.

**Hofmeyr, D.** And Pavlidis, N. (2015)Semi-supervised Spectral Connectivity Projection Pursuit. Proc. PRASA RobMech Int. Conf.

We propose a projection pursuit method based on semi-supervised spectral connectivity. The projection index is given by the second eigenvalue of the graph Laplacian of the projected data. An incomplete label set is used to modify pairwise similarities between data in such a way that penalises projections which do not admit a separation of the classes (within the training data). We show that the global optimum of the proposed problem converges to the Transductive Support Vector Machine solution, as the scaling parameter is reduced to zero. We evaluate the performance of the proposed method on benchmark data sets.

**Morgan, L. E.**, Nelson, B. L., Worthington, D. and Titman, A. (2016).“Input Uncertainty Quantification for Simulation Models with Piecewise-Constant Non-Stationary Poisson Arrival Processes” The Winter Simulation Conference 2016

### Abstract

Input uncertainty (IU) is the outcome of driving simulation models using input distributions estimated by finite amounts of real-world data. Methods have been presented for quantifying IU when stationary input distributions are used. In this paper we extend upon this work and provide two methods for quantifying IU in simulation models driven by piecewise-constant non-stationary Poisson arrival processes. Numerical evaluation and illustrations of the methods are provided and indicate that the methods perform well.

**Nemeth, C.**, Fearnhead, P., Mihaylova, L. and Vorley, D. (2012).“Particle learning for state and parameter estimation,” 9Th IET Data Fusion and Target Tracking Conference (DF&TT 2012), London U.K.

### Abstract

This paper presents an approach for online parameter estimation within particle lters. Current research has mainly been focused towards the estimation of static parameters. However, in scenarios of target maneuver- ability, it is often necessary to update the parameters of the model to meet the changing conditions of the target. The novel aspect of the proposed approach lies in the estimation of non-static parameters which change at some unknown point in time. Our parameter estimation is updated using changepoint analysis, where a changepoint is identi ed when a signi cant change occurs in the observations of the system, such as changes in direction or velocity.

**Nemeth, C.**, Fearnhead, P., Mihaylova, L. and Vorley, D. (2012).“Bearings-Only Tracking with Particle Filtering for Joint Parameter and State Estimation,” 15Th International Conference on Information Fusion, pp. 824-831.

### Abstract

This paper considers the problem of bearings only tracking of manoeuvring targets. A learning particle filtering algorithm is proposed which can estimate both the unknown target states and unknown model parameters. The algorithm performance is validated and tested over a challenging scenario with abrupt manoeuvres. A comparison of the proposed algorithm with the Interacting Multiple Model (IMM) filter is presented. The learning particle filter has shown accurate estimation results and improved accuracy compared with the IMM filter.

Pike-Burke, C., Grünewälder, S. (2017)."Optimistic Planning for the Stochastic Knapsack Problem," in Proceedings of 20th International Conference on Artificial Intelligence and Statistics

### Abstract

The stochastic knapsack problem is a stochastic resource allocation problem that arises frequently and yet is exceptionally hard to solve. We derive and study an optimistic planning algorithm specifically designed for the stochastic knapsack problem. Unlike other optimistic planning algorithms for MDPs, our algorithm, OpStoK, avoids the use of discounting and is adaptive to the amount of resources available. We achieve this behavior by means of a concentration inequality that simultaneously applies to capacity and reward estimates. Crucially, we are able to guarantee that the aforementioned confidence regions hold collectively over all time steps by an application of Doob's inequality. We demonstrate that the method returns an \epsilon-optimal solution to the stochastic knapsack problem with high probability. To the best of our knowledge, our algorithm is the first which provides such guarantees for the stochastic knapsack problem. Furthermore, our algorithm is an anytime algorithm and will return a good solution even if stopped prematurely. This is particularly important given the difficulty of the problem. We also provide theoretical conditions to guarantee OpStoK does not expand all policies and demonstrate favorable performance in a simple experimental setting.

Randell, D., **Zanini, E.**, Vogel, M., Ewans, K., Jonathan, P. (2014).“Omnidirectional Return values for Storm Severity from Directional Extreme Values Models: The Effect of Physical Environment and Sample Size. Proceedings of 33nd International Conference on Ocean, Offshore and Arctic Engineering, San Francisco OMAE2014-23156.

### Abstract

Ewans and Jonathan [2008] shows that characteristics of extreme storm severity in the northern North Sea vary with storm direction. Jonathan et al. [2008] demonstrates, when directional effects are present, that omnidirectional return values should be estimated using a directional extreme value model. Omnidirectional return values so calculated are different in general to those estimated using a model which incorrectly assumes stationarity with respect to direction. The extent of directional variability of extreme storm severity depends on a number of physical factors, including fetch variability. Our ability to assess directional variability of extreme value parameters and return values also improves with increasing sample size in general. In this work, we estimate directional extreme value models for samples of hind-cast storm peak significant wave height from locations in ocean basins worldwide, for a range of physical environments, sample sizes and periods of observation. At each location, we compare distributions of omnidirectional 100-year return values estimated using a directional model, to those (incorrectly) estimated assuming stationarity. The directional model for peaks over threshold of storm peak significant wave height is estimated using a non-homogeneous point process model as outlined in Randell et al. [2013]. Directional models for extreme value threshold (using quantile regression), rate of occurrence of threshold exceedances (using a Poisson model), and size of exceedances (using a generalised Pareto model) are estimated. Model parameters are described as smooth functions of direction using periodic Bsplines. Parameter estimation is performed using maximum likelihood estimation penalised for parameter roughness. A bootstrap re-sampling procedure, encompassing all inference steps, quantifies uncertainties in, and dependence structure of, parameter estimates and omnidirectional return values.

Yates, K. and Pavlidis, N. (2016).Minimum density hyperplanes in the feature space. Advances in High Dimensional Big Data" workshop, IEEE international conference on Big Data, Washington.

### Abstract

We introduce a kernel formulation of the recentlyproposed minimum density hyperplane approach to clustering.This enables the identification of clusters that are not linearlyseparable in the input space by mapping them to a potentiallyvery high-dimensional feature space. This mapping alsoextends the applicability of the minimum density hyperplaneto datasets whose features are not necessarily continuous.The location of minimum density hyperplanes involves theoptimisation of a non-convex objective function. In the featurespace, the dimensionality of the optimisation problem is n,where n is the number of observations. We further proposean approximation method that can substantially reduce thedimensionality of the search space, avoiding searching overdimensions which are unlikely to contain useful informationfor clustering. Experimental results suggest that the proposedapproach is capable of producing high quality partitions acrossa number of benchmark datasets.