Real time bayesian analysis

225 Views Asked by At

Currently our memcached cluster shields our database from heavy load. If the cache were to go down, it would take our application down with it. With a cold cache, we could avoid this downtime by switching into a "safe mode", where $1\%$ of cache miss requests go to master and warm the cache. The remainder of requests go to read replicas, serve potentially stale data and aren't used to populate the cache(to avoid stale sets).

I want to create a model that can recognize when the cache is likely in a cold state. I imagine I could track cache hit rate over a period of time and get a good idea of the mean and variance of the cache hit rate. Given a prior of how frequently I expect to be in a cold state (rare) and a certain hit rate, it seems I could create a posterior distribution that estimates how likely we are to be in a cold state, and switch into "safe mode" given some threshold value.

What would be the best way to measure hit rate? I could somewhat arbitrarily set a window of say $1$ second or $30$ seconds and measure hit rate in that interval. Although that raises the issue of finding the best window — smaller windows will have larger variance, but too large of a window means the model doesn't trigger "safe mode" quickly enough. What I really want is a sliding window where more recent hits weigh more heavily in the models decisions but all previous hits are taken into account as well. What would be the best way to approach this?

1

There are 1 best solutions below

0
On

There's probably a lot of ways to do this. You could use a Hidden Markov Model(HMM), where $X_t$ is the observed value at time point $t$, then $X_t | s_t \sim D_{s_t}$. Where $s_t$ is a latent state indicating whether the observation is in the normal or cold state. That is, $s_t \in \{C,N\}$ with $C$ the cold state and $N$ the normal state.

Specifying $K \in \mathbb{Z}^+$ the order of the HMM.

Then we have that $P[s_t = s | s_{t-1}, ...,s_{t-K}]$ are the transition probabilities from each of the previous sets of $K$ states to a new state $s$. In many cases people deal with first order HMMs, so $K=1$, which gives us that$P[s_t = s | s_{t-1}=r] = Q_{rs}$ with $Q$ a $2\times 2$ matrix containing transition probabilities from each state to itself/the other state.

We can complete the model by putting a $\textrm{Dir}(\alpha)$ prior on the rows of $Q$, and putting priors on the emission distributions $D_C \sim \pi_C$ and $D_N \sim \pi_N$.

If you fit this using MCMC you can get an approximation to the probability of the state $s_t$ at each time point by looking at the proportion of samples in which the state $s_t = s$ for $s\in\{C,N\}$

You can do this really quickly in R for even large data sets by using the HiddenMarkov package in R.

# Load the package
library("HiddenMarkov")
# Model with class "dthmm" with the Gamma distribution, example for simulated data
# The transition matrix for the simulated data
Pi <- matrix(c(0.8, 0.2,0.3, 0.7),byrow=TRUE, nrow=2) 
# The number of samples in the sequence of observations
n <- 1000 
# Generate the synthetic data
my.data <- dthmm(NULL, Pi, c(.5,.5), "gamma", pm=list(shape=c(4,0.1)),pn=list(rate=c(rep(0.5, n/2), rep(1, n/2)))) 
my.data <- simulate(my.data, nsim=n)
# Fit using the BaumWelch Algorithm
y <- BaumWelch(my.data)

The list object y contains the posterior probabilities of each of the two states at each time point. You can refit this after you obtain a new data point (fairly quickly) to get the posterior probability that the new data point is in each of the two states.

EDIT: Made an edit to a mistake in the example code which caused it not to work.