What are some different approaches to scoring an old Stock price prediction?

159 Views Asked by At

I want to compare predictions of 2 different Stock Analysts and would like to assign a score to each prediction made by each of them in a way that they can be compared to decide who is the best Analyst.

The factors to take into account would be:

  1. How earlier was the prediction made; a correct prediction of certain Stock price being at $100 is worth more if it was made a year before the target date than if it was made a month before.

  2. How close to the actual price was the prediction; someone predicting $110 would score higher than someone predicting 120.

When we mix both of them, someone predicting 120 a year earlier scores higher than someone predicting 110 a month before.

I understand how I could do this to compare the two analysts on one single stock, by adding different weights to each factor, but what should I do if I'm comparing stocks that cost anywhere from a couple of pennies to $100.000?

I'm looking for anything that can point me in the right direction.

2

There are 2 best solutions below

5
On BEST ANSWER

As some of the other answers/comments have pointed out, there are plenty of ways you could mix the two criteria to satisfy both earlier predictions being better and more accurate predictions being better.

If $t$ represents the length of time between the prediction and the event (how early it is) and $d$ represents the distance between the prediction and the actual value obtained, then we want to maximize $t$ and minimize $d$, and any score $s(t,d)$ with $\frac{\partial s}{\partial t}>0$ and $\frac{\partial s}{\partial d}<0$ satisfies the qualitative properties you want. For the record, there are a LOT of functions satisfying those properties.

In practice, you probably want a score that carries more meaning. Scores like profit (or profit less opportunity cost) are impossible because of your requirement that some more profitable predictions receive lower scores (e.g. \$120 instead of \$110 for a \$100 security). Since your focus seems to be on evaluating the accuracy of the analyst, you might try comparing their predictions to the best possible predictions.

For each prediction the analysts made, you know the lead time and the distance from the true price (which should probably be normalized to allow a comparison across securities) and consider only the lower-right portion of that hull. See below on simple simulated data.

Simulated Pareto front

Each prediction is represented by one of those points, and the best predictions are those with large times and low distances represented by the black line. Your score for any given prediction is simply how far it is from the black line.

This doesn't quite agree with your 1st constraint, but it arguably better captures the information you care about. Since you're penalizing some predictions that would yield high profits, business impact is not the dominating factor. You explicitly mentioned that you're trying to rank analysts. If that ranking is only internal, comparing their performance to the best possible performance they could hope to have is a logical, interpretable metric.

In most industries, boiling people down to a single aggregate metric causes problems. In software it causes people to write meaningless comments to have lots of lines of code, to split small problems into tiny ones to have more "commits" to make it seem like more work is being done, and so on. Combining multivariate problems into single variable problems causes information to be lost; some good performers will be rated as bad and vice versa. That isn't even counting the HR nightmare and morale drop that accompany draconian, numbers-driven policies. Knowing their performance is valuable, but be very careful how that information is used and communicated.

0
On

How about this approach: Fit the parameters of a stochastic process to the history of the stock price in question, such as geometric Brownian motion (GBM). GBM is used to model stock prices in the Black-Scholes model and is the most widely used model of stock price behavior. GBM has two parameters $\mu$ and $\sigma$ representing the percentage drift and percentage volatility, which can be easily estimated (through maximum likelihood) from historical data.

  1. Fit a GBM process to the stock price data from a year ago to today. Find the probability density of such a GBM process ending at $120 today. (Notice the probability will not be entirely concentrated at \$100 because this is a stochastic process. Repeating the process will yield different outcomes even if the parameters are unchanged. A stock price with higher volatility will have a correspondingly more "spread out" price distribution at today. Moreover, this spread increases the longer the GBM random walk is run, again reflecting the fact that it's easier to attain an error margin over a shorter time frame).

  2. Fit a GBM process to the stock price data from a month ago to today. Find the probability density of such a GBM process ending at $110 today.

Compare analysts 1 and 2 on the basis of the probabilities that were obtained from the above fitted stochastic processes, weighing them according to the entropy of the GBM-induced present price distribution (which is higher for the more spread out distribution, so its not unfairly penalized). These are the probabilities that they would be "correct" if history were "re-run", assuming such a history can be approximately modelled by the GBM process.

Here's an example of what samples from a GBM process with a given drift and volatility look like:

enter image description here

EDIT: It occurred to me that it makes more sense to do the above but backwards. Have each analyst decide what they think $\mu$ and $\sigma$ are . Their belief about $\mu$ can be obtained from their prediction about what the price will be today, i.e. just pick the $\mu$ which maximizes the probability density of their predicted price:

\begin{align*} \mu_1 &= \operatorname*{argmax}_{\mu_1 \in \mathbb{R}^+} \mathrm{P}(S_\text{today} = \text{price prediction 1} \mid S_\text{a year ago}, S \sim \text{GBM}(\mu_1, \sigma_1)) \\ \mu_2 &= \operatorname*{argmax}_{\mu_2 \in \mathbb{R}^+} \mathrm{P}(S_\text{today} = \text{price prediction 2} \mid S_\text{a month ago}, S \sim \text{GBM}(\mu_2, \sigma_2)) \end{align*}

Alternatively it could represent what they think is the expected price: \begin{align*} \mu_1 &: \mathrm{E}(S_\text{today} \mid S_\text{a year ago}, S \sim \text{GBM}(\mu_1, \sigma_1)) = \text{price prediction 1} \\ \mu_2 &: \mathrm{E}(S_\text{today} \mid S_\text{a month ago}, S \sim \text{GBM}(\mu_2, \sigma_2)) = \text{price prediction 2} \end{align*}

Then find the probability that the price would be what it actually is today under each analysts' prediction, i.e.

\begin{align*} L_1 &= \mathrm{P}(S_\text{today} = \$100 \mid S_\text{a year ago}, S \sim \text{GBM}(\mu_1, \sigma_1)) \\ L_2 &= \mathrm{P}(S_\text{today} = \$100 \mid S_\text{a month ago}, S \sim \text{GBM}(\mu_2, \sigma_2)) \end{align*}

$L_i$ is the likelihood of analyst $i$'s beliefs given the actual outcome, or equivalently, the probability of the actual outcome given each analyst $i$'s beliefs. Use $L_i$ as the score for analyst $i$.

Implicitly what you're asking each analyst to do is to give you the parameters for a distribution reflecting what they think today's stock price will be, and you're rewarding them with the likelihood of the parameters they gave you.