Conditions for applying Gradient Descent to 'learning' a function in MCTS

826 Views Asked by At

My problem setup is the following: states in MCTS are described with a sequence of features that take values from set {0, 1}. For every state I get a certain decimal value. I want to 'learn' a function that would give me an estimation for any given state.

I wanted to use Gradient Descent for that purpose but I came to realize that it is not right way to do it because I'm actually working with a discrete function. $$f(s) \in R, s \in \{0,1\}^n$$

Since the function is not continuous but discrete, I cannot get a derivative of function f.

So my question would be: am I right? If not, why? If yes, could you point me to some references that would back up what I stated here while those are not wiki links since I need to present this to my mentor.

Thanks you!

1

There are 1 best solutions below

1
On

The general approach used in these circumstances is, rather than directly trying to estimate the response function, to estimate a continuous function that represents the probability of the response function taking the value 1. Obviously a good choice of function is one that is monotonically increasing with range between 0 and 1, and the most common choice is the logistic function, $f(x) = \frac{1}{1+e^{-x}}$.

To then get results for your response function, you take your output, and if it's above some threshold value you return 1 and below it you return 0 (the simplest threshold is 0.5, but you could also tune it based on your training data to get a suitable balance of Type I and Type II errors).