Math Notation understanding help?

95 Views Asked by At

I'm trying to understand the following notation and I'm simply too new to more advanced mathematics, especially notation.

The purpose of this is for a reinforcement learning modification to utilize intrinsic rewards based on curiosity.

I know the left hand side of equals is the intrinsic reward for a specific time step. I believe η is a chosen value for scaling purposes. φˆ(st+1) is the result of a prediction of a future state while φ(st+1) is the true value of the future state.

The bars || I am unsure of, as I'm also unsure of the super/sub scripts of 2 on the end in this context. Even knowing what they refer to, I'm unsure of the order of operations intended with notation like this.

Any help at all would be greatly appreciated, but my hope is to get a full explanation of each value's meaning and a more simplified algebraic representation (ideally spelled out as plainly as possible).

This is taken from the following paper: https://pathak22.github.io/noreward-rl/resources/icml17.pdf

Intrinsic reward in RL equation

1

There are 1 best solutions below

1
On BEST ANSWER

The bars are in reference to taking a Norm. You have seen some norms before, I'm sure such as absolute value, or "length", or "max", etc... They satisfy certain nice properties such as the triangle inequality. Read more on the linked page.

The subscript here is in reference to which norm we are talking about. We are talking specifically about the Euclidean Norm. This is the norm that most people are used to when talking about distance. Given a vector $v = [v_1,v_2,\dots,v_n]$ the Euclidean norm of this is $\|v\|_2 = \sqrt{\sum\limits_{i=1}^n v_i^2}$

The superscript here is a usual power, which effectively takes away the square root.

Depending on context, $\|\cdot \|_2$ might instead be in reference to the $\ell_2$ norm or the $L_2$ norm instead, both of which act similarly to the Euclidean norm just for different spaces. See Lp Spaces.