How the α affects the solution path in the elastic net?

138 Views Asked by At

The following Quiz is the rough translation (with minor modifications) of Quiz No.10-2 of the exam of the "2018's semi-first grade of Japan Statistical Society Certificate" (see Ref (1)) ". According to the official answer, the correct answer is (A) (See Fig (A) below.). However, I cannot imagine how to achieve the correct answer.

My question: How to achieve the correct answer of following Quiz? How the α affects the solution path in the elastic net? Which α correspond to Figures (B), (C), and (D) ?(As described in official answer, the α of Fig (A) is 0.5. However, There is no description about "What are the α corresponding to other figures?")


I recognized the following features. These features might be the key to discriminate α … . However, I cannot imagine how to achieve the correct answer.

  • As lambda gets larger, the parameter estimates get smaller (Ref (2)).
  • Only in Figure (C), the shape of the graph seems very different. Only Figure (C) seems roughly symmetrically in the vertical direction with the  "y = 0 line (hereinafter x-axis) "but not otherwise. In Fig (A),(B), and (D), the "bundle of curves" (see green box and orange box in Fig (A)') is asymmetry with the x-axis: The convergence of the negative part (see the orange box in Fig (A)') is faster than that of the positive part(see green rectangle).
  • At first glance, Figure (C) seems to converge faster than others, but in fact, the convergence of Figure (C) is the slowest. (Note that, in Figure (B)-(D), the x-axis starts at 2, but in A the x-axis starts at 4.)The point where the “bundle of curves" converge to 0 is around x=4 for (B), around x=4.5 for (D), and around x=5 for (A).

enter image description here

Fig (A)':Fig (A) 'is an annotated version of following Fig (A).

Quiz:
In Japan, if you make a donation to a local government using the program called "Furusato Tax Payment (hometown tax donation program)," you can receive a return gift. A total of 1,741 local governments accept the Furusato Tax Payment program. Return gifts are classified into 166 categories. The number of categories of return gifts offered vary for each local government. For example, some local governments offer three categories out of 166 categories of return gifts, and other local governments only provide one category.

The following model formula expresses the amount of donation collected by each local government using this system: Hereinafter, the following formula is referred to as (Formula 1).
enter image description here

Here,

  • ${y}_{i}$ are objective variable represent the the amount of donation collected by $i$th local government.
  • ${x}_{1,i}$ are explanatory variable represent the population of $i$th local government.
  • ${x}_{2,i}$ are explanatory variable represent the number of return gifts offered by $i$th local government.
  • ${x}_{k,i}$ ($k=3,4,...,166+3-1$) are explanatory variable (dummy variables). If the $i$th local government can offer the $l$th ($l=1,2,...,166$) return gift, then ${x}_{l+3-1,i}=1$ and otherwise, ${x}_{l+3-1,i}=0$.
  • ${u}_{i}$ are error term.

All explanatory variables are standardized to mean 0 and variance 1.

Regression coefficients were estimated by elastic net regression that minimizes the following equation: Hereinafter, the following formula is referred to as (Formula 2).

enter image description here

The following Figs (A) to (D) are solution paths in which the estimated regression coefficients are plotted against log (λ). The numerical value at the top of each graph represents the number of explanatory variables with nonzero regression coefficients. For Figs A to D, the α is any one of 0, 0.5, 0.7, and 1.


Select the best answer from Figures (A) to (D) as the solution path when α = 0.5.
enter image description here
Figs. (A)-(D).

Reference:
(Ref.1). Quiz No.10-2 of the exam of the "2018's semi-first grade of Japan Statistical Society Certificate" is stored in the following URL. (Written in Japanease) That is an excerpt of only the part related to this quiz. Link
(Ref.2). StatQuest with Josh Starmer; Regularization Part 3: Elastic Net Regression.

P.S. I'm not very good at English, so I'm sorry if I have some impolite or unclear expressions.

1

There are 1 best solutions below

3
On BEST ANSWER

Elastic net is a linear model with both $L^1$ and $L^2$ regularization. In essence the loss is of the form

$$ \|Y-X\beta\|_2^2 + \lambda \Big( (1-\alpha)\|\beta\|_2^2 + \alpha\|\beta\|_1\Big) $$

Where $\alpha$ controls the weighting of the the $L^1$ and $L^2$ term. For $\alpha=0$, one only has $L^2$ regularization and for $\alpha=1$, one has only $L^1$ regularization. The important fact to know is that

$L^1$ regularization sparsifies the solution

I.e. some parameters will be exactly equal to zero. Our choices are $\alpha\in\{0, 0.5, 0.7, 1\}$. Hence we can conclude:

  • (A) must be $\alpha=0.5$ because it has the 3rd strongest sparsification (127 non-zero parameters at $\log\lambda=2$)
  • (B) must be $\alpha=1$ because it has the strongest sparsification (98 non-zero parameters at $\log\lambda=2$)
  • (C) must be $\alpha=0$ because there is no sparsification
  • (D) must be $\alpha=0.7$ because it has the 2nd strongest sparsification (109 non-zero parameters at $\log\lambda=2$)