Markowitz Model Invests in Everything

102 Views Asked by At

I'm trying to create portfolios using real life stock market data from the past, to test real life performance of Markowitz portfolio optimization model. Solving through CVX, however, gives me the most weird portfolios where the it invests in everything and the resulting efficient frontier is not a curve, but a zigzag line.

edit: Here's the input data:

0.770385819 1.041085387 0.459646675 1.407991775 1.032964003 1.015192832 1.761284058 0.432529846 1.196856508 1.130865217 0.948489011 1.130973958 1.021932454 1.119256276 1.090071648 1.070024263 0.898895476 0.94323697  1.07892532  0.969909847 1.427561837 0.952085375 0.911283811 1.218244825 0.732480413 1.25540862  1.367107712 0.931140812 1.185658681 0.868207764 1.014347682 1.478019669 1.86267166  1.007992203 1.132348209 0.951148514 1.008848925 0.835203173 1.568876745 0.814207457 1.008956842 0.924991843 1.091709856 0.459646675 1.251518105 1.241371696 0.902728017 0.851700526 1.23548996  1.340598122;
    1.563231801 1.160694749 1.295086356 1.102851533 1.658395418 1.441289332 1.131956592 0.441946506 1.429396335 1.168607239 1.112126107 1.079003599 1.097335615 1.34966378  1.186228482 1.619232426 1.043895898 1.258081655 1.442456662 1.071825205 0.977722772 1.149757902 1.160316757 1.245202657 1.483468401 1.207680015 1.428717458 1.468165592 1.115104134 1.504980529 1.179700146 1.137373204 2.054959786 1.325712628 0.885426498 1.300327201 1.358178726 1.392186705 2.588573659 2.686215908 1.016398298 1.331078306 1.187699367 1.295086356 1.234122963 1.300353826 1.418989068 1.504121584 1.361842357 2.410075523;
    2.833927266 1.074222841 1.149776806 2.5667894   1.188622645 1.27342918  1.120995734 2.557863666 1.56851897  1.299325173 1.20084143  1.09924957  1.212777332 1.190550724 1.715039578 1.689773913 1.227945996 0.810509236 1.202735818 1.102883208 1.491139241 1.081009384 0.808332936 1.069644246 1.266890108 1.239471924 1.172905087 1.102346128 1.137121269 1.139571095 1.179716604 1.191004623 1.213307241 1.209411945 1.21512834  1.065562602 1.060866403 0.983125018 0.464863672 1.66855076  1.111080906 1.196453819 1.161552216 1.149776806 1.27028591  1.618204209 1.252353924 1.262311919 1.390691261 0.737679682;
    1.503269518 1.64103917  1.32772227  1.01086575  2.476334985 1.435918879 0.826643773 0.767703937 1.206691156 0.98326345  0.965173507 1.600943419 1.14800529  1.137910023 1.035384615 1.052234414 0.916417592 0.910925845 0.963225503 0.888894562 1.405772496 1.155   0.875774931 1.455537213 1.270564643 1.245505029 1.205900058 1.162396583 0.879163835 1.954987817 1.080998944 0.875943724 0.849193548 2.86717459  0.821654933 1.14574944  1.167947175 1.208396908 1.480125546 1.344855794 1.198409593 1.442598391 1.09413377  1.32772227  1.286854078 1.282799117 1.055728237 1.046595974 3.609028312 0.972209445;
    2.027605491 0.975980961 1.523751475 0.60734791  0.954704656 0.948747816 1.152948392 0.555557551 1.327389277 2.006589554 1.130072166 0.988036542 0.974592813 1.085481069 1.51589896  0.770266552 1.075550379 1.17000843  2.37991297  1.374691252 0.963768116 1.033065094 1.207308029 1.03158237  1.27473305  1.323322811 0.995666963 1.701634761 1.197580575 1.796410512 1.34454464  1.414615372 0.908515353 1.074479273 1.892912954 1.215585858 1.260530805 1.946122565 1.054051499 0.464658358 1.258692039 1.415613126 1.363021507 1.523751475 1.224722064 1.193179055 1.770201997 1.155290681 0.999299308 2.09035457;

This is the CVX model:

RR = [1.15:0.01:1.21];

for i=1:length(RR)
    R = RR(i);
    cvx_begin quiet;
    variable x(length(mu))
    minimize (x'*S*x)
    subject to
     mu'*x >= R;
     e'*x == 1.0000;
     x>= 0;
      x<=0.33
    cvx_end

    xx = [xx x];
    rr = [rr mu'*x];
    stdv = [stdv sqrt(x'*S*x)];

end

And here's the output:

   Calling SDPT3 4.0: 57 variables, 7 equality constraints
------------------------------------------------------------

 num. of constraints =  7
 dim. of socp   var  =  6,   num. of socp blk  =  1
 dim. of linear var  = 51
*******************************************************************
   SDPT3: Infeasible path-following algorithms
*******************************************************************
 version  predcorr  gam  expon  scale_data
    NT      1      0.000   1        0    
it pstep dstep pinfeas dinfeas  gap      prim-obj      dual-obj    cputime
-------------------------------------------------------------------
 0|0.000|0.000|3.9e+02|1.7e+01|7.3e+03| 1.228178e+01  0.000000e+00| 0:0:00| chol  1  1 
 1|0.947|1.000|2.1e+01|1.0e-01|4.2e+02| 4.638828e+00 -1.233840e+01| 0:0:00| chol  1  1 
 2|0.977|1.000|4.6e-01|1.0e-02|1.9e+01| 7.587821e-02 -9.896742e+00| 0:0:00| chol  1  1 
 3|1.000|1.000|1.8e-07|1.0e-03|2.5e+00| 2.008947e-02 -2.496417e+00| 0:0:00| chol  1  1 
 4|0.961|0.948|1.0e-07|1.5e-04|1.3e-01| 1.434256e-02 -1.174981e-01| 0:0:00| chol  1  1 
 5|0.972|0.422|1.7e-08|8.9e-05|1.1e-01| 5.465703e-03 -1.043929e-01| 0:0:00| chol  1  1 
 6|1.000|0.443|1.0e-08|5.0e-05|8.7e-02| 6.486070e-03 -8.040863e-02| 0:0:00| chol  1  1 
 7|0.684|0.739|3.9e-09|1.3e-05|4.3e-02| 4.102134e-03 -3.888568e-02| 0:0:00| chol  1  1 
 8|0.949|1.000|2.6e-10|1.1e-08|1.0e-02| 3.037969e-04 -1.007127e-02| 0:0:00| chol  1  1 
 9|1.000|1.000|8.5e-11|1.1e-09|2.5e-03| 9.298042e-05 -2.408280e-03| 0:0:00| chol  1  1 
10|0.979|0.985|1.8e-12|1.3e-10|1.6e-04| 2.714452e-06 -1.545940e-04| 0:0:00| chol  1  1 
11|0.985|0.985|2.5e-14|1.3e-11|2.4e-06| 4.039800e-08 -2.340381e-06| 0:0:00| chol  1  1 
12|0.995|0.995|5.9e-15|1.1e-12|3.9e-08| 4.608065e-10 -3.805504e-08| 0:0:00| chol  1  1 
13|1.000|0.997|4.6e-15|1.0e-12|7.4e-10| 7.917000e-12 -7.267619e-10| 0:0:00|
  stop: max(relative gap, infeasibilities) < 1.49e-08
-------------------------------------------------------------------
 number of iterations   = 13
 primal objective value =  7.91700039e-12
 dual   objective value = -7.26761875e-10
 gap := trace(XZ)       = 7.36e-10
 relative gap           = 7.36e-10
 actual relative gap    = 7.35e-10
 rel. primal infeas (scaled problem)   = 4.59e-15
 rel. dual     "        "       "      = 1.00e-12
 rel. primal infeas (unscaled problem) = 0.00e+00
 rel. dual     "        "       "      = 0.00e+00
 norm(X), norm(y), norm(Z) = 7.4e-01, 6.1e-09, 3.2e+00
 norm(A), norm(b), norm(C) = 1.3e+01, 2.9e+00, 4.2e+00
 Total CPU time (secs)  = 0.37  
 CPU time per iteration = 0.03  
 termination code       =  0
 DIMACS: 5.9e-15  0.0e+00  1.3e-12  0.0e+00  7.3e-10  7.4e-10
-------------------------------------------------------------------

------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): +7.91702e-12

     3

    0.0086
    0.0143
    0.0073
    0.0186
    0.0125
    0.0115
    0.1894
    0.0055
    0.0197
    0.0269
    0.0117
    0.0179
    0.0130
    0.0136
    0.0198
    0.0095
    0.0109
    0.0098
    0.0208
    0.0130
    0.0534
    0.0115
    0.0100
    0.0182
    0.0087
    0.0259
    0.0193
    0.0123
    0.0169
    0.0148
    0.0143
    0.0485
    0.0204
    0.0178
    0.0270
    0.0114
    0.0122
    0.0119
    0.0077
    0.0047
    0.0155
    0.0133
    0.0165
    0.0073
    0.0237
    0.0237
    0.0124
    0.0089
    0.0472
    0.0101

Is there a reason for this? I checked almost everything to find a connection, but I failed. Am I missing something here?

1

There are 1 best solutions below

1
On BEST ANSWER

As per my comments, having more stocks than observed returns per stocks results in a singular covariance matrix if this matrix is estimated as follows:

Let $r_t$ be a vector of $N$ returns at time $t$. If we estimate $\mathbf{E}[r_tr_t']$ by $S:=\frac{1}{T}\sum_{t=1}^T r_tr_t'.$ Then any column $i$ of $S$ is $\frac{1}{T}\sum_{t=1}^T r_tr_{i,t}$ with $r_{i,t}$ being the $i$th element of $r_t$. This means that column $i$ is a linear combination of $T$ vectors. Moreover, all columns are linear combinations of the same $T$ vectors. Hence, the rank of $S$ cannot exceed $min(T,N)$. And so, if $T<N$ this results in a singular covariance matrix.

This potentially transforms a "minimization of a quadratic form"-problem into a (partly) linear optimization problem, probably resulting in a (piecewise) linear Markovitz frontier instead of the usual curved once and which may have multiple solutions.

Sidenote, $S$ is the covariance matrix for demeaned return vectors. However, the rank argument holds true for non demeaned returns.