I am analyzing different schemes of mutual funds. Each scheme has many funds in its portfolio. I wanted to analyse the overlap (of funds) between these schemes. I can find the overlapping of scheme1 with scheme2 $(A \bigcap B) / A $, where $A$ is the no. of funds in scheme1 and $B$ is the no. of funds in scheme2. Is there a way I can find the average overlap between two schemes?
So, suppose these are my data(dataframes in pandas):
funds Qty Value Asset%
fund_0 q_0 v_0 p_0
fund_1 q_1 v_1 p_1
fund_2 q_2 v_2 p_2
fund_3 q_3 v_3 p_3
fund_4 q_4 v_4 p_4
fund_5 q_5 v_5 p_5
fund_6 q_6 v_6 p_6
scheme2 looks like this:
funds Qty Value Asset%
fund_0 q_0_2 v_0_2 p_0_2
fund_2 q_2_2 v_1_2 p_1_2
fund_5 q_5_2 v_6_2 p_6_2
fund_6 q_2 v_2 p_2
fund_7 q_3 v_3 p_3
fund_8 q_4 v_4 p_4
fund_9 q_5 v_5 p_5
fund_10 q_01 v_98 p_59
For this case, I will be having the common funds in these schemes as fund_0,fund_2,fund_5 and fund_6 So I can find the overlap of each scheme with respect to other. Now I have some questions here:
- Is there anything like average overlap, which would make sense in the real world?
- Right now I am calculating the average overlap by using the number of common funds in each scheme. So for % overlap for scheme 1 above would be
100 * 4 / 7, 7 being the total no. of funds in scheme 1. So is there a more meaningful metric for this? - What would be the overlap of portfolios or schemes mean in case of comparison between 3 or more schemes?