Is it possible to convert output of an equation in an unordered set?

62 Views Asked by At

I am total novice in algebra, so I need help regarding what I try to do... I have built the following equation:

and these three statements:

Here is what is the process I tried to describe:

B is a family set of length 21, containing rounded values of looking like

K is an index set (of B) looking like that:

is the mean of means (if k = 1 only, if not see further) of euclidian distances calculated on variables n between n' records of and h (both randomly sampled in a dataset) ; if I made no mistakes it is done by this part of the equation:

This part: aims to tel that the operation done in the brackets [...] is bootstraped times and mean of the output is calculated to obtain finally mean of means of means (if k > 1 ).

Could you please tell me if this logics sounds good ?

And finally, if I want to considere as a familly set containing all results from bootstrap, I am OK to write that :

One of the question I did not manage to answer is, if e.g. *k = 3, the output is only the one computed with or is it (which is not what I have done in R...).

The other question is that the output of this computation will certainly not be ordered (values may not be crescent) ; is it a problem ?

Many thanks for your help, and

1

There are 1 best solutions below

0
On

Following @gandalf61 recommandations, I propose the same process in separated steps:

I set these three statements:

B is a family set of length 21, containing rounded values of looking like . Values in this set are the quantity of iteration to do on the whole process (at the begining input records are randomly sampled, and at the end we compute the mean of the total number of iteration, so the mean of B=1 to B=12089).

K is an index set (of B) looking like that:

The whole described process aims at computing mean distance between n' records of and n' records of h which are in the begining randomly sampled (in a big dataset); this operation is bootstraped B times and mean of the B times output have to be calculated. n are dimensions number (quantitive variables), whose values are used to calculate Euclidean distances

So first Euclidean distance d is calculated between record and record h:

Then Mean Euclidean D distance is calculated between all n' records of and h:

These steps have to be bootstraped B times; and outputs of each B times have to be stored in a vector/set(?). So, I think that there is a mistake here in my equation in this process. E.g. I need to store result for B = 1 (k = 1) ; B = 7 (k = 5), etc. to assess effect of iterations increase on results linearity (but I don't need to store results of B = 1 + B = 2 + ... + B = 12089)

Here is what I propose to explain this part of the process. Could you tell me if I am right? I wonder if I can built results of M like that because outputs of bootstraps B may not be ordered as it is means (of means of means), e.g. M(B = 7) > M(B = 12089) is totally possible.

Many thanks for your help !