I am hoping this is the right places to ask this question, if applogies
I am trying to calculate the value of the prevailing wind from a data set that includes a direction value and wind speed value thats given as a number of values. I have no data about how the wind changed strength
I also want to bound check my calculations, namely below a given value, the wind strength is negligible, ditto above a given value, the wind is to dangerous.
I define the prevailing wind value as the most common direction value and the prevailing wind strength as the most frequent value for wind strength in a given direction.
So for example a data set which includes wind values between 0 and 45 degrees, and wind strength values between two value measured in meters pers second
Direction 1-2 3-4 5-6 8-10 12-15
0 1 2 0 0 0
15 0 3 5 0 0
30 1 2 2 0 0
45 0 1 0 1 0
60 0 1 0 1 0
75 0 1 0 0 1
90 0 3 0 0 0
Assuming a boundary values of avoiding all winds speed below 2 M/S and stronger than 9 M/S, the most most frequent wind values occur in a 15 degree angle, and its most common speed value is between 5-6 M/S,i.e. a value of 5.5 M/S
Is there a better statistical or probability way of calculating this value? Because searching for a pattern feels wrong
Its theoretical possible to have a set of values where the wind blows for the same value in counter directions, e.g. it blows 50% time in direction 0 and 50% in the direction 180, with the net effect that the prevailing wind should have a value wind strength of zero.
Your approach seems reasonable to me, if the definition of prevailing wind direction is not up for modification. The threshold criterion $2 \leq \text{wind speed} \leq 9$ would be a data processing step, so you can simply set values outside that range to zero. For determining prevailing direction, summing across the columns in each row and reporting the direction(s) with max values there will deal with the tie problem you mention.