The best way to graph a lot of data

128 Views Asked by At

I have a lot of feature vectors in the form of: v1=[x0, x1, x2, x3, x4] where x0, x1, and x2 can take binary values. either 0 or 1 x3 and x4 can take values from 0 up to 9

I have a lot of vectors that covers more than the entire space. Each vector corresponds to a class so for example

v1 = [0 , 1 , 1 , 5 , 9] corresponds to class A

the same feature vector can correspond to different classes so another instance of v1 in the example above can correspond to class B. I want to have a graph or graphs that represents the probabilities a certain feature vector will belong to a certain class. What is the best way to do this? I thought about converting the feature vector into a number and then plotting this number in the x axis but it will be hard to get information from this graph.

1

There are 1 best solutions below

0
On BEST ANSWER

Not sure if it's really a math question, anyway here's what I think. It's a quite complex situation because you have a (nearly) $5$-dimensional vector and you also want to represent for each vector the probability of the vector belonging to a certain class. I suppose that you have these probabilities and that you have graphing capabilities.

One solution could be taking advantage of the binary components of your vectors, that is $x_0,x_1$ and $x_2$. Since they are three, that means that you can identify each triplet with a binary number which runs from $0=000_2$ to $7=111_2$. This is just a grouping criterion to show in lower dimension what actually lives in $5$-D, no real math here.

You can in such manner work on $8$ objects separately, each one representing one group of vectors sharing the same first binary coordinates.

Now, each of these groups has still $2$ free coordinates which run from $0$ to $9$. The first thing that crossed my head is that this group can be represented by a $10\times 10$ square made of a total of $100$ points.

So the setting is this: you have $8$ squares sitting one next to the other (or as you wish), each made of $100$ points. Each point is one of your feature vector.

For the probabilities, suppose you have $n$ different classes $A_1,\dots,A_n$. You could assign to each class a different color, and then proceed like this: from each point on the squares, sketch a segment orthogonal to the square and passing through that point. Divide it into $n$ equal parts and color the $k$-th part with the color of the $k$-th class. You can change the saturation/brightness of that part in accordance of the probability that a given vector belongs to that class. (brighter color = higher probability).

In my head the final result would look something like this: $8$ flat squares lying near eac other in the $xy$ plane (floor). Each square is really a grid made of $100$ little squares. Each square emits in the $z$-direction a colored line segment. Since the colors are all at the same level, you can "see" every class at the same time, because you have $n$ color layers which are somewhat transparent. There will be brighter color spots where the probability is high.

Maybe it's really too complicated, but could give some ideas.