Equations for Classification & Probability Problem

385 Views Asked by At

There are 4 containers (classes) to keep balls of different colors (red, green, blue, orange). We know that Container A is for red balls because it contains 80% red balls. B for green balls (90%), and D for orange balls (70%). This is just a simple illustration of a classification problem that I'm doing.

My question: What kind of inference method that I can use to show Container C is for keeping blue balls? Need help to form equations, preferably at academic paper level.

enter image description here

1

There are 1 best solutions below

3
On BEST ANSWER

I'm not sure one can deduce the meaning of "is for keeping" from one example. You have kept track of color counts for each box. But in your example, it seems best to keep track of the box count for each color:

Of the 9 red balls, the box with the highest proportion is A with 4.

Of the 7 green balls, the box with the highest proportion is B with 4.

Of the 5 blue balls, the box with the highest proportion is C with 2.

Of the 7 orange balls, the box with the highest proportion is D with 4.

In this particular example, that seems a better way to attain your objective of connecting colors with boxes. Of course, this method can get into trouble in various ways, perhaps most obviously if the number of balls of a particular color were divisible by 4. And with so few balls, cases must sometimes arise in which there is not a sensible one-to-one connection.

In some sense what I have done has the flavor of Bayes Theorem. I have moved from P(color|box) to P(box|color).