I am not a math whiz so I don’t have the right vocabulary to ask correctly, please bear with me.
Suppose we have a committee of humans and an AI, each of which have an “Algorithm” for recommending a teacher to teach some class. They’re both using past performance data and whatever else they see fit to use, to predict performance outcome.
Committee suggest Joe AI suggest Susie
The school principle has to ultimately make the decision based on one of these recommendations (either joe or Susie)
And let’s assume that teacher performance can be uniformly measured.
Principle picks Susie over Joe... and Susie ends up performing at X level.
Question: if you were selling this AI to principles, how can you prove (or fail to prove) that picking Joe would’ve performed at something less than X level?
In other words, how can you measure or benchmark against something that cannot take place once you pick its alternative?
I don’t need a mathematical answer... but how does one work through this in a quantitative way?
I hope the question is clear. Thanks in advance for your guidance.
Bonus: I’d have to imagine this has a name. If you could point me to the topic name, that’d be great too so I can study it further.