Calculating Inflection Point in Real World Data

97 Views Asked by At

I can think of a few ways to get a good grasp of this problem from an experimental perspective, but I'm looking for a more general explanation that can be applied to different situations. If this questions belongs somewhere else, please let me know.

I'll try to explain the real-world analogy before attempting a formal description. Say I'm investigating how many doctor visits a person has in a year, and I have a healthy sample size. After doing some quick observations, I make a hypothesis. There seems to be three categories of patients- the vast majority make one or two visits in a year and presumably solve their problem (they just wanted a check-up, so I consider completing the visit solving their problem). The next 'category' sees the doctor some finite number of times (in the range of 3-10), and the remaining see the doctor many more times- less concise but anywhere from 11-350 times. The explanation behind why they have these visits are for a different place, but we notice there is some distinction between the groups. Is there a way to make a statistical inference about the boundaries of these groups? Is there any mathematics behind this or would it always be completely subjective.

It appears to be not quite as simple as percentiles because there could be 70% in the first group, 25% in the next, and 5% in the latter. I feel like I could just change these to better "fit" but is there a way to find the inflection where the groups are more clearly defined?

1

There are 1 best solutions below

0
On

Sounds like you want something like k-means clustering. See here.

Roughly speaking, this technique will allow you to partition your observations into three sets in such a way that "most" observations are "close" to the "centers" of these sets.