Motivating Example:
I have a procedural generation system that makes shapes. Sometimes the algorithm makes a "good" shape and sometimes a "bad" shape, based on the parameters its given.
Example, creating leaves:
This one is "good" (purely a subjective notion)
whereas this one is "bad" because the pieces are not connected:
The parameters that are input to the above leaf generator are varied: bool, int, fractions. Perhaps 10 parameters in total.
I don't have an analytical way of knowing exactly which set of parameters might result in "bad" results.
The Goal
I want to have some way to characterize the ranges of "good" vs "bad" parameters, trained by some data points, and generate new parameter sets from that characterization.
I imagine something like this:
- I run through ~100 random sets of parameters manually
- For each one, I produce a "good" or "bad" annotation (just using my eyes)
- I run <algorithm 1> to build a description of the "good" regions in that 10-dimensional space.
- Later, I can run <algorithm 2> with the above description as input. It will produce some guess about a random "good" set of parameters.
In this way, I hope to maximize the amount of good shapes produced vs. bad ones.
I'm ultimately trying to find good choices of <algorithm 1> and <algorithm 2>.
Note: <algorithm 1> can be slow, as it is run offline. But <algorithm 2> must be fast, since it is used at runtime to create new shapes.
Ideas So Far
I have thought about this some, and have a few ideas. I am looking for more.
- Treat the "parameter space" as some N-dimensional volume. I could create a voronoi diagram with the training points, annotating each cell as good/bad. To generate new parameters, I could choose a random "good" cell and then choose a random point inside it. ** I'm not sure a good way to find a random point inside a 10-dimensional voronoi cell
- Similar to above, but discretize space with a grid (10D voxels). Mark grid cells as good/bad based on distance from the training points. This seems like a lot of data to process, given that I have few training points and high dimensionality.
- Treat this as a machine learning problem. This is essentially a binary classification task (good/bad). I could generate various random N-dimensional points and stop when the classifier says "good". This seems like it will be slow. I want the "find new interesting parameter sets" part to be fast.
Any other ideas?
Also, any other areas of math/statistics/etc you can recommend for this sort of problem, that I can research? I don't always have the right terminology to get good search results.

