My dataset consists of 799 images and each image is associated with a real number. I have split my dataset into 3 parts (80-10-10 split) randomly. I have trained my model using split1. The values follow a gaussian distribution.
I have trained an CNN model to output a real when an image is fed to it. These are my $R^2$ values and Residual plots for each of the dataset split.
Split1:
$R^2$ = $0.9857076788979228$
Residual plot:
Split2:
$R^2$ = 0.7396153728584336
Residual plot:
Split3:
$R^2$ = 0.7290908907070102
Residual plot:
I can see that my model is biased, probably overfitting ( high $R^2$ value on Split1). I don't understand why the $R^2$ values of Split2 and Split3 are nearly same and the Residual plots look similar.
What is the next step for me ? Should I reduce model complexity ?


