I am writing an application to predict exam scores given at least one tuple, where the tuple represents the results of a practice exam:
(score, exam_name, date_taken[optional])
Users submit practice exam scores and, later, their official exam score (date required); not all users take all available practice exams. The dataset looks like this:
$$ \begin{bmatrix} a & b & c & d & e & f\\ g\\ h & i & & j\\ k & l & m &&n \end{bmatrix} \begin{bmatrix} (actual,date)\\ (actual,date)\\ (actual,date)\\ (null,date) \end{bmatrix} $$
Each row represents a user
. Each column in the first matrix is a particular practice exam; the second matrix is the official score (actual
) and date taken. Each element in the first matrix is a tuple of the form described above. Users complete an arbitrary combination of practice exams.
The user
represented by the 4th row has inputted four events (practice exams), the date he plans to take the official exam, but not his actual score; as such, we want to provide him with a prediction.
What model(s) might be suited for this problem? Is this question too broad?
I'm not sure how to approach this problem. I have a hunch that naïve Bayes might be best because of the "incompleteness" of the matrix, and because I want to provide a prediction to the user after any input of data (i.e., user inputting a single tuple with no date, (512, mcat_24)
, should still receive a prediction).
I've also considered a neural network, where each layer processes a tuple. The problem I'm having is not knowing how to even describe this problem.