How to find a formula for a sequence of slightly inaccurate numbers

237 Views Asked by At

I'm in the process of reverse engineering a music file format from an old computer game. It uses numbers from 0 to 127 to represent note frequencies, however I need to convert these numbers to a different unit (in this case millihertz) in order to use them.

Unfortunately because of various technical reasons, the output frequency values I am working with are a little inaccurate due to rounding issues. I am not sure how to take this into account when trying to find a formula to fit the sequence, as most explanations I can understand assume the values are precise.

Could someone please help me find a formula to fit this sequence? Here are some values. The first column is the input note number, and the second column is the output frequency in millihertz.

  0 16262
  1 17258
  2 18301
  3 19391
  4 20529
  5 21762
  6 23042
 24 51774
 50 146410
 80 520403
 98 1171287
127 1796378

I am reasonably certain the formula to calculate these is quite simple (given the game would not want to waste too much processing power on the music), but I'm being thwarted by the inaccuracies in the above list of numbers.

EDIT:

Here's some more values after some of the discussion below:

 73 439232
 89 878465
126 1772103

EDIT2:

Here are some more values as requested. It looks like values are indeed invalid where the lower four bits of the note number are >= 12.

  7 24417
  8 25887
  9 27452
 10 29064
 11 30818
 12 13654  // possible invalid note
 15 56136  // possible invalid note
 16 32525
3

There are 3 best solutions below

1
On BEST ANSWER

Actually, the answer is simple enough in hindsight. The indices consist of two 4-bit numbers. The lower four bits encode the note from C to B, the upper four bits encode the octave. I'm leaving the rest of the answer just because it documents the process of arriving at that result; you won't need any of that.


Much of the world's music, almost all Western music, and practically all computer game music is based on semitones. Two frequencies that form a semitone interval differ by a factor of $\sqrt[12]2\approx1.06$.

In your data, the interval between any two of the first seven frequencies is a semitone; the interval between notes $6$ and $24$ is $12\log_2(51774/23042)\approx14$ semitones, and so on, the next three intervals comprising $18$, $22$ and $14$ semitones, respectively. The last one doesn't come out close to an integral number of semitones; perhaps there's a typo in the last frequency?

[Edit in response to comment:]

Here's a correspondence between indices and semitones of the values you've given, except the upper two that seem to be less exact:

$$ \begin{array}{} 0&1&2&3&4&5&6&24&50&73&80&89&98\\\hline 0&1&2&3&4&5&6&20&38&57&60&69&74 \end{array} $$

Here's a fit based on that correspondence. The fit is as exact as you could wish for, given that you know there are rounding errors in your data. The $R^2$ value is $0.999999$, the resulting function for mapping semitones (not indices) to frequencies is $16273.4\mathrm e^{0.057792x}$, and the coefficient $0.057792$ in the exponent is approximately $\frac1{12}\log2\approx0.057762$, as it should be. Thus, if you can complete that correspondence between indices and semitones, you can get the frequencies to a good accuracy. However, it's not obvious from the values you've provided so far how to complete this correspondence – at the lower end of the spectrum the indices appear to correspond directly to semitones, but then later they seem to move in irregular steps, with $18$ indices corresponding to $14$ semitones, then $26$ to $18$, $23$ to $19$, $7$ to $3$, $9$ to $9$ and $9$ to $5$. You'll probably have to measure more values to make sense of that – if you don't want to measure too many, you could start with some between $73$ and $80$, corresponding to semitones $57$ to $60$, since that would allow you to see whether two indices map to the same semitone or whether there are fractional semitone steps in between.

I don't know whether it's a coincidence, but your index $73$, corresponding to semitone $57$, appears to correspond to your A; at $439.2$ Hz, its frequency is slightly below $440$ Hz, and all the other frequencies are "too low" (with respect to the $440$ Hz pitch standard) by a similar factor.

10
On

It may seem a good guess to assume that these are statndard frequencies of musical notes from the musical scale. To have a formula that is as simple as possible, we shall assume the equal-tempered scale, where each note differs from the base note $a=440\,\text{Hz}$ by a factor of $(\sqrt[12]{2})^k$ for some integer $k$. Especially, we have a factor of $2$ for every twelve steps. Unfortunalety, this does not match at all with your given data. For example 127 seems to be slightly above $a''$ (i.e. $a$ plus $24.35$ half tones) and 98 seems to be slightly below $d''$(i.e. $a$ plus $16.95$ half tones). The index difference of $29$ seems to be too big for the seven half tone steps. Apparently whatever process ultimately converts the indices to an audible sound has some builtin nonlinearity.

EDIT: After dropping the value for 127 and using Wolfram alpha, I suggest $$f(k)\approx\exp(0.0430399 x+9.74885)$$ as a simple fit. Observe that this corresponds to a musical scale having $\approx16$ instead of $12$ steps, so four steps correspond to three conventional half-tone steps. This would allow to produce somewhat accurately for example C, D#, F#, A, but the others not so well. Then again, the 12 step scale is not the answer to all questions.

5
On

I think I've cracked it. Take your index $i$ $(0 \le i \le 127)$, and express it in hex. Then the high nibble $\lfloor i / 16\rfloor$ is the octave, and the low nibble $i$ mod $16$ is the number of semitones (0-11) above C in that octave. If this is right, then indexes with a low nibble greater than 11 (hex 'B') are simply invalid (and presumably they are never used). This would include the anomalous 126 and 127.

index 0 represents four octaves below middle C.

This doesn't explain all your figures to the last digit, but it will certainly be good enough for your average game player. Perhaps these figures are not well-tempered at all, but optimised for a certain key. I might try to find which key that is if I find myself with some free time :-)

With this in mind, can you post the frequencies for 7, 8, 9, 10, and 11, please?