Stephen Hazel suggested some dimensions such as time, pitch, velocity of note down event, current root note of chord, chord type(major/minor/7th/etc), pan of the mix, volume of the mix and holding pedal. Vi Hart has popularized orbifolds with notes. This is very limited: sounds and music are characterized with variety. This suggests stochastic actions. If you can create a mapping $f:\mathbb R^n\mapsto \mathbb R$ from some dimensions, you could analyze the function $f$ with time-series models such as SARIMA.
I am now not looking for a perfect model to describe this kind creative activity but interested in different models and their bad sides.
What kind of models are used to describe audio material such as music or sounds? How should you select their dimensions? When should you select them and when not? Their limitations?