How to use Support Vector Machines with Mixed data?

552 Views Asked by At

I have a dataset regarding student records with a mix of continuous, discrete & categorical data - the categorical data takes both nominal and ordinal forms.

Ex: Continuous - GPA Ex: Discrete - Age when admitted Ex: Categorical (ordinal) - Admit class (late freshman, early junior, etc). Ex: Categorical (nominal) - Admit description (transfer, GED, etc).

What is the best way to apply SVMs to this dataset? The options seem to be: 1) Convert discrete & categorical values to continuous, real values before application of SVMs 2) Use appropriate similarity-based kernel for mixed data, but this I cannot find. Does it even exist?

1

There are 1 best solutions below

1
On

You can convert categorical values to binary features. For eg, if car color was a categorical value, Car Color = Green, Blue or Red This value can be converted to three binary features, namely, feature 1 (Green?), feature 2(Blue?), feature 3 (Red?)

A green car would be represented as [1,0,0] feature A blue car would be represented as [0,1,0] feature and so on.