ich habe eine Frage zur Anwendung der Clusteranalyse bei Zeitreihen. Ich kenne mich grundsätzlich mit statischer Clusteranalyse aus. Jedoch soll ich eine grössere Anzahl von Firmen ($n=3000$) anhand von stündlichem Stromverbrauch über einen Zeitraum von $5$ Jahren clustern, d.h. ich habe Liefermenge von Strom $y_1, y_2,\dotsc,y_5$. Ziel ist es, auch Bewegungen der einzelnen Firmen in/zwischen den Clustern im zeitverlauf darzustellen.
Habt ihr evtl eine Idee?
Vielen herzlichen Dank!
Edit: I have added a Google translation for users who don't speak German - it looks pretty good, but if someone with better German than me wants to check it, that would probably help.
I have a question concerning the application of cluster analysis in time series. I am basically familiar with the application of static cluster analysis. However, I will cluster a large number of firms ($n = 3000$) on the basis of hourly electricity consumption over a period of $5$ years, ie I have deliveries of current $y_1, y_2,\dotsc,y_5$. The aim is also to represent movements of the individual companies in and between the clusters over time.
Do you possibly have an idea?
Thank you very much!
For a statistical cluster analysis you could evaluate statistical properties for each company for a period of time. These properties could be average, variance, peak or more sophisticated power consumption measures.
You could separate the five years into 60 months and characterize every company for every month with a vector of such statistical properties. You then perform an ordinary cluster analysis for every month. For a start, you could assume that the companies should be assigned to one of two or three clusters.
You could try k-means clustering. Principal component analysis could help to visualize the statistical properties of the vectors and to devise a suitable separation/clustering criterion.
The first distinction could be to find out which companies are large, small, fluctuating or steady consumers. This would result in four clusters.
As power consumption typically varies in a seasonal fashion, you could try to identify seasonal power consumers and non-seasonal ones.