I have sales data and I want to cluster it based on similarity. The term similarity here focus on the pattern of peak, valley or the slope.
I have different scale on both side of time and sales. So some items might be very short on time, some very long, some very high in sales, or very low in sales. basically, i dont need to care about the magnitude difference or time difference. An item thats very short in time or low peak on sales, as long as it have similar pattern than to those who have same magnitude of sales or time length, should be in the same cluster.

Now what i have read so far is about DTW. However, i think it really doesnt fit to my problem which have different length on both time and sales. My understanding is, DTW could fix the time dimension, but if curve have different magnitude, then the distance would be very big.
my approach up to now is by scaling the 2 dimension into 0-1 so the curve have the same size on both sides of axis. then i encounter the problem with dtw because the discrete point on each curve is different. is this the right approach? if not, what should i do? Any direction would be helpful.
I'm using R to do this.
An initial attempt to find similarities may go something like this:
Standardize each time series - for each sales value, subtract the mean sales and divide by the standard deviation of the sales. Doing this is will help alleviate some of the sales scale differences across the time series.
Fit a polynomial - maybe start with degree 0 (the arithmetic mean of the series), and progress in terms 1, 2, 3, ... - and save the estimated coefficients for each series. These estimated coefficient values will be the data you use to determine similarity between time series. The key in choosing which polynomial degree to use is to balance generality (manifested as a tendency towards using lower degree polynomials) versus capturing features in the data (manifested as a tendency towards using higher degree polynomials). Note that the highest-degree polynomial you can use is constrained by the length (i.e., how many time indices) of the shortest time series. You may find cubic splines useful as well.
Decide upon a distance function - Euclidean is simple and worth an initial try - and compute the pairwise Euclidean distances between time series where the estimated coefficients of each time series are the input vectors used.
Make sure to exclude any results of a time series's pairwise distance with itself (which will have a distance of 0). Without loss of generality, if time series B has the lowest distance from time series of all time series compared with A, then time series B is most 'similar' to A from the perspective of A.