I'm currently doing some studies on a new quantitative unit in Linguistics, the so called Motifs. A Motif is a ascending (or descending) sequence of quantitative linguistic properties. For a better intuition, here's an example
- I did not inhale and never tried it again
- (1 3 3 6) (3 5 5) (2 5)
There's the example sentence (1) and it's representation in l-Motifs in (2). An l-Motif is a sequence of length, in this case lenght of words in letter count. There are also f-Motifs, which are motifs of frequencies, e.g. frequency of the word in a corpus. Motifs can be taken from motifs, e.g. it is possible to count the lenght of an l-Motif. The first length in (2) would be 4. On this numbers it is again possible to find ascending sequences which again build motifs.
I want to formalize this "unit" and it's properties for further computations. I thought on modelling it as "Motifs Category". When I understand the meaning of categories right, there are four properties which must be given.
- there need to be object (in this case the different motifs and of course the abstract object of text, which is "transformed" to a motif)
- there need to be morphisms (in this case it would be the mapping from a text to a motif structure and from a motif structure to a motifs structure, either l- or f-)
- there need to be an identity morphism (which is obvious I think)
- there need to be a composition (I can build a f-motif structure from a l-motif structure from a f-motif structure etc.)
My questions now are:
- Are the assumptions on categories right?
- Is this really the right intuition behind the "Motifs Category" to be modelled?
- If the above are true: What would be the right way to write it?
Here's what I've got so far:
$T :$ texts
$L :$ l-motifs
$F :$ f-motifs
$\phi_{word}^{L} : T \to L$
which "transforms" or "maps" (I'm not sure of the right term here) a text to a l-motif structure by word length
$\phi_{word}^{F} : T \to F$
same as above, but in f-motif structure
$\phi_{word}^{F} \circ \phi_{word}^{L} = \phi_{word}^{FL}$
which is the l-motif structure of a f-motif structure of a text