A common method for linking language with psychological variables involves counting words belonging to manually-created categories of language. One counts how often words in a given category are used by an individual, the percentage of the participants' words which are from the given category:

where
is the number of the times the participant mentions
and
is the set of all words mentioned by the subject.
I currently have 5 categories with some words in each, I also have 100 texts = 900 words, So i am trying to get how many words from each category was used in the 100 text using the above equation.
The above equation gives you the probability with which a particular subject text may belong to a particular category.
The probability can be calculated as the division of (Sum of frequency with which each word in the category appears in the subject text) with the (Sum of frequency of each word of the subject text inside the subject text).
So if category "Sports" has words : cricket , volleyball, football.
And if text is : "Cricket football game football"
So probability that category of text is "Sports" can be calculated as:
p(Sports | text) = (freq (cricket) in text + freq(volleyball) in text + freq(football) in text) / (freq(Cricket)+freq(football)+freq(game))
= (1+ 0 + 2)/(1 + 2 +1)
=3/4
You calculate this probability for each category for every subject. And the category which has highest probability for your subject is the category of the subject.