Hello there is a question given in Mining of Massive Data Sets book http://infolab.stanford.edu/~ullman/mmds/ch1.pdf it is on page 15 exercise 1.3.2
My solution is following: as there are $10$ million documents and word occurs in $320$ of them so Inverse Document Frequency = $\log(10*10^{6}/320)$;
Now as per question...
case a) word if appears once then $TF=1/15$ (as given $15$ is the max occurrence of word in a document)
case b) $TF = 5/15$ as given word appears $5$ times (maximum occurrence pre defined to be $15$ times)
so for case a) $TF.IDF$ score $= \log(10^{7}/320)*(1/15)$
and for case b) $TF.IDF$ score $= \log(10^{7}/320)*(5/15)$
Is this solution correct? I just want to understand if I have understood the concept correctly or not.
You're on the right path...according to the definition of $IDF$, $IDF_i=\log_2 (N/n_i)$, so your answers should be
Case A: $TDF.IF \text { score} = \log_2 (10^{7}/320) * (1/15) = \log_2 (6250/3)$
Case B: $TDF.IF \text { score} = \log_2 (10^{7}/320) * (5/15) = \log_2 (31250/3)$