quangngoc When TF is used, the frequency of the words appearing in the corpus will be used, which will be mostly populated with stop words such as "the", "a", "and", etc. So, most of the output will be skewed towards the stop words. When TF-IDF is used, the numerical output is such that the output is not affected by the most common words in the corpus. The way that this is done is that the weights of the common words are diminished down, and the weights of the uncommon words are scaled up.