This filter count the number of words based on spaces, so it’s not perfect (and to be more accurate I should use a wikify widget as well but I dont want to crash my browser…) + it only count for words in the text field.
I wasnt able to calculate the number of char : too many arguments ! You can try it yourself if you want (filter taken from PhantomYdn)
Whoah, that is great! Very interesting. I does not matter if it is not 100% correct. It is indicative in exactly the right way. 163K words is a lot already! I wonder if @jeremyruston himself grasps how much he has written? 163K is a lot for a minimalist!
Thankyou for having the interest to do that! Fun & indirectly informative.
As a point of comparison, I copied all the text from https://tiddlywiki.com/alltiddlers.html and pasted it into a blank word processing document, which reported a total of 85,148 words.
Ha, very interesting! It is a kinda obscure thing but I think it is still intellectually interesting. TBH, I am actually quite taken by your writing style which I think is unusually compact.
It be would great if in TW we could natively, occasionally, get such figures?
@jeremyruston your point of reference might contain many double counts in form of transcluded content. The OT asks for written (as opposed to displayed) words and characters. How much difference might there be?
I suspect that the estimate from the word processor would be fairly conservative in terms of not counting punctuation as words, and that that would be significant given the large amount of non-prose content. Perhaps also interesting to experiment with the concatenated source of all tiddlers too.
Personally I think most code tiddlers would not really be counting true words. Only textual tiddlers are relevant in many ways. There is a way to use splitegexp and a w parameter to retrieve words that is more effective because it recognises new lines, punctuation and spaces etc…
I think the error come from all the punctuation symbols incorrectly counted + tiddlers with code I shouldn’t count. Wikify should help with that but it’s too much text, maybe it would be possible with several “loops”…?
@TW_Tones\w does not account for accentuated words but in this specific case since the words are in English this shouldn’t be an issue I guess…