Performance And Sorting
Out of interest does anyone know much about the performance aspects of sorting ( tiddlers with respect to date modified or date created ? ) in the Tiddlywiki internals?
Since sorting is generally quite an overhead with long lists.
I have been looking at the code for my “All tiddlers” pill button that has code as follows…
<$macrocall $name="tag-pill" tag="All tiddlers" element-tag="$button" actions="""<$wikify name="filteredList" text="<$list filter='[!is[system]!tag[z hidden]!sort<stype>limit<limit>]'><$text text='[['/><$text text={{!!title}}/><$text text=']] '/></$list>"><$action-setfield $tiddler="$:/StoryList" list=<<filteredList>>/></$wikify>"""/>
So for the “All tiddlers” tag pill button I screen out system tiddlers and tiddlers with tag z hidden but after that I am forced to order the entire list ( which in my case is considerably over one thousand tiddlers ) but then once ordered I will display only a small number - say 30 tiddkers according to the value currently set by the slider which is used specifically to limit the number of tiddlers displayed to avoid the user being overwhelmed by sheer volume - so one of those unfortunate situations where your interest is only 30 items but you have to work hard on ALL items to decide which 30 items should appear.
So in typical use the above button broadly equates to “Show the the most recently modified 30 tiddlers” or “Show me the least recently modified 30 tiddlers” but to do that it is of course necessary to sort a very large number of tiddlers.
I find myself wondering if anyone had looked into performance here? If there are currently no special measures and if sorting performance was found to be significant I find myself wondering whether a background task might periodically order all non-system tiddlers and cache the result in some manner. The cache might serve as a quick look up?
When a tiddler was created, deleted or modified the cache would require notifying - in most cases the cache could be updated much more efficiently than it might be recalculated from scratch.
The cache might remember the previous sort call it was created to serve - for instance if the previous call was !sort[title] then the cache would remember title and the ! indicating a reverse search.
Since the sort call to create the cache would be exhaustive ( for all non-system tiddlers ) it would serve for any subsequent call that called sort with the same sort parameters – in this case sort by title - actually it would serve both reverse and non-reversed sorts so really it hinges only on the parameter ( title, modified, created etc ).
The benefit would only be for consecutive sorts that had the same argument - for instance consecutive sorts on ‘date modified’ or consecutive sorts on ‘title’ - the cache would need to be destroyed and recalculated if the next sort was of a different type.
I think this is why the tag pill “ALL” stands out as a performance bottle neck for @arunnbabu81 because similar tag pills like Random or Untagged have additional filters that will usually greatly reduce the length of the list that makes it through to the sort operator.
Anyway probably complete and utter rubbish blue sky thinking as far as application in Tiddlywiki goes I am just thinking speculatively - this kind of thinking harks back to my days writing software for solid modelling CADCAM systems where the overhead of complex geometric algorithms was considerable - we ended up with quite a lot of caching code to claw performance gains where we could.