How to sort search results by number of occurrences

Dear friends

For example, when I type ‘sincostan’ in the search box there is one match and only one match in normal mode, but I think it will match four times in the text

Any reply would be greatly appreciated

It looks like you’re using filter search (the fourth tab) rather than standard search (the first tab). Filter search expects filter syntax, so it’s treating “sincostan” as a tiddler title — the simplest kind of filter run. In this case, there is no tiddler with the title “sincostan”, so your single result is a link to that missing tiddler, as indicated by the italics. However, if you search for the same text in the standard search tab, you’ll get a link to the tiddler that contains this word in its text.

If this isn’t your desired behavior, could you say more about the sort of results you’re looking for? Do you want four identical links to the same tiddler…?

For example, ‘1234567890’ only appears in these two tiddlers, and standard search matches two results, but it appears 6 times in these two tiddlers. How can it match the number 6?

If you want a single number as your result, representing the total number of times that string appears in your tiddlers, you could try this filter in the Filter Search tab:

[!is[system]search:text[1234567890]] :map[get[text]split[1234567890]count[]subtract[1]] +[sum[]]

If you want the search results to display four links to “新条目” and two links to “新条目 1”, one for each time that string appears in their text fields… I’m not sure how to do this with a filter search, and I’m not sure why you’d want to. This isn’t the way search results typically function: if you google something, for instance, you’ll get one result per page that matches your search term, regardless of how many times that term appears on that page.

1 Like

Thank you very much, I can use this metric to measure the level of nonsense in the overall tiddlers, “的” is equivalent to the English “of”, preferably less in the body of the tiddlers

It occurred to me that you might like to sort your standard search results based on number of occurrences of the search term in the text field. Here’s a quick package you can import into your wiki to add another tab to your Standard search results: (UPDATED as of 2/3) counted search results v3.json (1.2 KB)

Demo:

This may make it easier to find the tiddlers that contain the most “nonsense”!

1 Like

Perhaps this is the case in the code block, where “hello” is not counted and the statistical result is zero

Oh, good catch. Here’s why this is happening:

To find the number of instances of the search term in the text, I split the text field at that term. For instance, if I split the following example with split[text]

sample text more text more text.

I’d get this…

sample
more
more
.

… for a total count of 4 (because we’re counting the pieces that came before and after “text”). This means we have to subtract 1 to get the number of times “text” appears.

However, split[text] is looking for the literal, case-sensitive string “text”, which means it will ignore “Text” or “TEXT” or “tExT”.

  • In your example, we can see three instances of “Hello”, but you searched for “hello”, which does not appear in the “Theorem prover” tiddler. Since my filter can’t split this tiddler’s text at “hello”, it returns the text as-is = 1 result. [[1]subtract[1]] gives us 0 — the count that you see in the search results.

So why does “Theorem prover” show up in the results at all? That’s because the search operator is not case-sensitive by default: it will treat “hello” and “Hello” as the same word.

Solutions:

  • If you don’t want to see results that only contain “Hello” when you search for “hello”, you can change the search filter used by this custom tab. In the tiddler $:/my/ui/CountedSearchResultList (part of the package I gave you), modify the second-search-filter field: change [!is[system]search:text<userInput>] to [!is[system]search:text:literal,casesensitive<userInput>].
  • Alternatively, if you want to treat “Hello” and “hello” as the same word, you can modify the filters in $:/my/ui/CountedSearchResultList and $:/my/ui/CountedListItemTemplate to use splitregexp:i<userInput> instead of split<userInput>. In my opinion, this is a better solution, so here’s an updated package that makes these changes: (UPDATED as of 2/3) counted search results v3.json (1.2 KB)
1 Like

Thank you again.It worked perfectly, and the code worked as expected

Dear friend, I found an exception where searching for “$” throws" zero exception "for the tiddlywiki system tag, but the rest of the tag you’re looking for doesn’t have this problem

Hell @etardiff,

what is the issue im my wiki, that it will be sorted on title instead of hits?
(I used counted search results - caseinsensitive.json)

Thanks for feedback.

Good catch, thank you! This is happening because $ has a special meaning in regex. Here’s an updated version that ought to be regex-safe: counted search results v3.json (1.2 KB)

As a general note, this updated code uses functions, which means it will not work in TW versions older than 5.3.0+. Let me know if this is a problem for you.

I confess I’m not sure what’s happening there! I just tested (with the v3 code shared above) and I can’t reproduce your results…

1 Like

with v3 same issue - still not working in my wiki
(running v5.3.6)

I’m sorry, I don’t know how to troubleshoot if I can’t reproduce the behavior.

I see.

Does anybody know, if the result list (always sorted on title) is based on another core tiddler?

To make this alternate tab, $:/my/ui/CountedSearchResultList, I cloned the default search results ($:/core/ui/DefaultSearchResultList) and modified it:

\define searchResultList()
\function regexpInput() [<userInput>escaperegexp[]]
\function by.term.count() [{!!text}splitregexp:i<regexpInput>count[]subtract[1]]
\whitespace trim
//<small>Text matches</small>//

<$list filter="[<userInput>minlength[1]]" variable="ignore">
<$list filter={{{ [<configTiddler>get[second-search-filter]] }}} emptyMessage={{$:/language/Search/Matches/NoResult}}>
<span class={{{[<currentTiddler>addsuffix[-secondaryList]] -[<searchListState>get[text]] +[then[]else[tc-list-item-selected]] }}}>
<$transclude tiddler="$:/my/ui/CountedListItemTemplate"/>
</span>
</$list>
</$list>

\end
<<searchResultList>>
  • <<configTiddler>> = the current tiddler, $:/my/ui/CountedSearchResultList
  • As in the default search results, the $list filter is stored in the field second-search-filter.
    • In the default results, second-search-filter: [!is[system]search<userInput>sort[title]limit[250]]
    • In my custom results, second-search-filter: [!is[system]search:text:regexp<regexpInput>] :sort:number:reverse[by.term.count[]] +[limit[250]]
  • The filter run :sort:number:reverse[by.term.count[]] is responsible for the sorting (and itself relies on the function by.term.count, defined in this tiddler.
    • This function is also responsible for displaying the result counts in the list template $:/my/ui/CountedListItemTemplate.
    • Like the default results, this custom tab also uses the variable <<userInput>>, which contains the input string of the search box. Otherwise, the sorting does not depend on any external tiddlers.

Given all that, I’m still not sure what could be going wrong for you, @StS. I hope this breakdown may help someone else diagnose the issue.


Last minute guess: Are you using lazy loading?

I don’t use it myself, so I’m not very familiar with potential pitfalls. But if the filter doesn’t have access to the search results’ text fields, it wouldn’t have any content to split/count, so my sorting wouldn’t produce any meaningful results. In this case, you’d see an unsorted list (i.e. presumably by title).

Thanks for your effort @etardiff - cloned it - but still not sorting:

PS: I’m not using lazy loading…

Great, thanks a lot, this version I can search “$:/tag” without zero error, I can see the number of specific system tags in each tiddlers!