Most relevant results shown first in search results

I’m looking for ideas on how to solve this one. When I search a phrase I’d like to see the most relevant results shown first. These, to me, are those results where my search text is in the title (or caption), however the top results include those entries where my search text is closest to the beginning of the title. The further the search text is in the title the further down the entry appears in this list. When I search for “Windows” and 176 results are returned, I currently have to scroll down through the list to find those results that begin with “Windows”.

Would this suggested a third group in the results? Could a new search operator do this “rsort”?

Thoughts?

I think this is worthy of investigation I suppose the problem is first it searches for “windows” then it sorts them alphabeticaly. It is useful to scan a list of results in alphabetical order.

Perhaps first list items with the search string as the prefix if any then the balance in alphabetical order.

Unless it is obviouse however it may get confusing when what you search for is not at the beginning.

Interesting line of enquiry though.

I could imagine using a sortsub filter like:

\define number-of-chars-before-search-term() [split<search-term>first[]length[]]

<$list filter="[search:title<search-term>] +[sortsub:number<number-of-chars-before-search-term>]" />

and thus sort according to the number of characters that appear before the search text in the title, as shown. Here I assumed that the search term is defined in the search-term variable, but this could as well be a transclusion or something else – you have to change it in both places where it’s used, though.

Note that split is case sensitive, whereas your search might not be. You could transform the title and search-term to lowercase before doing the split with a bit more code if necessary (define additional variables around the $list using the lowercase filter operator).

I’m pretty sure that @EricShulman could come up with a more elegant solution, but it’s a start. :wink:

Have a nice day
Yaisog

PS: I use a sortsub that sorts by the length of the title, using a very similar approach. To me, the more relevant results are often those where the search term occupies a larger fraction of the title, hence those with the shortest titles for a given search.

In my notes setup results are ordered as follows:

  • All title prefix matches (ignoring stop words and shortest first)
  • All other title matches
  • Matches in other fields

I will have a look later to see if any of the filters used are easily transferrable for more general usage.

1 Like

Without any experience in implementing custom filter operators, I took a stab at this approach. I implemented two tiddlers for testing:

  • implemented a JavaScript custom filter operator tiddler
    – operator is rsort
    – operand is a search string
    – e.g. rsort[Windows Services]
    – multiple operand strings/words (as in the example above) are processed separately.
    – the operand is case insensitive
    – sort order is based on the character position of each operand word. The closer to the beginning of the title field for each operand word, the higher the sort ranking that tiddler becomes.
  • implemented a dedicated DefaultSearchResultList tiddler tagged: $:/tags/SearchResults for Relevant search results
    – this tiddler adds a tab control labeled “Relevant” to the Standard tab of the $:/AdvancedSearch tiddler that makes use of the rsort operator
    – in this way, we can compare the results between the core’s existing “List” search results and the “Relevant” results

Here are links to the 2 tiddlers identified above (.json files):

http://craigsturgeon.com/mk/example/rsort/$__plugins_cls_mk_modules_filters_rsort.js.json

http://craigsturgeon.com/mk/example/rsort/$__plugins_cls_mk_ui_DefaultSearchResultList.json

This can also be tested using my test “Memory Keeper” system of the “Churchill Family History” at:

http://craigsturgeon.com/mk/example/Churchill-Example.html

My own response/comments to what I have so far:

  • Relevant mechanism needs to optionally support the caption field.
  • Any benefit gains for rsort seem to be dependent upon the tiddlers titles in the system.
  • I suspect the JavaScript code I implemented to perform the rsort is not the most efficient. Feel free to comment, change, or just put me down for any poor logic :slight_smile:
  • I’d like to see this optionally implemented in @telmiger simple-search plugin (I have not worked out how to do that yet!).
  • In regards to point above (optionally implemented in simple-search) I would like to have a configurable mechanism, like a reference to a tiddler that contains an “order by”. Such a tiddler might be empty when no further ordering is required, but populated with something like +[rsort<userInput>] when I want to use a custom “order by”.

Thoughts?

3 Likes

rsort don’t show results in case to search for non-title field, e.g.

[!is[system]search:title,pinyin[lixi]sort[title]rsort[lix]limit[250]]

Here I simply fix it through add missing tiddlers into results.

$__plugins_cls_mk_modules_filters_rsort.js.json (1.9 KB)