Sorry — It’s not a solution. Just some thoughts about a “rateable” structure, the content at tiddlywiki.com already has … It’s an “opinionated” view about that structure.
eg: User titles may have a completely different structure than we use atm at tw-com.
Using “search term count” in “tags” and the “body” text may be a bit more generic.
I also think it’s a bit of a chicken / egg problem here. User titles are basically “free form”.
If there would be a “rating” algorithm, that give “better” results, then user created titles may (will) change in the way the algorithm works. …
Since the algorithm is created by an “opinionated programmer” there is a “backside of the coin”. … Users will be “forced” to adopt this opinion — for good or bad
I was thinking about something like this:
- rating point results add up (cumulative)
- “combined” words are splitted on “word boundary”
- eg: ActionSetFieldWidget … action set field widget
-
my/title/looks/like/this
… my title looks like this (so the “boundary” may be configurable)
- searchTerm rating points = 100 / position-in-title
- so word #1 = 100 points, word #3 = 33 points … and so on
- “stop words” like:
at, in, on, and, or
… may be counted in phase 1, but removed in phase 2
- searchTerm in tags get 50 points
- “full” and “partial” match may be considered in phase 2
- searchTerm in “defined field” gets 50 points
- This gives us the possibility to use fields like:
keywords
instead of tags to improve rating
- searchTerm as “full word match” in text: rating points = number of occurrences * 10
- To make it faster only eg: the first X words may be used
- X may be 100 by default. If set to 0 … use all words. …
- search terms shorter than Y may be ignored
- Y = 5 as default
- This gives us the possibility, that “full text search” kicks in later in the “process”
The numbers may be tweaked or may be configurable, to give the user more control.
I think this algorithm is already very complicated. … But since the existing search results already do 2 full iterations to display title and full-text results the performance should still be OK.
No promises, just some brainstorming
-mario