Fuzzy search / match implementation?

I’m curious if anyone out there had implemented a fuzzy search for TiddlyWiki. Fuzzy here meaning things like 1-character spelling mistakes etc. I didn’t see anything in the toolmap yet.

It would be handy for many things, for me specifically in my link-fixer component of bullets for instance. I’ve played with this in other languages but not Javascript (which is out of my capability at current), though there appear to be many libraries available (such as those listed here: regex - Javascript fuzzy search that makes sense - Stack Overflow)

From an implementation standpoint it would make sense to be another flag to the normal search operator like regex etc. but not sure if that’s hackable. I would not be surprised if this wasn’t “core” since it’s likely to be a sizable bit of code to run the algorithm? Though it would be cool to be able to combine the ‘fuzzy’ bit with other flags. I’m looking forward to seeing an ANY or OR enhancement as a side note mentioned here by @pmario ( [tw5] appending to a ‘list variable’ through cycles - Google Group (Read Only) - Talk TW (tiddlywiki.org))

1 Like

There is a demo with an old version of fuse.js TwFusejs Plugin — Fuse.js fuzzy searching for TiddlyWiki

1 Like

That’s a different thing. It will be an advantage, if you want to search in tags or keyword-fields, when you don’t know the exact keyword to try. So you can try different ones and get a result if any or “some” words match

The flexsearch library seems to be interesting. GitHub - nextapps-de/flexsearch: Next-Generation full text search library for Browser and Node.js

But the powerful stuff for fuzzy-search libs is how good the stemmers are and the availability for different languages.

2 Likes

This is actually pretty nice. It’s very simple, but compact. I thought worth mentioning is that the filter operator that comes with the package happens to have built-in capability for doing the “OR” vs. “AND” thing that was asked for in the other thread. Might be interesting for @CarloGgi (not sure if he’s a user here though)

After playing with the fuse plugin for a while I unfortunately I found that it didn’t work for some fairly simple things and I couldn’t debug it. It also searched the title, tags, and text fields and you didn’t appear to be able to change it. It is nice and light (55KB) but I remembered that I’d built something in VBA for Excel that was really unsophisticated but worked for me, so… I ported it to a JS filter operator and thought I’d share back.

I will again emphasize that it’s not a standard algorithm, but along the lines of my stupid-autocomplete (aka good enough autocomplete), I’ll consider this “good enough fuzzy search”. Note that this implementation is particularly geared towards things like names, so other purposes would benefit from tuning to those purposes (like how important the ordering of letters or words is for instance).

Here’s how it works in action (using Advanced Filter filter search as demo area). Note that the amount of tolerable “fuzziness” is hard-coded but otherwise adjustable. I don’t know how to make it adjustable outside the macro, though theoretically I could reverse-engineer another filter operator with multiple operators to pass a value. It’s currently set at 80% match using my own homebrew algorithm.

I start out showing that I populated some names and then show the standard search operator to compare to fuzzy
fuzzy

$__stobot_filters_fuzzy.js.json (1.2 KB)

I like that it’s very small (1.23KB which is about 50x smaller than the “small-ish” Fuse at 58.1KB) and seems to be relatively quick. Funnily enough the .gif playing is 250KB (200x the size)

I hope someone else finds this useful. If anyone who actually understands Javascript finds an error or optimization in the code - let me know! Thanks.

2 Likes

Actually if anyone (developer) can point me towards somehow returning results in the order of highest matching score to lowest, that’d be super-helpful! It’s currently implemented just to generate a matching score and use a hard-coded 80% match rate, but they’re returned in no particular order. As I want to plug this into “auto-complete”, having the best matching value first is going to be pretty important.

I have such feature in my modified command palette plugin, using fuse.js tiddlywiki-plugins/pinyin-fuzzy-search.ts at 7094eda068547eec0f4e5c7cd7be31765a4d9476 · tiddly-gittly/tiddlywiki-plugins · GitHub

Actually I just prepare the data, and call

const result = fuse.search(input);
  return result.reverse();
1 Like

And I use it like tiddlywiki-plugins/CommandPaletteWidget.ts at 7094eda068547eec0f4e5c7cd7be31765a4d9476 · tiddly-gittly/tiddlywiki-plugins · GitHub

Thanks for replying @linonetwo but with not really understanding javascript yet, the best I can do is find something 90% of the way there and change some words, so what you sent is unfortunately over my head at the moment. I did have to abandon fuse due to the results I was seeing, so I’m more just trying to make my macro above have the additional functionality of returning in an order of the match percentage.

I started trying to adapt .../sort.js but now thinking .../sortsub.js is probably a better fit - I just need to swap in my above scoring system instead of the filter input that sortsub usually uses.

I’ll keep tinkering with it - if anyone thinks this is a dead end and I can’t get this done, please warn me :slight_smile:

Actually I think I figured it out - adapted .../sortsub.js for the most part. Seems to be working at least though I should warn anyone else who tries it that I only partially understand how the sorting bit works. I’m very happy to have made it this far!

$__stobot_filters_fuzzy.js.json (1.8 KB)