[tw5] Freelink and non-english tiddler title

Hi all,

Freelink does not recognize if tiddler title has non-english characters (unicode).
Confirmed to work as described only for english titles.

TiddlyDesktop version 0.14
TiddlyWiki version 5.1.22

Does anyone have a workaround for this issue?

thanks

Hi keSh,

I did have a look at the source code. There is a problem with the regexp, that is used. We will need to create javascript fix
There is no workaround for this problem.

I did test “asdf sdfö” where “asdf” is recognized, but “sdfö” is not.

The plugin uses \b to define “word boundaries” but \b in JS only uses: [a-zA-Z0-9_] as word characters. … That’s the problem.

-mario

1 Like

Hi Mario.

You may have already seen this discussion on stack overflow. regex - Javascript RegExp + Word boundaries + unicode characters - Stack Overflow
I don’t know enough javascript to try it out.

thanks again.
-keSh

look behind is not supported by safari.

-m

I’m desperately trying to get this to work. Non-english tiddler titles are essential for my project.
I wonder if it will work with other browsers (Chrome, Edge, Firefox). I’m not restricted to Safari or TiddlyDesktop.

-keSh

I have no idea if this would technically work.

Even if it technically worked, I have no idea whether or not it would be crap (pardon my language!)

What if you kept tiddler titles in English, but then had non-english titles with unicode characters as aliases?

That aside, would the pending TiddlyWiki version 5.2.0 fix things?

1 Like

Hi,

I did create an issue at github: https://github.com/Jermolene/TiddlyWiki5/issues/6029 …

There is no workaround and using any other browser won’t help either. …

Even if we find a matching regexp pattern, which can be used, there is some risk, that it will be way to slow, with a higher number of tiddlers.

-mario

I’m following the issue at GitHub.

Any updates ?

-kesh

Hi,
Jeremy has changed the core regexp construction a little bit, but I didn’t find a performant way to use an unicode regexp that JavaScript does understand. … May be I missed something. …
So no improvements atm :confused:

-mario

With the link edit tool, it’s pretty easy to insert the actual link to a tiddler. freelinks depends on you matching an exact title, which can be difficult even with english titles.

@Mark S.

The advantage of freelinks is in a context, where you can’t modify the content but you still want to have links. eg: Submitted homework tiddlers from students. Adding links to those tiddlers would mean to modify the original work in an unreasonable way. It wouldn’t be the students work anymore.

-m

That is exactly my use case. Only difference is that tiddler content and titles are not in English.
Thanks for looking into the issue.

-kesh

What language or character set are you targeting ?

It seems like a change to one line of code in the plugin can accommodate PMario’s example. But not sure whether that would work for other languages.

var regexpStr = “(?<=^|[^a-zA-Zö])(?:” + reparts.join("|") + “)(?=$|[^a-zA-Zö])”;

Yes, it depends on the browser having look-ahead.

@Mark S. Thanks,

I’m using South Indian script Kannada: U+0C80–U+0CFF .
How do I include unicode range in the regexpStr ?

Browser is whatever engine comes with TiddlyDesktop on macOS.
I can change to a more suitable browser if needed to support this feature.

Absolutely make a backup before unzipping, importing, saving, and reloading with the attached file. I am definitely out of my depth! You will probably need a modern (e.g. Chrome, Firefox) browser. This is not a universal solution (only adds Kannada), and there may be tweaks that need to be done (or actually undone) re case folding.

But it did appear to work – you can see the faint Kannada characters in this screen grab:

image.png

(Attachment $__core_modules_widgets_text.js.zip is missing)

@Mark S.

Wanted to report back that it works for my purposes.
I can now continue with my project which has been on hold for over 2 years.

You are a godsend. Your efforts are much appreciated.