How to Solve Regex Issues

@CodaCoder
Well, this is Nondestructive Search (NDS) :wink:
Nothing will be destroyed! but it takes time for average user to know how to use regex!

My warning is not about NDS. You can easily produce a search that takes a long time (a lifetime?) to complete, effectively crashing the wiki.

To help prevent that, when regex is selected, I disable live searching and add a Go! button.

@Mohammad I only now got a look at this and its very nice.

On the use of regex it is a big investment in time to learn, but there seems to me possibly no more than 50 basic regex that would provide substantial value to tiddlywiki users with or without regex experience so I wonder if as a result of this plugin and others with regex options we could build a library of regex patterns like we collect filters in Advanced Search > Filters > dropdown. Add a one sentence explanations to each and users will be on the way to learning regex from a tiddlywiki perspective if not a global regex understanding.

2 Likes

Yes, that’s a problem
If the pattern can be validated before displaying results it would be great

@TW_Tones
Thank you, having a day off predefined patterns is absolutely a must

1 Like

To put it mildly.

Please, take it seriously and put a guard in place (see my previous) before your good intentions cause someone to lose work:

@jeremyruston - do you think using regex in advanced search can make Tiddlywiki to crash?

@CodaCoder - I believe the above GitHub isue has resolved and it is not related to advanced search, it is related to the way Tiddlywiki uses regexp! Advanced search just produces titles and they can be limited to 250 or less, so it is far from crashing Tiddlywiki by such searches!

Please note advanced search in fields is a plain wikitext plugin and does not add any JS or bring any new regex tools into Tiddlywiki!

Ciao Mohammad

I looked at it. You know I’m a great fan of regular expressions and where I actually, unlike everything else, have some competence. So it is the regex aspect that I’m most interested in. I think, overall, it is as great idea! But for really useful, safe, implementation, I have a couple of initial comments that I hope will be relevant.

  • Regarding saving the filters. The existing default system is really only useful if you have a limited number of saved filters. IF you thought that, eventually, there might be, say 50, regex heavy filters then, to be usable, one would need to change the filter dropdown so it had a cascading menu system–i.e. with sub-sections for different groups of filters.

UPDATE: I need to rephrase this as the way the tool works at the moment there is no dropdown. Later.

  • Regarding the issue of guarding against “catastrophic backtracking”, as well as for half-completed regex, I very much agree with @CodaCoder. Meaning, for regex heavy filters, it is best if there is a Go! button. The end-user would enter the full regex before applying it. The dynamic nature of TW–with an incomplete regex running–would potentially cause more issues than anything else. I do realise this makes the implementation much more complex but I think without it users, especially those not so familiar with regex, could get in a pickle! :smiley:

Right. But it is still possible to lock the wiki up with a malformed regex. I think the point is to guard against live application before the regex is fully complete?? i.e. a Go! button. Though I’d have to test that to be sure.

Just FYI, as far as I can see, TW is interacting with the JS regex engine and serious error protection is JS based. And I don’t know quite what browsers are doing on that (by the way the JS regex engines do differ slightly between browsers). Personally, to test difficult regex, I use the commercial software RegexBuddy which analyses one’s regex to isolate issues on scope and lookbehind and lookforward.

But, in practice, I think for most use cases in TW it isn’t likely to be much of an issue.

Ciao @Mohammad

This is just a side comment.

I asked an Italian programmer friend to look at TW and give me his first impression of it.

He commented: “It is a very sophisticated self-modifying regex machine with bells and whistles.”

It isn’t quite that, but I thought his comment had quite a lot of truth in it. :smiley:

1 Like

10 posts were split to a new topic: Advanced Search in Fields v0.2.0

I take the blame for this - I’m somehow failing to get the point across.

Forget all your “static HTML” bits and pieces, they are not the issue. Or, at least, they are not the lion’s share of the problem. Browsers can spit out spans and divs and even tables without even breaking a sweat.

Not so events!

Attaching events to elements takes – relatively speaking – quite a long time. When you have many hundreds, or even thousands to wire up, then you will start to see performance issues.

Now… on to TiddlyWiki.

TW is a dynamic presentation environment. ANY change you make, practically any move you make, will cause the TW internals to reconsider and perhaps even do a rebuild of the dom. Your job, as a seasoned TW dev is to aid TW in that task.

If your code requires 10,000 (ten thousand) divs, pfft – easy.

If your code requires 10,000 divs containing event handlers, oh dear… hmmm… uh, perhaps I’ll go get a coffee…

:yawning_face:

I don’t understand what the resistance is, but for any seasoned TW dev, this should be your mantra going forward:

Never write another $list of links/buttons and not consider using $eventcatcher

Actuallly, I do understand the resistance -- you've invested in code that mostly works. Changing it is painful. Suck it up... "it's only code"

:wink:

It’s not clear what you’re referring to, but I hope you’re not comparing a list of event-handler laden links with a single event handler attached to a parent div ($eventcatcher). If so, your definition of “very fast” is very different to mine.

I’m going to stop now.

:sweat_smile:

There is no resistance at all! I really welcome any improvement! The only resistance I have showed by now if to go with pure wikitext instead of JavaScript! The reason is/was to let more people understand your code! and develop/extend the TW ecosystem!

So, please when you read the Search plugin codes, kindly let me know where we can improve the performance! Note that Shiraz is a separate plugin and it can be improved too!

I have developed simpler dynamic tables in other plugins, I can think to use them instead of Shiraz, if there is room for improvement!

In dynamic table there are many buttons/links and I have read the post by @saqimtiaz on improving the performance by using $eventchatcher!

Thank you for time and comments on Search plugin!

1 Like

I’m sorry, but I really don’t have that kind of time right now.

I appreciate that and that’s an admirable stance. I wish I too wrote TW code with an eye on re-use. But that’s my failing. Sorry. the best I can do is “share with myself” through the bundler plugin. Again, I’m time limited so for now, at least, it has to be that way.

Did I read somewhere you have an engineering background? If so, take my input as sharing known facts with you, some of them measured (by others, not me). None of what I gave you above is opinion, just what is known.

Anyway, I MUST get back to “real” work :frowning:

Thank you anyway!

Good luck!

I leave this here and hope others interested in the search tools find useful points!

That, BTW, is pretty much a definition of the human brain. :brain: :wink:

4 Likes

To help prevent that, when regex is selected, I disable live searching and add a Go! button.

I do believe TW users should not be able to crash their wiki/browser when using regexp, I experienced it last week and nearly lost a few hours of work. But in my case, the Go button wouldn’t have helped, for I thought my regexp was legit, and I would have hit the button!

Maybe the solution should be coded in the core, with some kind of “watchdog” monitoring the filters interpreter and “killing” it when a filter processing takes too long ? Just an idea, I don’t even know if this makes sense :sweat_smile:

Fred

That’s very true, of course. But you would get chance to take one last look. And also, your regex wouldn’t be executed before it is finished being written, which was my overriding concern.

1 Like

Rather than a “go” button, which may still not prevent problems, why not have a limit applied? Say 150 tiddlers. Then if a user is sure they want to see all the results, they can select to “See all tiddlers – this may slow your machine down.” . This way users can use the interface normally 90% of the time, be protected from mistakes, verify that a filter works, and still have complete usability when they need it.