How to apply a regex only on the normal text in the tiddler body?

talha131 · October 18, 2022, 1:03pm

What am I trying to do?

I am writing a regex that can extract transcluded tiddlers and show them in the Shiraz Node-Explorer table. Say a tiddler has the following body:

Some random text

Here is a transcluded tiddler.

{{hello 0}}

Here is the same tiddler transcluded again. We do not want to show duplicate tiddlers. 

{{hello 0}}

Another example of inline transclusion {{hello 1}}.

Example of inline transclusion with a template {{hello 1||$:/talha131/Template/NavigateTranscludedTiddler}}.

Then regex should extract the two tiddlers.

hello 0
hello 1

What have I done so far?

This is the regex that I come up with. It can extract the transcluded tiddlers from the body.

[all[current]get[text]splitregexp[\n]search::regexp[{{.+?}}]search-replace:g:regexp[.*{{(.+?)\|\|.+}}.*],[$1]search-replace:g:regexp[.*{{(.+?)}}.*],[$1]unique[]]

And it works fine.

What’s the issue?

I need a way to exclude transclusion syntax if it happens inside pre-formatted text or inside some widgets.

For example, it should ignore.

{{hello 1}}

Because it is inside code block.

It should also ignore {{ [<splitme>] }}

<$list filter="[[hello 1]regexp:text[{{(.+?)}}]]" variable="splitme">
<$text text={{{ [<splitme>] }}}/><br/>
</$list>

Because it is not transclusion although it matches the syntax.

How can I solve this issue? I think if there is a filter that can return only the normal part of the tiddler body, i.e. removing code blocks, widgets and macros, then I can apply regex on the left over text.

Mark_S · October 18, 2022, 2:01pm

Check out the search-replace operator. You can replace the chunks you don’t want to process with blanks (or whatever).

jypre · October 18, 2022, 4:20pm

Your problem have me thinking it would possibly be cool to have flex or bison tool on tiddlywiki. This could help tiddlywiki itself about the way it could hadnle his markups.

I know of jison that is a bison for javascript, but it is aparently not used within tiddlywiki.

talha131 · October 18, 2022, 5:37pm

I solved the issue with the following patterns.

\define widget-start-pattern() <\$.+>
\define widget-end-pattern() </\$.+>
\define code-pattern() `[\s\S]+`

And then

[all[current]get[text]search-replace:g:regexp<widget-start-pattern>,[]search-replace:g:regexp<widget-end-pattern>,[]search-replace:g:regexp<code-pattern>,[]splitregexp[\n]search::regexp[{{.+?}}]search-replace:g:regexp[.*{{(.+?)\|\|.+}}.*],[$1]search-replace:g:regexp[.*{{(.+?)}}.*],[$1]unique[]]