A Simpler Solution for Find Macro in modern TiddlyWiki 5.3.5

The find macro is a powerful Wikitext macro to find a substring delimited by two delimiters (beging: the delimiter shows the start of snippet, and end: the delimiter shows the end of snippet).
The find macro finds part of text separated by delimiters. This is a powerful macro to extract text snippets and can be used for many applications like footnotes, abbreviation lists, references (bibliography), partial transclusion and much more.

demo: https://kookma.github.io/find-macro/
code: https://github.com/kookma/find-macro

See full description here: https://kookma.github.io/find-macro/#Home:Home%20Acknowledgement%20ChangeLog

With the many development and improvement of TiddlyWiki from 2018 (the time find was written), specially TiddlyWIki 5.3.0+, I am looking for a simpler to understand and easier to maintain wikitext solution

  1. to find a substring from a given string delimited between two begin and end delimiters
  2. to return the result (so one can further manipulate it or display it)
  3. to be able to find the first occurrence or all occurrences.
1 Like

This is a good idea @Mohammad. I was looking at the current macro and wonder if it could first be annotated more by yourself or written as pseudo code?

This would allow others to more easily consider the design.

  • I may look at doing this for myself

I am keen to make use of this in a range of applications but it may be scope creep, however by your use of the macro name “find” suggests a general and comprehensive solution. It is also interesting to ask ourself how find differes from search. The term “find section” or block comes to mind.

Notes

  • A recent discussion about collapsible headings and your section editor come to mind.
  • I have strong skills with functions and volunteer to help
  • The output macro could be tested for its existence using getvariable or has title so either a macro or tiddler could be used for the output template.
  • I would like to be able to apply this to rendered html or plain text so for example we may extract <p></p> and other tags
  • Does find handle nested pairs of begin/end?
  • Widgets usually result in output and functions in a string within a variable so I wonder if we could produce a widget and function version of find section.
    • The advantage of a widget is it accepts strings, variables and transclusions as the parameter values.
    • A widget could be designed to find within its body content.
  • The function version will need nested functions which @saqimtiaz had mastered in an example I can’t find.
  • Perhaps we could have an option to include or exclude the find delimiters, wikified or not-wikified.
  • A way to set the end delimiter to a regular expression such as \n or \n\n to extract sentences eg \function fname() filter
    • What about a pragma on a single line or with an \end or an \end functionname
  • Using the parameters widget may help
  • The text could become input
  • the output could be a tiddler but would need a trigger
  • Output could be presented inside a button to allow click to save.

I gave it a try using regexp:

\procedure linefeed() \n
<$let t="""
This is a ##test1%% for a #search1%% and #replace1%% in ##text1.
This%% is a ##test2% for a #sea#rch2% and%% #replace2% in #text2%.
This is a ##test3% for a #sea##rch3%% and #replace3% in #text3%.
"""
         start="##"
         end="%%"
         st={{{[<start>] +[escaperegexp[]]}}}
         e={{{[<end>] +[escaperegexp[]]}}}
        searchString=`.*?$(st)$(.*?)$(e)$`
>

SearchString: <<searchString>> <br>
Text: <<t>> <br><br>
Output:<br>
<$list filter="[<t>search-replace:g:regexp<linefeed>,[]search-replace:g:regexp<searchString>,[$1┋]search[┋]split[┋]butlast[]]">
<<currentTiddler>><br>
</$list>

----

SearchString: .*?##(.*?)%%
Text: This is a ##test1%% for a #search1%% and #replace1%% in ##text1. This%% is a ##test2% for a #sea#rch2% and%% #replace2% in #text2%. This is a ##test3% for a #sea##rch3%% and #replace3% in #text3%.

Output:
test1
text1.This
test2% for a #sea#rch2% and
test3% for a #sea##rch3
1 Like

Thank you @jacng. Your solution in form a procedure is given below.
To be able to work with the special characters used in regular expressions, I change the code a little bit, note to the escaperegexp.

\procedure linefeed() \n
\procedure findp(text, start, end)
<$let searchString=`.*?${[<start>escaperegexp[]]}$(.*?)${[<end>escaperegexp[]]}$`  >

SearchString: <<searchString>> <br>

Text: <$text text=<<text>> /><br><br>

Output:<br>

<$list filter="[<text>search-replace:g:regexp<linefeed>,[]search-replace:g:regexp<searchString>,[$1┋]search[┋]split[┋]butlast[]]">
<<currentTiddler>><br>
</$list>
\end

Why the linefeed has been removed? Is it possible to keep them and instead use multiline flag?

That was my first thought as well when I started working on this. ‘m’ flag means multilines search which is what’s needed here. I checked my wikitext codes for how I used it previously, but my old codes were removing linefeeds to do a multiline search instead of using the ‘m’ flag ?!? Can’t recall exactly what was my difficulty then :sweat_smile:, perhaps you can get that to work.

Oh, if linefeeds were removed and you need the exact found text, you may want to change the linefeeds to some rare character instead and restore those linefeeds later in the found text. I don’t need to do this in my previous codes so didn’t think of restoring them.

Hi @jacng
Thank you for the hint! I will experiment to keep the linefeeds. In some cases, like extracting all macros in a bunch of tiddlers, you need to keep the text structure (including linefeeds).

2 posts were split to a new topic: How to Extract Delimited Substrings from a Longer String