I want to extract Texts from Macros in a Tiddler with a regEx filter. The text looks like
Text...
...<<arbitraryMacro """The first text I want""" "multiple" "other" "SameID" >>
...text ..
.<<arbitraryMacro """A text I don't want""" "multiple" "other" "otherID" >>
...text ...
<<anotherarbitraryMacro """the other text I want""" "multiple" "other" "SameID">>...text...
Thus I would add an extractID variable to the macros and want the text in triple" in the macros with the same extractID
This may get more complicated if your """target text""" itself can contain <<, but here’s a quick attempt to get you started:
\function extract.id(id) [split[<<]search:title:regexp<id>] :map[split["""]nth[2]]
\define testing()
Text...
...<<arbitraryMacro """The first text I want""" "multiple" "other" "SameID" >>
...text ..
.<<arbitraryMacro """A text I don't want""" "multiple" "other" "otherID" >>
...text ...
<<anotherarbitraryMacro """the other text I want""" "multiple" "other" "SameID">>...text...
\end
{{{ [<testing>extract.id[SameID]] }}}
\function extract.id(id) is the functional bit here. I stuck your sample text in a macro for ease of testing, but you could also use [{Transcluded Tiddler}extract.id[SameID]] or [{!!field-name}extract.id[SameID]].
Are you familiar with functions as custom filter operators, generally speaking? If not: a function is essentially a shortcut for a filter that you can use either as a variable or as a segment of a larger filter. Breaking this one down piece by piece:
id = a named parameter that we can invoke within the function definition using the corresponding variable <id>. When you use the function as a filter operator, you can supply parameters in the order they appear in the \function pragma — and just like a normal filter operator, each parameter can be a [literal value], a {!!transclusion}, or a <variable>. So if you’ve already defined the id you want to extract in a given context, e.g. <$let findID="SameID">, you could use [<input>extract.id<findId>].
split[<<] — splits the input value (in my example, that’s <<testing>> into separate chunks by breaking it at each instance of <<. This gives us four separate strings, which I’ve copied here, one per line:
Text... ...
arbitraryMacro """The first text I want""" "multiple" "other" "SameID" >> ...text .. .
arbitraryMacro """A text I don't want""" "multiple" "other" "otherID" >> ...text ...
anotherarbitraryMacro """the other text I want""" "multiple" "other" "SameID">>...text...
Of course, this only works because all your macrocalls start with <<. If you were also using <$macrocall $name="arbitraryMacro" ... /> or <$transclude $variable="arbitraryMacro" ... />, this would get more complex; we’d have to use splitregexp with a regex that covered all possible cases. But if your macro format is consistent, split[<<] will do.
search:title:regexp<id> — this filters the results of split[<<] down to only those lines that contain your desired ID:
arbitraryMacro """The first text I want""" "multiple" "other" "SameID" >> ...text .. .
anotherarbitraryMacro """the other text I want""" "multiple" "other" "SameID">>...text...
Now to retrieve only the text within triple quotes, we can use :map to apply the same transformative filter to each result of the previous step. Each line gets split at """, so the first line gives us, for instance
arbitraryMacro
The first text I want
"multiple" "other" "SameID" >> ...text .. .
And in each case, the second string contains the text you actually want, so we can use nth[2] to return only that line.
Do you want the numbers as part of your rendered tiddler or as plain text? If you want a rendered list, you could use <<list-links "[<testing>extract.id[SameID]]" type:"ol">> with some CSS to add “linebreaks”, I think…
If you want the numbers and linebreaks as part of the text, you’ll need a couple of supplementary functions to splice them in, plus <pre><$text ... /></pre> to render the linebreaks properly:
\function lbr() [charcode[10]]
\function extract.id(id) [split[<<]search:title:regexp<id>] :map[split["""]nth[2]]
\function join.results() [all[]] :map[<index>add[1]addsuffix[. ]addsuffix<lbr>addsuffix<currentTiddler>] +[join<lbr>]
\define testing()
Text...
...<<arbitraryMacro """The first text I want""" "multiple" "other" "SameID" >>
...text ..
.<<arbitraryMacro """A text I don't want""" "multiple" "other" "otherID" >>
...text ...
<<anotherarbitraryMacro """the other text I want""" "multiple" "other" "SameID">>...text...
\end
<pre><$text text={{{ [<testing>extract.id[SameID]join.results[]] }}} /></pre>
This should give you
1.
The first text I want
2.
the other text I want
[charcode[10]] = a line break
join.results
[all[]] :map[...] = for each input value, apply the following filter:
<index> = a variable defined by :map = the “number” of the string being processed, starting at 0
add[1] = because we want to start at 1 rather than 0
addsuffix[. ]addsuffix<lbr> = adds the period and the following linebreak
addsuffix<currentTiddler> = adds the text string extracted by extract.id. Within the context of :map, <<currentTiddler>> is redefined to refer to the current input value.
+[join<lbr>] = join each result of the previous filter run into a single string, separated by linebreaks.
It doesn’t — this is just a supplemental variable definition replacing the static [<<] in split[<<]. Since there’s some conflicting overlap between regex and filter bracketing, I like to use \define or $let to define regex strings as variables that can be easily reused in any filter context. So splitregexp<outerMacro> is equivalent to splitregexp[(?<!\<\<.+?"""(.+?)?)<<], but (hopefully) a little easier to parse. And splitregexp[(?<!\<\<.+?"""(.+?)?)<<] ought to split at every << that is not preceded by the following:
<<anyCharacters """other optional characters
That is, “split at every << unless it’s already inside a set of triple quotes in a macrocall.”
I didn’t test this extensively, so there may be other exceptions this regexp doesn’t account for… but it should properly account for target text that includes macros like <<ref “Word” “explanation”>>.
The rest of extract.id is unchanged, so you’d use it the same way you’d use the previous version.