Parse text until line contains specific string

I’d like to parse a text line by line, searching for a given string.
It could be done with use of the search-operator, but I want to know in which line it occurs.
The text can contain comments on random lines, too.
Comment lines aren’t counted for the matching line.

I think I need some recursive function or procedure for parsing the text line by line:
Can a “global” counter variable be realized for the line numbers?
Or a marker for memorizing the last found position and can be used in another run as the start position?

# Tiddler title: SampleData
#################################
# First line: comment
# Second line: comment
Some text;separated by;semicolon
# Another line, another comment
More text;separated by;semicolon
More text;follows

My first thought for stripping out the comment lines:

\function new.line() [charcode[10]]
\function get.text() [{SampleData}split<new.line>] :map[<currentTiddler>search::anchored[#]count[]compare:integer:gt[0]then[comment]else[text]]

How do I call get.text recursively until a text line was found?

Thank you in advance for your comments!

This was a fun puzzle, thank you! You didn’t say what format you needed the results in, so I’ll start with the easiest way to do it, which is with nested lists. I’m assuming that each “comment” line starts with #, and no non-comment lines do.

<$let search="semicolon">
	"<<search>>" appears in line(s)
<$list filter="[{SampleData}splitregexp[\n]!prefix[#]]" variable="line" counter="num">
	<$list filter="[<line>search:title[semicolon]]"><<num>></$list>
</$list>
</$let>

$list lets us define a counter variable, which makes it easier to iterate the search across each line and only display the numbers of the line where the search term appears.

But if you do want to do it with a function, here’s another option:

\function text.line(tiddler) [<tiddler>get[text]splitregexp[\n]!prefix[#]]
\function search.line(tiddler, searchTerm)
[text.line<tiddler>search:title<searchTerm>]
:map[text.line<tiddler>allbefore:include<currentTiddler>count[]!match[0]]
\end

"semicolon" appears in lines {{{ [search.line[SampleData],[semicolon]] +[join[, ]] }}}
<$list filter="[search.line[SampleData],[semicolon]]" variable="num">

<<num>> {{{ [text.line[SampleData]nth<num>] }}}
</$list>
  • I used <tiddler>get[text] here, but if you want to preserve the {SampleData} format you could do something like this instead:
\function text.line(text) [<text>splitregexp[\n]!is[blank]!prefix[#]]

[text.line{SampleData}]
  • You’ll notice search.line uses text.line<tiddler> twice; this means that you could, in theory, do the whole thing with a single hideous function…
\function search.line(tiddler, searchTerm)
[<tiddler>get[text]splitregexp[\n]!prefix[#]search:title<searchTerm>]
:map[<tiddler>get[text]splitregexp[\n]!prefix[#]allbefore:include<currentTiddler>count[]!match[0]]
\end

… but I think splitting it makes it neater and easier to read. :slight_smile:

2 Likes

Here’s my solution:

data tiddler:<br/><$edit-text field="data"/><br/>
find:<br/><$edit-text field="find"/><br/>

<$let find={{!!find}}>
<$let result={{{ [{!!data}get[text]splitregexp[\n]!prefix[#]] :map[search<find>] +[!match[]] }}}>
<<result>>

Notes:

  • For testing purposes, we start by getting the name of the data tiddler and the desired search text , which we store in fields {{!!data}} and {{!!find}} of the current tiddler.
  • Next, we get the value of the search text as a variable named find so we can refer to it inside the :map[...] filter.
  • Then, we use a $let widget to invoke a filtered transclusion that finds any matching lines, and stores the output in a variable named result.
  • Within the filtered transclusion syntax:
    • The first filter run, [{!!data}get[text]splitregexp[\n]!prefix[#]], extracts the lines from the data tiddler, discarding any lines that start with #
    • The second filter run, :map[search<find>], returns any line that contains the desired search text. Note that if a line doesn’t contain the search text, the map filter run returns a blank value.
    • The third filter run, +[!match[]] eliminates any blank values returned by the :map filter run.
  • Note that the $let assignment automatically uses only the FIRST matching item from the filtered transclusion parameter, so all that remains is to display the variable <<result>>.

In addition to the above, we can also use the :map[...] filter run to add formatting to the output by replacing:

:map[search<find>]

with

:map[search<find>then<index>add[1]addprefix[Line ]addsuffix[ = ]addsuffix<currentTiddler>]
  • <index> is the numeric index of the current list item. Note that <index> is 0-based, so we use add[1] to produce line numbers that are 1-based.
  • We then add “Line ” as a prefix, and “ = line text” as a suffix to produce the desired output.

The final output would look like:

Line # = text

enjoy,
-e

4 Likes

Thank you, @etardiff & @EricShulman - for two nice solutions.
Actually I’m playing around with your suggestions and trying to find out which one suits better.
I’m going to post another reply as soon as I worked it out :grinning:
Thanks!

If of any use, I wound up doing the following as a fun exercise:

\define Search(tid val filter)

<$wikify name="result" text = """
<$list filter="[[$tid$]get[text]splitregexp[\n]]" counter="num">
<$list filter="[<currentTiddler>!prefix[#]]">
<<num>>🟠🟠🟠<$text text={{!!title}}/>\n
</$list>
</$list>
""">
<$list filter="[<result>split[\n]regexp[$val$]$filter$]">
<$text text={{{ [{!!title}split[🟠🟠🟠]nth[1]] }}}/><br>
</$list>
</$wikify>
\end


!! All:

<<Search "New Tiddler" "separated">>

!! First

<<Search "New Tiddler" "separated" "first[]">>

!!Last

<<Search "New Tiddler" "separated" "last[]">>

!!Second

<<Search "New Tiddler" "separated" "nth[2]">>

The input text:

image

The results:

image

2 Likes

BTW, as-is, you can pass along other subfilters, like:

!!Count of lines that have the search value

<<Search "New Tiddler" "separated" "count[]">>

image

I’m very sorry for keep you waiting - unfortunately I had some rather busy weeks both at work and home.

Thank you all for your nice solutions (including @Charlie_Veniot :slight_smile:), which helped me a lot meeting my requirements.
I ended up in combining the first two solutions, which resulted in:

\function text.line(tiddler) [<tiddler>get[text]splitregexp[\n]!prefix[#]]
\function search.line(tiddler, searchTerm)
[text.line<tiddler>search:title<searchTerm>]
:map[text.line<tiddler>allbefore:include<currentTiddler>count[]!match[0]]
\end
<style>
  .markup {
    padding: 0px 3px 0px 3px;
    border-color: cyan;
    color: black;
    background: lightblue;
  }
  .small {
    border: 0;
    height: 0;
    border-bottom: 1px solid #ddddee;
  }
</style>

data tiddler:<br/><$edit-text field="data"/><br/>
find:<br/><$edit-text field="find"/><br/>

<% if [{!!find}!match[]] %>
''//{{!!find}}//'' appears in lines <$text text={{{ [search.line{!!data},{!!find}] +[join[, ]] }}} />
<hr class="small">

<$list filter="[search.line{!!data},{!!find}]" variable="num">
<$let strLine={{{ [text.line{!!data}nth<num>] }}} parts={{{ [<strLine>split{!!find}count[]] }}} >
<<num>>:
<!-- <<num>>: <<strLine>> [<<parts>>] -->
<% if [<parts>compare:integer:gt[2]] %>
<$list filter="[<strLine>split{!!find}]" variable="content" join='<span class="markup">{{!!find}}</span>'><<content>></$list>
<% else %>
<$let strLeft={{{ [<strLine>split{!!find}nth[1]] }}} strRight={{{ [<strLine>split{!!find}nth[2]] }}}>
<<strLeft>><span class="markup">{{!!find}}</span><<strRight>>
</$let>
<% endif %>
<br>
</$let>
</$list>
<% else %>
Enter search term to start!
<% endif %>

Output:

grafik

Also thank you for your detailed explanation, the allbefore operator was new for me.
I didn’t know that I can use a undefined variable inside a filter expression? Interesting!

Can this method be used to create a context search like functionality seen in context plug in Context Plugin — search with context