Need a regexp to extract the number from a string

I would like for a regexp that extracts the number from an arbitrary string. The string can have chars before or after the number, but we can assume there is only one segment of digits forming the number.

arbitrarystring123123
1234 arbitrary string1234
Arbi12345trary String!12345

I’ve googled around and quite a few places propose simply \d+ (e.g here) or a bit more complex /(\d+)/(e-g here) (or are those really the same?).

Assuming the above regexp is corrent, I guess I just don’t know how to put it into TW. Here is an attempt, but the output is the full string1234. And if I put in the more complex expression above, the output is nothing:

<$let reg="\d+">
{{{ [[string1234]regexp<reg>] }}}
</$let>

I’ve also tested removing \ and brackets, etc, but nope.
What do I misunderstand?

Thank you!

Primarily, that regexp is a search operator and doesn’t modify its output. In fact, you could do the same thing (i.e., return an input string if it contains at least one number) with the extended search syntax: search:title:regexp<reg>.

I’d handle this with the search-replace operator instead—as luckily, it also comes with a regexp flag!

<$let reg="[^\d]">
{{{ [[string1234]search-replace:g:regexp<reg>,[]] }}}
</$let>

Here I’m using [^\d] to indicate any character that’s not a number. This is a general regex pattern: [^abc] will match anything that’s not a, b, or c, and \d refers to any number, as you’ve already discovered. We don’t need the + in this case because we’ll be replacing each individual non-number with a space.

We also need the global flag :g : [[string1234]search-replace::regexp<reg>,[]] will only get you to tring1234.

3 Likes

What happens when there are two or more numbers in the string?

For example: “abc123def456xyz”

If the regexp pattern does not contain any square brackets, you can avoid the need for a separate variable to contain the pattern. Fortunately, for this specific use case, the regexp pattern “\D” (note the capitalization!) means “match any NON-numeric character”.

Thus, to remove non-numeric characters, you can simply write:

{{{ [[string1234]search-replace:g:regexp[\D],[]] }}}

-e

3 Likes

Unsure if this helps - but this is how I would do it:

% head -1 MyFile | tr -d '[:cntrl:]' | tr -dc '[:alnum:]\n\r' | tr -d '[a-zA-Z]'
% head -2 MyFile | tr -d '[:cntrl:]' | tr -dc '[:alnum:]\n\r' | tr -d '[a-zA-Z]'
% head -3 MyFile | tr -d '[:cntrl:]' | tr -dc '[:alnum:]\n\r' | tr -d '[a-zA-Z]'
% head -4 MyFile | tr -d '[:cntrl:]' | tr -dc '[:alnum:]\n\r' | tr -d '[a-zA-Z]'

MyFile: (has all these tests)

arbitrarystring123
1234 arbitrary string
Arbi12345trary String! 
abc123def456xyz

Output: (all lines)

% cat MyFile | tr -d '[:cntrl:]' | tr -dc '[:alnum:]\n\r' | tr -d '[a-zA-Z]'
123123412345123456

Aha! I now realize I’ve naively thought that regexp more or less is some kind of magic spell that itself “does” things. Your information makes sense.

Appreciated!

1 Like

Fair question, but my use case is unlikely to do so.

Even better! Thanks Eric!

Thanks, but I have no idea what that means. I’m asking what to put in a wikitext filter. Could you show what you do with the above in a wikitext filter?

If the scenario can happen and such a thing can cause headaches and can’t be easily identified visually, you’ll probably want to set up some kind of check and flag.

I’m looking through the doc’s to see how this translates to a wikitext (filter),…

As for what I posted means:

% head -1 MyFile  <--- Grab the first line in the MyFile - using a command line (my case Mac Terminal)

The line is:
arbitrarystring123

Then, send the output (the line above) to the “translate” (tr) - and then translate the contents (in this case - ‘delete’ anything on that line that matches any control character)

| tr -d '[:cntrl:]' 

Then, do the same for “Alpha” and newlines: (delete)

| tr -dc '[:alnum:]\n\r'

Then, take the output of the above - and delete (-d) any non-digits:

| tr -d '[a-zA-Z]'

The result is just numbers.

123

I am reading up - but Eric (I think) already provided the solution) - I saw after I posted.

1 Like

You answer does not work in TiddlyWiki and you are not providing the context in which it works. @TwN00b I believe you posted a ChatGPT answer that was untrue in the past, what are you doing here?

You must take to account the people reading this not only how you understand things.

“what are you doing here?” seems like an overly harsh comment! I see no indication that this was a LLM AI-generated (aka, “ChatGPT”) response. My take is that @TwN00b simply mis-understood the question and provided what appears to be a reasonable linux-style command line solution rather than a TiddlyWiki wikitext filter solution.

-e

3 Likes

Hi @TW_Tones - I am still looking for the TW way to do what I posted - since I am still getting me feet wet in this.

As for the post -
I was merely posting the way I would solve this - in hopes it might help nudge something loose in our collective brains - for the solution.

Hence, the way I would solve this (non-TW way) - is using Bash - aka a Mac terminal (iTerm2 in this case):

  • All the commands are commands you run in the terminal
  • Each has a built-in help – using the man command
  • The sequence in bash is pretty simple using unix pipes — I think pipes might be the "trains, or tracks, or stations in TW??? - unsure

For ChatGPT - nah - I won’t use it - also - I think the team frowns on it in TiddlyTalk - - some post recently about that.

  • I think you made that assumption - - just like I did for bash and the process I would follow - - no worries

For what I doing - well - it’s simple - and to be clear

  • Just trying ‘give back’ - since soooo many dev’s have helped me in the past
  • I assume you meant it with no disrespect - - since I will never take it in the mouth,…

I admit that I made the mistake in assuming the dev’s understood bash or what my intent was - that’s on me as well - as I was trying to explain that this is the flow I would follow.

Even if I criticise I do not mean disrespect, I simply want communication here to be helpful and thoughtful of the audience. If too much content exists that not many people understand or cannot rely on we will loose participation especially of our large cohort of newer users.

Thanks for your explanation and desire to give back. Perhaps keep in mind the audience when replying and give this contextual information along with your reply so people can determine if they should know, learn or skip what you are saying. Especially if it’s a non tiddlywiki way.

You may have noticed this is a unique and effective forum for users of many skill levels, this is not by accident, we need to support its culture.

Thanks for you considered response.