Regular expression to split tags field

I am trying to split tags field into individual tags with R, e.g.

tags: [[tag 1 & 2]] tag2 [tag] [[[[tag 4]]]]

into

c("tag1 & 2", "tag2", "[tag]", "[[tag 4]]")

What’s the regular expression used in js for tiddlywiki to split it?

Sorry I cannot find it from source codes.

The current implementation uses this, but I imagine there’s something simpler for most use-cases:

/(?:^|[^\S\xA0])(?:\[\[(.*?)\]\])(?=[^\S\xA0]|$)|([\S\xA0]+)/mg
1 Like

First, you do not need regular expressions to split tags, there are core filter operators for tags specifically, even then a tags field is what we call a list field and there are additional operators such as list and enlist that can be used.

Please explain for what purpose you want tags in the following format?

("tag1 & 2", "tag2", "[tag]", "[[tag 4]]")
  • This can be constructed but what do you intend to do with the result?
  • Are you in fact writing a Java Script solution?
  • Why not use TiddlyWiki Script?

For example;

\function tag-array() [all[current]tags[]addprefix["]addsuffix["]] +[join[, ]] +[addprefix[(]]  +[addsuffix[)]] 

<<tag-array>>

returns ("a", "b", "c", "tag with spaces")

I’m assuming this:

(emphasis added)

means that we’re doing this in the R programming language, and hence do not have access to JS/Tiddlywiki tools, but do have a regular expression engine. And

c("tag1 & 2", "tag2", "[tag]", "[[tag 4]]")

is the list format for R.

1 Like

Thanks @Scott_Sauyet for your suggestions.

Yes I am trying to use web API to process tiddler in R.

This simplification works for me:

/(?:^|[^\S])(?:\[\[(.*?)\]\])(?=[^\S]|$)|(\S+)/mg

I don’t know why the built-in one handles non-breaking spaces (\xA0) but I assume there’s a good reason. However I never have them in my list fields and so for my own uses, I would skip them. That’s the only change here: I removed the handling for them. It’s at least slightly simpler.

1 Like

Thanks for your help