I am trying to split tags field into individual tags with R, e.g.
tags: [[tag 1 & 2]] tag2 [tag] [[[[tag 4]]]]
into
c("tag1 & 2", "tag2", "[tag]", "[[tag 4]]")
What’s the regular expression used in js for tiddlywiki to split it?
Sorry I cannot find it from source codes.
The current implementation uses this, but I imagine there’s something simpler for most use-cases:
/(?:^|[^\S\xA0])(?:\[\[(.*?)\]\])(?=[^\S\xA0]|$)|([\S\xA0]+)/mg
First, you do not need regular expressions to split tags, there are core filter operators for tags specifically, even then a tags field is what we call a list field and there are additional operators such as list and enlist that can be used.
Please explain for what purpose you want tags in the following format?
("tag1 & 2", "tag2", "[tag]", "[[tag 4]]")
- This can be constructed but what do you intend to do with the result?
- Are you in fact writing a Java Script solution?
- Why not use TiddlyWiki Script?
For example;
\function tag-array() [all[current]tags[]addprefix["]addsuffix["]] +[join[, ]] +[addprefix[(]] +[addsuffix[)]]
<<tag-array>>
returns ("a", "b", "c", "tag with spaces")
I’m assuming this:
(emphasis added)
means that we’re doing this in the R programming language, and hence do not have access to JS/Tiddlywiki tools, but do have a regular expression engine. And
c("tag1 & 2", "tag2", "[tag]", "[[tag 4]]")
is the list format for R.
Thanks @Scott_Sauyet for your suggestions.
Yes I am trying to use web API to process tiddler in R.
This simplification works for me:
/(?:^|[^\S])(?:\[\[(.*?)\]\])(?=[^\S]|$)|(\S+)/mg
I don’t know why the built-in one handles non-breaking spaces (\xA0) but I assume there’s a good reason. However I never have them in my list fields and so for my own uses, I would skip them. That’s the only change here: I removed the handling for them. It’s at least slightly simpler.