For another piece of work I have come across problem I believe someone with better regex skills than my self may be able to solve.
It seems to me a larger problem of parsing key/value pairs can be solved if the smaller problem of identifying how to spit a string into separate pairs is found.
First example number="1",streetaddress="my street",city="my town",postcode=2222,"key name"="key value"
In this example the existence of the comma, It is possible to split this string to extract five separate key value pairs and then handle them.
However we can not use comma â,â inside the key or values?
what if the input does not include commas?
Second example number="1" streetaddress="my street" city="my town" postcode=2222 "key name"="key value"
In this example we are using key value pairs that we understand as a list but how can we parse this into separate key value pairs?
Third example; number='1' streetaddress="my street there is a 'object' waiting" city="my town" postcode=2222 "key name"="key value" keyname=""" This includes "double quote" in the value""" mykey=45
This introduces examples of the different quote rules as documented, again without the commas. How can we split this into key/value pairs?
Note, Tiddlywiki can already parse a range of key/value pairs passed into a widget and macros and the new set multiple variables widget can make use of a list of keys plus a list of values to convert key value pairs into variables.
The problem is the initial spiting of a list of key/value pairs into separate pairs to give the various alternative ways to âquote a valueâ, or for that matter quotes for a key name.
It is also a little difficult to then make use of the results of a successful list of key/value pairs.
I would appreciate any ideas and especially if a split operator such as splitregexp<myexp> could be used to do this for all cases in general.
or at least based only on double quotes of the values.
or perhaps a tool to programmatically insert the appropriate commas.
There may even be a hack that uses the similar behaviour already in tiddlywiki to achieve the full outcome.
For example most widgets accept and validly process such key/value pairs often referred to as a âHashmap of variablesâ.
First example : [<string>split["]split[=]trim[,]!is[blank]trim[ ]] will clean up the input string to have every key followed by their value. Then you can get a value by their âindexâ : +[first[8]last[1]] or +[first[8]nth[2]] will output the value for postcode, since postcode is in 4th position (every key must be a multiple of 2). Alternatively, if you know the key name : +[after[postcode]] will output 2222. You might get false results if a value is identical to one of the key, and Iâm not sure this will work if you have empty values (ideally you should use a default N.A value to prevent issues).
This seems to works with your second example too, but not the third. Maybe you could use a macro to concatenate your string inside a let widget ? If the commas doesnt break the let widget, this would allow you to retrieve each key as a variable.
EDIT: tested it, it does allow comma but a variable name cannot be quoted. If you find a way to clean up the key then you could use this method, but honestly the best way would be to have a standardized input string, rather than trying to find a regex or filter expression for every case possible.
Thanks @telumire your approach is working as expected with commas and with the second case. Unfortunately the third case is ultimately the solution I am looking for.
Unfortunately adding to the string null="" throws these methods out. But it is close.
Verbose, but it keeps things simple and easy to handle with plain filter operators.
number:::1;;;streetaddress:::my street there is a âobjectâ waiting;;;city:::my town;;;postcode:::2222;;;key name:::key value;;;keyname:::This includes âdouble quoteâ in the value;;;mykey:::45
Of course, that doesnât work so well if any of the values have ::: or ;;;
If you do want to go the regexp route, I strongly recommend https://regex101.com/ for testing your regexps. This has helped me often to debug my regular expressions in the past.
Make sure to select the flavor âECMAScriptâ on the left-hand side when you use advanced operators, to stay compatible with the syntax TW uses.
Have a nice day
Yaisog
PS: For your third example, you could use a hierarchy of delimiters, e.g. split everything at """, then all the parts at ", then at ', and the rest at =. After that, proceed as in @telumireâs answer. After each split, work only with items that still contain a = so you donât split e.g. my street there is a 'object' waiting further (this check needs to be put into the regexp I think as TW has no foreach-like functionality).
With that you cannot have something like key='Value with "quoted" text', though, where the hierarchy is not respected.
Parsing text using wikitext filters is probably not really recommendable. Youâll be forever debugging regexps when something unexpected comes along.