Analysing the content type within a tiddler or field

Folks,

I am just revisiting a solution I developed in the past to automatically locate $:/config tiddlers and provide a selection option. For many the config tiddler contains a simple value such as yes/no and it is easy to pair these up and provide a selection.

  • However it would be nice to be able to detect if the config tiddler contains only a number, only text or alphanumerics, a date stamp or color value to name a few.
  • In some cases where there are a few values I can configure a config-values-filter in the config tiddler or even provide a more advanced selection tool.

Question

Is there a method by which we could give a string and have a classification of the content returned?

Not only would this help with this solution but also for cleaning data, or testing user input.

  • I am aware we can limit the input using the edit-text widget but in this case I want to act on existing string values.

I imagine there are JavaScript functions or regular expressions capable of doing this already but a filter operator that returns one of a number of classifications would be good including those known by tiddlywiki.

  • Along with those below
  • Date stamp - tiddlywiki serial numeric plus correct length
    • unixtime
  • Color value - named or prefixed # etc…
  • perhaps even filter if it contains [ or transclusion if it starts with {{
  • even an existing missing tiddler or title.

The following JS is suggested by ChatGPT

function classifyString() {
    var inputText = document.getElementById('textInput').value;
    var result = document.getElementById('result');

    if (/^[A-Za-z]+$/.test(inputText)) {
        result.textContent = 'The string contains only alphabetical characters.';
    } else if (/^[0-9]+$/.test(inputText)) {
        result.textContent = 'The string contains only numeric characters.';
    } else if (/^[A-Za-z0-9]+$/.test(inputText)) {
        result.textContent = 'The string is alphanumeric.';
    } else {
        result.textContent = 'The string contains characters other than just letters and numbers.';
    }
}

What do you think?

As you can see in the above that this is determined by regular expressions.

  • Revisiting Mohamads resource http://tw-regexp.tiddlyspot.com/ I am building a function to return the various classifications.
  • Ultimately this may be better in a Javascript operator however I can finalise a custom operators first.

[Update]

If anyone is interested this is my work in progress, past this into a tiddler on tiddlywiki.com to see it working.

\define numeric-regexp() ^[0-9]+$
\define tw-timestamp-regexp() ^[0-9]{17}$
\define alphabetic-regexp() ^[a-zA-Z]+$
\define alphanumeric-regexp() ^[-a-zA-Z0-9_]+$
\define color-code-regexp() ^#(?:[a-fA-F0-9]{3}|[a-fA-F0-9]{6})$
\define hex-regexp() ^[a-fA-F0-9]+$
\function classify(string) [<string>regexp<tw-timestamp-regexp>then[tw-timestamp-regexp]] :else[<string>regexp<numeric-regexp>then[numeric]] :else[<string>regexp<alphabetic-regexp>then[alphabetic]]  :else[<string>regexp<alphanumeric-regexp>then[alphanumeric]] :else[<string>regexp<color-code-regexp>then[color]else[special]]
\function string.is(string set) [<set>addsuffix[-regexp]getvariable[]] :map[<string>regexp<currentTiddler>]

;Classify function returns the first format it matches or special
# `<<classify "abc">>` <<classify "abc">>
# `<<classify "1234">>` <<classify "1234">>
# `<<classify "a1b23c">>` <<classify "a1b23c">>
# `<<classify "#">>` <<classify "#">>
# `<<classify "#ffffff">>` <<classify "#ffffff">>
# `<<classify "20240115000000000">>` <<classify "20240115000000000">>
# add test valid tiddlername not `| [ ] { }`

;String.is returns the value if it matches a named type
# `<<string.is "abc" alphabetic>>` <<string.is "abc" alphabetic>>
# `<<string.is "3" numeric>>` '<<string.is "3" numeric>>'
##  `<<string.is "a" numeric>>` '<<string.is "a" numeric>>'
# `{{{ [string.is[#00ff00],[color]] }}}` {{{ [string.is[#00ff00],[color]] }}}
# `<<string.is "test" alphaumeric>>` <<string.is "test" alphaumeric>>

Minor tweak…

\define color-code-regexp() ^#(?:[a-fA-F0-9]{3}|[a-fA-F0-9]{4}|[a-fA-F0-9]{6}|[a-fA-F0-9]{8})$

#0008 or #00000088 is 50% transparent black (opacity:0.5; in css).

Of course, in a perfect world, that would be…

\define color-code-regexp() ^#(?:[a-fA-F0-9]{3|4|6|8})$

I don’t think there’s a flavor out there that supports something like that, sadly.