Help needed about HTML2TW: Macro

Mark_S · August 15, 2022, 4:18pm

Made some updates. Here’s what it looks like for 3 of your cases.

However, the problem with “Parameterised transclusions” is that there is syntax nested inside of tables. AFAIK, most block wikitext doesn’t work inside wikitext tables. So blockquotes, multi-line code blocks, headers – none of these work.

I’m thinking of removing the code that attempts to make wikitext tables and instead keeping tables as HTML, albeit simplified – stripped of classes and arranged so the tags are easy to see. That way other tags could then be rendered as wikitext. Wikitext purists could then change the html table to wikitext by hand, if that seems warranted.

So, anyone else using this macro?

Parameterized
Calendar
Colorize

arunnbabu81 · August 15, 2022, 4:46pm

I had seen sometime back that you have updated the macro and i had checked it then.

Calendar tiddler was correctly converted.

Colour tags tiddler - I was not able to get it as shown in the image. How did you got it

Parameterised tiddler - sometimes it was working except for the quotes and sometimes it was not working.

I will test it when it’s available. I don’t how often we need to clip from pages with similar structure as Parametrised

Mark_S · August 15, 2022, 5:03pm

Should be vsn. 0.0.11

Start selecting with the word “This” and select all the way down to the image. The html has to contain both the start and end code tags to be processed correctly.

arunnbabu81 · August 15, 2022, 5:20pm

I tried from your demo site. Still its not working. Does the browser and OS matter ? I am using firefox browser in a macbook

Mark_S · August 15, 2022, 5:34pm

Oh, try it from the original post – Colorize system tags - #28 by TiddlyTweeter .

Hmm. You’re right. It’s not working on firefox. I’ve been using chromium on linux. But if it’s not working on Firefox that would probably mean that BJ’s clipper extension is returning different HTML on the two platforms.

arunnbabu81 · August 15, 2022, 5:47pm

It working in chrome for me.

The Parametrised tiddler - shows somewhat ok conversion in the end of the clipped text (except for the quotes) if I clip two times

Mark_S · August 15, 2022, 5:59pm

It’s going to depend somewhat on where you stop and start selection, since that will determine if you have a complete set of tags. This is what my selection looks like (chromium):

arunnbabu81 · August 15, 2022, 6:10pm

When I selected in a similar fashion, I also got the same result, but its without any formatting.

Did you checked the Parametrised tiddler in my demo wiki - I selected that whole reply and clipped two times - first time the conversion was not right at all. Second time the conversion worked except for the quotes. Is that any helpful info?

arunnbabu81 · September 17, 2022, 6:16pm

This is an appreciation post. I want to thank you @Mark_S for all your help with this macro. It greatly reduce the need to edit the clipped text since it preserve the formatting needed. I am not saying that there arent any issues. But it is certainly of great value to me. Here is an example of a TW google group post clipped using tiddlyclip and this macro. Each comment I clipped seperately - no extra effort was needed to get the exact formatting except for the initial code which was not within triple back ticks.
Authors name I added after clipping for better structuring

Do you use it in your wikis?

arunnbabu81 · September 18, 2022, 7:28pm

@Mark_S

Does the regexps tools mentioned in this post give any added benefit to the html2tw macro. You didn’t mention how to use it as a plug in. I would like to try it out.

Mark_S · September 19, 2022, 3:59am

I don’t think it would help particularly with particularly benefit html2tw, since html2tw already uses regular expressions. But who knows.

Here’s a plugin. Be sure to back up before trying – it’s been a long time since I checked up on this, and this is the first time I bundled it into a plugin.

$__plugins_mas_regexps.json (4.9 KB)

TiddlyTitch · September 19, 2022, 10:48am

@Mark_S, your understanding of regex in TW is v. good at it’s various levels of use!

FYI, a professional Italian programmer friend of mine commented—when I suggested he look at TW to deliver solutions to his clients well & cost effectively (he mainly makes money from doing heavily customised Wordpress sites for his clients ATM) …

He commented (translated): “TiddlyWiki looks like a serious ‘Regular Expression Machine’ that could be most useful for clients with specific needs. It might simplify a lot of things that currently I can only do via a server backend. It is interesting!”

Just a side comment. (Not meant to ruin the thread!)

Best, TT

TiddlyTitch · September 19, 2022, 10:51am

I will definitely look at that.

Grazie, TT

arunnbabu81 · September 19, 2022, 8:29pm

@Mark_S I added it to my wiki (after making a back up). Is there any way I can test it out?

Mark_S · September 19, 2022, 11:47pm

If you have the famous “HelloThere” text available, then this should give you a list of all regular words that have 10 or more characters.

<$vars regexp="(?g)\b\w{10,}\b">
<$list filter="[[HelloThere]get[text]regexps<regexp>]">

</$list>
</$vars>

TiddlyTitch · September 20, 2022, 10:35am

Nice example! Well illustrates a simple TW syntax. But also how one needs understand regular expression syntax (JS).

I DO wonder how much an ordinary end-user would grasp that excellent regex?
This is NOT any criticism of you @Mark_S!
I think it an ace example!

BUT the caveat is this: How much non-TW stuff do you need know to leverage Mother TW?

An implication is, maybe, this: how much should we be pointing to the “Other Thing” … i.e. external resources needed to master regex??

Or should we better illustrate it IN TW in a more methodical way? Like maybe a full scope TWegexer?

Just a simple query, TT.

arunnbabu81 · September 20, 2022, 6:30pm

Thank you @Mark_S . I will need to learn about how to use regexp in TW. I have seen @Mohammad regexp site and have read about some of the basics of regexp in that site (It is little complicated to understand though). Is there any other place I can read about regexp. May be once I understand how to write regexp, I will get more ideas about how to use it in my daily TW use.

Also why were to suggesting to use this regexp plug in when I was asking regarding complex image extraction in this post. How to use this plug-in to do such image extraction.

Mark_S · September 20, 2022, 8:58pm

To tell the truth, I was hoping that Saq’s enigmatic references to a new toolset would save the day.

Maybe my solution isn’t the right path. But the following would extract the lightbox references from your text:


<$vars regexp="<<lightbox\s+.*?>>(?g)">
<$list filter="[[Cortical laminar necrosis]get[text]regexps<regexp>]">

</$list>
</$vars>

I suspect that learning regular expressions is considerably easier than “Cortical laminar necrosis”.

arunnbabu81 · September 20, 2022, 9:08pm

Me too was waiting for his custom filters and macros to be released.

Mark_S:

Maybe my solution isn’t the right path. But the following would extract the lightbox references from your text:
<$vars regexp="<<lightbox\s+.*?>>(?g)">
<$list filter="[[Cortical laminar necrosis]get[text]regexps<regexp>]">

</$list>
</$vars>

I will test it out

TW_Tones · September 21, 2022, 12:23am

I recently restarted my endeavour to be able to use regexp in tiddlywiki to extract possibly nested “html”, widgets or arbitrary tags eg;

<tagname attrib=atval ..>
Content
</tagname>

Returning the attribute/values in one variable and the content in another.
I would then be keen to process or render the content using a template or macro.

Of note is, on the internet, I stumbled across a lot of “opinionated” statements that regular expressions can’t do this with HTML because “HTML is not a regularised language”.

This sounds wrong, if not misleading, but it could explain why I have found it hard to achieve this, even with the help of others here, kindly people are pointing me in possible directions of a solution, but never quite the tiddlywiki solution.
A lot of people here have provided information towards this request of mine but I have not yet being able to find my way to a solution based on this information.
A solution to this would allow a whole range of easy to use features;
- Apply custom CSS, formatting or tools to operate on the content of tags
- Allow actions to be applied to the content of a tag such as sorting, reformatting parsing etc…
- Simultaneously format and selectively display content with regular CSS
- Hide such “sections” with CSS display: none; but from the view template “resurface the content” interactively.

I raise this here because it is all but Identical to the OT, HTML > TiddlyWiki, but in this case allowing further tiddlywiki handling to be applied.