Pretty fractions

Scott_Sauyet · January 1, 2024, 7:30pm

This is a summary of the findings in Can we run a regex replace at render time?

My goal was to make it easy in a wiki for the user to type fractions as normal ("2/3") but have them display in a prettier format ("²⁄₃ ".) I knew how to write a procedure to do this, but I didn’t want the user to have to invoke a proc (macro, function, whatever), but rather to just enter plain text. I understood how to use regular expression replacement to make the change. I didn’t understand how to apply that without user intervention.

There was a great discussion and two very useful techniques emerged.

First, @btheado pointed out that the wikitext parser/formatter is built extensibly, and we could write our own rule for it:

I’ve posted a version of this that involves the following code:

/*\
title: wikiparser/rules/fraction.js
type: application/javascript
module-type: wikirule

Wiki text inline rule for prettifying simple fractions.

Turns, say, "3/4" into <sup>3</sup>⁄<sub>4</sub>

\*/
(function(){

/*jslint node: true, browser: true */
/*global $tw: false */
"use strict";

exports.name = "fraction";
exports.types = {inline: true};

exports.init = function(parser) {
	this.parser = parser;
    var regexText = "\\b(\\d+)\\/(\\d+)\\b"
	// Regexp for parser to match
	this.matchRegExp = new RegExp(regexText, "mg");
    // Regexp to collect numerator and denominator
    this.execRegExp = new RegExp(regexText);
};

exports.parse = function() {
	// Extract the numerator and denominator
    var matches = this.parser.source.slice(this.parser.pos).match(this.execRegExp);
	// Move past the match
	this.parser.pos = this.matchRegExp.lastIndex;


	// Return the sup-slash-sub elements
	return [{
		type: "element",
		tag: "sup",
		children: [{type: 'text', text: matches[1]}]
	}, {
		type: "text",
		text: "⁄"
	}, {
		type: "element",
		tag: "sub",
		children: [{type: 'text', text: matches[2]}]
	}];
};

})();

This was my first time writing such a rule, and it’s possible I’ve broken conventions of some sort here. But it seems to work well.

Second, @hoelzro pointed out that we can use the View Template Body Cascade to overwrite the default body template, and several people offered improvements.

My version of this uses

title: ViewTemplateBodyFilters/recipes
tags: $:/tags/ViewTemplateBodyFilter

[tag[Recipe]then[ViewTemplate/recipe]]

and

title: ViewTemplate/recipe

<$let filteredText={{{ [{!!text}search-replace:g:regexp[\b(\d+)/(\d+)\b],[^^$1^^⁄,,$2,,]] }}} >
<$transclude $variable="filteredText" $mode="block" />
</$let>

Both techniques do the job well. The wikirule technique has the advantage of running across all wikitext. Anywhere that //text// would generate italic text, this should also work, including in calls to <$wikify> The implementation, though, is in JS, and will be more obscure to many readers.

The body cascade technique is more familiar, implemented in wikitext, and has the advantage of allowing us to target specific filters to choose where to run it. On the other hand, there’s no way, I see to run it across all wikitext.

For my usage, I’ve chosen the wikirule technique, but can easily see changing that.

I also have to mention a third option from @TW_Tones:

Just using that Unicode character as the division sign, automatically generated usable fractions. By replacing " / " with " ⁄ " in

1/4 + 1/2 = 3/4

we get this:

1⁄4 + 1⁄2 = 3⁄4

Notice that it looks different from

¹⁄₄ + ¹⁄₂ = ³⁄₄

I find it too small to be useful for this use-case, and for now the technique is not suitable for my purposes, as it still requires the user to apply the function. But it could be combined with either of the other two techniques if we wanted fractions at that font size, and I did post a version of it online as well.

There was also a suggestions from @buggyj that I never found time to investigate:

as well as a variant of the body template cascade from @Charlie_Veniot that instead of cascading, simply used CSS to hide the default body.

All in all, a fantastic thread!

Thanks to @btheado, @hoelzro, @TW_Tones, @buggyj, @Mohammad, @EricShulman, @CodaCoder, @Charlie_Veniot, and anyone else who participated!

CodaCoder · January 1, 2024, 8:12pm

Thanks for the diligence in writing this up, @Scott_Sauyet. Exemplary.

pmario · January 2, 2024, 11:49am

@jeremyruston – I think, this “fraction parse rule” would be of value for the core too.

I would even activate it by default, because rendering fractions that way seems to be natural.

It’s easy to deactivate from ControlPanel → Info → Advanced → Parsing tab and with the \rules except fraction pragma.

@jeremyruston – What do you think. Should we create a PR for the core?

vilc · January 2, 2024, 2:00pm

Some examples that come to my mind that match the regex in example above are:

date in day/month or month/day format, or even day/month/year
house/door number
sometimes in telephone numbers, e.g. to separate the directional number
various arbitrary ID numbers and so on

I usually don’t use any of these formats, so I would welcome this parsing rule as a default one. But I guess many users could be using some of these formats, and they would have to turn it off in the settings or use the pragma.

But then changing the default settings is not necessarily a backward compatibility issue, the CamelCase linking was changed some versions ago to default off.

Scott_Sauyet · January 2, 2024, 2:13pm

I didn’t even realize that this was available. I could see adding it to the core, but for the reasons that @vilc points out, I don’t think it would be a good idea to have it on by default.

Yes, and I’m sure we could come up with more. Moreover, if we decided to extend it to algebraic fractions, rendering a/b as ^a⁄_b, we would also run into weird things such as and/or rendering as ^and⁄_or.

pmario · January 2, 2024, 3:02pm

OK It should be an opt-in functionality.

For me personally the most important behaviour is, that the standard TW search function needs to work. eg: If I search for 1/2 it should show all tiddlers that contain a fraction in text.

For me this requirement rules out possibilities that have been discussed in the other tread.

Just some thoughts

Charlie_Veniot · January 2, 2024, 3:20pm

I think the approach I threw into the other thread works for search.

Scott_Sauyet · January 2, 2024, 3:23pm

Agreed. In fact, I think any solution that meets my initial design goal (have the user enter plain text fractions, but change the display to make them nicer) shouldn’t interfere with searching… although I admit I hadn’t tested for search.

TW_Tones · January 3, 2024, 12:52am

I think if at render time or in the parser number/number uses one of the other Unicode characters or the html approach to display is a reasonable approach. The advantage of the Unicode symbol is its a simple replacement and even the html is less verbose. On/off to start with as you prefer.

However I think this would be the opportunity to introduce a mechanism to allow select characters to be replaced with a Unicode equivalent during parsing.
If built correctly it will be performant.

I can for see other opportunities such as alternate bullets, and other special symbols opening other possibilities.

You could see this as partially opening the parser to allow a little hacking.
I am sure our smart community will discover other simple replacements who’s only impact is in the final render. Unicode has a wide range of codes that can alter other codes. Many of which are available with the “noto sans” font (short for no tofu). Something we could consider adding to the “font family”
A way to escape or disable this feature may also be useful.
Multi-character replacement may be helpful such as replace +- with ± basically it permits “one or more” keyboard entered characters to be mapped to “one or more” Unicode characters. See its wide range of uses here

I see an increasing number of cases where people are modifying core tiddlers (especially myself, but others too), or introducing new ones to handle edge cases like this. This includes now redefining core widgets with custom widgets. I think there is value to instead making it hackable and allow multiples of this possible without clashes.

I think if we don’t provide a hackability, we may find a little more clashes occurring.

I do think we underutilise Unicode a lot, and can see things like editor dropdowns for the selection and insertion of nominated symbols, including Smilies, sentiments, would be helpful.

On windows I have started using win+. for this but I don’t get to curate what is available.

TW_Tones · January 3, 2024, 1:05am

Just for completeness remember that katex and the ascii maths plugins go some ways to doing this already.

However it may be overly complex for most users.

Scott_Sauyet · January 3, 2024, 2:26am

I think this is a very good idea. I haven’t looked to see where it is, but a start at something like this is built in. A double-hyphen is replaced by an n-dash, and a triple one by an m-dash. I don’t know if that code is written with extensibility in mind, though.

If not, though, this doesn’t sound horribly difficult. I can see a configuration panel in Control Panel > Info > Advanced, which simply lists typed characters and their replacements. A single parser step might handle all of them.

One twist I see is that many of the uses I would imagine involve characters which would reasonably appear in text, and so might need to be delineated. While +- probably doesn’t have too many legitimate usages not related to ±, I would love to have something for the section symbol (§), for the degree symbol (°). These would most likely use text characters that would have legitimate uses, so these might do well somehow marked up, perhaps, :sect: and :deg:.

Although not much of a user of the editor buttons, I can also see an editor button that brings up a grid of such characters to insert, which inserts the Unicode as desired, or perhaps inserts the key such as +-, if that will help with searchability.

The JS would be fairly simple. Here is a dummy version which hard-codes the lookups rather than gathering them from a tiddler:

const transform = ((
  chars, 
  regex = new RegExp(Object.keys(chars).map(c => c.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")).join('|'), 'g')
) => (s) => s.replace(regex, (k) => chars[k]))({
  '+-': '±',
  ':deg:': '°',
  ':sec:': '§',
  ':para:': '¶',
})

transform('In :sec:3-:para:4a, we learn that we can alter the value by +-10:deg:')
//=> "In §3-¶4a, we learn that we can alter the value by ±10°"

(This uses more modern JS than is allowed in the core, but we can fix that later. This is only POC.)

There are interesting questions that would arise from trying to turn this into a production version – especially about caching the transform function and its internal regex, but invalidating the cache and rebuilding when the token list changes. But I’m sure there are reasonable solutions.

vilc · January 3, 2024, 8:20am

I think that parse rules for things like § ° ¶ might be too specific to be generally useful. Where we put this line is arbitrary. It has been decided that rules for -- and --- are commonly useful enough to be in the core. We could argue about +-. But I see the following problems with parser rules for more “exotic” symbols like § ° ¶ (not just these three specifically, I mean them as an example):

it makes the wikitext less useful outside of TW and less readable in the editor. While --, ---, and maybe +- are similarly readable to – — ±, :para: is much less readable than ¶.
depending on platform or keyboard layout, some users might not need them at all:
- some keyboard layouts already have some of the symbols, e.g. the german layout has ° and §
- afaik macOS has a default shortcut for en-dash and quite many exotic symbols available with option key, albeit somewhat hidden (not printed on keys)
if I’d be using § a lot, i think that :sec: would be quite unwieldy anyway, so I would make some other custom solution.
:sec: and :para: are not much of an improvement over the html entities § and ¶

For things that do have a clear Unicode representation (so the pretty fractions with upper/lower index might are an exception here, but the “standard” ones like 1⁄4 are not), there is no parser rule needed. Some affordance to input the symbol would be enough:

The core already has snippets, improving this mechanism (e.g. a searchable list with aliases for symbols, grid layout to fit more symbols in small popup) would be more or less what you propose:

I have developed a good workflow using Auto Complete plugin and a dictionary tiddler that contains Unicode symbols with names/aliases, see here. So i can type e.g. ::degEnter and the typed ::deg is changed to °. I think I’m just going to add ⁄ to that list and get the standard 1⁄4 fractions with this method.
This has the upside of showing up a list of suggestions and handles multiple aliases for one symbol, so it’s useful even for rarely used symbols for which one does not remember the exact name.
On Windows I use AutoHotkey with “hotstring” rules that immediately change e.g. \alpha to α.
Other than having the obvious disadvantage of being dependent on OS and ability to run AHK and my script, I catch myself on forgetting the hotstrings I set up myself for rarely used symbols.

Scott_Sauyet · January 3, 2024, 1:57pm

Yes, I keep going back and forth on this, but you’re right. An easy way to enter the symbol would be enough. For those of us who prefer keyboard-based solutions and not editor buttons, it would be nice if it worked like emojis do in some environments, such as this Discourse editor I’m typing in now. If I type :smi, I get a small popup offering a collection of emojis. I can choose one, and have the preview show me the symbol.

In some editors (like GitHub), the inserted text is Unicode, others (Discourse) it’s formatted text that will render as Unicode, or with images from some emoji set. Either seems acceptable to me.

This looks extremely useful. But there’s an interesting question of whether this should make it into the core.

To my mind, none of this really addresses the fraction question that opens this thread, as there we don’t want to replace every /; we only want to replace {digits}/{digits} with the appropriate output, and it wouldn’t make sense in various cases as you enumerated earlier.

vilc · January 3, 2024, 3:07pm

Of course, you original idea about fractions remains valid and useful!

I didn’t mean it should. I just wanted to show that there are both existing core mechanism (snippets) and plugins (AC) that can help in filling out Unicode symbols. And they seem better solutions to begin with than parser rules when it comes to simple unicode chars (not necessarily for the fractions).

Btw my AC trigger can be configured to work very much like the Discourse/GH examples of :smiEnter → 😀. The double colon as trigger :: is just my choice, so that it won’t be triggered every time I use a single colon.

andrewg_oz · January 5, 2024, 9:52am

AsciiMath is intended for easy typing. My AsciiMath plugin with the KaTeX plugin come very close to what you want:

$am$1/4+1/2=3/4$$

Scott_Sauyet · January 5, 2024, 11:22am

Yes, I should have mentioned this as an option in the initial post. A mathematician by training, I have used and very much like AsciiMath and KaTeX.

But I don’t think it’s appropriate for our target audience.

The motivation example here is an attempt at a community-built recipe-tracking wiki. This would chiefly be a getting-started tool for new users. So the primary goal here is to allow users to type, say, 3/4 cup sugar and have it render nicely. There’s too much training involved for them to type $am$3/4$ cup sugar.

It’s a great tool, and thank you for creating it. But I don’t think it’s the right tool for this job.