TiddlyWiki syntax/grammar for the CodeMirror 6 plugin

While writing my Initial Idea see details at the end of the post, I found out it’s too complex to start with.

I did have a closer look at the lezer markdown-parser plugin.

It seems we could use it to get the basic TW highlighting going. The MD source code seems to be relatively straight forward.

The repos intro says the following, where I did highlight the important part.

This is an incremental Markdown (CommonMark with support for extension) parser that integrates well with the Lezer parser system. It does not in fact use the Lezer runtime (that runs LR parsers, and Markdown can’t really be parsed that way), but it produces Lezer-style compact syntax trees and consumes fragments of such trees for its incremental parsing.

So it produces a tree with which the codemirror highlighter can work with.

I think that could be the way to go.

Since the parser already implements an extension system to eg: implement the GFM autolininking we could use this system to implement TW specific highlighting in a one by one basis, which would make it less intimidating.

That’s my thoughts

have fun!
-mario


Click! -- Details Initial Idea, which goes nowhere I think the right way to go would be to use `$tw.wiki.renderTiddler(outputType,title,options)` as the basis. `.renderTiddler` allows TW to convert the parseTree into a widgetTree, which then is converted to HTML output.

See the code:

  • line 1229 – var parser = this.parseTiddler(title,options) instantiates the parser class and also caches the parseTree
  • line 1230 – widgetNode = this.makeWidget(parser,options); converts the parseTree into the widgetTree
  • line 1232 – widgetNode.render(container,null); creates the HTML output.

So instead of outputype="text/html" we would need eg: outputype="application/vnd.lezer+json" or something similar.

  • instead of widgetNode we would get a lezer Tree
    • where each child is either a readonly Tree or a TreeBuffer

class TreeBuffer

Tree buffers contain (type, start, end, endIndex) quads for each node. In such a buffer, nodes are stored in prefix order (parents before children, with the endIndex of the parent indicating which children belong to it).

I think the TreeBuffer could be created with the information stored in the TW parseTree. The TW parseTree has a very similar info.

BUT

The lezer tree has a context that is created by the lezer-parser. So to convert a TW AST to a lezer-tree we would basically need to recreate the parser.

→ While writing this and studying the docs I see this goes nowhere. → Way to complex to start with.

If phase 1 will be done, there should be enough experience to implement a conversation from TW AST to “lezer Tree”

Thanks @pmario , I’m happy about your feedback and your idea!

I believe you understand the Markdown source code way better than me. I’m not experienced in typescript, I’m not experienced in Javascript.

I was believing that it’d be much easier. What the Markdown parser does is “creating that lezer tree step by step” if I did understand the code correctly.
What I wanna do is "create the lezer tree at once frome the result of this.widget.wiki.parseText("text/vnd.tiddlywiki",input.doc.toString()); and return it in advance() when “parsedPos” is at the end of the document.

I don’t know if I’m “at the wrong road” but it was my initial idea… and I’m learning from experimenting with it

I was thinking about that too but then I had a closer look at that structure. The lezer tree is optimized for highlighters. So it has a lot of internal info to be fast. We would need to recreate that.

It may be possible to go from a TW tree to a lezer tree but not as the first project.

The MD parser was written by the author. So it should be a good base.

Once we have enough knowledge about the lezer tree it may be a possibility

This is how my code for the tiddlywiki parser looks like at the moment:

	switch(mode) {
		case "text/vnd.tiddlywiki":
			var {Tree,Parser,TreeBuffer, NodeType, NodeProp, NodePropSource, TreeFragment, NodeSet, TreeCursor, Input, Parser, PartialParse, SyntaxNode, ParseWrapper} = CM["@lezer/common"];
			var {html,htmlLanguage} = CM["@codemirror/lang-html"];
			var tiddlywikiParser = new Parser();
			tiddlywikiParser.createParse = function(input,fragments,ranges) {
				return {
					advance: function() {
						if(this.parsedPos === input.doc.toString().length) {
							var parseTree = self.widget.wiki.parseText("text/vnd.tiddlywiki",input.doc.toString()).tree;
							var tree = new Tree();
							tree.type = NodeType.none;
							tree.children = [];
							tree.positions = [];
							tree.length = input.doc.toString().length;
							return tree;
						}
						return null;
					},
					parsedPos: input.doc.toString().length,
					stopAt: function(number) {
						console.log(number);
						this.parsedPos = number;
						return number;
					}
				}
			};
			var data = defineLanguageFacet({commentTokens: {block: {open: "<!--", close: "-->"}}});
			var htmlNoMatch = html({matchClosingTags: false});
			var support = [htmlNoMatch.support];
			var lang = new Language(data,tiddlywikiParser,[],"tiddlywiki");
			var tiddlywiki = function() {
				return new LanguageSupport(lang,support);
			}
			editorExtensions.push(tiddlywiki());

This at the moment returns an empty tree - it should return the final lezer tree created from the parseTree

var parseTree = self.widget.wiki.parseText("text/vnd.tiddlywiki",input.doc.toString()).tree;
var tree = new Tree(NodeType.none,[],[],input.doc.toString().length);
// recursively enter the parseTree here and create a new Tree() from each node.
// identify the children of the first tree and add them (as Trees) to the Tree.children Array
return tree;

That’s where I’m struggling.
I need a recursive function. Maybe more.

I’m currently reverse engineering the lezer markdown parser in Javascript
I also came to the conclusion that that’s the way to go - looking how it’s done there and learning from it


Many things that the lezer markdown parser does aren’t needed for the tiddlywiki parser because we do the parsing ourselves

1 Like

Ok now some News:

I managed to recreate the markdown parser in Javascript and integrate it as tiddlywiki parser

I’d like to put the code on GitHub and see if someone is interested to jump in

With jumping in I mean change the code so that it highlights tiddlywiki syntax correctly


Is anybody interested?

Thank you and best wishes,
Simon

1 Like

Just in TiddlyWiki syntax/grammar for the CodeMirror 6 plugin - #20 by linonetwo

You mean you embed the lazer parser as a tw wikirule module?

Hi @linonetwo

I mean I converted the markdown parser to Javascript and integrated it in the codemirror engine. I use that as a starting point and change/extend it to “understand” tiddlywiki syntax

That would be overkill

Hi all, I’ve now figured out how to highlight and complete widgets (widget tags)

I’m getting close :slightly_smiling_face:

3 Likes