That’s correct, yes.
It might be helpful to look at some example Lezer trees, if you’ve found any?
That’s correct, yes.
It might be helpful to look at some example Lezer trees, if you’ve found any?
Yes @jeremyruston
I’ve already looked at lezer trees
They are not too complex
What I’d have to do is recursively cycle through the tiddlywiki parse tree and create a lezer tree from it
I have a function that cycles through the parse tree but I need a way to create that new object from it containing a lezer tree for each child and subchild and so on…
Currently I can only post a link to Lezer Reference Manual
There you can see that the parse tree must be converted to a lezer Tree containing an array of children which are also lezer Trees (and their children must be lezer Trees and so on)
So what I’m trying to do is cycling trough the parse tree and converting each node to a lezer tree. Where I’m struggling is putting it all together
I cannot post an example what I’m doing because I’m currently away from my desk AND I really doubt that I’m doing this correctly
Where I’m sure is that I’m in the correct function (createParse) Lezer Reference Manual returning the resulting lezer tree in advance()
Just copy some code from GitHub - tiddly-gittly/wikiast: Wiki AST transformation for WYSIWYG editor and advanced dynamic widgets in Tiddlywiki. , it is similar to GitHub - syntax-tree/mdast-util-to-nlcst: utility to transform mdast to nlcst , a very standard way for ast transformation, easy to write, because you only transform a single node in a file.
And easy to test, you can see I’ve written many tests, you can’t write this without unit test, otherwise you will have many regressions.
Just read all code in this folder https://github.com/tiddly-gittly/wikiast/blob/master/packages/wikiast-util-to-slate-plate-ast/src/index.ts there is no much. And I have some note about them
While writing my Initial Idea see details at the end of the post, I found out it’s too complex to start with.
I did have a closer look at the lezer markdown-parser plugin.
It seems we could use it to get the basic TW highlighting going. The MD source code seems to be relatively straight forward.
The repos intro says the following, where I did highlight the important part.
This is an incremental Markdown (CommonMark with support for extension) parser that integrates well with the Lezer parser system. It does not in fact use the Lezer runtime (that runs LR parsers, and Markdown can’t really be parsed that way), but it produces Lezer-style compact syntax trees and consumes fragments of such trees for its incremental parsing.
So it produces a tree with which the codemirror highlighter can work with.
I think that could be the way to go.
Since the parser already implements an extension system to eg: implement the GFM autolininking we could use this system to implement TW specific highlighting in a one by one basis, which would make it less intimidating.
That’s my thoughts
have fun!
-mario
See the code:
var parser = this.parseTiddler(title,options)
instantiates the parser class and also caches the parseTreewidgetNode = this.makeWidget(parser,options);
converts the parseTree into the widgetTreewidgetNode.render(container,null);
creates the HTML output.So instead of outputype="text/html"
we would need eg: outputype="application/vnd.lezer+json"
or something similar.
widgetNode
we would get a lezer Tree
Tree
or a TreeBuffer
class
TreeBufferTree buffers contain (type, start, end, endIndex) quads for each node. In such a buffer, nodes are stored in prefix order (parents before children, with the endIndex of the parent indicating which children belong to it).
I think the TreeBuffer
could be created with the information stored in the TW parseTree
. The TW parseTree has a very similar info.
BUT
The lezer tree has a context that is created by the lezer-parser. So to convert a TW AST to a lezer-tree we would basically need to recreate the parser.
→ While writing this and studying the docs I see this goes nowhere. → Way to complex to start with.
If phase 1 will be done, there should be enough experience to implement a conversation from TW AST to “lezer Tree”
Thanks @pmario , I’m happy about your feedback and your idea!
I believe you understand the Markdown source code way better than me. I’m not experienced in typescript, I’m not experienced in Javascript.
I was believing that it’d be much easier. What the Markdown parser does is “creating that lezer tree step by step” if I did understand the code correctly.
What I wanna do is "create the lezer tree at once frome the result of this.widget.wiki.parseText("text/vnd.tiddlywiki",input.doc.toString());
and return it in advance() when “parsedPos” is at the end of the document.
I don’t know if I’m “at the wrong road” but it was my initial idea… and I’m learning from experimenting with it
I was thinking about that too but then I had a closer look at that structure. The lezer tree is optimized for highlighters. So it has a lot of internal info to be fast. We would need to recreate that.
It may be possible to go from a TW tree to a lezer tree but not as the first project.
The MD parser was written by the author. So it should be a good base.
Once we have enough knowledge about the lezer tree it may be a possibility
This is how my code for the tiddlywiki parser looks like at the moment:
switch(mode) {
case "text/vnd.tiddlywiki":
var {Tree,Parser,TreeBuffer, NodeType, NodeProp, NodePropSource, TreeFragment, NodeSet, TreeCursor, Input, Parser, PartialParse, SyntaxNode, ParseWrapper} = CM["@lezer/common"];
var {html,htmlLanguage} = CM["@codemirror/lang-html"];
var tiddlywikiParser = new Parser();
tiddlywikiParser.createParse = function(input,fragments,ranges) {
return {
advance: function() {
if(this.parsedPos === input.doc.toString().length) {
var parseTree = self.widget.wiki.parseText("text/vnd.tiddlywiki",input.doc.toString()).tree;
var tree = new Tree();
tree.type = NodeType.none;
tree.children = [];
tree.positions = [];
tree.length = input.doc.toString().length;
return tree;
}
return null;
},
parsedPos: input.doc.toString().length,
stopAt: function(number) {
console.log(number);
this.parsedPos = number;
return number;
}
}
};
var data = defineLanguageFacet({commentTokens: {block: {open: "<!--", close: "-->"}}});
var htmlNoMatch = html({matchClosingTags: false});
var support = [htmlNoMatch.support];
var lang = new Language(data,tiddlywikiParser,[],"tiddlywiki");
var tiddlywiki = function() {
return new LanguageSupport(lang,support);
}
editorExtensions.push(tiddlywiki());
This at the moment returns an empty tree - it should return the final lezer tree created from the parseTree
var parseTree = self.widget.wiki.parseText("text/vnd.tiddlywiki",input.doc.toString()).tree;
var tree = new Tree(NodeType.none,[],[],input.doc.toString().length);
// recursively enter the parseTree here and create a new Tree() from each node.
// identify the children of the first tree and add them (as Trees) to the Tree.children Array
return tree;
That’s where I’m struggling.
I need a recursive function. Maybe more.
I’m currently reverse engineering the lezer markdown parser in Javascript
I also came to the conclusion that that’s the way to go - looking how it’s done there and learning from it
Many things that the lezer markdown parser does aren’t needed for the tiddlywiki parser because we do the parsing ourselves
Ok now some News:
I managed to recreate the markdown parser in Javascript and integrate it as tiddlywiki parser
I’d like to put the code on GitHub and see if someone is interested to jump in
With jumping in I mean change the code so that it highlights tiddlywiki syntax correctly
Is anybody interested?
Thank you and best wishes,
Simon
You mean you embed the lazer parser as a tw wikirule module?
Hi @linonetwo
I mean I converted the markdown parser to Javascript and integrated it in the codemirror engine. I use that as a starting point and change/extend it to “understand” tiddlywiki syntax
That would be overkill
Hi all, I’ve now figured out how to highlight and complete widgets (widget tags)
I’m getting close