Thanks, @pmario. Your whole post is very useful.
After much playing around and observing the behavior and sometimes peeking at the code I’ve come up with this long document about TW parsing modes. It is still a work in progress and I thought I’d share what I have so far. After writing this explanation, I find I am able to easily figure out my mistakes when the parser does something I don’t quite expect. Hopefully reading it might provide the same skill to others
Here is the tiddler I wrote and it can be dropped on tiddlywiki.com: Understanding WikiText parsing.json (8.1 KB)
For convenience, here is the html output (doesn’t render quite as well here as on tiddlywiki.com):
WikiText parser modes
The WikiText parser has three modes:
-
block mode - the parser will recognize only block mode WikiText punctuation
-
inline mode - the parser will recognize only inline WikiText punctuation
-
ignore mode - the parser will ignore WikiText punctuation
WikiText recognized only while in block mode
WikiText which spans at least an entire line will only be recognized while the parser is in block mode. The common theme here is that at least one entire line of text is required to delimit the WikiText. Another common theme is their closing punctuation must come at the end of the line (in some cases the end of the line is the closing punctuation):
While the above WikiText types are only recognized in block mode, the text enclosed by most of them will be parsed in inline mode (Block Quotes in WikiText and Styles and Classes in WikiText are the two exceptions in which the parser will continue in block mode). While in inline mode the parser may encounter something which moves it to block mode (more on this later).
At the end of the terminating line, the parser will return to block mode.
.
If the punctuation for the above types of WikiText is encountered while the parser is in inline mode, it will be ignored and output as-is.
WikiText recognized while in inline mode
The remaining WikiText can be expressed without an entire line of text. They aren't required to be all on one line, just that they can be expressed within a single line. And as such, more than one can appear within a single line. In other words, line endings are not involved while the parser tries to find where the particular WikiText begins and ends These will be recognized while the parser is in inline mode:
The text enclosed by these WikiText types will be parsed in inline mode unless the parser encounters something which moves it to block mode (more on this in the next section).
The other
inline mode WikiText types are technically
only detected while the parser is in
inline mode. However, the opening punctuation will also trigger the start of
Paragraphs in WikiText which will automatically cause the parser to go into
inline mode. Therefore, practically speaking, it is just as useful to consider these
WikiText types as recognized while the parser is in either
inline mode or
block mode
When the parser transitions between inline and block mode
In this section, the term "block mode WikiText" is used as shorthand for "WikiText only recognized while in block mode" and the term "inline mode WikiText" is used as shorthand for "WikiText only recognized while in inline mode"
- The parser starts in block mode by default.
- The parser will move to inline mode when it encounters the start punctuation of any block mode WikiText construct except multi-line block quotes and multi-line style blocks.
The start "punctuation" for a paragraph is "invisible". Even for paragraphs the parser moves to
inline mode
- The parser will move back to block mode after the end of a line which terminates block mode WikiText.
- Transcluded text will inherit the current parsing mode unless the
mode
attribute of TranscludeWidget is used to override it.
- When the opening widget or HTML tag is followed by a blank line, then the contents enclosed by the tag will be parsed in block mode. If the opening tag is not followed by a blank line, then the parser will be placed in inline mode. This inline mode can only be "escaped" by either using another nested opening tag followed by a blank line or by using the
mode
attribute of TranscludeWidget.
Examples
Block mode WikiText examples
Paragraphs are the most common WikiText. It is important to know they do not end until a blank line is encountered. Once a paragraph starts the parser will be in inline mode. Until that blank line is encountered other block mode syntax will be ignored:
This is a paragraph.
Only __//inline mode//__ punctuation is recognized here
Block mode punctuation will be <b>ignored</b> until
the paragraph ends (i.e. a blank line is encountered).
For example:
* List item punctuation is ignored
|tables|are|ignored|
! headings are ignored
Horizontal rules are ignored:
---
<<<
multi-line block quotes are ignored
<<<
That renders as:
This is a paragraph.
Only inline mode punctuation is recognized here
Block mode punctuation will be ignored until
the paragraph ends (i.e. a blank line is encountered).
For example:
* List item punctuation is ignored
|tables|are|ignored|
! headings are ignored
Horizontal rules are ignored:
—
<<<
multi-line block quotes are ignored
<<<
Most other block mode WikiText can end without the need for a blank line. Therefore, the parser will recognize the WikiText when written one line after the other with no blanks in between:
* list item one
* list item two
!!! heading
---
|cell 1|cell 2|
|cell 3|cell 4|
<<<
block quote
<<<
; Term
: Definition of that term
Paragraph can start here also, but it won't end until blank line
That renders as:
- list item one
- list item two
heading
cell 1 |
cell 2 |
cell 3 |
cell 4 |
block quote
- Term
- Definition of that term
Paragraph can start here also, but it won't end until blank line
Example changing the parser mode with the transclude widget
The HelpCommand tiddler contains a code block which requires the parser to be in block mode. List item content is parsed in inline mode. The mode
attribute of the TranscludeWidget can be used to render in block mode as desired:
# one
# <$transclude tiddler=HelpCommand mode=block/>
# three
That renders as:
- one
-
Displays help text for a command:
--help [<command>]
If the command name is omitted then a list of available commands is displayed.
- three
Without the explicit request for block mode, the result of the rendering does not look right:
# one
# {{HelpCommand}}
# three
That renders as:
- one
- Displays help text for a command:
`
–help [<command>]
If the command name is omitted then a list of available commands is displayed.
- three
Example changing the parser mode with blank line after html/widget opening tag
TODO (also mention this approach does not work for Tables in WikiText syntax)
When the parser ignores WikiText (ignore mode)
TODO - explain how the text enclosed by these constructs is skipped by the parser
Parsing of widget and html attributes
TODO