Parsing modes: inline vs. block

@tw-FRed , thanks for that link.

I found 3 widgets which have a mode attribute which can be used to explicitly specify which parsing mode to use. They are mentioned in the related widget documentation:

https://tiddlywiki.com/#TranscludeWidget:TranscludeWidget%20ViewWidget%20WikifyWidget

In addition, HTML in WikiText mentions that a blank line after the opening tag can be used to change the default parsing mode from inline to block inside of an html tag. The same is true for Widgets in WikiText.

This filter will display the names of the rules which will only be parsed when in block mode:

[wikiparserrules[block]]

This filter displays the names of the rules which will be parse when in either block or inline mode:

[wikiparserrules[inline]]

Even after reading these, I’m not able to come up with all-encompassing definition for the two parsing modes.

2 Likes

Inline and block are terms that originate in html, there are actually more. Tiddlywiki has chosen to use a similar approach in its widgets but ultimately they are rendered into html.

This is usually set in html with the CSS Display Property but when not set it is inherited or determined by what it is within.

That is the beauty of TiddlyWiki is its use of global standards such as HTML, CSS and Javascript but it also means sometimes these rules are not restated in tiddlywikis documentation.

See the list https://tiddlywiki.com/#ListWidget > Content and Attributes >

  • Otherwise, a default template is used consisting of a <span> or <div> element wrapped around a link to the item
  • If the list widget is completely empty (ie only whitespace between the opening and closing tags), then it behaves as if the content were a DIV or a SPAN containing a link to the current tiddler (it’s a DIV if the list widget is in block mode, or a SPAN if it is in inline mode)

With TWclassic we had the problem, that the rendered output of 2 text paragraphs wasn’t rendered to HTML P-tags. … eg: The wikitext

Paragraph 1

Paragraph 2 

created this HTML output

Paragraph 1<br>
<br>
Pragraph 2<br>

TiddlyWiki 5 produces:

<p>Paragraph 1</p>
<p>Paragraph 2</p>

TW5 allows proper CSS styling for paragraphs. … TWc does not, because there are no P-tags.

That was the main reason why the wikitext parser has 2 different modes.

  • Block-mode, which tries to detect where paragraphs start and end. …
  • Inline-mode for elements like bold, italic and so on, which are identified once a block could be identified.

The primary rule to identify a wikitext paragraph is: A paragraph starts with text and ends with 2 newlines.

The problem with this rule is, that sometimes it is too simple. It can break down with eg:

Some ''bold'' text {{transclusion}}

The parser identifies this as 1 block and covers it in a P-tag. … If the {{transclusion}} contains simple text without any formatting like headings and so on, everything is OK.

  • Step 1: The parser always starts in block mode for tiddler content.
  • Step 2: The parser identifies the block and covers it in a HTML paragraph tag.
  • Step 3: The parser switches to inline mode and parses the content of the block
  • Step 4: It finds ''bold'' and covers it in a HTML STRONG tag.
  • Step 5: It finds {{transclusion}} and since the current parse-state is “inline” it parses the content of the transcluded tiddler in inline mode.

If there is some simple text that’s OK

If there is

! heading
and some text

it breaks down. … That’s why we have the mode=block parameter for the <$transclude widget.

Hope that helps

-mario

3 Likes

Thanks, @pmario. Your whole post is very useful.

After much playing around and observing the behavior and sometimes peeking at the code I’ve come up with this long document about TW parsing modes. It is still a work in progress and I thought I’d share what I have so far. After writing this explanation, I find I am able to easily figure out my mistakes when the parser does something I don’t quite expect. Hopefully reading it might provide the same skill to others :slight_smile:

Here is the tiddler I wrote and it can be dropped on tiddlywiki.com: Understanding WikiText parsing.json (8.1 KB)

For convenience, here is the html output (doesn’t render quite as well here as on tiddlywiki.com):


WikiText parser modes

The WikiText parser has three modes:

  • block mode - the parser will recognize only block mode WikiText punctuation
  • inline mode - the parser will recognize only inline WikiText punctuation
  • ignore mode - the parser will ignore WikiText punctuation

WikiText recognized only while in block mode

WikiText which spans at least an entire line will only be recognized while the parser is in block mode. The common theme here is that at least one entire line of text is required to delimit the WikiText. Another common theme is their closing punctuation must come at the end of the line (in some cases the end of the line is the closing punctuation):

WikiText Punctuation
Block Quotes in WikiText Multi-line block quotes are enclosed by lines containing only the text <<<; single line block quotes are also possible.
Code Blocks in WikiText Enclosed by lines containing only the text ```
Definitions in WikiText Each term is on its own line and each definition is on its own line.
Hard Linebreaks in WikiText Enclosed by lines containing only the text """.
Headings in WikiText Entire line starting with !.
Horizontal Rules in WikiText A line containing only the text ---.
Lists in WikiText Each list item is on its own line.
Paragraphs in WikiText Any text other than the start punctuation of one of the other block mode WikiText will start a paragraph. Even the start punctuation of inline mode WikiText will start a paragraph. The parser includes all following lines into the paragraph until it encounters a blank line.
Styles and Classes in WikiText Enclosed by lines starting with @@.
Tables in WikiText Each table row is a line starting and ending with |.
Typed Blocks in WikiText Enclosed by lines starting with $.

While the above WikiText types are only recognized in block mode, the text enclosed by most of them will be parsed in inline mode (Block Quotes in WikiText and Styles and Classes in WikiText are the two exceptions in which the parser will continue in block mode). While in inline mode the parser may encounter something which moves it to block mode (more on this later).

At the end of the terminating line, the parser will return to block mode.

Note: Hard Linebreaks in WikiText require an extra blank line after the trailing """ before the parser will return to block mode
.

If the punctuation for the above types of WikiText is encountered while the parser is in inline mode, it will be ignored and output as-is.

WikiText recognized while in inline mode

The remaining WikiText can be expressed without an entire line of text. They aren't required to be all on one line, just that they can be expressed within a single line. And as such, more than one can appear within a single line. In other words, line endings are not involved while the parser tries to find where the particular WikiText begins and ends These will be recognized while the parser is in inline mode:

The text enclosed by these WikiText types will be parsed in inline mode unless the parser encounters something which moves it to block mode (more on this in the next section).

Macro Calls in WikiText and Transclusion in WikiText will be recognized in block mode if the macro call or transclusion spans an entire line.

The other inline mode WikiText types are technically only detected while the parser is in inline mode. However, the opening punctuation will also trigger the start of Paragraphs in WikiText which will automatically cause the parser to go into inline mode. Therefore, practically speaking, it is just as useful to consider these WikiText types as recognized while the parser is in either inline mode or block mode

When the parser transitions between inline and block mode

In this section, the term "block mode WikiText" is used as shorthand for "WikiText only recognized while in block mode" and the term "inline mode WikiText" is used as shorthand for "WikiText only recognized while in inline mode"

  1. The parser starts in block mode by default.
  2. The parser will move to inline mode when it encounters the start punctuation of any block mode WikiText construct except multi-line block quotes and multi-line style blocks.

    The start "punctuation" for a paragraph is "invisible". Even for paragraphs the parser moves to inline mode

  3. The parser will move back to block mode after the end of a line which terminates block mode WikiText.
  4. Transcluded text will inherit the current parsing mode unless the mode attribute of TranscludeWidget is used to override it.
  5. When the opening widget or HTML tag is followed by a blank line, then the contents enclosed by the tag will be parsed in block mode. If the opening tag is not followed by a blank line, then the parser will be placed in inline mode. This inline mode can only be "escaped" by either using another nested opening tag followed by a blank line or by using the mode attribute of TranscludeWidget.

Examples

Block mode WikiText examples

Paragraphs are the most common WikiText. It is important to know they do not end until a blank line is encountered. Once a paragraph starts the parser will be in inline mode. Until that blank line is encountered other block mode syntax will be ignored:

copy to clipboard

This is a paragraph.
Only __//inline mode//__ punctuation is recognized here
Block mode punctuation will be <b>ignored</b> until
the paragraph ends (i.e. a blank line is encountered).
For example:
* List item punctuation is ignored
|tables|are|ignored|
! headings are ignored
Horizontal rules are ignored:
---
<<<
multi-line block quotes are ignored
<<<

That renders as:

This is a paragraph. Only inline mode punctuation is recognized here Block mode punctuation will be ignored until the paragraph ends (i.e. a blank line is encountered). For example: * List item punctuation is ignored |tables|are|ignored| ! headings are ignored Horizontal rules are ignored: — <<< multi-line block quotes are ignored <<<

Most other block mode WikiText can end without the need for a blank line. Therefore, the parser will recognize the WikiText when written one line after the other with no blanks in between:

copy to clipboard

* list item one
* list item two
!!! heading
---
|cell 1|cell 2|
|cell 3|cell 4|
<<<
block quote
<<<
; Term
: Definition of that term
Paragraph can start here also, but it won't end until blank line

That renders as:

  • list item one
  • list item two

heading


cell 1 cell 2
cell 3 cell 4

block quote

Term
Definition of that term

Paragraph can start here also, but it won't end until blank line

Example changing the parser mode with the transclude widget

The HelpCommand tiddler contains a code block which requires the parser to be in block mode. List item content is parsed in inline mode. The mode attribute of the TranscludeWidget can be used to render in block mode as desired:

copy to clipboard


# one
# <$transclude tiddler=HelpCommand mode=block/>
# three

That renders as:

  1. one
  2. Displays help text for a command:

    --help [<command>]

    If the command name is omitted then a list of available commands is displayed.

  3. three

Without the explicit request for block mode, the result of the rendering does not look right:

copy to clipboard


# one
# {{HelpCommand}}
# three

That renders as:

  1. one
  2. Displays help text for a command:

    `
    –help [<command>]

    If the command name is omitted then a list of available commands is displayed.

  3. three

Example changing the parser mode with blank line after html/widget opening tag

TODO (also mention this approach does not work for Tables in WikiText syntax)

When the parser ignores WikiText (ignore mode)

TODO - explain how the text enclosed by these constructs is skipped by the parser

Parsing of widget and html attributes

TODO

4 Likes

As far as I can see … it looks right.

Hi @btheado excellent work, thank you. Very impressive to figure out the details and write about them coherently at the same time.

The only thing that raised a flag for me was the mention of “ignore mode”, which doesn’t quite match how I think about things. The parser manages a list of “rules” which can be seen in the control panel. Each rule is flagged as being one or more of “inline”, “block” or “pragma”.

Each rule only matches when the parser is in the right mode. It actually starts out in “pragma” mode where it accepts pragmas that start with a backslash. When the parser is invoked it is provided with the starting mode of either “inline” or “block”. Exactly as you explain, each rule behaves differently in terms of whether it’s content is parsed as inline or block.

There are some further subtleties with the “element” parse rule that may be worth exploring. It’s the rule that parses HTML elements like <div> and widgets like <$text>. The rules it uses are particularly complex. For example, it avoids generating paragraphs for elements where the opening tag is on a line by itself and followed by two newlines:

This is a paragraph

<div>

This is a paragraph within a div

</div>

I think it would be well worth working this up into a PR. There’s a lot of material in that tiddler, so perhaps it could be split somehow?

1 Like

Thanks @jeremyruston!

The concept I’m trying to convey with “ignore mode” is that the parser will ignore any starting WikiText punctuation.

For example Linking in WikiText:
[[ only look for bar or close braces here | only look for close braces here ]]

Another example Macro Calls in WikiText:
<<mymacrocall the parser ignores wikitext here>>

In my writing I’m trying to make it clear to the user which (if any) WikiText punctuation will be recognized and where it will be recognized. What I’m calling “ignore mode” is an important piece of information. Maybe there is a different way I should be describing it?

I know the parser doesn’t have an explicit “ignore mode”. It just happens because the code doesn’t make any calls to the parseInline or parseBlock functions. The parser will only recognize punctuation specific to the internals of the current WikiText construct.

Yes, that is a whole topic I ignored so far, but it deserves to be addressed in order to make the documentation comprehensive.

Yes, I had planned that for one of the sections I marked with TODO.

I had already planned to discuss the inline vs. block mode of the html/widget content and how the blank line after the opening tag influences it. That behavior doesn’t require the opening tag to be on a line by itself. So I think you are pointing out a different subtlety specific to paragraph wrapping which happens when the opening tag also starts a new line.

I hadn’t noticed that behavior before and I’ll try to describe it as well.

Thanks for the encouragement and I will work on it. I agree the material needs to be split somehow.

I expanded and split the content and submitted a PR which has been merged. It is now available on tiddlywiki.com.

Here is a link to all the new tiddlers (and one tiddler which already existed but goes along with the others).

Ideas for improvement are welcome. Please share.

3 Likes

I also discovered @sobjornstad has a nice write up about block mode and inline mode in his excellent Grok TiddlyWiki.

Thank you @btheado, your contributions to the documentation are highly appreciated.

As @saqimtiaz Saq says

It is important we have reference materials for people to understand this when needed.

As you say;

So;

Perhaps in future in the “WikiText Parser Modes” tiddler we could give a little high level explanation for newer users, or those without the knowledge of inline/block from other systems like html.

Note: TiddlyWiki “parses” the contents of your Tiddlers (text field) and interprets the content therein and applies the “WikiText” rules before displaying the result.

Note: Inline and block modes determine if the results flow together in the same line or occur on seperate lines, in blocks.

I agree the tiddler as written jumps right in without any high level explanation:

The WikiText parser has three modes:

The parser transitions between these modes based on the text it encounters. In addition, there are places where the parser ignores WikiText punctuation.

Maybe changing the first line to the following captures the intent of your first note (and following the style guide: https://tiddlywiki.com/#Reference%20Tiddlers):

In order to display Tiddlers (usually the text field), the WikiText parser reads and interprets the content and applies WikiText rules. The parser has three modes:

To me it seems like the inline and block mode of HTML is different enough from the WikiText parse modes that any mention of them should also contrast the two. As you say, in HTML it is about whether results flow together in the same line or not.

However for WikiText, the two modes are mainly about which syntax will be recognized.

But now that I inspect your w3schools link more closely I see that most of the WikiText recognized in block mode corresponds to HTML block level elements and the same is true for many of the inline WikiText.

1 Like