Create PDF Book from Collection of Tiddlers

Mohammad · August 19, 2021, 12:08pm

Yes @pmario , the requirement are different! I just mean PDFs (ebook, thesis, reports,… etc) are very common and used on daily basis!

We can think here to generate PDF books like Wikipedia ones and of course a good plugin will allow user to customize the output!

Later one can use a template here and produce his/her PDF of choice!

abesamma · August 19, 2021, 3:06pm

Glad to see this gap has been pointed out. This deserves further exploration with working code. I think something that allows flexible formatting of the PDF document output would be more desirable.

stobot · August 19, 2021, 3:29pm

This isn’t a direct answer per-se, but I will say it’s similar need to what’s provided by RMarkdown. R is a statistical programming language I use for my work, and they have a markdown language and plugins that go from a few things (R / Python / SQL etc.) through R Markdown and to many things (HTML pages, PDF, Word, Beamer, HTML5 Slides, Books, Dashboards, Articles, Websites etc…)

While it’s pretty far off wikitext, given that it’s open-source, if somebody were to spend effort on this, they might want to leverage what’s been built there as well.

There is also a related project (bookdown) that goes all the way to full books, and I’ve learned R through textbooks that were created this way - common in data science

https://rmarkdown.rstudio.com

clutterstack · August 19, 2021, 3:47pm

Wonder how Pandoc and RStudio’s tool compare. I would think we’d definitely want someone else to do the PDFing, so we would have to get our tiddlers into a format (markdown, HTML, ?) they could ingest.

TW_Tones · August 20, 2021, 7:52am

I have come late here, but I already produce PDF documents from tiddlywiki. Just build a tiddler that displays the content you want in your pdf, bring in content from as many tiddlers you want.

Now Open in new window and Print (Use shortcuts if necessary) and select a PDF printer.

It is easy to insert page breaks in the output (invisible to interactive user) which throws page breaks when needed.

To me the effort should be put into predictive page breaks and tools to support page layout in tiddlywiki. Remember also generating html output and saving to files is trivial in tiddlywiki and is often a valid import source to many other document apps.

TW_Tones · August 20, 2021, 7:53am

As a community we should add Pandoc a converter to/from tiddlywiki markdown.

clutterstack · August 20, 2021, 2:25pm

@TW_Tones I wonder if this would be useful to some people, maybe coupled with some templates. A lot of my wikis’ HTML comes through macros, though, so if I wanted to generate PDF reports or something, I’d have to start at the HTML stage you were discussing in your previous post.

I think I’ve surpassed my current expertise on this topic and would have to do some reading to have any good opinions!

Best,
Chris

TW_Tones · August 21, 2021, 12:00am

Here is a little more on this subject

If you install the plugin “internals”, the preview on tiddlers includes a HTML view. This can be copied and pasted elsewhere. As long as it renders on the screen it is presented in HTML, however with a little trickery it can be exported, or printed. Jeremy pointed out if you save some output such as a tiddler as static html and rename it to docx and open it with Microsoft Word you have converted it.

This approach using any appropriate word processor could be used for what we may call “print preparation” ie all the content is collected and presented in tiddlywiki, then you use a word possessor for final print and pdf preparation using auto and manual page breaks, headers footers and page numbers etc…

With the above workflow refined many will be happy and need no more, they can control and organise there information in tiddlywiki and use a for purpose tool to make use of its sophisticated print preparation and output.
However we can progressively move more of these print preparation methods into tiddlywiki, with each new feature there will be a subset of documents that do not need the additional print preparation.
I have already used html tables to page break with column headings/footers repeated as needed and can print these directly to PDF/printer.

Mark_S · August 23, 2021, 7:02pm

I’ve never seen a HTML-based solution for PDF that didn’t mess up somewhere. There are extensions that can convert your page to PDF, but they usually get the formatting wrong. Printing to PDF almost always cuts across lines and images.

What I do is copy the rendered text into libreoffice or word. These products have decades of experience with formatting text. Then export to PDF, which is better than printing because you’re more likely to get exactly what you see. The whole process is very fast, at least if I’m using an existing document as a starting point.

Mark_S · August 23, 2021, 7:07pm

Somewhere on GG I posted a LUA filter for conversion to markdown (I think). Making an actual converter is difficult, but making a conversion filter is pretty straight forward, though it’s written in LUA which isn’t even the base language of Pandoc.

Mohammad · August 24, 2021, 6:10am

Hi Mark,
How do you copy the rendered text, specially when the text is created from many tiddlers and produces long page?

Jason_Cunliffe · July 11, 2023, 12:21am

Typora

TW_Tones · July 11, 2023, 12:26am

@Jason_Cunliffe please explain in your reply/post why you share a link when you do. The community should be rightly suspect if someone is posting links without explanation. They could be phishing. The Last person that did a lot of this, was eventually banned, their behaviour became quite suspect.

All you need to do is explain why you thing a link is relevant to the topic and why someone may want to look at it.

DaveGifford · July 2, 2025, 6:11am

Hey Tones,

I am printing to pdf from my browser, but the page breaks aren’t working. Here is an example of a header that is not working. What might I be doing wrong?

<span class="subtopicburgundy" id="abreviaturas" style="page-break-before: always;">Abreviaturas</span>

EricShulman · July 2, 2025, 7:14am

First, note that page-break-before:always is deprecated. While it still works in some browsers, for more general compatibility, you should use break-before:page instead.

More importantly, this CSS only applies to “block elements that generate a box”. Thus, it can only be used with a non-empty div element, not a span element.

Try using:

<div style="break-before:page">
<span class="subtopicburgundy" id="abreviaturas">Abreviaturas</span>
...
</div>

where the <div>...</div> encloses the entire page content.

-e

DaveGifford · July 2, 2025, 8:46am

Thanks for this Eric! Works like a charm.

TW_Tones · July 3, 2025, 2:49am

I also find break “after more semantic” because you are kind of saying this content is for the current page, what follows is a new page and the worst you get is a blank page at the end, a blank page at the begining may make people think the print is broken.

The big issue here

One of the greatest problems converting variable output to print is choosing when a page break will be needed and responding appropriatly. This is in part because we dont know how much content can fit on a page given the fonts, margins and even print driver (WYSIWYG) in use. So we do not have access to where automatic page breaks will be forced on you, before sending it to the print driver.

So we tend to force a break before it is forced on us, making the bottom of many pages empty.
I am starting to wonder if PDF is a valid output now with most such content not being printed and browsed in its electronic form. Long documents with no page breaks are more practical especialy if you have a progress indicator and bookmarks. I did some preliminary research to see if we could generate infinite length pages in PDF, no luck yet.
There are valid ways to break over pages such as using tables where its fine if the column headings are displayed again but spliting a table because the last row does not fit, could be corrected by reducing fonts in that table. Again we cant see this until we atempt to print.

A good alternative output may be simple html with imbeded fonts, images etc… and no page breaks, but a viewer that allows page breaks to be set if in fact someone needs to print.

I dont know if there are open standards on this but there should be.

Remember TiddlyWiki already allows export a “compound tiddler” as static html which ultimatly becomes a html page, so it would be possible to construct a template with even better layout and even some reading tools, however you may want to avoid embedding javascript, perhaps use CSS calc because a file containing javascript will rightly be viewed suspiciously, and not arrive in many mail boxes.

Reading tools may include a scrolling window, adjustable fonts, position or percent indicator and more.
CSS mayalso use the view window to determin how many “pages” of views.

I am going to research this a little more. With the Help of ChatGPT and google.

Initial research

Makes mention of the epub format I belive some people have used out of tiddlywiki,
See Export selected tiddlers to epub but I dont know if it has the paging problem. However it does point out save as html then use tools to convert html to epub, but perhaps also pdf? see pandoc
- I belive @jeremyruston has done things with EPUB professionaly.
Printing to html
- should allow as wide as possible but idealy allow the width to be altered for readability.
- Be assesible to screen readers, semanitc with html tags, headers and include a TOC/Index if possible all of which we can do in tiddlywiki.
- Browsers can search, resize and do a lot more on a html page including with addons

ChatGPT has all the info needed to create a new static html output with a range of features that will assist screen reading here is the example it gave cant be uploaded here including a ZIP which demonstrates the issue with HTML

Big files should use compression

DaveGifford · July 3, 2025, 4:07am

Thanks Tones. I use break before because I insert them at headers. I don’t add breaks with the first header, so I don’t get blank pages.

I normally put out my Spanish materials in long static HTML with or without breaks. But I was having trouble with a macro. It was printing to pdf fine, but not showing any of the content when exporting to statics. Never had that before. I am not sure if the problem is the macro itself, which doesn’t seem likely, or the customized system tiddlers for statics that don’t work with the macros.

Anyway, Eric’s solution worked for me. As they always do.

Thanks for the thoughts on printing and exporting. Good advice. Blessings

TiddlyTitch · July 3, 2025, 11:33am

Right. Is your solution interactive?

Just asking, TT.

TW_Tones · July 3, 2025, 12:55pm

my solution is to not have page breaks, although other solutions may help.