Bulk conversion of _canonical_uri tiddlers into embedded image tiddlers

Hello

Can anyone help with the bulk conversion of _canonical_uri tiddlers into embedded image tiddlers.

I have a travel blog where I needed to keep the size if the wiki as small as possible while travelling but I now what to convert that wiki into a single file html.

Thanks in anticipation, Rob

This might be related to yours

I saw this but I want to do the opposite conversion and I am not sufficiently competent to reverse the code!

I’m not a coding expert! But it strikes me that TW has no mechanism to parse the images that are external, though it can point to them, and channel them to the browser for display. When the _canonical_uri field holds a url, there’s no way to get TiddlyWiki to go “eat” those files and import their data as tiddlers.

But perhaps that are ways to batch-import them, and then to replace all the image references in some standardized way.

I’m personally very curious what we learn here, because I’m in the process of converting a tiddlyhost web-hosted wiki (which helps itself to lots of externally hosted images, both from the web and from my own servers) into a format that can be distributed to a set of 100% offline (locked down) laptops.

For my purpose, I’m not hoping to directly embed all images (I’m worried that would slow things down?), but I at least need to get all the web-hosted stuff into a neat local directory alongside the TiddlyWiki html.

For one data point, I have a wiki with no slowdown caused by including around 120 headshot photos, mostly between 200x200 and 500x500 pixels. I don’t know the number of images nor the total size that would start causing problems, but I’m not there yet.

Hi @Springer

I like your suggestion

But perhaps that are ways to batch-import them, and then to replace all the image references in some standardized way.

I’ll give that a try although my coding expertise might not be up to it. Let’s see

Thanks

1 Like

Why can’t you just drag and drop your image files into your TW file? In the import dialog, change any incoming file names so that they match existing names. That is, so they’ll overwrite the old tiddler.

The only complication is that you will lose any tags. So it might be good to write a routine that would save the tag information and then reapply it after the import.

That might work @Mark_S but there are around 250 images so I need a batch solution.

I’m not sure a batch solution is much faster – I mean you can instantly select an entire directory of files. But let me repost the thing I just deleted. I actually felt a little silly for having made it.


This seems to work, but it is absolutely experimental. We always say to make a backup, but this time I really mean it.

The code below (made with the help of AI, yes) works in python3 (not earlier versions). You can change the line on the bottom to match your file image directory and desired output file. I used “output.json”.

Once you have your output file, drag and drop it into your TW file. Every title in the import dialogue should say “A tiddler with this title already exists”. If it doesn’t, then you need to change the incoming name to match. Then import. You made a backup, right?

So, this seems to work, but I don’t understand how TW creates its own image export. It puts a lot of stuff on the front of the base64 ASCII. I don’t know if any of that is important, but the import seems to work without it.

This was done on Linux. Don’t know if there will be problems with Windows, that doesn’t recognise the difference between “Title”, “tItle”, and “TITLE”.

Edit: If you do it this way, any tags or extra fields in your original tiddlers will be lost. So maybe someone will come up with a way to save and then re-apply those attributes after import.

import os
import base64
import json
from datetime import datetime

def image_to_base64(image_path):
    with open(image_path, "rb") as image_file:
        encoded = base64.b64encode(image_file.read()).decode("utf-8")
    return encoded

def create_tiddler(image_path):
    image_name = os.path.basename(image_path)
    title = os.path.splitext(image_name)[0]
    encoded_image = image_to_base64(image_path)
    mime_type = f"image/{os.path.splitext(image_name)[1][1:].lower()}"
    
    tiddler = {
        "created": datetime.now().strftime("%Y%m%d%H%M"),
        "title": image_name,  
        "type": mime_type,
        "text": f"{encoded_image}",
        "tags": "image"
    }
    return tiddler

def pack_images_to_json(image_folder, output_file):
    tiddlers = []
    for filename in os.listdir(image_folder):
        if filename.lower().endswith((".png", ".jpg", ".jpeg", ".gif", ".bmp", ".webp")):
            image_path = os.path.join(image_folder, filename)
            tiddlers.append(create_tiddler(image_path))
    
    with open(output_file, "w") as f:
        json.dump(tiddlers, f, indent=2)

# Example usage:
# pack_images_to_json("path/to/image/folder", "output.json")

pack_images_to_json("./files","output.json")

TW is not intended to be an image store. Images can be huge. Inside a wiki they are needed to be stored as base64 encoded text files. So their size is increased by about 60%

How big are you images?

If you have an image of about 4MByte, which is common for a decent JPG in our days. With 250 images you will get an HTML file with about 1.6GByte — Which does not make any sense at all.

Even if your files are only half the size it will be a problem. I personally would not do this.

Mario

One thing I sometimes do is create a field with the basic name of the image (basename). Then I rewrite the _canoncal_uri depending on what device I’m on. So for travel, the _canonical_uri will refer to a cloud storage site. But for local (like laptop), it will refer to a local site. This is done through a drop-down selector.

Thinking out loud, a travel version could also use thumbnail versions of the original images. Those 250 images at 5k a piece would only add 2 megs.

Minor correction: it increases by about 33%. Every three (8-bit) byte get encoded using four (essentially 6-bit) bytes.

But if your images are not this large, it may not be a problem. That has been my experience, with numerous smaller, embedded images (10 - 20 KB).

I would export the canonical_uri tiddlers to a JSON file and run a repeated text edit on them to change the code. Then import the new json file back into the wiki.

Sometimes I find it is easier to change the comma between fields into a tab and then import that file into a spreadsheet so you can manipulate the various columns. Export it as a SCV file and replace tabs with commas to get back to where you started.

As long as you keep the tiddlers’ titles the same, this will overwrite what is already there, maintaining any tags, fields, etc.

This is the code for my image tiddlers

As for different versions of image tiddlers, have a look at Fallback handling for $image widget. @EricShulman provided an edit for the $image widget that allows for fall back if the first location does not have the image to a second location. I use that for having a local Image directory and one on a server so that if the local directory is not available then it pulls the image out of the server copy.

This is how I configure the fallback location.

bobj

Hi @Mark_S

I did as you suggested and, hey presto, I was able to achieve what I wanted without any problems.

For your information my wiki with 450 images is about 75MB.

I use the read-only plugin by TonGerner
https://tongerner.tiddlyspot.com/#Readonly%20plugin
to restrict access to the wiki when I share it

Many thanks for the helpful suggestion.

Cheers, Rob

Rob since it likely to be slow to load have you added a splashscreen? I helps to stop people abandoning the site due to no apparent activity while loading.

No, but that is a very good idea. I have done just that.
Thanks @TW_Tones

Many thanks for the advice @pmario

For clarity, my wiki is being used as a travel blog with images, not as an image store. The images are resampled to about 150kb each.

Cheers, Rob