Tiddler Compression? Using LZ/LZMA?

It appears this idea hasn’t been extensively explored before. The closest resemblance I’ve encountered are UglyfiTW5 or JSZip. However, the concept I’m contemplating leans towards a more literal approach. Picture this:

Consider TiddlyWiki: it compresses Tiddler Data and fields but not the core itself. Similar to how sjcl.js handles encryption, the proposition involves decrypting and/or decompressing TiddlyWiki Tiddlers (excluding System/Excluded ones) upon loading. In my trials, LZMA typically reduces space usage by approximately 60%. Imagine combining this with something like UglyfiTW5 for a wiki the size of the main page.

Using some rough calculations based on the main page of TiddlyWiki, the numbers came out as follows:

  • TiddlyWiki 5 (5.3.3) = 7.30MB
  • TiddlyWiki 5 Uglified (5.3.3) = 6.20MB
  • TiddlyWiki 5 Uglified + LZMA.js Level 9 (5.3.3) = 3.66MB

This estimation involved determining the empty size of TiddlyWiki, exporting non-system/shadow/plugin Tiddlers (user-only) as CSV, using the online LZMA.js demo to compress only text fields (excluding titles or other field values), and then reintegrating all ‘Uncompressed’ columns back into TiddlyWiki. The resulting compression from the Compressed Text Fields led to a reduction from 7.30MB to 3.66MB (about 50% smaller), excluding the ‘Lossy’ compression of UglyfiTW5.

Notably, LZMA being Lossless ensures retrieval of anything stored upon loading. This concept bears potential usefulness for various reasons. According to LZMA.js, Text Field Size was reduced by approximately 67% in 11.5 seconds. Moreover, LZMA.js claims compatibility as far back as IE6, hinting at likely compatibility across all browsers/mobile & HTA Hacks/APPS.

Additionally, LZMA.js allows users to choose their preferred compression level. My tests showed it could compress around 2.2Mb of data in roughly 3.6 seconds and decompress the same data in about 0.5 seconds using LZMA Level 1 with about 59% Compression. Running tests at Level 9 showed double decompression/compression times for approximately 8% more efficient storage. While 8% might seem modest, it’s impressive given TW5’s existing footprint.

Overall, the concept, when combined with uglify, appears to reduce the TW5 Home Page by approximately 50% without sacrificing functionality or data. At the very least, a plugin akin to the single Tiddler encryption plugin could be developed to compress individual Tiddlers.

My grasp of JavaScript is limited, and when it intertwines with TiddlyWiki, it becomes even more bewildering for me. An attempt I made in a sandbox to incorporate LZMA, failed unfortunately and my attempt to jerry-rig it was unsuccessful due to my lack of expertise in this area.

Thoughts?

TL:DR,

LZMA.js enables compression in Browser Environments, in theory all or select Tiddlers could be reduced 50-70% “Theoretically Tinkering” Resulted in a ~50% Reduced Wiki Size, From 7.3mb to 3.6mb & I’m to dumb to implement any of this.

Links:

Demo Used to Test Text-Field Compression

A Demo of LZ-string vs LZMA.js

1 Like

I did respond in detail at: What about Tiddler compression as a feature like encryption? (LZMA.js) · Jermolene/TiddlyWiki5 · Discussion #7920 · GitHub

My conclusion is:

TLDR;

If harddisk space is a real concern, every modern OS has the possibility to compress document folders in a transparent way on the fly.
Activating this option will compress every file if possible, which will have a much higher impact as TW-html files only

Very interesting idea! I like to see a demo/plugin to play with and see the performance.

I’ve contemplated this for quite a while now (I have mixed feelings about the html file I serve not being plaintext at large), as I transmit a ~60-65MB TW every few minutes through a Tor straw (not always with the option to compress either). I know those who download it would prefer it were smaller, and even gzip can be beaten. zstd may also be worth a gander (I get mine down to ~16MB with it, but that includes the core). To my eyes, hard disk space isn’t the relevant issue: it’s bandwidth, and, in some cases, the time it takes for anything functional to be rendered at all. For some contexts, there’s another step I can think of to help, which is a more robust multi-stage loading (similar to lazyloading), though I think that would require far more effort, IIUC.