Extracting Tiddlers from the JSON Store (TW5.2.0) with Ruby

This short Ruby script shows how you can extract tiddlers from the tiddler store in a single-file TW 5.2.0. Hopefully something mildly interesting and useful to someone

Usage:
    $ tiddlers.rb full.html

Which results in something like:

"A free, open source wiki revisited" by Mark Gibbs, NetworkWorld
modified: 20160204225307847

"A Thesis Notebook" by Alberto Molina
modified: 20130302084548184

"ATWiki" by Lamusia Project
modified: 20210106151026834

"BJTools" by buggyj
modified: 20210106151026926

"BrainTest - tools for a digital brain" by Danielo Rodriguez
modified: 20210106151026982

"Cardo - Task and Project Management Wiki" by David Szego
modified: 20210106151026996

...
require 'nokogiri'
require 'json'

fn = ARGV.shift

doc = Nokogiri::HTML(File.read(fn))

tiddler_store = doc.xpath("//script[@class='tiddlywiki-tiddler-store']")

json = JSON.parse(tiddler_store.text)

json.each {|tid|
  puts tid["title"]
  puts "modified: #{tid["modified"]}"
  puts
}

With the Nokogiri library, the contents of the tiddlers could actually be modified and the wiki file saved again with the modifications but that requires a slightly more complex script.

Are you sure you did fix / undo the < = \u003C replacement. I can’t see it

A tiddler with this content:

Das ist ein <test> xxx </test>

is stored like this: in the json-store

{"created":"20211009175026181","text":"Das ist ein \u003Ctest> xxx \u003C/test>","tags":"","title":"test","modified":"20211009175042346"}

FYI there’s some similar ruby code here used by Tiddlyhost:

Edit: It’s probably hard to use it directly since there are minor dependencies on some Rails libraries (I’ve just realised).

There’s no special handling needed iiuc. The standard JSON parsing handles that encoding.

(I wasn’t certain, but I think https://simon.tiddlyhost.com/text/EscapingTest.tid confirms it, see Site #1 — misc notes and experiments .)

1 Like

You are right. … But I think it is still important to mention, that there is some “escape” mechanism needed.

If somebody directly adds a jsonstore to a wiki they need to escape < with \u003C as best practice and security reasons.