Scripts for Backing Up TiddlyHost

Simon has posted a script to make backups of your site. I found it was fairly easy to modify it so that it would download (backup) from a list of TW files.

Whatever method you use, this might be a good time to make sure you have a local copy of your files.

would you care to share that? (…and perhaps a few words on how to use it). I have 142 tiddlyhost sites…

Well, it runs under linux. Probably can run under cygwin (Windows).

Unzip the attached file into your download directory. Edit the sh file.

There’s a section at the top to put your site name, email, and password. Scroll down to variable “strings” and there’s a section to list your Host files.

Save your changes. Be sure to make the sh file executable (e.g. chmod 744 thost-downloader-public.sh)

Run the file. Your files should download into your current directory.

If I had 142 files, I think I would break this into groups of 10 initially in order to verify everything is working. Of my site files, I found one older one, possibly going back to the time tiddlyspace (? what was it?) , did not download. I didn’t pursue the issue. It wouldn’t surprise me if there was some sort of time-out for people with 142 files.

It occurs to me that a good project for someone much smarter than me would be to deploy this as a Github action. This would eliminate the need for a linux environment, and could directly download to your github repository.

https://filedn.com/lS4tJjRXRq8jyjPabjSqR2k/TiddlyWiki-shared/thost-downloader-public.sh.zip

1 Like

Well, it runs under linux. Probably can run under cygwin (Windows).

Thank you Mark. I didn’t realize those were the premises so I’m thinking maybe I’ll just backup the more important ones. I’ll first convince myself that it is “A wonderful opportunity for reviewing my stuff.”

Backups are good… But TiddlyHost also archives my backup history (on a paid plan), and is — as far as I know — the only game in town for functioning smoothly — “plug and play” as it were — as a web host.

(I did work with github for a couple of projects, but sending students to a github url — and/or reconfiguring all the hard-coded iframe links within the LMS — is quite awkward! Also github saves are less “real-time” — they sometimes take a while to show up, and/or a subsequent save gets stuck in a bottleneck if it’s too soon after another save.)

That kind of backup is handy if you need to “time-travel”, to recover something you accidentally wrote over. But as an actual backup … you’re putting all your eggs in one basket, as the saying goes.

Of course, I back up my files locally on a regular basis, and with greater frequency whenever I’m doing big adventurous work — like “clipping in” while rock-climbing.

But my point was that if TiddlyHost were to cease to be available (or to cease to be reliable), then an important feature (the fine-grained incremental backups — recently enhanced with labels, thanks to @simon) would be effectively lost. I have all those eggs in that one basket, because otherwise it would not be realistic to keep all those eggs at all :wink: I don’t think I’m confused about the importance of backing up. I’m simply wringing my hands over various ways in which TiddlyHost has been amazing, and (for my own case) there’s no other path that puts all those features together in such a lovely “just works” [until now :grimacing:] way.

(As of just now, I tried to open three tiddlyhost sites: one was zippy, one took 30 seconds, and one gave me a gateway timeout after 60. Really feels random right now — not in a “small chance of a problem” way, but “small chance it’ll work responsively” way.)

1 Like

Ok, here’s a script that runs in PowerShell. But I tested it in Linux, so I might need some feedback. There might be pathing descrepancies. It would be neat if the shell could retry files after timeouts, but that is more than I am up to today.

With this version, you make a file for your password, for your user id (email address), and for your list of files. From the comments

# Example usage:
#   <this scriptname> [FileList] [Password File] [User Id File] [Download Dir]
# Where
#   [FileList] is a file with a list of projects/files you want to download.
#      Do not include the .html extension.
#   [Password File] is file with your password (so it doesn't need to be hard-coded here)
#      Defaults to "./thostpass.txt"
#   [User Id File] is a file with your user id (usually your email)
#      Defaults to "./user.txt"
#   [Download Dir] is the name of the directory where you want your downloads to go.
#      Defaults to "." (current directory)

So FileList is a file with a list of the files you want to process. One name per line.

You can also specify parameters with parameter names, rather than by order.

If you have an error with one file, it will skip to the next. So watch the output for error messages. This is different from the bash shell script, where an error with one file would terminate the program.

In terms of warnings, well, do not try this where it will over-write your existing backups … in case there’s something wrong with these backups. If you try this, let me know if the status field returns a number, because on Linux it doesn’t.

1 Like

That would be great! The caveat is that I haven’t heard back from any Windows users.

Have you used it yourself?

For testing of course – on Linux.

Powershell runs on Windows, Linux and MacOS. Powershell comes by default I think on Windows 10 and 11. It may have slight differences in abilities from the Linux/Mac version, but (I think) this usually relates to access to the hardware, which should be irrelevant for this use. It would be handy if someone out there (@twMat ?) gave the script a test run on Windows.

I’ll give it a try within a day or two. I didn’t previously because by the time you kindly posted it I had already backed-up almost everything manually. But I realize now that I probably didn’t communicate that though! Thank you @Mark_S

And, of course, a huge thank you to @simon ! Speed seems to be as normal now! Regarding:

installing a robots.txt file to instruct web-crawlers not to relentlessly follow every filter and sort link in the “Explore” page.

…it’s a bit strange though that the problems arose suddenly, not gradually, so if web-crawlers were always “following every filter…” then why would it change all of a sudden…? Or maybe they themselves, or how they do their thing, changed ~2 weeks ago? Or maybe there was a sudden influx of users? Or an abrupt change in how users use the system?

I’m working with this backup script, and it keeps stalling. My feeling is that the site problem isn’t over yet.

I have gotten the powershell script to work on my old Win 7 machine. I forgot that Powershell actually works better under Linux than Windows. There’s a couple of extra steps people will need to know about.

I’m adding code to retry download attempts.

2 Likes

@Mark_S many thanks for the script. Until now I was using the bash script with some modifications for multiple wikis through WSL. This will make things much easier on Windows.

I have tried it on Windows 10 and it worked good so far – it correctly downloaded a couple of wikis. I really like how easy to configure it is with the file list.

In a rather locked business environment I get an error New-Object : Cannot create type. Only core types are supported in this language mode. on line 44 $WebSession = New-Object Microsoft.PowerShell.Commands.WebRequestSession. I’ll give it a try at home later today.
However, it seems this line is not necessary anyway, the $WebSession variable is only referenced in a commented-out code.

Btw, @simon, is it possible to download a local core version of external core wikis (and the core js) using these scripted methods?
I tried appending ?mode=local_core (like in the TH sites panel) to the url, but with no effect.
This would make scripted downloads a better tool for backing up in case of TH issues.

1 Like

It works.

Long live Progeny Of Polly!!! (PoP)

TT, x

.

I have a new verion of the uploader in PS. It has code to trap and repeat files that have timed-out, up to 5 times. Unfortunately, it has only been lightly tested because … TiddlyHost has been working too well!

I’ve downloaded 12 out 12 files every time whenever I’ve tested over the last 12 hours. Hopefully this means TH is back in the game.

1 Like

Things seem to work well on TH right now but regardless:

OK, I’m looking into your Tiddlyhost downloader script now (thank you!) and immediately, even before running it, hit a “conceptual” problem for my use case:

[FileList] is a file with a list of projects/files you want to download.

First, I’m guessing “file” refers to the wiki names seen in the tiddlyhost.com/sites list… right? (Not entirely clear. Maybe better use “sites” or even clearer with “wiki names” / “WikiList” ?)

Second, having to manually list all files nixes the point of using the script to begin with: It is not difficult to manually download a wiki, so the purpose - or at least my purpose - of using a script would be to eliminate as much manual hassle as possible. But having to search a 100+ list with sometimes semi-cryptic titles and flip back and forth to type out names in a list is burdensome. I’d suggest an option to downloads all the wikis - maybe even have this as the default (e.g if no list given). The user can thereafter manually delete the undesired ones but I imagine the use of this is typically as a backup so just maybe it’s not ever even relevant to delete stuff afterwards).

I can see how typing titles is not a big deal if you have some 10 TH sites… but then I also wonder why one would need a script at all?

OK, hope this made sense. Again, I didn’t yet get to the actual execution of the script because of this.

Thank you for sharing Mark!

Clicking 143 sites, every week (or however often you do a backup), isn’t a hassle?

In any event, once you’ve downloaded your sites, it should be pretty easy to make the list.

But the real reason I did it this way is because the original script has you pass the title.

But downloading everything would be useful. So it would depend on whether it’s even possible. @simon , are the sites in TH listable (once you’ve logged in, of course) ?

Oh, once a week is another use case than what I thought this was about. I’ve seen this as a kind of emergency action when/if TH behaves strangely. But you’re right, it makes sense to back up more often and yes manually clicking 143 sites is a hassle …but, still, identifying and typing out the title of, say, 50 of those is even more of a hassle IMO.

I wouldn’t type them all by hand. Since you have them downloaded, you just do something like

dir > files.txt

Then edit files.txt to cut out the cruft. Also remove all .html, which probably needs to be stated more directly. There are also utilities on windows (File commander ??) that will let you copy a nice little list.