URL encoding using a `+` instead of `%20` for a space

https://github.com/Jermolene/TiddlyWiki5/blob/642f8da6ed4210af9552858efaa66988e3b255ed/core/modules/server/routes/get-tiddler-html.js#L20Here is another related file

Hi @Scott_Sauyet using + to encode a space character is only allowed within the query string part of URLs, and not in the main path of the URL.

Here’s a Stack Overflow article that gives a good overview of the situation:

The key passage is:

The use of a ‘%20’ to encode a space in URLs is explicitly defined in RFC 3986, which defines how a URI is built. There is no mention in this specification of using a ‘+’ for encoding spaces - if you go solely by this specification, a space must be encoded as ‘%20’.

The mention of using ‘+’ for encoding spaces comes from the various incarnations of the HTML specification - specifically in the section describing content type ‘application/x-www-form-urlencoded’. This is used for posting form data.

As far as I can tell, any encoding used in fragment identifiers is a matter for the application, and is not affected by the spec.

TiddlyWiki uses URL encoding in a few situations:

  • to construct permalink URLs
  • to construct file paths from tiddler titles when saving static static renderings
  • to construct HTTP paths to each tiddler

I think that the only one that meets the criteria for using + to encode spaces is the permalink. Changing the others usages would for example break file paths on Windows which doesn’t allow + in filenames.

Changing the encoding used in permalinks wouldn’t in fact require a way to distinguish between the old and new encodings; a particular instance of TiddlyWiki would interpret permalinks according to its prevailing encoding. Upgrades would work, too: existing links to tiddlers with spaces in their titles would still have %20 encoding, but it would still be decoded correctly. I think the only issue would be with existing permalinks to tiddlers with a “+” in their title, which seems tolerable.

So, I think your proposal is feasible as a core change. However, I have always expected that we’d solve the problem of making prettier permalinks through some kind of slug mechanism, where the system generates a unique, readable URL fragment for each tiddler title. That way we would end up with urls like https://tiddlywiki.com/#help-improve-tiddlywiki, and any punctuation etc would be removed.

Ok. Trying to type a response on my mobile phone as an automobile passenger. Let’s see how it goes.


First of all @jeremyruston, thank you for the response. I was planning on working this through a bit more and then if it still seemed feasible, raising an issue (and perhaps a PR) on GitHub. If that doesn’t make sense, please let me know.

I think it’s subtlety different than that. The specification says that the space character cannot appear anywhere in a URL, and offers the percent-encoding as a way to encode arbitrary bytes, including the space character. Especially in the fragment part of the URL, though, there’s no proscription of what those characters actually mean. TW is already slightly altering the usual HTML understanding of the fragment as a DOM id, even if the solution is in the same spirit.

Yes, I read that before I asked and reread the specification. Some of the answers are reading in a proscriptiveness I can’t find in the spec.

What do you mean by that third one?

That was my conclusion as well.

I think so, but I’m still not sure whether or not it’s a good idea.

Although that would would be both the most familiar and more readable than my + solution, I can’t see how to resolve the ambiguities, especially given TW’s dynamic nature.

Would a visitor arriving with #more-coffee be offered the choice between the three existing tiddlers More Coffee?, More Coffee!, and MORE COFFEE!!!? Or would we arbitrarily choose one? It only gets worse with permaviews.

So I have attached in my permalink protection button. It’s a working proof of concept, On any tiddler click it and it will copy a permalink to the clipboard with the tiddlers title getting spaces replaced with _ and create a matching tiddler. We can use different rules if we wish. permalink-protection_button.json (2.1 KB) drop it on tiddlywiki to see it work, and capture a permalink on a tiddler with spaces. Look in recent to see what was created and look/edit it.

  • It would be a simple matter to hide such permalink protection tiddlers from the general search. -[object-type[permalink-tiddler]]
  • Even if you rename the source tiddler, with relink installed the same URI will show the same content.
  • If you ever publish a URI then if you never delete your “permalink protection tiddlers” then not only will the same content be displayed, but you can choose to redirect previously published URI’s to different content in the same wiki.
  • We can even adjust it to appear with the same space separated title.
  • This is better than adding slug to a tiddler because it is change tolerant and you can gather information about the links you have shared outside the wiki.
1 Like

For me personally as a user that’s a “no go”. I do not want to create 2 tiddlers to be able to use 1 of them. – Especially if I want to publish links from sites I do not control.


I think the “masquerade” concept is OK, but it’s plugin territory.

If I would need to implement a system like this one, I’d like to use my “aliases” field. Wich is also part of a plugin.

My aliases field already contains “short titles” for tiddler-links. There would be no problem, to add a new value or use an existing one for the permaview or permalink.

But as I wrote – That’s plugin territory

My solution is about publishing links into the wiki, to tiddlers, not publishing links to other resources from the wiki. It’s a single tiddler solution.

  • I am creating a second tiddler for a reason
    • it can be renamed if desired.
  • however this discussion has prompted me to look into incoming links as I have on other web technologies.
  • I can see extending the solution I gave with additional features.

I am sure we could use your alias tools here but this came from encoding permalinks.

I am not sure about your emphasis on plugins. All my solutions are json packages, but could be made plugins. If what you mean is it core or plugin, I vote plugin and argued not to do core changes at the top of the OT.

  • one advantage of masquerade is bringing system tools into the tiddlers name space and searchable without editing shadows.
  • it would be worth looking if aliases can be used this way.

This has been raised as an issue on GitHub, with a Work-in-Progress pull request demonstrating an initial approach.

@Scott_Sauyet

Can you please clarify for me?

  • Is this intended to replace all permalinks that contain spaces to + ?
  • Can it be switched on/off?
  • Is it only applied to the tiddler titles, that is the “fragment”, or the whole URI?
  • What If we prefer underscore _ ?

I too would be in favor of encoding less characters in the search or fragment parts of the permalink, to make it more readable, but believe this extended set, needs to be configurable, one reason is if someone wishes to use permalinks for GET or POST requests, It is not possible without the existing encoding. This is now possible with “HTTP Requests in WikiText”.

Yes, although only in the fragment (starting with # to the end of the url). It does more, transforming permaviews the same way, keeping the : separator between the focused tiddler and the full list, and putting commas between the list elements. Right now, if we have a current tiddler of Foo Bar Baz and also have open Qux and Corge Grault, we separate the current tiddler from the list with a colon (:), space-separate the list, wrapping any elements that include spaces in [[ ... ]], to get a basic fragment of

#Foo Bar Baz:[[Foo Bar Baz]] Qux [[Corge Grault]]

which we then use JavaScript’s encodeURIComponent to turn into this:

#Foo%20Bar%20Baz:%5B%5BFoo%20Bar%20Baz%5D%5D%20Qux%20%5B%5BCorge%20Grault%5D%5D

My suggestion is that we instead turn it into

#Foo+Bar+Baz:Foo+Bar+Baz,Qux,Corge+Grault

While this is a definite improvement in those permaviews, it is to my eyes immensely better for permalinks. This:

https://tiddlywiki.com/#Working+with+TiddlyWiki

is tremendously nicer to look at than this:

https://tiddlywiki.com/#Working%20with%20TiddlyWiki

No it’s intended to be a backward-compatible replacement for loading previously-generated links.1 And it would entirely supplant the current permalink/view creation. I see very little reason to ever want to turn it off. Do you see one?

Only to the fragment.

If the community prefers underscores to plus signs, giving us instead

https://tiddlywiki.com/#Working_with_TiddlyWiki

I’m perfectly happy with that. + is what came to mind first, as I think it’s more common to see, but Wikipedia is one of the biggest sites on the internet, and it uses _. That would be fine. It has an additional advantage in that _ seems less likely to be used in a title than +. But we can bikeshed later over the specific character to use if we decide this overall is a good idea.

But, I don’t see much sense in making this customizable. First, it would make decoding/loading much more difficult. Second it would lead to inconsistencies between wikis for little benefit. Third, the most obvious form of customization would be to allow the wiki creators to supply the character to use in replacing spaces; this would then lead to problems if they choose characters illegal in the fragment, as seems all too likely.

I’m not clear where this is coming from, but I’m pretty sure it’s irrelevant. The fragment is not supposed to be supplied to the server at all, and I’m pretty sure that no major JS clients actually do so. If for some reason a permalink is supplied as a query parameter or part of the body of an HTTP request, then it would presumably be encoded as otherwise required, but that would be layered atop whatever we’ve already done to encode it.


1 There is a caveat. If someone has a tiddler whose title contains a + or a ,, and they’ve hand-crafted a permalink to it, rather than using the one generated by TW, and that includes the + or , directly, without percent-encoding, then it will link to the wrong place, likely a non-existent tiddler, but conceivably an existing one.

@Scott_Sauyet to cut a long story short, if you are prepared to use the underscore instead, I would be happier, it is to do with encodeURI and encodeURIComponent standards. The plus + is not inside published standards, even in the fragment, although it will work in many cases.

  • Ideally this would be an optin or optout because I can forsee cases where the existing permalink encoding method is needed.

    • The following characters need not be encoded both with encodeURI and encodeURIComponent A–Z a–z 0–9 - _ . ! ~ * ' ( ) note the _ underscore is among them, + is not.
    • The following additional characters also valid when encoding with encodeURI ; / ? : @ & = + $ , # the + symbol and others are in this subset. So if we modify your proposed change to “not encode these additional characters” the permalinks will look even easier to read.
      • However I have read that not all systems may consider this a valid URI even if only used in the fragment. Thus I believe we need to allow people to opt out or in.
  • I have described this issue back in this thread but happy to give more details if you request so.

I have an alternative way to patch this to achieve the above rather than using a replace with + I can show you, we just need to add an opting/out process.

The latest commit switches to underscores. I agree that it makes for a much more readable URL. You can see my comment on it on GitHub.

This is not correct. The relevant parts of the specification are these:

   fragment      = *( pchar / "/" / "?" )
   pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
   pct-encoded   = "%" HEXDIG HEXDIG
   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="

with these in a different spec included by reference:

    ALPHA        =  %x41-5A / %x61-7A   ; A-Z / a-z
    DIGIT        =  %x30-39  ; 0-9
    HEXDIG       =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

This means that the characters allowed in a fragment, besides alphanumeric and percent-encoded ones are these:

- . _ ~ ! $ & ' ( ) * + , ; = : @ / ?

So this is a legitimate URL:

https://tiddlywiki.com/#A_(surprising?)_tiddler_with_various_*special/unusual*_characters,_&_more!

Note that encodeURI and encodeURIComponent are not normative as to what forms a legitimate URL. Instead, they are tools to turn URL-like strings into guaranteed-legitimate URLs. There are plenty of legitimate URLs they could never produce. That doesn’t mean those URLs are not legitimate.

That’s in the latest update.

That’s entirely possible, and we might have to see how that could play out. Do you have any references?

How do you envision that working? Multiple permalink/permaview buttons? A control panel setting? Something else?

This thread has gotten long. I would appreciate a refresher or a pointer to the specifics you’ve already posted.

Thank you as always for your feedback!

In the GitHub discussion I address at least a part of this, the fact that many programs consider sentence-ending punctuation like ., ?, and ! at the end of a url not to be part of the link, but rather part of the surrounding text.

You can test it in its current state at https://tiddlywiki5-2vg2v0leq-jermolene.vercel.app/

1 Like

@Scott_Sauyet for a totally unrelated tool I wanted to get a url to a tiddler, constructed. I could replace spaces with _ but is there another way to get the “permalink format” as we do not have a permalink operator?

  • Perhaps a compatible operator or additional format:permalink[] would be helpful?

If there’s support for this proposal, it would be easy enough to refactor what I’ve written, offering this formatting for both purposes.

It’s not clear, though, if there is any support. This discussion has been mostly you and me, with some technical details from @pmario, some interesting interjections from @oeyoews, and a somewhat mixed-signals message from @jeremyruston. The only interactions on the actual pull request have been a few well-appreciated corrections and suggestions from @pmario. I’m not sure whether to take this as discouragement or only a sign that things may move slowly around here.

However, if there is no support for this proposal, it would be easy enough to extract the relevant code to make a permalink formatter of some sort. It could be stand-alone or submitted for inclusion in the core. The trouble would be that those permalink, at least as I’ve been doing them, would not actually be useful if they weren’t being decoded when loading.

Actually in this case it would be simple enough to make a plugin, minimise the number of core tiddlers you replace, such as writing an alternative permalink button that does as desired. You add functionality rather than modify it.

  • If it proves to be useful it may then become a core addition.
  • This would suit me as I have a hunch, no “educated guess”, that this being a default may break something.
  • Yet at the same time it would be nice to not encode the other permitted characters in the fragment as well.

The only thing I don’t know is how when such a link is followed, how it opens the required tiddler.

  • I would like to understand this.

Although not withstanding the above, I do have a custom permalink button that creates a permalink tiddler, allowing the original to change title and the previously published permalink still “work”. It also allows you to see which permalinks you generated and thus published.

  • In this case there is no need to redirect underscore containing tiddlers.
  • This works more like sophisticated link managers such as yoast in WordPress, it helps maintain your “SEO juice”.

In a related matter I am building a way to generate links to other wikis so they are opened in named window/tabs so you can maintain one tab per wiki even after following multiple links.

Therein lies the rub..

The whole idea of permalinks are that they lead you to the desired resource. So encoding must be paired with decoding.

That’s why this has to be a core change. The code in question is buried in the be middle of one of the most central tiddlers, boot.js.

That sounds fascinating. I have questions, but I’m typing on my phone on a bouncing bus, so I’ll save them for later.

  • Not with my approach
  • will have a look. now thanks

The idea is rather than a permalink, drag or copy a html a tag, containing the link, pretty link, the target window name set, and maybe more information (eg browser brand). Primarily we put such links into a tiddlywiki as a link to an external resource.

  • You can also build a method to do the same thing from a href in any wiki, but it requires more assumptions than getting it from the source wiki.

You can see my changes to that in the pull request.

Is this related to the earlier discussion about URL shorteners? That’s something I thought I wanted, but eventually realized will not satisfy my use-cases, where there are many read-only users and only a small number of editors for a wiki. It certainly has many uses, though.

Or does it have to do with what you suggested earlier, about creating a new tiddler with a simpler permalink URL that transcludes the target tiddler? While that doesn’t appeal much to me, I still look forward to seeing how it goes.

Or is it something different altogether?

I’m not quite following, although I look forward to seeing what you come up with.

From your earlier post, I thought you were spawning new windows/tabs from one wiki to another and then intercommunicating among them. I haven’t done much multi-window/multi-tab work in years, but I remember the great number of complexities involved, and I was wondering how you were handling them. It sounds as though this is somewhat different.

  • Thanks
  • Indirectly because it gives an option to add this feature.
  • Keep in mind tiddlywiki already uses many additional tiddlers with tiddlywiki.com is using 4,000+ and you only need additional ones for the external links and/or short links you publish. The way tiddlywiki works they get loaded into memory effectively caching them.
  • It is also easy to keep them out of searches etc, so you would not know they are there.
  • Yes, but I think you will be surprised, they can look exactly like the tiddler they are masquerading as, most will not know the difference, and they do not appear in searches.

I have most of this 90%+ completed so will share next week sometime at the latest.

  • Thats a different project, focused on proving a window manager for tiddlers opened in a new window. I have what I need to proceed, just not the time.
  • I have considered interwiki communications for all in the file: “domain” but not further yet.

The previous PR for this was closed as a backward compatibility question was raised just before 5.2.0 was released. There is now a new version on GitHub. This version returns to using + characters for space, as initially discussed. The _ character turned out to cause problems.

1 Like