@TW_Tones, I think you’re onto something here about there being a lot of people who want to contribute but don’t want to because:
- It takes extra time and steps to do so, many of which are difficult or unclear;
- You have to make sure everyone is OK with changes first.
My initial thoughts were about not knowing how to get started, which indeed I don’t know how to do, but @pmario’s responses and your thoughts made me realize there is a more general problem after those things get resolved. The more I think about it, the more skeptical I am that a pull-request-based process is the right way to write documentation, especially when we are trying to encourage contributions from more users. We are gating things and thereby eliminating contributions to make sure the documentation quality is kept high, when the actual problem is that the quality is much worse than it could be because we don’t have enough contributions.
Here’s why the gating is discouraging contributions.
It’s hard to squash general improvements to documents into PRs
I think the focus on familiarity with GitHub that has so far characterized these discussions is misplaced. That’s a factor for lots of users, but I think there’s a larger problem.
For me, it isn’t familiarity at all. I’m a software developer, I work with pull requests all the time, and I am pretty much obsessive with producing a clean Git history that is easy to follow and explains why I made each change in detail. I will happily spend 5 minutes rebasing something if I accidentally added two lines to a commit that don’t belong. I also was the Git, SCM, and development automation expert at my last workplace for 3 years and learned all sorts of esoterica I am not happy to have needed to know. So I am very Git, GitHub, and PR-positive in general. But I’m also a writer, and when I am sitting down and working on improving a large document, I cannot get my prose to fit effectively into neat little changes that align with pull requests. I have tried numerous times going back to when I learned distributed version control in 2008 – because I would love to be able to produce a coherent timeline of my “development” process – and I just cannot make it work.
Here’s, as best I can explain, why it doesn’t work:
When you’re writing software, you can decide you want to add X feature or fix Y bug; exactly what behavior you want is clearly definable and clearly defined. Then you go and make changes until that’s implemented. You know exactly when you’ve finished implementing it, because before it didn’t work and now it did. You finish testing it, and then you commit it and move onto X’ or Y’. If you change a feature that has documentation associated with it, you can also go update that documentation and include it in the commit, if you like. The changes you need here are fairly circumscribed and clearly connect back to the functional changes you just made.
But then we have the situation where there’s a document out there. For our purposes here, we’ll say the TiddlyWiki documentation. This contains, in addition to topics that are tightly connected to sections of the code, numerous explanations of best practices, tutorials, and so on, and we see some part of it that’s not as clear as it could be, or that could use additional information, or could maybe be reorganized, or whatever. I don’t know about you, but I don’t do this by picking out some extremely specific change I want to perform, doing it, committing it, and then moving on. I open the document to the section that looks like it needs work, read it, think about what’s missing, try adding some text, go remove some other text, rewrite some text, realize that this is partially duplicating another section, go over there and swap some stuff between sections for a little bit, notice a broken link in the supporting material for that and fix that, come back, realize that something is tagged wrong and fix that, notice that I can make things clearer by reorganizing this into two sections…eventually 45 minutes have passed and the document is better than it was when I started, in a way generally according with what I set out to improve, but maybe in other ways as well.
TiddlyWiki’s structure often makes this effect even more pronounced, at least for me, because one topic will likely be spread over several tiddlers, and it’s therefore easier to encounter other things that need to be improved on the way. The number of things that need to be reshuffled to end with an effective organization can also be higher. Sure, I could add all those items I come across to a to-do list and come back to them later, but if the problem is, e.g., “these italics over here should be changed to a macro,” or “there’s a missing comma in this tiddler,” that’s kind of ridiculous. Yet if you start allowing these kinds of things to squish into an existing commit, the boundaries start to blur, and in my experience it’s very difficult to avoid making larger and larger additional changes, even unintentionally, and these then become extremely challenging or impossible to separate. Also, in some cases two changes are intimately connected with each other and it’s not really possible to separate them at all.
So then put them into one PR, that’s OK, they’re part of the same change right? But then you end up with a huge PR and people complain that there’s too much in it, and it takes 53 days for all the reviewers to look at it, and by that time you’ve forgotten what you were doing, you’re on vacation, and you’ve given up on editing the TiddlyWiki documentation because it took forever to get anything done. (This isn’t really an exaggeration; it took 30 days to have my 4-character typo fix reviewed in September. Some of this could be fixed by ensuring that new PRs get triaged better, but I think anyone who has worked on open-source can testify that sometimes, they just take a long time to deal with.)
In the end, I find that my changes smoosh together enough that it really isn’t worth trying to separate out the changes that precisely. Obviously it’s possible to broadly group them rather than working for 8 hours before committing anything – which is sometimes important if you’re working collaboratively on something – but the kind of precision naturally expected by version control doesn’t come easily, if it’s possible at all. When I work on something by myself, I often don’t even bother with general groupings and just do end-of-day checkpoints so I have backups, as I don’t find the groupings to be helpful in any way.
Submitting things for review causes excessive friction
I’ve used over a dozen different tools for documentation, and without exception the tools that required extra steps to submit changes for review, merge them, etc., have been the ones that went out of date and the ones where I often left ugly bits and even outright errors because it was too much trouble to go fix them. Sure, you can say “it only takes 45 seconds to submit a pull request,” and that’s true, but only to a point:
- The more steps there are in a process, the harder it seems to get started. Have you ever not gone to get something you wanted from the next room because you had to get out of your comfy chair? I’m a young and reasonably fit guy, and I do that all the time. The inconvenience can be completely trivial, and it’s enough to stop you from doing something that is obviously beneficial.
- Unless you own the project and there are no other reviewers, once you take 45 seconds to submit a pull request, you have to keep thinking about it. It’s not actually public yet. Someone might come back and challenge what you wrote, and then you have to go defend it or update it. Someone else’s PR might get merged first, and then you have to go fix merge conflicts. People might forget about it entirely, and then you have to go poke someone. As mentioned in the previous section and by @Mark_S, this can take a long time.
When we’re talking about functional software, running these hurdles is part of the game, and they’re totally worth it. Because it’s relatively easy to bundle software changes into small, neat groups, even if you leave it for a while and have to come back to it, it’s usually pretty easy to re-grok your request and get it moving again. Meanwhile, introducing bugs into software that a lot of people rely on is pretty bad, even when it’s TiddlyWiki and it’s unlikely to hurt anybody, and mistakes can easily lead to more time and trouble than doing the code reviews over the long run.
Again, general improvements to a large document are a different animal. They’re hard to bundle and delays are much more costly because changes build on each other more often. Waiting for an existing PR takes up a lot of mental space that makes it hard to keep working on other parts of the document, and you usually can’t hop back to the current mainline revision and work on a different feature, then merge them together, like you usually can with software. There are a lot fewer objective problems that need to be caught; while the docs can certainly contain factual errors, the debates about them are usually more about organization, surface errors, etc., which are subjective to a significant degree, and “failures” in them are a matter of making the docs a little harder to use rather than making something outright not work. Realistically the worst thing that can happen is “the documentation is a little harder to understand and we might have to revert it when somebody notices.” Given all the downsides of a gated review process in this case, I think it’s really hard to justify that.
Look at what happened to Wikipedia – the project behind it was originally an encyclopedia farmed out to individual experts, with a review process and so on. The wiki part started as an experiment for small portions of the encyclopedia, and it sounded like an absurd idea…but we all know now that the little upstart experiment worked so dang well that it’s one of the most visited websites in the world and they canceled the initial project entirely. This works by getting more people involved. Maybe 95% of the (non-vandalism) edits are improvements; the others get noticed and fixed because there are a lot of people involved, and overall you trend upwards.
I don’t mean any disrespect (y’all are doing great work!), but I think the folks who are in charge of the current process for handling documentation do the vast majority of their documentation writing in the “writing documentation to go with a change” mode. For this mode, pull requests work pretty well, and there’s no denying that having the documentation’s history tied to the code is useful. So naturally, they’re biased towards this mode. But the reason our documentation is tricky to navigate, in places hard to understand, and in general in need of revisions and additional content of some types is that more or less all of it was written to go with a specific change, and some other types of work are needed from time to time as well. Why is all of it written to go with a change? Because it is a pain in the rear end to do anything else currently.
Have you ever worked in a bureaucracy where you have to ask a committee for permission to do something, and then they go talk to 6 other people, and then they come back and you have to go to two other meetings so you can explain what you want to do multiple times to people who don’t understand what you’re doing? Then once it finally gets approved, you have to go back and do the next step in your 6-step project plan? It can take months or years to get simple things done under these processes. That is what I see from afar when I think about working on editing a document where I don’t have some form of direct write access, and I’m inclined to run away.
How can we fix it?
I’m not suggesting changing platforms. That’s a bother, and it would be kind of embarrassing if the TiddlyWiki docs didn’t use TiddlyWiki! If it were me, I would try getting rid of reviews entirely on documentation files in the tiddlywiki-com
branch – have them merged instantly by a bot or a scheduled process and sent live on a schedule. Ask for a commit message, but don’t be too worried about exactly what’s changed.
Since we wouldn’t have the benefit of instant rollbacks by anyone like you would in, say, MediaWiki, I would want a couple of precautions:
- Require contributors to be approved the first time they contribute. (As things are, we probably want them to sign the CLA anyway.) It’s probably too dangerous to let anyone with a GitHub account commit any changes they want to the docs directory, but hopefully we can trust established members of the community. Worst case, we could even require people to apply to get this access if “known, has committed OK things before, and signed their name in the license agreement” still seems too risky, but I don’t like bureaucracy for projects like this.
- Limit the number of tiddlers and/or lines that can change in one PR to a reasonable number – this can prevent accidents and make sure that each PR is small enough to read through if appropriate.
- Tag/otherwise mark any particularly dangerous or high-profile tiddlers (e.g., the front page) as protected and require approval for those.
- Since I saw concerns earlier in another thread about changing things that would affect inbound permalinks, something to detect renames without redirects and block PRs until this is fixed should be implemented. There may be many additional automated checks that would make sense.
If we aren’t willing to do that, then something needs to be done to streamline the review process. Unless I can expect leniency on exactly what (maybe partly unrelated) changes go in a PR and the turnaround time is vastly improved from the status quo, I just don’t think I can write effectively, and I doubt I’m unusual. Further, in most cases I’m not sure blocking changes until things reviewers suggest are improved really makes sense for documentation; unless the changed version is clearly way worse, usually it’s better to just make the changes and then have someone go back in and make additional improvements if others have suggestions.
It might also be helpful to add some kind of subscription or recent-changes feed so people could see what’s changed more easily and see if they disagree with anything. And there should be a clear forum to bring up questions, whether they’re about changes already made or about potential future ones.
The work in progress to make it easier to submit changes directly from TiddlyWiki could help to address some of the remaining problems with the tools being developer-centric.
Maybe none of this will work, but the basic approach seems to work in most places it’s been tried across the web – I do not think it’s pie in the sky by any means.