Improving tiddler import (keyboard responsiveness + duplicate handling)

Springer · March 3, 2024, 4:41pm

There are some aspects of the import process that I wish I could tweak, though I’m sure it’s way beyond my ability to mess with this on my own… Perhaps many would agree that these would be good areas to improve:

(1) I’d love better keyboard responsiveness in the import process, which might require making the import into an alert window.

Cancel with ESCAPE? When the import is a mistake (say, the json has an error, or I didn’t realize “A tiddler with this title already exists”), I wish I could dismiss the import tiddler from the keyboard (perhaps with a further confirmation, analogous to how I can just hit “escape” key to back out of New Tiddler, confirming from keyboard with “return”).
Confirm with RETURN? When there’s an import in the story river and all looks good, I wish I could confirm from the keyboard.
Heck, why not have a keystroke to activate “batch edit” (with the Utility plugin by @Mohammad) while we’re at it?
Tiddler-title edits could be better handled. In order to to rename the incoming tiddler (s), I reach for the cursor to hit the edit button (which is fine, probably hard to avoid) — but then focus is not automatically moved to the edit field. (Granted, keyboard can get me into that edit field with TAB, but why should that extra keystroke be needed?) Worse, I need to remember to reach for the mouse to confirm that new name. If I accidentally confirm “IMPORT” with the new name not yet confirmed, then — if I was renaming so as not to overwrite an existing tiddler — I have not only typed out this new desired name for nothing, but I’ve also lost my old tiddler. This is especially common when I’m dragging and dropping an image, since it gets imported with a default title, which I’ve already accepted (by failing to rename successfully — for exactly these reasons!) on some past occasion. (I’ve brought this up before, but I’m including here because it belongs under this heading.)

Presumably most of this responsiveness would be much easier if the import dropzone triggered something more like an alert, less like just another tiddler hanging out in the story river. After all, I pretty much always do want to follow up on that import step right away!

(2) Handling of duplicates (beyond the retitling issue above): It’s helpful that the import tiddler highlights (in pink) the apparent duplicate rows within a multi-tiddler import, say, from a large json file. It’s helpful that there’s a select-all and deselect-all button, but…

It would be super-helpful to have a “deselect duplicate titles” affordance. Suppose I already have half of the tiddlers in an incoming json batch of dozens or hundreds (and have made changes on some/all of them, so overwriting would be terrible). I currently go through this mindless step of scrolling-mousing-clicking through to de-selecting exactly all the pink rows.
Ignore true duplicates: Could it be easy for the import to inspect past the duplicate title flag to recognize whether or not ALL other fields match, so that identical incoming tiddlers can be visually recognized and or filtered automatically? (Perhaps certain fields could be configured as to-be-ignored for this purpose — creation/modification dates come to mind — to focus on whatever the user considers to be the fields that matter.)
When incoming tiddlers are exact copies (assuming such detection is possible), could they be deselected by default?
“Dovetail” import (OK, this is BIG STRETCH wishlist): have the import be able to recognize when the incoming tiddler and old tiddler have complementary info at the level of fields present — ignoring (or easily flagging) any fields where there’s substantive conflict. For example, the incoming tiddler (from a public database-generated json) has a field with card-catalog-id numbers and other technical data I want. My existing tiddler with the same name lacks that info, but has a tags field that I’ve already populated with things that matter to my own projects. Imagine the import dialogue offers a “dovetail” checkbox column which means something like “Add new field/value data to existing tiddler, without overwriting any existing fields?” Yes please!

What say you all? Would there be any reason not to develop the import UI toward one or both of these kinds of flexibility? Any serious technical or performance obstacles in the way?

stobot · March 3, 2024, 4:54pm

For me this would be huge. I think this was a discussion not that long ago and someone (Saq maybe) hacked together a modified Import tiddler that performed at least some of this.

I maintain all of my contacts in TiddlyWiki and have different sources - each with their own fields. I would love to have an option on the import page to have some global settings on duplicates. Something like:

Ignore: If there’s already a tiddler, ignore the import tiddlers with that name
Supplement: Merge in new field values only if the field doesn’t existing already
Overwrite: Merge in all field values, overwriting the old values
Replace: Fully replace the old tiddlers with the new ones, even if new tiddlers have fewer fields

stobot · March 3, 2024, 4:57pm

@Springer Found what I thought I’d remembered. Looks like you were in that thread so you probably already had knowledge of it, but I didn’t want to leave that hanging.

saqimtiaz · March 3, 2024, 5:44pm

The import process is a rather complex set of code both in terms of the wikitext involved and how it interacts with the core JavaScript. I have wanted to make various improvements to the import process for quite some time but have held back because there are some rather fundamental changes that really should be made first, before we introduce more complexity and backwards compatibility constraints with more features.

One of the major shortcomings of the import process is that once you have tried to import something, you no longer have any control over how the incoming data is interpreted. Should it be imported as a raw blob of JSON or separate tiddlers? Is it a bibtex or a moodle import or just raw text? You have to get it right at drop/paste time by using a custom dropzone or using a file extension that a deserializer recognizes. You have no ability to change it after that.

Instead, what would be desirable is that after you start an import, you have an affordance in the import tiddler to switch deserializers and thereby change between and preview how the different deserializers intrepret the data before proceeding with the import. What this entails is a rewrite of the import process so that we also save the raw data from each drop/paste event and not just the interpreted version after it has been run through a deserializer. There may be some backwards compatibility issues in making this change, that will not be entirely clear until someone has the opportunity to prototype it.

In terms of technical challenges and requirements, the other thing we probably need is to further flesh the JSON filter operators. In this regard we are in a much better place now than 12 months ago, but still need some new operators for operations such are unsetting an object property.

The biggest hindrance to any of the above from my perspective is really just a lack of resources and opportunity for someone to focus on the work.

Springer · March 3, 2024, 6:09pm

Thanks @stobot, I had a niggling feeling that at least some of these issues had seen some action… Apologies to @saqimtiaz for not remembering and following up on the merge/dovetail progress made there!

Springer · March 3, 2024, 6:16pm

Thanks, this is all really great to hear!

If any of the above is “low-hanging fruit” that can be picked without lots of structural effort, I’d be delighted… maybe just the “de-select duplicate titles” affordance, since clearly that status is already being tracked for each row? (But I know nothing about whether this is much harder than it sounds!)

At any rate, I’m very grateful for your help (and the steady work you’re always doing behind the scenes). I feel sheepish about losing track of that import-and-merge solution that you already (6 months ago) put effort into sharing!

pmario · March 3, 2024, 7:29pm

My bundler-plugin has an option, that will automatically create new unique titles, if a tiddler already exists.

Springer · March 3, 2024, 7:46pm

Ah, I admit that I installed this at some point (on my demo site, where it’s still present), but then got a bit confused about how to configure it. I see lots in your documentation about what problems the plugin can solve (cool), and also lots about what kinds of bundles it can create (ok)…

What I didn’t easily see was the beginner-friendly translation bits: “Here’s how to make use of bundles for a simple use-case… Specifically, here’s what to do so that when you drag new content to your wiki (tiddler, json, image, whatever), you can ensure that your incoming tiddlers all get fresh names…”.

EDIT: AH, I see now that in order to solve this particular import problem, I don’t really have to understand much about bundles, and what kinds exist or why! … The checkbox in your plugin settings takes care of this, even if I am not yet confident with how to handle bundles… (I just did not realize that, and got lost at thinking I had to understand bundle management before I could configure anything useful with your plugin.)

I’m sorry, because those connections probably seems straightforward to you, or maybe I just overlooked where it’s spelled out… but my eyes glazed over. Could you add (or point me to) a simple step-by-step?

saqimtiaz · March 3, 2024, 7:59pm

Possibly, I cannot be certain without revisiting the code. At present there is an interplay of wikitext and JavaScript that the import process UI relies on and that has some hard limitations which in part is why the support for renaming and selection during import is so limited. A proper rewrite of the import process would probably move more of that into the realm of wikitext allowing for more flexibility, whereas as adding new core features for import at this time might necessitate JavaScript changes that would later become obsolete with a rewrite, creating more backwards compatibility constrains that any future write would need to deal with.

I think the best way to make progress in the short term would be by replacing the core import process via plugins, which could entirely be wikitext based. The above mentioned demo I posted of an alternate import process, this work on easier image imports and even the external content plugin all are examples of implementing import in wikitext. Such experimentation would also facilitate and inform the future rewrite of the core import process.

pmario · March 3, 2024, 7:59pm

You are right, the documentation can be improved. I probably should add some more text from the talk “Intro thread” into the official documentation.

Springer · March 3, 2024, 8:07pm

I do not envy this kind of development dilemma you face!

Personally, I’m happy with waiting for the careful implementation to arrive (and taking advantage of ad hoc solutions here and there in the meantime), so long as I know that this feedback (about the awkward aspects of the import process) has been heard by those of you steering development of the core.

And this super-fast response here by both you (@saqimtiaz) and @pmario does indeed convince me that you both are taking the import-process seriously as something to improve!

Scott_Sauyet · March 3, 2024, 8:40pm

This actually strikes me as less of a stretch than you seem to suggest. I don’t have any time to spend on this for at least two weeks, but I can easily imagine a UI that simply offers to merge the old and the new versions, includes a few automatic merge features for fields such as created (keep the older), modified (keep the newer), revision (keep the newer), and then takes the fields that are unique to either one and incorporates them, takes the non-empty version of fields where there’s a conflict, and then, for any remaining fields, presents the user with the chance to choose one or the other, or to manually merge them.

This is easy enough to imagine for single imports, but if you are importing dozens of tiddlers, and need to do this often, that could become tedious too. I don’t quite have my head around how it would work, but I feel as though such decisions could easily be applied to other imports from the same batch: "Always take the foo field from the current tiddler, but use the bar and baz ones from the import, and concatenate the qux field. I’m still trying to figure out an appropriate user experience for this, though.

Springer · March 3, 2024, 9:17pm

Anyone is to be forgiven for not combing through every single post! But @stobot pointed out in Post #3 that @saqimtiaz had actually built something very much like this — for me no less — about 6 months ago. Check it out… I forgot about it because I haven’t yet tried to implement this solution on my own sites. (I had just briefly tested the functions in place at saq’s demo site, and managed that way to stitch together a set of bibliographic tiddlers that kept the field-contents from two overlapping sources with different fields, just as I had hoped would be possible).

TW_Tones · March 4, 2024, 12:10am

@Springer I have knowledge and experience with every little detail in your top post. In that we are kindred spirits on imports.

One of my key motivations is to allow wiki review submissions from anyone, to be processed, reviewed and applied by a wiki owner in a single step workflow.

The solutions provider paradox

I have explored and found ways to intervene and overcome many of these import limitations, however unless the cost of using workarounds, overwhelms the cost of coding a new solution, it’s hard to justify the effort. When you multiply the benefit to a solution for the whole community, it would appear the increased benefits justify an altruistic donation, however by trying to meet community needs, the scope and demands can grow, and it costs the designer (me more and more) and the value becomes more and more speculative.

Although as tiddlywiki develops the cost of intervention comes down, new JSON operators as one example.
I continue to explore what I would call the “solution space” to see if I can find shortcuts or tricks that may allow solutions to be found via quicker less costly development routes.

A couple of quick comments;

Think you would be surprised I think you can do more than you estimate once a couple of key facts arew known to you about the Import process.

If you are interested I think there exists some ways to overcome this without too much intervention and definitely with little or no core and javascript changes. For example;
- Different dropzones or import buttons allows us to use different desterilizes
However even with default desterilizes, as long as there is no loss of information we can intervene, after something is dropped and before we hit the import button.

@Springer here is my top Importing work around that may help in many cases;

Drag and drop multiple tiddlers and json files of tiddlers on to a target wiki such that you have a single list of tiddlers to import.
Don’t hit Import!
Edit the $:/Import tiddler, changing its name to a regular tiddler that describes its contents (I made a button for this).
Edit the plugin-type field to plugin, delete the status field.

You now have a plugin that contains all your import tiddlers as shadow tiddlers, they will not overwrite existing “edited” tiddlers and the plugin can be removed without effecting the wiki, as if never imported.

Now once you have a plugin of tiddlers, you can write some smarts to interrogate the shadows and compare with the tiddlers in the wiki and provide your various options on a per tiddler basis.
Be aware if other plugins also have the same tiddler as a shadow the last wins, including if edited into a tiddler.

Now once you have a plugin of tiddlers, you can write some smarts to interrogate the shadows and compare with the tiddlers in the wiki and provide your various options on a per tiddler basis.

This is I believe the source of “Low Hanging Fruit”.
Not only do you enable advanced imports, but you build more tools for handling plugins
This will also help using plugins as a method for transferring and manipulating data not only code solutions. What I call Data Plugins.

Springer · March 4, 2024, 2:36am

Fascinating. This sounds potentially similar to the bundles approach from @pmario — having something like a freeze-frozen package (bundle or json super-tiddler) that can be massaged and tweaked in multiple steps, and maintained for troubleshooting etc… Of course there may be important differences in the details of your approach, but I see how the pattern is similar.

You and @pmario and @saqimtiaz have all given me glimpses into what’s possible (and @Scott_Sauyet is clearly also thinking along parallel lines with @saqimtiaz about flexible merge possibilities).

Many thanks!