Easier import of JSONs that lack title field?

Dear all — especially the JSON wizards,

From Absurd Big Clumsy Database ™ I can get various JSON files, but I can’t control their format.

Each is a flat (spreadsheet-structure) JSON file, but no field (no “key” in the JSON) is called title.

I’d like to convert these JSON files into batches of tiddlers, but of course tiddlywiki “out of the box” won’t import from a JSON lacking a title field.

I understand I could write custom deserializers (in theory, with a learning curve!), but there’s not a single field that appears consistently across all these files. And writing multiple ad-hoc deserializers seems like the wrong way to go. I could also pre-process each of files with a search-and-replace function. But this is potentially a routine workflow going forward, and I’d like to streamline it.

Could there be an on-the-fly solution?

I’m imagining a deserializer that allows the user to edit a field value (ultimately with a setting/CP interface) to specify, prior to dragging in an import, which incoming field (json key) should serve as the title field?

Then all I would have to do is peek into the next JSON on deck to see that … this one will have a suitable incoming “description” field that I can treat as tiddler title … and now this one will have a reliably unique “id” field that can serve as title… etc.

I would entirely ignore any field from the JSON. Just create a new unique title if non exists in the JSON to be imported.

IMO a custom importer and / or upgrader should be able to handle that. But I am not sure, since they also work with titles. So some experimentation would be needed.

How does one of your JSON files look like?

I had imagined exactly this, as well as allowing control over mapping of other fields, when we last discussed JSON import in the context of moodle. This actually points out a need for refactoring the import mechanism.

For the time being a quick fix might be to toss your JSON to ChatGPT and ask it transform the JSON mapping a selected field to title.

I have an alternative approach that I hope to have time to complete this evening, but can’t do so right now. The idea is that instead of trying to modify the import process, we have a text area to paste your JSON. The code finds the first record in the JSON and extracts the field names, offering you a choice of which one to use for your title, and putting the others into fields. Presumably you would also want to add specific tags, so there’s an entry for that. And then there’s a button to create the tiddlers. Here’s a screenshot of a naive version of that:

There would likely be additions to this idea. I can easily imagine allowing the user to enter a filter instead of a single field name ([[Person/] [<first-name>] [[ ]] [<last-name>] +[join[]], for instance, to give the tiddler a name like Person/Fred Flintstone) and perhaps offering a running sequence number, to create ones like Person/3. These ideas would be necessary if there is no single field that is unique across the whole JSON input; even if there is, the additional control over the title seems useful.

It’s also quite possible that there are other fields you would like these tiddlers to have in common, so there might be a way to specify common field name/value pairs.

All of these can be added fairly easily I believe.

We also might want a test-run capability that generates a compound tiddler you can verify before actually adding them to the wiki. Or we might want to offer a mapping of selected field names ("type" could be a common problem) and an simple way to exclude certain fields. But all of that would be for later, only if this idea in the end makes sense.

You can test where it is so far with this: Parse_unTITLED_JSON.json (1.8 KB)

What works in here is the parsing of the JSON to offer a choice of field names for the title, the entry of common tags, and the creation of tiddlers with the proper title and tags. But it’s not yet properly adding the other fields. If someone can tell me what’s wrong with this, perhaps we can fix it quickly:

<$button class="create-entry">Create <$text text={{{ [{!!raw-json}jsonindexes[]count[]] }}} /> tiddlers
<$let
    fields={{{ [{!!raw-json}jsonindexes[0]] }}}
    raw-json = {{{ [{!!raw-json}] }}}
>
<$list filter=[{!!raw-json}jsonindexes[]] variable="idx">
    <$action-createtiddler $basetitle={{{ [{!!raw-json}jsonget<idx>,{!!title-field}] }}}} tags={{{ [{!!common-tags}] }}}>

<! ------------------- Problem below ----------------------------->
        <$action-setmultiplefields 
              $tiddler=<<createTiddler-title>>
              $fields={{{ [{!!raw-json}jsonindexes<idx>] }}}
              $values={{{ [<fields>] :map[<raw-json>jsonextract<idx>,<currentTiddler>]  }}}
        />
<! ------------------- Problem above ----------------------------->

	    <!-- uncomment for debugging <$action-navigate $to=<<createTiddler-title>>/>  -->
    </$action-createtiddler> 
</$list>
</$let>
</$button>

I think there is something wrong with the extraction of $fields and $values in the inner action. I’m thinking it’s likely mainly a syntax error, but it could be a more fundamental misunderstanding; these JSON operators still do not feel at all natural to me.

Any suggestions would be quite welcome.

2 Likes

Wow, this looks fantastic, Scott! I’m looking forward to testing it out!

Since you anticipated the potential need to use a filter expression to compose the tiddler title, I’ll just enlist your sympathetic exasperation with the fact that I’m also stuck doing the inverse of that in this workflow:

The ABCD (to which I alluded above ~cough~moodle~cough~ ) insists on spitting out a JSON log which concatenates various fields’ worth of info into a text field called description which looks like this:

The user with id ‘94135’ has viewed the discussion with id ‘834301’ in the forum with course module id ‘6422219’.

And no, it does not offer any option to retrieve those key-value pairs directly. :grimacing:

Alas, I can’t submit anything related to educational records (the ABCD above is indeed moodle) via ChatGPT or other online services (unless there is a decently secure guarantee that the data is not being fed to the corporate borgs).

I can of course do SnR on each file via BBEdit and such, but again I’m considering how to make a sane workflow from moodle output to tiddlers (which continues to work great for the special case that you already solved — and it makes a great difference to my routine workflow!)

I’m afraid that I didn’t get any time to work on it last night. Local politics can be all-consuming.

I don’t have further thoughts on the place I got stuck, so if nothing comes to me this morning, I may raise that in its own topic.

As a programmer, I do that sort of data extraction all the time. If the records are consistently of that format, with only the IDs changing, then it’s easy enough to do. But in these sorts of things, complexities always pile up. Many of my wikis are data-heavy, with some external source that requires substantial massaging before I import it into a wiki.

What I’m proposing above would allow us to do the simplest versions of this sort of massaging in-wiki.

But I guess I need to get it working first. :blush:

There is a way of thinking. You can first merge the json file into a large json file, then convert it to csv using an online tool or python, and edit the csv file with excel (check the csv field values exported by standard tiddlers), such as using ctrl+D to fill. Set the default title field as uname in cell a1 and automatically fill it with (uname1, uname2, uname3,…) Or (uname, uname, uname,…) Save the csv file, and then use an online tool or programming language to convert the csv edited in excel into a json file supported by tiddlywiki

Boring meetings are good for something!

I created a version that fixes the issue discussed above. (There were several problems, but I think the one that consistently caused issue over my many attempts was in using {{{ - }}} in a place I needed to be using " - ".) Thanks to @saqimtiaz, whose earlier post in another topic demonstrated a similar process and straightened me out.

Here’s the latest version: Parse_unTITLEd.json (1.7 KB)

It does not attempt any of the extensions discussed such as using a filter instead of field for title, offering a sequence number, adding common custom fields beyond tags, or filtering/renaming the included fields. I think all of those would be straightforward to add. It certainly doesn’t try my idea of creating a compound tiddler as a dry run of the conversion before you modify the current wiki. I think that would be useful, but it sounds like much more work.

I don’t know if I’ll pursue this further. My personal workflow, of writing JavaScript conversion code to create tiddlers that I can simply import has been working for me for many projects. But I can see the benefit of this for myself in the simple cases and for non-programmers for many cases.

If you’re looking for any of those extensions I (recklessly) called simple; let me know, and I’ll see what I can do.