JSON Filters of Data in an Array

Continuing the discussion from Idea of multi field data tiddler like a data base:

I have a Tiddlywiki I use to make notes about decks I build playing Magic: the Gathering. I have a JSON file with data about the cards (card name, the card’s “cost” in-game, its card type, and a URI to an image of the card). The format of the JSON is an array of objects:

[
  {
    "name": "A Realm Reborn",
    "image_uris.normal": "https://cards.scryfall.io/normal/front/d/1/d1af74e4-38d5-44b5-85e1-4d13f6970453.jpg?1748706495",
    "mana_cost": "{4}{G}{G}",
    "type_line": "Enchantment",
    "legalities.commander": "legal",
    "set_name": "Final Fantasy"
  },
  {
    "name": "A Tale for the Ages",
    "image_uris.normal": "https://cards.scryfall.io/normal/front/c/a/ca0c8d3b-ce30-4da5-a6a8-9bdcb3c757f9.jpg?1692941043",
    "mana_cost": "{1}{W}",
    "type_line": "Enchantment",
    "legalities.commander": "legal",
    "set_name": "Wilds of Eldraine"
  },
  {
    "name": "Aarakocra Sneak",
    "image_uris.normal": "https://cards.scryfall.io/normal/front/2/a/2a83882c-3e03-4e85-aaac-97fa1d08a772.jpg?1722040128",
    "mana_cost": "{3}{U}",
    "type_line": "Creature — Bird Rogue",
    "legalities.commander": "legal",
    "set_name": "Commander Legends: Battle for Baldur's Gate"
  }
]

I’ve only shown 4 cards of data but the total list is 29,000 cards (!!), far more than I think would be reasonable to create static tiddlers for (but maybe I’m wrong). I’d like to be able to query these objects by the “name” field in order to auto-complete a list of card names in tiddlers I’m editing. However, all the JSON examples I can find assume a single object. In the Tiddlywiki documentation and Grok Tiddly wiki, there are no examples showing how to search or filter an array of similar objects based on the value of a particular field. Is this possible without building my own plugin?

Eventually I want to setup some shadow tiddlers based on this data so I can get info on the card by name, but for now auto-completing would be sufficient.

My wiki is here and the data tiddler is MtgCards. The Auto Complete Plug-in config is in the Evidently Cube tab in $/ControlPanel.

Welcome back @technome I expect others will have valuable things to say here. I am researching this a little, to see if I can help. 29,000 cards is a lot, If you moved this into a structure like a CSV where the columns are named rather than needing to repeat key/value pairs, but as far as I can tell you would have to construct your own lookup, perhaps having the key/name in the first column. But you may be able to divide this data based on one of the other values.

  • Why is it in this format in the first place?
  • Have you tried JSONMangler? I am trying to find a source
  • Erics Rerelease of SaveAs may provide useful later moving these as tiddlers to a Plugin.
  • Some time ago we got good performance on a wiki with 60,000 tiddlers, but a few tweaks were needed to avoid accidently itterating all the tiddlers eg; in sidebar recent tab.

The data comes from a community site that tracks the cards (Scryfall) in a HUGE JSON file (168MB) with lots of data I’m not interested in. I can reformat in whatever format would make it easy to search or filter in Tiddlywiki. As is, I used Excel/PowerQuery to reduce the original file to a 4MB CSV file:


name,image_uris.normal,mana_cost,type_line,legalities.commander,set_name
A Realm Reborn,https://cards.scryfall.io/normal/front/d/1/d1af74e4-38d5-44b5-85e1-4d13f6970453.jpg?1748706495,{4}{G}{G},Enchantment,legal,Final Fantasy
A Tale for the Ages,https://cards.scryfall.io/normal/front/c/a/ca0c8d3b-ce30-4da5-a6a8-9bdcb3c757f9.jpg?1692941043,{1}{W},Enchantment,legal,Wilds of Eldraine
Aarakocra Sneak,https://cards.scryfall.io/normal/front/2/a/2a83882c-3e03-4e85-aaac-97fa1d08a772.jpg?1722040128,{3}{U},Creature — Bird Rogue,legal,Commander Legends: Battle for Baldur's Gate

I know Python and can use other tools to prepare this data in some form to pull it into Tiddlywiki, but I’m trying to figure out what the best format would be (JSON, Dictionary [probably not], individual tiddlers).

i exported the file ( after copying the data made my system “slow” )

# size : 9.6M !
file MtgCards.json 
 MtgCards.json: Unicode text, UTF-8 text, with very long lines (65276), with no line terminators

original json text copy and (eventually) saved to file 8.7M

the array ‘length’ is 29677

copying every occurrence of ‘https://cards.scryfall.io/normal/front/’ “url-data” into a file , it size is 1.2M
(therefor you can save that much by removing that reoccurring string(s) …)

saving the repeated field’s to a file , it size is 2.8M

so (just under) half of the data is repeated

data:   field   url     total
         ( 2.8 + 1.2  =4)

if it was represented as some-table-format
eg csv

 (headers)   
name ,image_uris.normal  ,mana_cost ,type_line  ,legalities.commander ,set_name 
 (data)

it appears you could half** the size with current data
double the data for the same size

**perhaps more than ( removing quotes 1.4M )

… idk about table / csv support in tw (it’s just stings realy) but
i think some how a method exists to append the sections of url

??? filter [https://cards.scryfall.io/normal/front/] ~{!!image_uris.normal}? +[join[]]

/crunch

1 Like

I found the original JSONMangler Plugin both in GitHub (for Node) and my own records, which also requires the modloader plugin. For our convienece I have placed it here https://jsonmangler.tiddlyhost.com

Your array placed in a tiddler with the type application/json is able to be manipulated in a number of ways, I am still exploring it now. I will let you know what I learn for your application.

I am starting to look closer as JSON as a result of a number of recent discussions where we seem to be strugling with handling JSON as data tiddlers.

One thing I know is if you import a CSV on a wiki With JSONMangler its easy to import them as tiddlers and to save them in a plugin, thus they become shadow tiddlers. Once you have them as shaddow tiddlers there are different ways to extract and save data in different forms. I use a utility wiki to prepare data for ultimate use in a seperate wiki.

1 Like

The jsonindexes[] and jsonget[] filter operators can extract the card names from the “MtgCards” JSON tiddler, using a filter like this:

[[MtgCards]get[text]jsonindexes[]] :map[[MtgCards]get[text]jsonget<currentTiddler>,[name]]

The first filter run gets all the numeric indexes from the JSON array. Then, the :map filter run gets the name for each array index number. The problem is that the :map filter run is horribly inefficient. If you were to use this syntax as the suggestion filter in the Auto Complete Plugin, as soon as you type the [[ trigger, it would immediately freeze up while trying to build the list of card names.

Note: To count the number of cards in “MtgCards”, you can use this filter:

There are <$count filter="[[MtgCards]get[text]jsonindexes[]]"/> cards

which reports “There are 29677 cards”

Here’s an alternative idea that MIGHT work a little bit better… since the really slow part is extracting the list of names, let’s suppose we create a $button that does that work in advance, and then saves the result as a space-separated, bracketed list in another tiddler called “MtgCardNames”.

To do this, first create a tiddler (e.g., “GetCardNames”), containing

\define make()
<$let cards={{MtgCards}}>
<$set name=cardnames filter="[<cards>jsonindexes[]] :map[<cards>jsonget<currentTiddler>,[name]]">
<$action-setfield $tiddler="MtgCardNames" text=<<cardnames>>/>
\end
<$button actions=<<make>>>make cardname list</$button>

When you press the “make cardname list” button, it invokes the make() macro, which will take a LOOOONG time to complete since it has to process all 29677 cards! Your browser will repeatedly ask for more time (the “wait or exit page” message), but if you are patient, then eventually the “MtgCardNames” tiddler will be created.

Note: I did a quick experiment by adding a limit[1000] operator in the filter, like this:

<$set name=cardnames filter="[<cards>jsonindexes[]limit[1000]] :map[<cards>jsonget<currentTiddler>,[name]]">

It took a few seconds to complete, but the browser didn’t ask for “wait or exit page”. Then, I increased the limit to 2000, and the browser asked “wait or exit page” once. So, we can “guess-timate” that to process all 29677 cards, the browser will probably ask you to wait somwhere between 15 and 20 times before it completes the process. As I said… “if you are patient”…

Once this tedious process completes, you can then define your Auto Complete suggestion filter using:

[enlist{MtgCardNames}search:title<query>]

which will function reasonably quickly, even though it is seaching through 29677 card names, because it is just enlist-ing the card names, rather than calculating them “on-the-fly”.

A third alernative that might be significanty faster would be to use some external tools (i.e., “Excel/PowerQuery”, or Python) to generate the list of card names as a space-separated, bracketed list in a text file, and then import that file into your TiddlyWiki and name it “MtgCardNames”.

Let me know if any of this helps…

enjoy,
-e

2 Likes

Thanks @EricShulman and @TW_Tones for your suggestions! I definitely want to pre-process this as much as possible. I wouldn’t update this maybe more than once a month, so the data stays pretty static. I’ll play around with both of these ideas tomorrow and report back on what I find.

Just a not that this kind of timeout was common in the past, there was a browser setting you could change, even if just for this purpose, multiply the timeout by 10 by adding a zero would go a long way to making this bariable.

Question;
Would it be possible to move the name value into the place of the index? Saving the need for the seperate “index tiddler”.

In JSON Mangler world the JSON can be flattened or given numeric indexes, perhaps this could be used in the process?

If you can share the CSV I can give you back a plugin to get you started?

In the mean time I want to see how hard it would be to query the csv directly.

I’ve added it to my original wiki linked above in the tidder MtGQuery.csv

https://mtg-sketchbook.tiddlyhost.com/#MtGQuery.csv

This definitely seems like a “right tool for the job situation”! I tried this but got impatient, stopped it, and wrote a function in my Excel workbook to wrap the names in double square brackets instead, copied the cell range to Notepad (Tiddlywiki wanted to treat the paste as an image instead of text) and copied the text to a new tiddler named MtgCardNames. The rest worked great and the autocomplete is plenty fast enough for my needs. Thanks!

Next up: getting the rest of the JSON or CSV data into tiddlers that I can use to store the rest of the information I want about the cards. I think once I understand the best way to create the tiddlers in bulk with the data stored in fields, I can create a default view that will show the information I care about and reference other tiddlers that reference that card.

As far as I can see that is not CSV

@technome … Do you control how the JSON tiddler with the cards is created?

If yes, you could try to create a JSON file, that can be imported as single tiddlers, per card, to run tests.

Using json-operators is relatively complex, even if the structure is relatively simple. TW is optimised to be used with tiddler. So it would be worth some tests.

If "name": will be changed to "title:", it should be possible to import it into TW.

-m

1 Like

It attached it as a binary file (not sure why, maybe because my system has CSV associated with Excel?) I have deleted and re-added a new tiddler with the same name and copied the contents in.

I’ve also created a rough sketch of what I would like each card to look like if I can create all 29,667 tiddlers. Ideally this would use a default template for tiddlers with the “Card” tag.

1 Like

I do have control over it, as I had to filter out a lot of data I didn’t want to import.

Is the JSON format for import documented somewhere? I know I’ve seen examples of what it looks like but I am struggling to find an example now. I would want to set the title and multiple fields along with a tag of “Card”

  • I have imported it into a wiki with JSON Mangler and another required plugin.
  • I changed the type to application/csv known in the wiki with JSONMangler installed
  • Using the import tools that appear on the CSV because of JSON Managler I imported it with no setting changes
    • Note there is a display issue but its working.
  • Exported it as a JSON file, The result is this 84kb file with 250 items? Data_MtGQuery.json (83.4 KB)

This only took a minute and no timeouts. I did it on a single file wiki at file://

Due to load time on mtg-sketchbook.tiddlyhost.com I recomend a splash screen

I could improve the import based on what you want the tiddler title to be. eg MTGcard-name-number

1 Like

See: How do TWs JSON Formats Look Like – It basically is the first format, that is described.

1 Like

in addition to that
size wize

i imagine using
a key/tiddler to link field name text to number

title:obj-key
1: image_uris.normal
2: legalities.commander
3: mana_cost
4: set_name
5: type_line

put the data in
numbered field’s (if that is possible?) should lessen size of repeated key name’s reducing final size

1 Like

Update: I swear that when I answered there were no other answers! I must have been blind. Sorry, this post doesn’t add much to the discussion.

This wiki has over 32K tiddlers, with no noticeable slowdown. I would suggest that it’s simply worth trying. You could just import the above if you converted slightly:

  {
    "title": "A Realm Reborn",
    "tags": "MtgCard",
    "image_uris.normal": "https://cards.scryfall.io/normal/front/d/1/d1af74e4-38d5-44b5-85e1-4d13f6970453.jpg?1748706495",
    "mana_cost": "{4}{G}{G}",
    "type_line": "Enchantment",
    "legalities.commander": "legal",
    "set_name": "Final Fantasy"
  }

That is, change name to title and add a tags parameter to identify the sort of data involved. The tag makes it much easier to query, but is not required. title is required; it’s the unique key. If any of these names are likely to conflict with other tiddler titles you want to have, then you can add a common prefix, such as Card/A Realm Reborn, in which case, you might also want caption: A Realm Reborn to have the familiar title in lists.

(I would also suggest normalizing the field names to using a single separator. But only for cleanliness; it’s not essential, and it won’t hurt anything to have image_uris.normal, but it can get difficult to remember. Most peoples settle on kebab case or snake case, but anything will do, so long as there are no spaces.)

If this doesn’t slow things down, and I don’t think it will, then you should find that querying your data is much easier than it would be in JSON. You can see in a current thread that there are discussions about how to make it easier to query JSON tiddlers, and you’ll see me arguing that it might not be worth doing. Right now JSON data is definitely a second-class citizen in TW, and there’s a very good chance that this will long remain so.

1 Like

Agreed. This was a quick and dirty PowerQuery of the JSON data. I’ll come up with a more robust ingestion workflow that cleans this up and standardizes.