Access to image metadata (EXIF)

Over in the thread on lightboxes, interest has been expressed in having access to image metadata (EXIF and perhaps XMP and IPTC?) in TiddlyWiki.

A quick search came up with three libraries that might be useful towards this end:

Of these, only piexifjs allows modifying the metadata as well, and actually seems like it would be very straightforward to implement for embedded (binary) image tiddlers as it works with base64 data. This might make a good beginner project for someone.

exif-js might be easier to get working with non embedded images - it seems to work with any image being displayed - but can only read metadata not modify it. Furthermore having to display every image to retrieve their metadata and then potentially filter based on it may not be ideal.

Is anyone aware of other libraries that are worth consideration?

What isn’t clear and would be helpful is knowing the potential use cases for metadata access. Some user stories or scenarios would be very helpful here.

Off the top of my head some unanswered questions are:

  • embedded images, external images on a local drive, or remote online images? In particular external images on a local drive might be difficult to work with.
  • a one time extraction of image metadata, or an extraction on demand when display an image?
1 Like

Thanks @saqimtiaz

That is a very useful starting place!

TT, x

That’s interesting. Yesterday I did read a blog post, where the BlurHash library was mentioned: https://blurha.sh/

It would give us a possibility to create initial thumbnail tiddlers, that contain the extracted EXIF data in fields plus a _canonical_uri to the original, to keep the TW small.

When the image is shown the BlurHash can be used as a thumbnail and the images can be lazy loaded.

So the wiki itself would be searchable but small.

That would be a usecase that I personally prefer.

BlurHash

In short,
BlurHash takes an image, and gives you a short string (only 20-30 characters!) that represents the placeholder for this image.

You do this on the backend of your service, and store the string along with the image. When you send data to your client, you send both the URL to the image, and the BlurHash string.

Your client then takes the string, and decodes it into an image that it shows while the real image is loading over the network. The string is short enough that it comfortably fits into whatever data format you use. For instance, it can easily be added as a field in a JSON object.


Some personal remarks. … I do have about 10th of thousands of images on my HDs and another 1000++ on my phone. Movies not counted.

All of them have EXIF data, but not a single one of them has additional meta data. Because the existing editing tools I know all suck. So I never did add any new data there.

In contrary, if I do send “polished” images to friends, I do remove the EXIF info, because of privacy reasons.

All the meta info is part of the directory structure. … That’s “just good enough” for me.

Just my thoughts

2 Likes

Interesting approach.

Personally I would prefer to create an actual thumbnail no matter how low the quality.

Yes and it is a recurrent source of stress to remember that step.

1 Like

Here is a quick and dirty proof of concept of using the piexif library to display metadata for JPG image tiddlers:

https://saqimtiaz.github.io/sq-tw/temp/piexif.html

I’m curious as to the nature of your photo exifdata that requires this step? I can’t remember a time I have intentionally stripped metadata from photos. I currently have about 270GB of photos and videos…

One reason may be this: Exif - Wikipedia … But geolocation is not the only info that can tell a lot about the owner. Eg: Exif contains the model name … If I do have a look at this data and you happen to have a Canon model, where less digits in the model number usually correspond to higher prices, I do exactly know, in what price range I can target you with additional equipment adds :wink:

oh, so you are a whistleblower or journalist? Spy?

I mean, I’m not trying to talk you out of it, but for me I would never worry about it. Most of the time I want my photos to keep all of this data, even when publicly shared. If I had a real worry about privacy or security, I’d just not post it, or send it to friends. :man_shrugging:

1 Like

@dixonge @pmario sits on our shoulders or in the breast pocket and makes sure we are aware of all security risks, I value this highly.

There are many occasions where one may want to obscure the metadata, especially photos taken in private places (remove location data) and I would if I remember remove everything on any social media site, but I often edit and crop them first anyway, which is the opportunity to censor.

Not withstanding the above as a photography hobbyist and traveler, I really like to keep the metadata for my own use and would love to integrate this all with the file uploads plugin, exif data stored in tiddler fields and use tiddlywiki as a master database of my 100 thousand plus photos.

Also some enthusiast photography sites prefer this information be made available for insight into the photographers methods, and definitely if selling photos.

Nice Saq.

1 Like

Whoah! Amazing you are able to put this together so quickly! As a non-programmer that kinda thing wows me!

Looking at the detail of the scope of the meta-data retrieved: it is brilliant! e.g. …

I’ll comment separately on my use cases.

Best, TT

Right. I do see there could be several basic approaches.

Though I’ve mentioned it before it might be useful to re-iterate one typical use case of mine …

– 4,000 paintings by school children in the UK of “Guy Fawkes Night”. Scanned to jpg. Stored as external images in a sub-dir of the wiki at fireworks.html/img/. This is for local use and online use with the same structure on a host. So external images at relative addresses for both local and online.

– Each image contains meta-data I added via an image editor for:

  • Name:
  • Age:
  • Gender:
  • School:

– It is this data I want to display when the image is loaded. So extraction of meta-data on demand that exists in TW only for the showing.

I hope this is clear and the reason to want to do it that way too :slight_smile:

Best wishes
TT

I would paraphrase that as follows:
Extraction of metadata on demand where the metadata is not saved in the TW.

Some thoughts off the top of my head follow.

Extracting metadata from images could easily get resource heavy so it would make sense to cache the data for repeated display of the same image. There could also be use cases where users want the metadata to persist in the wiki. The logical approach therefore would be to facilitate extracting metadata on demand and saving it in a tiddler, from where it can be displayed and re-used as needed. A setting that determined whether the tiddlers created to hold the metadata are temp tiddlers would allow for both uses cases: where the metadata is saved in the wiki, and only available until the wiki is reloaded.

A related issue is that the structure and content of the image metadata can vary widely so deciding how to store it in a tiddler would be challenging. Considering the nested nature of the data, the simplest approach would be to just save the raw JSON data and use the new core JSON operators currently in development to fetch and display what is required. That may however not be very user friendly for non-technical end-users.

1 Like

Right. That is clearer.

TT

Right!

A footnote to my “Guy Fawkes Night” usecase, which I did not mention (to avoid complexity), is the end-user should be able to “favourite” an image. In those cases I was thinking only that the “favourite” holds the link to the image (so you can visit it again) AND add one additional field for “comments”.

Hope this is clear!
TT

In theory yes. But meta-data saved, eventually, for 4,000 images?

That was always my concern—that burdening TW with meta-data for 4,000 images was overkill?

I totally see for galleries of say 100 images it is no problem saving all the meta-data.

But my use case of 4,000 I’d say it don’t add anything to do that—except for a few where the end-user wants to preserve a link to an image???

Just thoughts!
I don’t actually know.
TT

Please pay attention to the subsequent sentence. As I mentioned, there could be a setting that you could use so that the metadata was saved in $:/temp/* tiddlers. Temp tiddlers would only be created for the images that are actually viewed and temp tiddlers are not saved in the wiki, they exist only in memory until the wiki is closed or reloaded, and should have no significant impact on the wiki in any form.

My emphasis in bold …

Right.

BUT I think IF there is a final user interface that can query the JSON simply it will be well okay with idiots like me.

Just a comment
TT

Ciao @saqimtiaz

I think that approach is good!

Part of the issue for a developer is like at least 3 scenarios …

1 – Only fetch meta-data for temporary use (e.g. only sample from TT’s 4,000 for showing) — Images always external. Nothing kept.

2 – Fetch meta-data from images for permanent storage in TW. — Images external to TW but link to and kept meta-data internal. (e.g. smaller numbers needing decent docs, say 100).

3 – Fetch meta-data AND images from an arbitrary address for permanent storage in TW (e.g. my 15 cat images with meta-data).

I think this is the point, in a way, it gets complicated unless we also think about what it is FOR.

Just comments
Not wanting to muddy-the-waters.

TT

Keep in mind metadata or exif data is going to be small compared to a typical image. My guess is not more than 1kb so 4k number of images will need 4mb bytes. Even safe in a single file wiki, perhaps i would need a node install for my 100k images. And one can purge data you dont need.

I think that is very true. One MIGHT keep such meta-data regardless at lowish overhead.

But in my 4,000 scenario what would be the added value of doing that?

I guess you could analyse & filter on Age / School / Gender but to do that meaningfully I’d hope you’d be a child psychologist, art therapist or serious lover of primitive art.

A comment
TT