Copy info from web site to tiddlywiki using bookmarklet and json tiddler dropzone

I created a json tiddler text deserializer for use with a Dropzone to import json tiddler text. It works, but turns out that there is a better way to do what I wanted. I shared my findings in this somewhat long post, maybe it will be helpful to somebody.

JSON TIDDLER DROPZONE
The use case I have in mind is to use bookmarklets to scrap info from specific web sites to the clipboard, and import into tiddlywiki by pasting (with CTRL-V) as json tiddler text in a dropzone. Tiddlywiki already supports import of json tiddler file using the import button and via drag and drop. In contrast, this approach aims to do the followings:

  • uses bookmarklets to capture info from web pages.
  • uses clipboard to export the captured info from bookmarklets, in view of the security-related constraints on bookmarklets.
  • uses json tiddler format to transfer multiple pieces of info, spread across one or more tiddlers, to tiddlywiki.
  • import by directly pasting (using CTRL-V) from clipboard to a dropzone in Tiddlywiki.

A simple use case is to copy a selection of text from a website, along with the site url and title and import as a tiddler like this:

  [ { "text":"A sample selection of text from a web site",
      "type":"text/vnd.tiddlywiki",
      "title":"Illuminate your world",
      "url":"https://thinking-about-things.com/"} ]

This is similiar to the use case for BibTex, and from which I took the cue to create the json tiddler deserializer:

The dropzone and deserializer works !! With it, I can select some text from a web site, launch the bookmarklet to copy the text and url to the clipboard, then click the dropzone in tiddlywiki and CTRL-V (paste) to import the info as a tiddler.

There are other bookmarklets I tried as well, such as to capture all the image links from a web site into a tiddler, or likewise, capture a list of local file links through the browser so they can be referenced within tiddlywiki. The latter is possible as the web browser can browse local file directories and file links in the directory page can be captured with a bookmarklet. The json tiddler dropzone and deserializer plus the bookmarklet examples are packaged in this json file: json tiddler dropzone.json (12.1 KB)

JSON TIDDLER INPUT BOX
In my exploration before I hit on creating the json text deserializer myself (the thread above came closest to creating a json text deserializer yet somehow didn’t take that last step. Tiddlywiki has a built-in json text deserializer too for file import but I have no idea how to use it for my purpose. I copied it’s javascript codes instead for my json text deserializer :slight_smile: ) , I found an alternate solution using text input box instead of a dropzone. The idea came from this thread “Importing a tiddler with a specific name - #4 by sobjornstad” by saqimtiaz which breakdowns the tiddlywiki file import button into its component steps:

$browse -> import with deserializer -> Import preview using "$:/Import" tiddler -> tm-perform-import

I learnt that if I start instead with a text input box that accepts json tiddler text in the format for “$:/Import”, I can create the “$:/Import” tiddler directly and send a “tm-perform-import” message to complete the importing, like this:

\define handleTextPaste()
    <$vars textin=<<actionTiddler>>>
    <$action-createtiddler  
        $basetitle="$:/Import" 
        type="application/json" 
        plugin-type="import" 
        status="pending" 
        text=<<textin>> 
    >
    <$action-sendmessage $message="tm-perform-import" $param=<<createTiddler-title>>/>
    </$action-createtiddler>
    </$vars>
\end

<style> .wideEdit { width:100%; } </style>
<$edit-text field="dropjsontext" placeholder="Paste JSON IMPORT text here" tag=textarea class="wideEdit"  />
<$button>
   <$vars actionTiddler={{!!dropjsontext}} >
   <$action-setfield dropjsontext=''/>
   <<handleTextPaste>> 
   </$vars>
{{$:/core/images/copy-clipboard}} Import
</$button>

The json tiddler format for “$:/Import” is well documented in this thread “How do TWs JSON Formats Look Like” by pmario. Only minor change is needed in my bookmarklet to create it. Below is a sample of “$:/Import” json tiddler format. If you copy and paste it into the text input box above, and click the [Import] button, a new tiddler is (silently) created. No json tiddler text deserializer required.

{"tiddlers":
     {"Illuminate your world": 
        { "text":"A sample selection of text from a web site",
          "type":"text/vnd.tiddlywiki",
          "title":"Illuminate your world",
          "url":"https://thinking-about-things.com/"}}}

JSON TIDDLER DROPPABLE ZONE
I tried to do the same with Droppable widget. The Droppable zone below uses the same handleTextPaste() actions as the text input box above. If you DRAG the json text above (or a text file containing the json text) into this Droppable zone, the tiddler will also be created (silently). However, I can’t find a way to paste the json text from the clipboard into the Droppable zone using CTRL-V.

<$droppable actions=<<handleTextPaste>>>

<$button style="width:10em;height:10em;">Droppable zone - Drag your JSON IMPORT text/file here</$button>

</$droppable>

BOOKMARKLET EXAMPLES FOR JSON TIDDLER INPUT BOX AND FDIR@VIEWER APPLICATION
The json text input box plus the example bookmarklets are packaged in this json file json tiddler input box.json (22.3 KB).

The bookmarklets in this case output the contents in the “$:/Import” json format that is expected by the text input box. Included also is a “fdir@viewer” tiddler to navigate and inspect imported tiddlers. This application started life as a basic $list to check the contents of tiddlers imported thru the bookmarklets, and sort of grew into a pseudo folder-based navigator and viewer. It shows a navigable folder view of imported tiddlers in the top half and the content of selected tiddler in the bottom half.


Screenshot of fdir@viewer showing list of files and directories captured from the browser directory view

You can explore the sample bookmarklets using fdir@viewer as well.

SUMMARY
In retrospect, I prefer the text input box approach to the json tiddler dropzone:

  • Dropzone doesn’t have the “actions” option for further processing. It does provides an import preview which I do not need for my purpose but I understand can be tapped for some intermediate processing by modifying the import view template to add some button actions.

  • The text input box approach doesn’t invoke import preview. Note that the import using tm-perform-import will overwrite existing tiddlers with the same name. I can’t find an option to disable overwrite.

  • The text input box approach has the flexibility to process the json tiddler text further before sending “tm-perform-import” to import the tiddlers. An example is in “fdir@viewer” application, the input box processing extracts info from the json tiddler text to update the $:/Import tiddler before importing, so that the $:/Import tiddler can be re-purposed as a import cum directory tiddler by fdir@viewer.

  • It would be best if the droppable zone can support CTRL-V pasting via the clipboard. It’s a pleasant surprise to me that the text input box with an Import button works equally well, and a json tiddler text deserializer is not needed after all.

5 Likes

Thanks for sharing @jacng jacng, This is verry interesting to me. You seem to have discovered things I have not and I suspect I know a few you do not as well.

  • I think a lot of people may be interested in this, especially for desktop integration.

I am interested in your bookmarklet code too.

Off to test you solution, thanks

This is more than amazing! This can be converted to a great TW clipper.
The import of file is really helpful to create albums of images stored in an external folder.

1 Like

Yes, you are right indeed ! I was still using v5.1.23 when I last looked into dropzone. I remembered clearly there wasn’t much options for Dropzone. Indeed that is so for v5.1.23. Subsequently I did switch to a copy of v5.2.3 for Help references, but somehow didn’t re-check on Dropzone, so missed out the newer features altogether :frowning: Will look into it. Thanks!

Will be glad to hear your feedbacks.

the post Advanced bookmarking: combining LinkPreview API with TiddlyWIki? might be of interest to you. I still use this regularly in nearly all my wikis to drag a link into my TW and have a very nicely formatted tiddler created.

I didn’t know about this post, thank for the info!

I’m aware of those nice site image and description though because I captured them also in my personal python-based bookmarking solution. Not nearly enough web sites are using them unfortunately, mostly news sites where the info provides a pretty link preview when we share their news links.

Bookmarklet is actually in a better place to retrieve these info than a third-party site like LinkPreview (which also needs a API key to access) because bookmarklet runs in the context of the web site, and these info are in the html header of the web page based on The Open Graph protocol (ogp.me). The catch, as I found out, is that these info are not consistently defined by the many, many web sites out there despite the protocol standard, so a script to retrieve them will have to try different variations to cover most bases. I’m using a python module (GitHub - erikriver/opengraph: A python module to parse the Open Graph Protocol ) in my own solution. I suppose there is a javascript library script somewhere out there that can do the same in a bookmarklet.

1 Like

Oh, I should highlight that the json tiddler text deserializer is an outcome of the amazing works you and Saq have pioneered on the BibTex deserializer. Thanks, guys!

The great TW clipper is actually Tiddlyclip which I considered as the premium one-click do-it-all clipper. This is a lightweight alternative for when we don’t have the extension to do the heavy stuff, and is just one more of the many ways to import stuff into Tiddlywiki.

1 Like

@Mohammad

This bookmarklet by @jacng will do the trick. I plan to use it with spotlight and macy plug ins to create beautiful albums of local images

1 Like

@jacng This is excellent work. I think you should create a tiddlyhost demo wiki for users to try it out easily (which must be just a matter of few minutes to set up)

I use tiddlyclip extensively in my wikis. I think your bookmarklet can complement tiddlyclip. Tiddlyclip configurations which I use are shown in the home page of this wiki.

One thing I miss in tiddlyclip is saving multiple images from a website like twitter in a single step. Can it be done with this bookmarklet? For example, some tweets may have multiple images in it. Currently I have to clip each image separately using tiddlyclip. Is it possible to select all those images together and paste them into the dropzone or docked tiddler (docked tiddler is a new feature in tiddlyclip - if we click on the dock viewtoolbar of the tiddler, the subsequent clips are added to the docked tiddler only).
Tiddlyclip goes one more step ahead and save these clipped images in a subdirectory of the wiki as local images with the help of savemedia addon. Can you make use of the save media addon and save the images captured by the bookmarklet into a local folder. This might be difficult I guess.

These are just some suggestions which came to my mind. Thanks for sharing the bookmarklet with the community.

There is a sample “Bookmarklet - Copy all image links on a web page to TW.js” in both JSON files. When launched in a web page, it simply copy all the html image tags found in that web page as a list of [img[]] image links that can be imported as a tiddler into tiddlywiki. No idea if it works well with site like Twitter though.

As to copying out the images, browser extension/add-on like Tiddlyclip is the better approach as it “works like an extension of the browser” with inherently more access rights than bookmarklets. Bookmarklet runs in the context of the web site in comparison and face security restrictions with respect to local disk access.

Local images
By the way, if using bookmarklet to copy local image links thru the browser directory listing, instead of generating a list of image links in one tiddler, it is also possible to create individual image tiddlers with canonical uri containing the image link, if that is useful for your purpose.

On the other hand, for processing local image files, I think tool like powershell scripts, though platform specific, is better than bookmarklets as they have full disk access, can traverse the directory tree and also generate json files or copy stuff to clipboard for easy transfer to tiddlywiki.

2 Likes

Hi @jacng,
This is a nice and useful piece of work!
I have encountered many times to import tables from html pages, sometime tables are big and legthy!
Is your solution capable to capture tables from html like images and then import as WikiText tables?

Thank you

IF you can extract precisely the table data you need from the page, then you can import them thru tiddler/s, and manipulate them however you wish within tiddlywiki. I would think the length and size of the table is not a constraint, unless it’s one of those dynamic endlessly scrolling web page. The complexity of the task very much depends on the web page, no one-size-fit-all solution I’m afraid. Beware that some web sites may restrict bookmarklets from running.

The usual process is check the html codes of the specific web page using the developer console “More → Developer Tools → Element”. Find the table and poke around the table elements in the console to see how you can extract what you need, looking out for identifiers that help to pinpoint those table elements. Then craft the javascript bookmarklet to do exactly that and consistently.

An example is the file directory page in browser which is using html table internally, then look up how the bookmarklet extracts file links from browser file directory.

Images can be retrieved by simply looking for ALL the <img> tags using document.querySelectorAll("img") (which may include things like icons, buttons, invisible images, background/shading images, etc).

Similarly, you can pull up all the table rows by looking for <tr> tag using “document.querySelectorAll(‘tr’)” and loop thru the cells with <td> tag blindly and hope that is enough. But html table is also widely use internally in web pages for organising the page layout and presenting design elements, so you can have table within table within table :sweat_smile:

1 Like

A follow-up on using Dropzone with bookmarklet. I’m sharing my findings on post-Drop processing of imported tiddlers.

Dropzone (v5.2 onwards) has a “actions” clause for post-drop processing as highlight by TW_Tones. Mark offers an example in Bulk import local images into single tiddler as links - #24 by Mark_S

What i learnt:

  1. Pasting of json tiddler string using ctrl-v works if the dropzone is defined using <$dropzone deserializer="application/json"> No additional deserializer required :sweat_smile: I believe this is from v5.2.0 onwards. On v5.1.27 which I’m still using, I can’t get the dropzone to work with json tiddler string and had to fallback to the input box method.

  2. In the same thread ( Bulk import local images into single tiddler as links - #32 by Mark_S ), Mark highlighted that the “$:/Import” tiddler json text has been json-stringified again by the import process and could be ‘brittle’. That’s my experience as well while working on this example ( Bulk import local images into single tiddler as links - #39 by jacng ) Instead of working with the raw pre-import json text, the better approach seems to be to import the json tiddlers text and process the imported tiddlers instead. The text and field values of imported tiddlers are also easily accessible without having to parse the raw json tiddlers text.

  3. Surprising to me, checking for post-import tiddlers is straight forward and didn’t have to hook into the import mechanism. Tiddlywiki seems like magic to me sometimes :slight_smile:

  4. In this example on post-import tiddlers processing:

the bookmarklet plants a field value images.link.update="no" into the json tiddlers text which is detected by a $list filter loop placed next to the dropzone to perform the post-import processing:

'<$list filter="[has[images.link.update]search:images.link.update:[no]]" variable="unprocessImport" >'
         <!--- Process the imported tiddlers `unprocessImport` here ------->

The Dropzone also validates the dropped json text by checking for the planted field “images.link.update” before performing the import:

<$dropzone 
   deserializer="application/json"
   actions=""" 
   <$let rawtext={{{ [<importTitle>] :filter[search:text:["images.link.update": ]] +[get[text]] }}}  >
      <!-------------------  Images import found : perform import ----------->
      <$list filter="[<rawtext>!match[]]" variable="none">
         <$action-sendmessage $message='tm-perform-import' $param=<<importTitle>> />
          ...

The json for this example for reference: JsonTiddlersforImages 3.1.json (9.2 KB)

5 Likes

Thanks for this work and documentation @jacng

Yes, It already has a comprehensive set of solutions not evident on the surface, it helps to think of it as a platform, on which we can extend or incorporate many things.

  • I would encourage other open source projects to adapt their solution for tiddlywiki because then the audience and the prototyping etc… is simplified. With this platform they get;
    • Editors, and user interface, organisational methods, tags and fields, tiddlers that can be objects, filter and query language, single file and server, to name but a few.