Expanded ChatGPT Interface

well-noted · October 20, 2024, 8:42pm

Hey all, I posted a while back in ChatGPT for TiddlyWiki to suggest some fixes and improvements - - I have continued to play around with expanding its capabilities and I’ve reached the point where it feels like posting a separate plugin would be appropriate and useful

$__plugins_NoteStreams_Expanded-Chat-GPT.json (46.2 KB)

Please note, this is my first time packaging and releasing a plugin - - I’ve likely missed steps and may not have done enough to acknowledge the original plugins creator (@Sttot) and/or quite a lot of other things, though I have tested this on a blank Wiki and it works. Most of my experience with Tiddlywiki and javascript more generally has come from kit-bashing other people’s projects. I have recently decided to begin repackaging some of the (many) customized versions of plugins that I use, but will need some time to untangle the hairs and perform testing.

Please let me know if there’s anything inappropriate about sharing this plugin or the format of this post and I will attempt to correct.

Also, I value @linonetwo for pointing out in the original ChatGPT post that the new version of TW has a chatbot plugin under development which also has the (exciting!) ability to use local models - - as far as I can tell, that version is currently (and maybe even only intended) to be a chat bot.

That is only one of the features of this plugin, however. An AI agent within Tiddlywiki has been a long time dream for me, and I find that a chatbot, while cool, does not do quite enough to justify itself.

Here are some of the important features of this plugin:

Enhanced Memory: The agent can remember 5 previous messages by default, but this can be adjusted in the widget settings.
Reference Tiddler: This version creates a reference tiddler to store content deemed important about the user, enhancing long-term memory and personalization.
Default Model: Uses gpt-4o-mini by default, which can be changed in the widget however please note that this widget uses multiple API calls at its own discretion in order to complete a request, each one of these costing money
That said, even though I have been extensively testing using both simple and complex prompts over the last several days, the most I’ve paid on any given day, using this as the default model, was $0.60 and even then I was prompting, modifying the code, and prompting again.
Contextual Awareness: The system prompt provides context for the agent, including its TiddlyWiki environment and its capabilities to reference and modify tiddlers.
Extended Capabilities: The AI assistant can interact with TiddlyWiki in various ways, including:
- Retrieving tiddler titles and content
- Searching tiddlers by content, tags, or custom fields
- Adding, retrieving, and revising notes about the user
- Accessing conversation history
- Creating and modifying tiddlers
Streams Plugin Compatibility: By default, this plugin is compatible with the Streams plugin. The agent will refer to the content within a tiddler’s stream-list field in addition to the content of the tiddler itself, providing a more comprehensive context.
Specific Tiddler Reference: Users can reference a specific tiddler as context for the conversation by using the tiddlerTitle parameter in the widget.
Access to the Current Time: As in, the agent can check the date and report info about tasks that may be overdue, for example, or list tiddlers that were created on a particular day. It can also be used to create timestamps which can be used in text, fields, tags, or titles.

I have not attempted to play with the new versions chatbot yet – though I am incredibly enthusiastic about the changes in the new version, I have not done any work with SQLite as of yet and have no idea if this version I’m releasing would be easily converted to utilize the SQLite database or not. I can report that I developed and tested on a MultiWikiServer instance, though, and it seems to do data retrieval well. That said, it is compatible with both the prerelease and the current version, at the very least.

Alright, now that acknowledgements are over, let’s look at some examples so we can talk about the great advantages that can be leveraged by using ML within a TW Environment.

As you can see, this agent has been given the ability to create new tiddlers based on natural language prompts.

Here you can see a demonstration of 1) its conversational memory and 2) its ability to make changes to existing tiddlers based on the users natural language request.

You can see that the agent is not only able to recognize specific tiddlers and call information from them, but will also consider content from that tiddler’s stream-list when answering a prompt, if it deems that appropriate. It can do this by searching for a matching title or for the content in any of the tiddlers aliases fields

Here you can see that the agent self-selects information that it deems important and stores that in a reference tiddler, which it is able to use when it deems appropriate to responding to a user’s prompt. This allows one to “train” (deceptive choice of word in this context) the agent, by explaining important parts of one’s personal workflow. This is not specifically called out in the above example, but one can see that the agent does not simply add to this list, but can recognize if important information is already in the reference material, and (revise) the entry without modifying the rest of the content.

And I have also personally followed @jeremyruston 's example and used a small brain icon to “summon” the agent – this is not included in the plugin, but is simple enough to implement

It’s also possible to create a sidebar agent which directly references the storyTiddler – you can do this with

<$chat-gpt useCurrentTiddler="yes" />

{{$:/HistoryList!!current-tiddler}}

The above code is telling the widget to send the contents of the storyTiddler along with the prompts, and then it is transcluding the title below that for the users reference. This way, one might interact with their wiki without having to specify to which tiddler they are referring.

All of this together, I think, creates an incredibly helpful tool for interacting with one’s own content.

Some improvements I’m considering for the future:

Wikify responses within the chatbox
Give the agent the ability to open tiddlers in the story river
Expanded control over tiddlers (i.e. the capability to delete tiddlers – with a user confirmation – from the interface) For obvious reasons, I have been unwilling to add this feature until I was sure the plugin was stable-ish.
Expanded Reference capability that the agent might categorize its own reference material and to do so in a way that is more legible to the user (if one wants to review what it is the agent is using as reference)
Multimodal capabilities (image recognition, voice to text/text to voice, etc.)

linonetwo · October 21, 2024, 7:42am

These are features that I also want in core AI tools plugin, maybe you can PR them to the original PR (fork the repo, check out the branch, and create new branch based on that branch. And create PR , select the target branch to be that brahcn).

This way, we can make sure future AI features are on the same foundation. And as you may see, we are deprecating the old chatgpt plugin in favior of the core plugin one. Building new features based on it may not get enough discussion like dev in the core AI plugin.

I don’t have time for AI plugins recently, because I want to refactor the WYSIWYG editor plugin, so AI in it can works like AI in notion.

Scribs · October 21, 2024, 2:10pm

great work, this looks super useful!

CodaCoder · October 21, 2024, 4:03pm

@well-noted

That’s an extraordinary feat of accomplishment. Well done.

You would do well to mention Safety and Security, how prompts are tunneled, what stays local and what leaves the local machine.

The linked article here concerns me a great deal: LLMs NOT welcome in my wikis -- yet.

well-noted · October 21, 2024, 5:33pm

Thank you, @CodaCoder – I’m happy to discuss, with the caveats that 1), this will not be a comprehensive discussion of the topic at this time and 2) I do not use my TW to handle especially sensitive information, i.e. information I wouldn’t be more than mildly embarrassed about (such as personal reflections). It should go without saying that if you handle national security secrets, for example, you shouldn’t include this plugin in your wiki.

^{Minor note on point 2, however, my understanding leads me to believe that one of the most exciting parts of MWS would be that sensitive data kept in one bag could be viewed in the context of a recipe that also includes a general bag full of less sensitive, and the bag containing the sensitive data could also be kept separate from the recipe which includes access from a non-local LLM}

To your point, broadly:

When making an API call with OpenAI, inputs are no longer used for training unless one applies for that through a multi-step process. This is different than if one uses the ChatGPT interface in which one has to opt out of training data. There is definitely real concern for the gen-pop user who might be discussing sensitive work data with ChatGPT through the web-interface who may not have gone to the trouble of opting out - - it has been documented time and time again that LLMs are especially sensitive to “coercive behavior” designed to override their usual protocols. That said, I am not especially concerned in this context about data being passed through API calls being accessible via one of these methods of attack.

The particular behavior the article you link to in your post, (Invisible text that AI chatbots understand and humans can’t? Yep, it’s a thing. - Ars Technica) does make clear that OpenAI API calls do not include sensitive information within invisible text and therefore its users are not susceptible to the potential security threats related to that.

This, to my eye, is one of the most important reasons to opt for OpenAI (API calls specifically) at this time (though I do like Claude, for example, for certain tasks): Because they have “all-eyes-on-them” as the big name in ML tech (I don’t include Google because it is so diversified whereas OpenAI is “all-in”) they have an especially strong interest in finding strategies to avoid major security pitfalls such as the invisible text problem described. I have seen @linonetwo suggest Anthropic API calls - - at this time, I would not use Claude for contextual searches such as those enabled by this plugin.

Now:

There is a cache of information that is very relevant to this conversation but I have not mentioned yet - - pretty recently, OpenAI began caching common user prompts that are made through API calls. This cache is tied directly to your API key - - if someone had your API key, they could certainly access this cache through coercive prompting.

However, the larger point here is that, if you have structured your prompts well (in this case, you would want the system prompt, which is generally static, to precede the user prompt, which is usually dynamic) the agent will select to pull that information from the cache rather than handling it as original text to process (again, broad overview of the behavior). This means (generally) that, once someone has run the same prompt multiple times, they will avoid having to pay for processing the same system prompt multiple times, and they will only have to pay for the original content.

For example, I use a script that processes all of my reading notes from a plaintext format:

^{Exported from Boox}

Into individual jsons which are formatted for Tiddlywiki and contain all the tags:

^{Individual JSON format created by running python script which reorganizes information via openAI api call to gpt-4o-mini}

I have a pretty detailed (and extensive) prompt which I have developed over time to produce incredibly consistent results using even small models like gpt-4o-mini – The script I’ve developed passes this (static) system prompt through the API call first, and then extracts the characters from a .txt file in order to reprocess them.

Since I do this nearly every day to import notes into my wiki, the system prompt is cached (and tied to my unique API key) so the system recognizes that it can skip over that content – then I am only paying for the dynamic content which are my notes (usually this costs less than a cent a day even for multiple chapters across multiple books).

For our purposes here, it means that we can have a very extensive system prompt – say, one that explains how Tiddlywiki works, how the agent should respond to different situations, a full and detailed list of its capabilities, a set of fallback instructions for handling unique or unfamiliar situations, etc, without the cost becoming excessive in the long run, and, as long as you don’t share your API key could even include personal information.

It is therefore important, when developing a strategy like this, to make sure that all static prompting precedes dynamic prompting

CodaCoder · October 21, 2024, 5:43pm

Thank you. Much appreciated.

well-noted · October 27, 2024, 5:32pm

Announcing the release of the improved version of the Expanded ChatGPT Plugin

$__plugins_NoteStreams_Expanded-Chat-GPT (.8).json (73.2 KB)

Updates in this version:

Wikified text: Chat responses in this version are wikified to include internal links and tiddlywiki features
Increased Context: Agent has been given more context for what it means to be a Tiddlywiki agent, and has been given explicit understanding of how to perform tiddlywiki operations including filtering, tables, lists, etc.
Additional Macro: Clarify Text: <<clarify-text>> macro which takes the content of the storyTiddler and provides a pass at editing the content
Additional Macro: Text Completion: <<yadda-yadda>> macro which takes the content of the storyTiddler and appends the text with additional text to complete the sentence logically.

In addition to these macros, there are streams-specific versions <<clarify-text-streams>> and <<yadda-yadda-streams>> which perform roughly the same operation within the context of a Streams edit box and which can be added as an actionMacro to the streams context menu or can be included as a keyboard trigger (as demonstrated below)

Will be opening up a discussion about these and future ways of incorporating ML into Tiddlywiki going forward, but want here to briefly demonstrate some of these abilities:

^{Here you can see both a demonstration of the ability to perform a filtered list operation, and also to provide that list as links}

^{And a demonstration of how the model can split large amounts of content into multiple tiddlers with contextual naming schemes}

^{Here you can see the ability to combine a contextual search with the creation of functional buttons in its answer}

^{From raw data can also produce logically formatted tables. Not seen here, the user can prompt the agent to provide the wikitext used to generate the table, and copy/paste that to use the table anywhere}

^{A basic implementation of the Streams plugin, using the context menu}

vivaldi_NPMpGgEXUH
^{A keyboard-activated implementation of the streams-specific plugin <<yadda-yadda-streams>>}

signal-2024-10-27-072729_002
^{A demonstration of the <<yadda-yadda>> macro when implemented as (packaged) button from the edit box}

signal-2024-10-27-075712_002
^{A demonstration of the clarify text function implemented as a (also packaged) button from the edit box}

Will post some more soon about ideas for future development, and some of the ways that these new features could be implemented functionally into one’s workflow.

All earlier terms and conditions apply

it appears stable on multiple tests I’ve run and so far I’ve not had it make any major overwrites (though it certainly has the capability!) so please always back up.
Works (for me) on the current version (5.35) and the prerelease of (5.36) running MWS
A brief word of warning about the new macros – if you call them directly in a tiddler, or do not properly """<<encapsulate>>""" the action within its trigger, the macro will trigger immediately and repeatedly and you will be unable to delete that tiddler from the dropdown menu (because it will be constantly updating), requiring you to perform an operation to delete it.

CodaCoder · October 27, 2024, 6:41pm

What are the prerequisites? Do users need to have a paid license?

well-noted · October 27, 2024, 10:49pm

You will just need an OpenAI API key and you’ll need to fund you account. Go into the config file or the readme for the plugin and you will find a box that you can fill… or you can just replace the content of the api-key tiddler. As mentioned above, all the functionality described works with the Gpt-4o-mini model which is the cheapest and which I’ve set as the default – I’ve spent less than
3 dollars altogether running dozens of API calls a day to develop and test

well-noted · October 28, 2024, 12:21am

No prerequisites in terms of non-core plugins as far as I am aware - - once you plug in the API key it should work out of the box. If one is additionally using the streams plugin, the two streams macros that I mentioned should also work, though that would require the user to take the extra step of adding the macro to the context menu template and/or adding a keyboard shortcut to the streams editor template.

well-noted · October 28, 2024, 6:04pm

Minor update:
$__plugins_NoteStreams_Expanded-Chat-GPT (0.0.83).json (75.3 KB)
^{Improved chaining of actions and made this capability more explicit to agent}

well-noted · November 2, 2024, 5:17pm

Major release of the Expanded ChatGPT Plugin

$__plugins_NoteStreams_Expanded-Chat-GPT (0.0.9).json (92.9 KB)

If anyone has any problems with this release, I’ll be happy to address. Can report it is working for me on 5.3.5 and 5.3.6 pre release.

Updates in this version:

Timezone now automatically corresponds to user’s desktop
$:/plugins/Notestreams/Expanded-Chat-GPT/User-instructions tiddler, included in config, allows user to input custom instructions to be sent along with each query
Model switcher

Dropdown menu on interface allows for switching between models
Added $:/plugins/Notestreams/Expanded-ChatGPT/model-list tiddler, included in config, which allows user to add or remove any OpenAI models

Multimodal

Agent has been given ability to query dall-e-3 for image generation upon request
Vision Capabilities
a) Transclusions of image tiddlers are pushed for analysis
b) Upload button allows users to reference image files from system without importing
c) Add listener event for “Paste” which allows images to be uploaded by pasting into chat box

Custom Instructions allows users to input their own instructions to be pushed with each query

This is kind of a big update, so let’s get into it:

As promised, a space has been provided in the config that allows the user to input custom instructions such as their personal shorthand, which will be sent immediately following the system prompt each time a query is sent.
^{See above conversation of caching}

This should give the user far more control over their outputs, and, in that vein,

a model selection dropdown has also been included in this version. This allows the user to select from any openAI language model that exists currently.

Models can be added or removed from this list from the config menu now, allowing user to select with more detail and consideration.

These are the updates designed to give the user more control over their agent and the outputs they might expect. The default is still gpt-4o-mini, which I have found is excellent for most tasks – however, this will allow the user to switch to gpt-4o or any other model they feel might be better suited for a particular task, and easily switch back without interrupting a conversation.

Now for the fun updates:

As you can see, the agent itself has been given the ability to call for image generation and render those in the conversation history. Rather than the user prompting an image model directly (which I experimented with and decided against) this allows for a much more natural flow of conversation and allows the agent more flexibility.

Additionally, this update includes vision capabilities, which allows the model to recognize images either as transclusions or as uploads from the file system

I don’t really have a way of representing this, but one can also upload files by copying an image and then pasting into the text box.

This can allow for some very useful abilities:

As one can see above, I doodled in paint a nonsense chart – the model was able to accurately depict the relationship described visually through the creation and tagging of original tiddlers.

Even with limited context, this ability performs quite well – additional context should result in more refined results.

And here you can see these two skills being implemented in one query/response block

This is prettymuch as far as I’m planning to go with this phase of the project: As a final step I will be going through the plugin to do some cleanup of the code and maybe add a few fixes and new features for a version which I’ll release here, but wont necessarily be in a hurry to release unless users report a bug or have a request I may have already fixed in the experimental version.

The next most obvious phase I foresee is the multimodal ability to work with audio - - since I think MWS is going to be a major step-forward for me, in terms of incorporating non-tiddler files into my wiki, I am quite comfortable waiting for the official release before focusing too much on this

Cheers!

CodaCoder · November 2, 2024, 5:54pm

You may be surprised, but I’m impressed with Custom Instructions. Great idea.

well-noted · November 2, 2024, 6:42pm

(@CodaCoder thanks for reminding me to add it to the list)

And here I thought the best touch was pasting to upload

Mostly I just got tired of opening the JS in order to add new instructions

well-noted · November 3, 2024, 3:44pm

Minor Fix - 0.9.1:

$__plugins_NoteStreams_Expanded-Chat-GPT (0.9.1).json (95.3 KB)

^{Realized Transclusion pattern for images was interfering with text, minor fix which distinguishes images from text and, since I was in there, also pushes a stream-list for text tiddlers, if it exists}

well-noted · November 6, 2024, 9:17pm

I know I’m going against what I said only a few days ago, but I’m releasing a not-insignificant upgrade,

because of this plugin’s mention in the Tiddlywiki Newsletter likely to bring it more traffic, and
because I finally had the agent misunderstand and perform a task that was frustrating enough to undo that I felt it amounted to a “fix” that I could not ignore

$__plugins_NoteStreams_Expanded-Chat-GPT (0.9.3).json (116.9 KB)

The “fix” in question is technically a new feature, but probably deserves to have been included earlier in the process – an undo button will now undo most any action the agent can perform. Just a reminder that this includes Creation and Modifying of tiddlers.

vivaldi_u5bwbgW5pX

As you can see here, one of the “experimental” features that’s packaged with this update is the ability for the agent to open and close Tiddlers in the story river. Another, which can also be affected by the undo button, is a function for Renaming Tiddlers

Behind the scenes, there’s another improvement that I believe might qualify as a “fix” – the agent must now perform a process verifying its actions before it reports that it has performed them. This eliminates a problem in which the model would either

hallucinate that it had performed an action without even attempting to do so, or
attempt to perform an action, following the methods for doing so that it has been provided, and reporting that the action had been completed based on the assumption that calling the function was always sufficient for an action to be completed.

This latter point may be a bit confusing, as it’s not unreasonable to assume that functions would consistently perform if they are constructed well – to elaborate, I will use another improvement as an example:

There are occasions in which the agent might be responding to “partial hits” – let’s take the case of “Please add the tag Yolo to all tiddlers with their Generation field value set to Millennial”

This type of complex task requires multiple steps which the agent first breaks down into a list performs one at a time.
It is very possible in a situation like this that only some of the appropriate changes are performed, even if the model has correctly identified all the changes that need to be made.
In such a case, it would be reasonable for the agent to affirm that modification of the tiddlers has taken place, because it is accurately reporting partial successes.

A change to the modifyTiddler function has reinforced the agents ability to perform this kind of multi-step task – Additionally, the verification step makes sure that the agent confirms that the action has actually been performed, attempts a maximum of 3 times if it has not, and then informs the user of the results, rather than arbitrarily reporting.

This is a feature that still very much qualifies as experimental. Validation can become complex quickly and balancing that against wait times and timeouts and retries is all a bit complicated… While it’s definitely stable and safe to use at this point, there’s definitely still refinement I’ll be making to this process before it’s completely seamless.

While I have a few other “fixes” along these lines that I am working on, in addition to UI features and prompt refinements that I would like to make, I feel that it is timely and valuable enough to release this improved version for people to experiment with.

Without going too deeply into those fixes/features, it might be worth mentioning that I have begun with some experimentation that would allow the agent to perform the undo operation however, for the time being, it is incapable of that.

I think the button is a more valuable feature, frankly, but imagine a situation where the agent has performed an incredibly complex operation involving more than a dozen unique tasks in one query. The ability to tell an agent to undo “the last action,” could, effectively, undo dozens of actions at once, rather than requiring the user to sit and click the button dozens of times – a possibility I consider relevant.

JanJo · November 11, 2024, 7:04pm

Could this Plugin use handwriting recognition in you have a ChatGPT Account?

well-noted · November 11, 2024, 7:19pm

@JanJo, one will need to have an API key to use this at all, but other than the account associated with that, there is no need for a ChatGPT subscription or anything like that.

If one has that, it will perform handwriting recognition, yes, you can see in Expanded ChatGPT Interface - #12 by well-noted that it has taken an image of my (pretty poor) handwriting and done a solid job of interpreting it and setting the text of the tiddler accordingly.

well-noted · November 11, 2024, 7:24pm

On a separate note, the validation and verification steps that I am trying to to incorporate for more complex tasks are not quite there yet - - I intend to do further work to refine that, but any suggestions from the group (either specific code or abstract ideas about how the operations would go) are greatly appreciated.

In the meantime, the plugin continues to perform very well and I’ve incorporated it into many of my daily tasks – one should just verify that tasks have been performed correctly, which is much easier since you can prompt the agent to open the tiddler for you to validate, and the undo button works very consistently.

JanJo · November 11, 2024, 7:43pm

Could the API-Key be hidden to make the function accessible without revealing the Key?