AI model tuned to know TW by reinforcement learning

Up to last week I thought that in order to train an AI to gain expertise in a particular field, you needed to provide a huge amount of sanitized high quality input data.
As an example, to fine tune an AI so it understands wikitext and how to use TW filters and so on you had to get a lot of data covering this topics. (I recall there is an iniciative to collect this sample data)

Buuuut, what about if a functional TW is all what is required?

Looks like Reinforcement Learning on DeepSeek lowers the barriers to train models, and at the same time GPT Operator suggests that an AI might autonomously interact with browsers.

So, we might just need to give a TW User Interface to an AI so it can play with it and then fine tune by using reinforcement learning techniques.

I’m consistently confused by these conversations. What is it that people are trying to achieve with AI/Tiddlywiki that they cannot already?

The documentation for TW could definitely use some work, but it’s sufficient that if you provide an AI references it’s able to handle even complex operations. And AI already is trained on an enormous amount of JS.

I agree, however, that operator would be useful if you were trying something very complex and requiring multiple iterations to fine-tune – in that case, I would love to be able to leave the AI running at computer-speed rather than me having to test a thing, and then communicate back to the model the results

My understanding is that Operator is not quite at the level of being able to do this well, YET. Anthropic’s computer use is supposed to be better (though still needing development and supervision) – however I haven’t attempted to use it yet because anthropic’s use of user data is not as robust as I’d like.

You can easily get whirpooled into the unknown when creating complex functions/wikitext structures, on occasions I spent long hours even days to come up with solutions.
Documentation is great once you have a clue of what you might need, sometimes you don’t even now that and go ask for help in this Forum.

So this AI TW trained thing would serve speed up this process, and flatten the skillset required to create custom solutions for your TWs

1 Like

Hmm. Maybe. What model are you using?

It’s not my intention at all to disregard other people’s experiences – but I have found that, when using a sufficiently advanced model such as Claude 3.5 Sonnet, most of the times I get sucked into extended whirlpools such as you are describing are actually rooted in my inability to conceptualize or articulate the actual question that I am trying to solve – and that’s something that no amount of training will be able to resolve.

These aren’t tools for replacing thinking: It’s my experience that oftentimes, even if I get stuck in a long conversation “unnecessarily,” I actually learn something about some assumptions I have made or unconscious wording that may be inappropriately framing the problem.

So, while I would love to find that a model trained on an extensive TW data actually results in a far superior experience, I have serious doubts and want to challenge people to think more broadly about this:

  1. Models which have the ability to search the web can now have access to all the documentation – point a model to the TW website, enable deep search.

  2. As you point out, a model that is able to interact with an environment is going to be far better at gathering data about and responding to what’s happening than the average person who just trying things and communicating back to to the agent, acting as an intermediary.

  3. On the other hand, humans as an intermediary can, in many situations, be a vast improvement, avoiding many logical pitfalls that transformers fall into

I wonder, is the goal really that we have tools that give perfect solutions to every problem on the first use? And is that really realistic?

So far in human history, there have never been any tools that have worked like that – all tools require skill and practice, and complex problems always require strategic application of those skills and practice. Maybe AI is going to be the first tool that exceeds this “limitation,” but I believe that may be an unrealistic expectation for a glorified hammer.

What is it that people are trying to achieve with AI/Tiddlywiki that they cannot already?

Having used GitHub Copilot and Cursor in other programming languages for about 2 years now, they can definitely save a lot of time in at least three situations:

  • You need to write something kind of fussy and large, but not complicated – the LLM can quickly draft something that works which you can then tweak as necessary
  • You have a bug and know you’re missing something obvious, but can’t spot it – the LLM will often (though certainly not always) immediately see the problem
  • You know exactly what you want to do, but can’t quite remember the right widget / API / pattern – the LLM can give you the example immediately so you don’t have to hunt it down in the documentation

I guess it has been a little while since I tried this, but last time I tried to do any of this with TiddlyWiki, the LLM fundamentally misunderstood how TW worked and kept giving answers that had invalid syntax, so you can’t do any of these things with wikitext at all reliably.

I read recently that one of the standard frontier models was able to write competently in a language from New Guinea that only had a couple hundred speakers (and no known text in the training set) given just a grammar and dictionary fed in as cached context. So I’ve been inclined to try using a data dump of Grok TiddlyWiki and/or the documentation as context to Sonnet (maybe Gemini if 200,000 tokens isn’t enough), but haven’t gotten around to trying it yet.

3 Likes

WikiSage has all the pieces to achieve this, though I’d probably wanna push an update before a test on that scale :slight_smile: However, this usecase is exactly what I’ve had in mind, so I’ve given the model reference material on Tiddlywiki, and it has access to all the tiddler-titles, which it can reference if the title seems relevant to a search. It then caches some of this information for chain-of-thought reference, so it doesn’t have to go through the same search and lookup process every time.

I’ve also been working to implement a “handoff” so if a query goes beyond the max tokens, it will summarize and hand off to “another” model, resetting tokens.

Some work could definitely be done (by me eventually, if nobody else undertook it) to rewrite Grok Tiddlywiki into prompting language, which would make it more effective for the model. Though not necessary strictly, especially with a sophisticated, “grown-up” model like sonnet3.5

I would say that both of these are too niche for Tiddlywiki code. I access models through You.com which includes websearch, and they are designed for broader knowledge application – they seem, therefore, to be more capable to me at implementing TW language than the same model queried the same using Copilot

To be clear, when I tried this with TiddlyWiki I did a query of a similar type directly against Claude Sonnet, I didn’t try to use Cursor.

1 Like