Creating a "What's Nearby" feature

Scott_Sauyet · December 28, 2022, 10:33pm

I’m at the very beginning of considering ways to add some sort of “What’s nearby?” feature for my content Tiddlers, and I have several questions.

I’m wondering if there is useful prior art to look at. Are there ready-made tools for this?
I haven’t figured out the presentation, but to do anything worthwhile, I need to be able to collect the data first. These are the factors I’m thinking about, from strongest connection to weakest, for non-system, non-shadow Tiddlers, A and B:
1. A transcludes B / B transcludes A
2. A is tagged B / B is tagged A
3. A and B both transclude C, for some C / both A and B are transcluded by some C
4. A links to B / B links to A
5. A and B are both tagged C, for some C / some C is tagged both A and B
6. A has a field with the value B / B has a field with the value A
7. A has a list field which includes the value B / B has a list field that includes the value A.
Am I missing important other connections? Would you argue for a different ordering of them?
I’m sure I can create a reasonable model from these, then run something like Dijkstra’s algorithm to find the closest tiddlers to my current one. I imagine this would have to be a JavaScript macro, since I can’t imagine coding the closest neighbor algorithm in native wiki tools. I don’t know how to do that yet, but think I can figure that out, and I imagine the APIs available will trivially give me access to all but #1 and #3. Are those also easily available?

Are there other suggestions for how I might go about any of this?

TW_Tones · December 28, 2022, 11:26pm

I imagine in your above examples A is the current tiddler and B is any or all tiddlers? Is this correct?

Very inspiring, some hopefully useful contributions; Motivated by your idea leading to imagining very powerful possibilities for TiddlyWiki information/knowledge analysis.

Most of this kind of information can be found behind the Info Icon on the tiddler or made available using filters and plugins that provide references and backlinks etc…

In some ways this would be everything nearby
- In each different case it should be able to be represented with a filter and a count done of the number of types of each relationship.
If you then assigned a weight or used your algorithm to reduce the list to “most Nearby”. But then it looks like the weight is the steps / distance between A and B.

This is an expansive question so I “expect more will arise over time” but one could include with respect to the current tiddler, all that can be represented by filter(s).

Tiddlers whose title or other fields contains words in the current one
- This includes tiddlers for which the current one is a prefix or a suffix
Similarly search for words from the current tiddler found in other nominated fields eg keywords, or all fields
Something like the freelinks plugin, detecting titles found in the text.
- Rendering a/all tiddlers and searching the result, to see if the current tiddler is getting listed (via list widget) in another tiddler.
Finding the current title or part there of as/in a fieldname
The relationship between A and B in a hierarchy, the “Kin Operator” would help here.

I do think it would be helpful to have a shortest path algorithm but what gives you the weights in this “Nearby” case?
- Is it the number of steps to the other tiddler? A to B?
- This is just a count of the number of tiddlers between A an B is a given list/set.

I can Imagine it in wiki text, macros and widgets, and personally would even want to avoid JavaScript because; the components used to build such a solution, could be used independently, adding additional features.
If you were to write a Javascript macro, I think it best making it highly generalised so you simply hand it a list of tiddlers (Via a filter) and how to determine the weight (another filter) then the tittle for tiddler A and B between which to find the shortest (and longest) path etc…
- This could then be used many ways beyond your initial “What’s Nearby” feature.
- It makes me think of the “kin operator” which addresses hierarchical sets, but a “critical path tool” could be a tool for networked relationships. Imagine shortest path filter operator.

A shortest path filter operator would be a powerful addition.

if we assume various possible “sets” can be explored with a given filter, and both A and B are in the set, then simply extracting the distance between them will give us a number representing the distance.
I think I can write a filter for this.
Keep in mind the sort order will also influence the distance.

TW_Tones · December 29, 2022, 12:05am

Fureth;

The Distance between two items (count) in a set is one way to count then find a shortest path, but there will be applications where there is some kind of metric stored in each tiddler in the set, such as a weight that we would want to use instead, so the distance between two tiddlers in a list would be a sum of the weights found in all tiddlers between them.

An example may be an average milliseconds between nodes in a network or the internet. eg on windows at command prompt tracert google.com

Scott_Sauyet · December 29, 2022, 6:58pm

Well, I’m imagining that this would run for every pair of tiddlers if this runs at startup/reload, but it could also run for every B when I open A. I’m still in the dreaming stages.

Yes; I think I can get the others easily enough. I was wondering mostly about transclusions. I can do a regex test of the text of a tiddler, but was wondering if there was any more ergonomic API to list all the direct transclusions of a tiddler.

If I try to turn this into something fairly generic, this would make sense. It’s not likely important to my current sceanrio.

Ditto all these

I’ll have to look up the Kin Operator. Thanks!

I’m certain its beyond my current skills with wiki text. But if I do pursue this and get it working in JS, then I will come back to this group with my results and ask if there are clean ways of converting it. Even as simple as Dijkstra’s algorithm is, I simply can’t even imagine how to begin coding it in wiki text!

It would make sense to me to be handed a collection of tiddlers and return a structure which gives the distance between each pair of them according to whatever metric we come up with, OR to receive a specific tiddler and a collection of others and return the distances of each from our specified tiddler.

Yes, I haven’t spent real time thinking of the other cases this might offer. As I said, I’m still in the dreaming stage.

I would expect this to work the other way. We would create a metric based on some of the criteria I listed and/or additional ones as you suggested, and the application of that metric would be what it used to find the shortest path.

I’d love to see it!

I would think there is something fundamentally flawed with the metric if this were the case.

TW_Tones · December 30, 2022, 1:39am

This has being done before and relink is also aware of transclusions. I am sure you will find ways in the forum
The simplest method is searching for the transcluded tiddlers title in text fields perhaps with the }} suffix
However you could intentionally add additional information during the transclude process, eg store all transclusion tiddler titles in a list field of the current tiddler.
- You make a tool so easy to use to transclude tiddlers that also does some additional work for example use the Link button on the editor toolbar to insert a link, a modified version would insert the transclusion and trigger an action to also add the transclusion to a list field or something else.
- [Edited] Then adding the list field to relink config you get “referential integrity”

I will try if possible but it would be of the form

filter in which A and B are members, then allafter[A]allbefore[B] +count[]]

Or rather than count you may use "map" and calculate/accumulate the weights.

Not if the paths and relationships between tiddlers is based on a filter, because filters results are sorted in someday whether asked for or not. See the recent thing about how using indexes operator returns sorted results The Indexes Operator
The power of a shortest path tool is amplified it it responds directly to tiddlywiki filters.

My dream would be I can avoid this at startup and have it presented only in context, when looking at the tiddlers or;
Or in a dedicated whole of system view/report with or without additional filters for specific views.
It really is good to stick with TiddlyWiki’s always up to date model. You can always take a snapshot if you want to record some “as ta” a moment in time.

Scott_Sauyet · December 30, 2022, 4:11am

Ah, yes, of course. I am using relink. I will see if I can follow its implementation.

I believe that’s what I’m trying to do!

Right. I did think of this. But I was wondering if the JS API provided something simpler.

That sounds rather fragile. I for one, don’t use toolbar for much except the “excise” button. Moreover, what would cause this information to disappear if I deleted the text that included this

TW_Tones:

Scott_Sauyet:

TW_Tones:

I think I can write a filter for this.

I’d love to see it!

I will try if possible but it would be of the form
filter in which A and B are members, then allafter[A]allbefore[B] +count[]]

That seems to presuppose that you’ve already done the difficult work of computing the distances and applying some shortest-path algorithm to the the result. To me that is the core of the matter.

My current environment is a read-only documentation wiki (editable in the normal way by people in the know but not for end users.) So the dynamic behavior is not essential. But I think that since the presentation of the information is likely on viewing a tiddler, having it run at tiddler load would be fine if it’s efficient enough. But, there is a secondary use I can imagine of something like an automatically-generated mind-map style visualization of the entire wiki. Again, if it’s fast enough, it would be fine if that were generated on the fly, but as there are O (n^2) distances to calculate, this might be a problem if the number of content tiddlers is large enough.

TW_Tones · December 30, 2022, 5:25am

There is no disagreement here, more a discourse on a shared interest.

Not exactly. Lets say I was looking for the the shortest path by either of two means
- If both A and B have the same tag, how far apart are they? Return a distance number
- If both A and B are in the same table of contents how far apart are they? Return a distance number
- The smallest number is the shortest path, of the two possible paths.
Now add other filters that measure the distance between the two tiddlers by other methods, and again find the shortest path.
- The embedding of multiple filters in a single string as discussed here How to Delimit more than one filter in a string or field? - #15 by TW_Tones may help here.

One advantage of this approach is you can choose to display the distance or even the path according to the different path algorithms inline.

or share multiple nearby items.

understood. you could also do this with the snapshot when editing method if the start action is too slow. It will just load the result or working values at last save.

Yes we both have an interest in finding similar information for different reasons. I am coming from this perspective Navigating complex tiddler relationships, a discussion for the enthusiast where we can actually navigate alternative paths interactively and the idea of adding “shortest path”, or “shortcuts” is very interesting.

Its not an overnight project for sure, perhaps a slow burn project.

One other conceptual approach is identifying when a relationship is a “long path” then selecting from a set of alternative methods to add a new shortest or shorter path.

Imagine a tiddler with little or no relationships to other tiddlers actually asking the user/editor for more information to improve this relationship model. eg please categorise, link to or add a related tiddler.
This makes use of the fact that Humans tend to be the smartest component in human computer interactions, and the “right system” will highlight missing information or possible relationships.

twMat · December 30, 2022, 9:46am

It is not fully clear what “nearby” refers to and your use of the word “art” in the quote above further confuses me. But then your text does seem to refer to relationships between tiddlers so then perhaps the Kin filter op by Bimlas is what you’re looking for?

Scott_Sauyet · December 30, 2022, 12:54pm

This thread was meant to clarify that somewhat fuzzy notion. The basic idea, though, is that inherent in the linking and tagging structure of a wiki is some notion of relationship between parts. If two tiddlers share a common tag, they likely have more in common with one another than do two random tiddlers. If they also both link to one another, then there is an even closer relationship. I’m hoping to capture that.

Those is mostly an attempt to automatically capture the notion that, for instance, the match Operator is conceptually closer to the join Operator than it is to the Text-Slicer Edition.

My use-case is for a wiki documenting a relatively complex software system. While there are plenty of explicit linkages between tiddlers, it would be nice to offer an unobtrusive, autogenerated, “Related Ideas” section to my content tiddlers.

Sorry, I thought the term prior art was relatively well-known. It simply refers to preexisting related work. While used often in discussions of patents, it’s also common in software development circles:

“I’m thinking of writing a non-hierarchical, heavily linked knowledge management system.”
“Oh really? Have you looked at the prior art?”
“Like what?”
“Obsidian, Roam Research, and especially the wonderful Tiddlywiki”.
“No, I’ll have to check them out.”

Here I just meant to ask if people had already created such tools.

That’s clearly related, but it seems designed to work with well-defined, explicit hierarchies. I’m hoping to work with more implicit relationships, to use the mechanisms of linking, tagging, and transclusion to make clear that the match operator is conceptually closer to the join operator than to the text-slicer edition.

Scott_Sauyet · December 30, 2022, 5:13pm

Absolutely. I asked the question mostly to find out whether this was already done. While trying to do it myself would probably improve my own skills, I would be quite happy to find out that someone had already created a tool like this and I could just use it.

Secondarily, I would love to hear about other types of linkage than the ones in my OP. If I do try this, I would probably eventually make this sort of thing configurable in some manner, but I would prefer to have a relatively stable set of useful relationships to begin.

I’m imagining something slightly more mathematical, and choosing the (inverse) distance as a weighted summation of several different factors. Let’s pick a few of my original ideas, and assign weights (fairly arbitrarily for now):

A is tagged B, or vice versa: 8 points
A links to B, or vice versa: 5 points
A and B are both tagged C for some C: 3 points for the first such tag, 1 point for each additional tag
A has a field with the value B, or vice versa: 1 point

Now we take three tiddlers, X, Y, and Z.

For X and Y, we happen to have X links to Y, for 5 points, X and Y are both tagged T and both tagged U, for 3 + 1 = 4 more points, and a raw score of 9

For X and Z we happen to have both are tagged U, for 3 points, and Z has a field with value X for 1 more, and a total raw score of 4.

We take reciprocals to find edge-weights of X-Y and Y-Z or 0.111111 and 0.25, respectively. So X is closer to Y than to Z.

But when we compare X with tiddler W, we find none of these factors come into play, and the raw score is 0, meaning an infinite edge-weight for X-W. They are simply not neighbors.

We can use these values directly to list direct neighbors, and sort them to find closest neighbors, and we can use Dijkstra’s algorithm to extend this to shortest paths, and use these for visualizations or for “You might also like” recommendations.

This algorithm is by no means set in stone, but it feels like a good starting point… with precise choice of commonalities and their weights still to be decided. I would not be surprised to find that it’s not a true mathematical metric, that it might violate the triangle inequality, but this does not concern me much.

This is a fascinating usage, far from what I was considering, but still obviously related. It’s amusing to think here about Humans being the smartest part of the system, when my initial goal is to algorithmically tease out information that the human writer(s) have not managed to make explicit. But it still works well.

TW_Tones · December 31, 2022, 1:21am

I understand this somewhat but see you have done more thinking about this. I still envisage this as a set of filters which may include the weight and reciprocal within them (or referencing a parameter eg multiply{$:/config/tag-weight})

To me distance is anything you wish it to be from a simple count, multiplication factor or to “something that changes with the weather”.

A worthy goal yes, in someway it is automating what the user would do if he/she could and had the time.

I raised this a few time years ago, the idea of analytics about the tiddlywiki handed back to the user interactively that helps them learn about and build on what knowledge/information is already in their wiki. A virtuose circle.

I also think we can,
- A lot can also be done with dates and relative times comparisons.
- how we navigate, can also capture information we can use to help us navigate in the future. Follow paths, develop stories, document journeys through our data.
- Using effective metaphors to help make complex relationships easier to understand.

Scott_Sauyet · December 31, 2022, 4:09am

You may well be right. To me, though, that’s an implementation detail that I’m willing to hash out later. You clearly know a lot more than I do about filters and what can be done with them. Most of my programming over the last fifteen years, and a fair bit for the ten years before that has been in JavaScript, so thinking in its terms comes naturally to me.

I’ve seen enough of your posts to recognize that you’re a strong advocate of doing everything possible with the tools provided by TW, and almost never stepping out of the system to the JS host language. I don’t yet feel I have the experience to make a stand for or against this notion, but it’s definitely not ingrained in me the way functional JavaScript is.

My original training is in mathematics. So I would love this to be a metric, which by definition has several useful properties. This would require that distance (x, y) === distance (y, x), for all x, y, distance (x, x) === 0, for all x, distance (x, y) > 0 for all x != y, and if we want a true metric, distance (x, z) + distance (z, y) >= distance (x, y) for all x, y, and z. But I would compromise on that last if necessary.

There are several ways in which this might not be possible. For instance, there might be disconnected subsets of tiddlers which have no paths between them. But that doesn’t worry me overmuch. For our purposes, we can simply call those distances infinite and be done with it.

Or even more than that, it might help the writer actually learn some previously unrecognized facts about the data. Again, that’s not part of my core goals, but it would not surprise me to have that happen. As you say, it could be a virtuous circle.

Probably not for the data I’m working with, but I can see it in other circumstances. It might help me with my own notes wikis.

In my few-writers-many-readers scenario, I’m not sure if that would really help. But it could be very powerful for single-user wikis.

If you’re suggesting that this can be automated by TW tools, then TW is a lot closer to Artificial Intelligence than I’m really comfortable with.

Scott_Sauyet · January 3, 2023, 4:45am

Just added topic #5700 with a very early proof-of-concept of this idea. I’d love to hear any feedback people have.

Cyrill_Andreani · January 3, 2023, 9:55am

Hey,
your idea remember me mats http://seealso.tiddlyspot.com , but without points appreciation !

Scott_Sauyet · January 3, 2023, 2:12pm

I’ll take a look. Thanks.

CodaCoder · January 3, 2023, 5:56pm

This. 10,000 times this.

Scott_Sauyet · January 11, 2023, 3:10am

I finally got around to looking at this. There is definitely an overlapping idea. That is a much simpler concept, I believe, but it’s still interesting. If I have it right, that is simply allowing you to specify certain tags for a footer treatment. If the current tiddler has one of those tags, then a footer is shown listing the other tiddlers that also have it. If the tiddler has multiple such tags, it gets multiple footers.

There’s no information here than you cannot already get from the tag pills, but it gives more emphasis to it for the specified tags.

My version is trying to do something similar, but based not on a specified list of tags but rather the entire linking/tagging/field structure (eventually to also include transclusions.)

This falls somewhere between my notion of the info section, which gives much information about tiddler relations, but very explicitly displaying certain categories of relationships, and the current concept, which is to try to extract relationships, especially tiddler distance, implicitly, based on combinations of all these explicit relations.

Nearby might help writers learn things about their data that they didn’t realize, but its main goal is in situations where the reader of the wiki doesn’t care about details of tiddler relationships, but would still like guidance about things close by. This will usually imply readers who are not also writers of the content, although for large or long-lived wikis, this might still help writers.