How to Filter A + (B-A)? - and AI Generated Code

TW_Tones · July 10, 2024, 5:17am

I replied there

twMat · July 10, 2024, 6:34am

What, specifically, is not clear in my OP?

twMat · July 10, 2024, 6:45am

Oh! That’s very neat and seems very efficient. In my ignorance about these things I’m guessing that; appending is just a matter of the system changing a pointer or two and then unique does a more costly traversing to compare titles but that can’t be avoided given the requirement.

saqimtiaz · July 10, 2024, 6:57am

There is unlikely to be a significant performance difference between this filter and the one I suggested, as the costly bit is the unique operator. So I suggest using whichever fits your cognitive pattern best.

TW_Tones · July 10, 2024, 8:26am

It is not that we can’t get clarity on what your problem is, but phrased in another way we may be able to state it terms of logic or plain language much more easily. What is the reason you want this to be evaluated, and specifically to maintain this order?

You don’t have to answer that, since you have the answer already.

I just think we may be able to provide a whole class of solutions, not just a specific one.

Given list A append list B, retaining the order and removing duplicates

This is exactly how [enlist<A>append<B>unique[]] reads.

Mario · July 10, 2024, 8:50am

Hi folks, I was not able to split the AI posts from the OP - So @twMat I did change the thread title to fit the actual content.

Springer · July 10, 2024, 12:34pm

Fascinating! I would not have thought that unique could be more costly than any other of the listops when working with similar numbers of items. Are you suggesting that an additional remove step (as in [enlist<list2>] -[enlist<list1>] +[prepend<list1>]) would be less costly (computationally) than using unique (as in [enlist<list1>append<list2>unique[]])? If so, we have a genuine tradeoff here between machine-efficient code and natural-reading, easily-expandable code.

If Dominant Append is the most efficient (machine-wise), and unique is the least so, then perhaps I should finally trot out my own Rube Goldberg solution now (which respects one of my desiderata, which is being easily able to accommodate any number of lists, which the remove-prepend acrobatics cannot):

[enlist<list2>reverse[]] [enlist<list1>reverse[]] +[reverse[]]

Just backing that train right up to the station…

saqimtiaz · July 10, 2024, 12:44pm

To clarify, the unique operator is costly compared to the rest of the operations of those rather simple filter expressions.

Potentially, though I would need to study the code implementation to answer definitively. However, I do think that unless we are dealing with lists hundreds of thousands of items long, the difference if any is going to be negligible.

My uninformed gut feeling says that might be the slowest option, since the reverse operator needs to loop through all its input.

TW_Tones · July 11, 2024, 12:52am

I appreciate your insight here

I suppose what we need is some kind reference to the missing number that helps us scale it to a meaningful time cost.

The unique operator may cost twice as much when compared to the rest of the operations, such as 2¢ vs 1¢ per hundred thousand.

Perhaps we could use an existing large data set and a standard environment to extract some metrics as a general rule?