How to efficiently store activity log?

For example, habit tracker log, flashcard review log.

I’m afraid doing things for 10 years will result in a huge log file. So what is the best way to store time series data in tw?

I really do think is depends on the size (bytes or info uniqueness) and frequency.

Spaced repetition can be simplified using the algorithm that makes it work.

Can you estimate these (size and frequency) and also consider if the data ages such that it loses value over time.

  • Is it really of historical value?
  • can you purge it?
  • export periodically to dedicated log wiki
    ** Summarise periodically, export and then purge old items?
  • dump to external files you can read and or import
  • perhaps you can process it into sums and counts and keep only that
  • perhaps you can compress it?
    • Perhaps into a simplified or tokenized list that needs conversion back.
  • Perhaps a dedicated wiki or pair of wikis ?
  • Can the data be funneled into in to different logs with different characteristics?

I do have dozens of techniques, but they first depend on size and frequency, sometimes the easiest thing is to set it up, and monitor it, such as present statistics on your home page such as count and bytes.

  • It won’t take that long for you to get a good idea.

For spaced repetition activity

Using a full


review log to record the complete data would be larger, but since the interval repeat algorithm makes the interval larger and larger, and you can’t review a card more than a few times, it doesn’t matter

For daily activity

For keep track of your habits and clock in, you will stick with it for a long time and the intervals will get bigger and bigger, so it’s time to think about the data structure

Use bloom filter like 8f8456208a4eae615f616e3e18755b9f

According to lastdate field, the next time you click, it will fill in the missing 0, and then set today’s record as 1.
The disadvantage is that you can only record by day, the advantage is that it is very space-saving, if you sign in every day, ten years later, it is so much, but if you use another example, it is to use the 1-bit to record ten years is also only 3,650 characters!

From Habit tracker in TW

For irregular activity

[0.4,0.6,2.4,5.8,4.93,0.94,0.86,0.01,1.49,0.14,0.94,2.18,0.05,0.34,1.26,0.29,2.61]

I’m writing a clocking plugin and the clocking intervals are indeterminate, so I record the intervals in days (0.4 days, 0.6 days, 2.4 days for interval of clock in), and then when I want to visualize it, I’ll calculate out each of the previous time points in time based on the modified field of the log file and the intervals. (last point is the time in modified field, last 2nd point is modified - 2.61 days)

It’s a good idea to use a review field to prevent tiddlers from being miscounted if they are manually modified.

Count activity per day

1,3,0,1,3,2

If activity is happened more than once per day, and you don’t need exact time point of the activity, you can use bloom-filter-like counter, its size will be the same as bloom filter example above, ten years is also only 3,650 characters.

I really want to make tw all-in-one, so I won’t use a dedicated wiki or external files.

Yes, I review some existing plugins, and above is my review.

2 Likes

You can also do a role up for example count the glasses per day, or even just the days you reached your target, and save that next year not the binary / decimal array.

Tidders have some fields and have a minimum size and if that is more than you need per entry use a data tiddler.

Yes, I updated the comment above to include this.

I will allow the user to choose the log type the user wants. Each will have a balance between size and limitations of visualization.

But I will keep log data on the tiddler that describes the activity, so it is easier to uninstall them all together.

1 Like

My latest desigin for three types described above

DailyCount

Use bloom filter or counter like

DailyCount1609459200000: 1,0,0,0,3,0,0

Each row starts with DailyCount, and follow with a date representing the last date of modification, created by new Date('2021-01-01').getTime(), and updated on each modify. Each line is for a month, if the value in this row is more than 30, then need to start a new row.

DayInterval

Record the intervals in days (0.4 days, 0.6 days, 2.4 days for interval of clock in) Calculate out each of the previous time points in time based on the last date of the line and the intervals. (last point is the time in lastdate field, last 2nd point is lastdate - 2.61 days). Minimum interval is 0.00001 days (1 min), but we can store it based on interval, if it is larger than 1 day, then store with toFixed(1), if is between 1 hour and 1 day, then store with toFixed(2) for 0.04 (1/24), if is between 1 min and 1 hour, then store with toFixed(5) for 0.00001.

DayInterval1609459200000: 0.4,0.6,2.4,5.8,4.93,0.94,0.86,0.01,1.49,0.14,0.94,2.18

Each row starts with DayInterval, and follow with a date representing the last date of modification, same as DailyCount. Each line is for a month, if the value in this row is more than 30, then need to start a new row.

Date

Store each date. Suitable for very infrequent events, or events that happened in a short period, but won’t happened for a long period. We only include the date here for visualization, and for privacy not recoding the title here.

0: 1609459200000
1: 1609459200001
2: 1609459200002
3: 1609459200003

I made it a plugin, currently only with JS api, no wikitext api, until someone need it.

https://tiddly-gittly.github.io/tw-gamification/#%24%3A%2Fplugins%2Flinonetwo%2Factivity-log-tools

Also available on CPL as $:/plugins/linonetwo/activity-log-tools

It currently serves for visualization of check in activity logs.

2 Likes