gedcom and tiddlywiki

Just wanted to link to a resource. Tamura Jones has been writing about genealogy software for a long time. There are some good articles on GEDCOM versions, names, dates, gender, etc.

https://www.tamurajones.net/

GEDCOM is deceptively simple format - it is fairly easy to parse data, less so to interpret the information.

1 Like

Great question. I’ll keep that mine. The last project I worked on (over 20 years ago), what I did was I would import raw GEDCOM the host application did not support. That data would get merged back in during an export.

I have a Ruby program to generate a TW JSON file from my gedcom. It’s too customized to be generally useful but can clean up if there’s interest.

One strategy I thought of using was to generate a data tiddler for each gedcom INDI record and FAM record. That way, the gedcom data is preserved 100%, and exporting would be a matter of concatenating the data tiddlers. Then use TW as a gedcom editor, in essence. I’m not sure how generally useful it would be to the average person as you’d need to be comfortable with manually editing the gedcom record. Then you could build tiddlers that are essentially views into the raw gedcom data tiddlers. I never experimented with this idea so I can’t say if it’s a viable workflow.

Well said. I see there is support for media. A separate challenge.

Another interesting idea. I’ll make note of this too. There maybe ways of reducing the GEDCOM look in TW. Thank you.

Craig - thanks for taking this on. I would love to help test your GEDCOM import once it’s ready for testing. I have a copy of your Memory Keeper that I’ve started playing with (which I love by the way).

1 Like

Are you thinking the raw GEDCOM (for an INDI record for example) is imported directly to a text field? I was wondering if we could, optionally, take advantage of Tiddler fields somehow. It would be nice for TW to still be used to draw charts, perform relationship calculations, etc, with the data still in this raw GEDCOM format. I’ll want an easy way to cross reference tiddlers.

I’d also want to write some sort of data integrity routines—users could easily corrupt GEDCOM syntax.

Thank you. This will be large under taking and need plenty of testing. I have a good collection of GEDCOM files, but I’ll for sure miss things.

I know Memory Keeper is not for everyone, but a GEDCOM plugin might attract a greater number of genealogist to TW.

I recommend this version:

https://www.tamurajones.net/DownloadGEDCOM551AnnotatedEdition.xhtml

The annotations are varied in nature. There are corrections of errors, commentary about differences with GEDCOM version 5.5, observations on the structure of GEDCOM files, notes meaning of some sections, clarifications of confusing parts and resolutions of contradictions. There is commentary on bad examples within the specification and on some real-world issues the specification never mentions. Several annotation provide advice, guidelines and best practices, often with links to relevant articles.
Obsolete and deprecated sections have been clearly marked as such, but have not been otherwise ignored; those sections do still contain corrections and annotations.

@clsturgeon Once I removed the Windows specific FileBrowser code, your script worked fine on powershell on Linux, specifically Debian. Was still kinda slow.

2 Likes

I have a ton of GEDCOM files as well. Some are quite small and some are quite large. I’m sure I could give it a good go.

Put me down for testing imports into MemoryKeeper! Would sure save a LOT of typing, especially if sources or at least citations would come in as well. More concerned about linking to local copies of files (photos, scans, etc.) as a source rather than online data or books.

1 Like

@dixonge thank you. I’ll post back here with any progress, which has been very little so far. I said it’ll be slow, and even slower if the weather is nice. Last weekend was gorgeous and this weekend is even better. I did get some work done last week, but more importantly I elected to created a “Change Management” system (in TW) to capture requirements, design notes, tasks, etc. It is also a knowledge base to capture tips, solutions etc. My very first TW was a change management system for work. So I had a shell to work with. I have since created another instance for tracking Memory Keeper.

2 Likes

Looking for some feedback. I have been moving forward with a GEDCOM plugin. However, some feedback might prevent a nasty re-write later. To provide feedback do not feel you need to know or understand GEDCOM or its structure. You simply need to know it is a text-based, human-readable, file format that represents a genealogical record exported from a genealogical tool that we what to import into TW.

So a GEDCOM plugin will do the following:

  • import GEDCOM data source into TW
  • export GEDCOM data source from TW

Anything else?

To support various TW implementations these two methods will follow a configuration. Therefore, the plugin will also include a UI to manage GEDCOM import/export configurations. The plugin will come with a few system-supplied configurations, such as one for Memory Keeper. New configurations can be created from scratch or cloned from an existing one. Configurations are defined in tiddlers tagged with: $:/tags/cls/mk/gedcom/configuration

An example configuration might include something like… a GEDCOM INDI record (a person record) generates a dedicated tiddler tagged with “person”. A user could change the tag. How else could one represent a person in TW? GEDCOM SEX M is mapped to tag: “male” and SEX F is mapped to tag: “female”. With a configurable approach, the user could change this so GEDCOM SEX M is mapped to a field: “gender” with a value of M.

Using the examples above what other configurable options could we define in TW? What I mean is what other approaches could be defined in a configuration beyond tags and fields? There are obviously more complex data structures than a person’s gender that needs to be addressed, so I suspect I’ll want more configurable options. I’ve seen other TW solutions use tags to represent parent/child relationships. I’m not a fan of this approach, but I should support it.

When developing MK I took my own past experience and TW knowledge to define how I wanted to represent genealogical information, but this plugin should support what others might want to do.

Once the user selects a configuration and selects the GEDCOM file to import, I have defined the import process with three major stages:

  1. import GEDCOM into a JS object model
  2. apply necessary changes to the data in the model to accommodate the import configuration
    – E.g. address potential duplicate tiddler names (likely another configurable option)
  3. generate tiddlers from the JS objects and based on the selected configuration

Step 1 (above) does not need to know anything about the configuration. It is written–and it appears the object model is populated as expected. I think I have about 80%, the most popular, of the specification data structures and elements implemented. I feel the good ole 80-20 rule applying here. The last 20% will take 80% of the time. :slight_smile:

I have not started the next two steps or written anything for the configuration.

Other considerations an import should/could/must address are:

  • multiple GEDCOM files are being imported.
  • related to the point above, should the plugin offer duplicate tiddlers resolution tools?
    – This refers to duplicate individuals, or duplicate source records (not duplicate tiddler names)
    – A tool would enable the user to merge data. I’m thinking this is a complex process and should be managed elsewhere, not in this plugin.

Anything else?

Appreciate any comments, feedback, things to watch out for, and better ideas!!

Sorry for the late response. I’ve been under the weather the last few days.

I like the configuration idea. A good thing to have under configuration options might be the capability of adding custom tags. I have a large file in Rootsmagic right now and that program using tags specific to it. I like the idea of this being configurable since many different programs treat GEDCOM slightly differently.

You may have thought of this already, but, at least with GEDCOM import, an error reporting mechanism would be extremely useful. This could be written to a log file or tiddler that would log what caused each of the errors such that the user could then go back and correct the records.

Those are the things that immediately come to mind. If I think of anything else (now that I’m a bit clearer headed), I’ll let you know.

Yes. This is a must. I have not finalized a target location yet. I do want a log, not just errors. Currently I’m logging to the console, but I’m experiencing that it has limits. I’ll experiment with a Tiddler, perhaps tagged as $:/temp so it does not persist. Not sure! Thank you for the feedback.

To all… please read (an odd TW observation you might be interested in—and hopefully you will comment on). To @HistoryBuff and @dixonge do you remain willing to test a GEDCOM import? I have not written a generic GEDCOM plugin. I still want to. I have implemented a GEDCOM import specifically for Memory Keeper. This built-in import mechanism dramatically exceeds the performance behaviour of the PowerShell script I wrote.

My goal is to complete an import for build 9— a GEDCOM export will come later. Build 9 is not complete, but I think the GEDCOM import is ready for testing–though there are a few more things I want to do.

A pre-release of MK build 9 can be exported from my “Churchill” demo project.

Using the link above navigate to the advanced search tiddler. Then on the filter tab select from the dropdown:

Memory Keeper plugin tiddlers

This will list all the MK tiddlers. Export the tiddlers to a json file. Then import that json file into your MK project file.

Be very careful. This is a pre-release. You will also find a number of other new enhancements and bug fixes. My favourite enhancements are the new views. There is a new photograph view to show them in a grid (see demo). To take advantage of this view photocaption and caption fields need to be populated. Other new items include a new source type: DNA results; new statistical charts; custom datasets; and a configuration to enable the cross-referencing of tiddlers between separate MK project files.

I want to mention one issue I had with the GEDCOM import process, so you can watch out for it, and perhaps others here can comment on the TW behaviour.

There were a number of bugs and performance issues I needed to address and I will not be surprised if more are found. One issue I ran into (which I hope I have fixed) is that a specific set of tiddlers could not be edited in TW after the import. After the import when I opened a specific tiddler into edit mode and made changes to it, it would save the changes into a new tiddler and the existing tiddler remained intact. What I found was my import process, on a specific set of tiddlers, was wrongly adding a trailing space character to the title of the tiddler. What I also discovered is TW trims these trailing space characters when you save your tiddler edits. I think I understand why, but shouldn’t the addTiddler method, which I am using in my import, remove trailing spaces from the title field too, to prevent this ill behaviour?

Example call the import process performs to add a tiddler:

this.wiki.addTiddler(new $tw.Tiddler(this.wiki.getCreationFields(),this.wiki.getModificationFields(),fields));

Aside: Having hundreds of tiddlers with this ill trailing space character also caused a performance refresh issue. Therefore, correcting the trailing space not only corrected the edit behaviour but also dramatically helped the performance.

The GEDCOM import process can be found using the menu (Contents tab) Research–>Research Tools–>GEDCOM Import

Thank you to all. I think most of the enhancement ideas I implement come from here, not to mention many of the bug fixes. Thank you in advance to @HistoryBuff and @dixonge

Issues can be logged here:

https://github.com/clsturgeon/MemoryKeeper/issues

Craig

1 Like

@clsturgeon I am definitely still interested and looking forward to it.

1 Like

I’m interested, although it might be a bit until I have time to test.

1 Like

Craig,

I was finally able to get to experimenting with this today, but am having trouble with the import process. Is the GEDCOM import button supposed to open a file chooser? How do I tell it which gedcom file to import? The button (nor the review button) seem to do anything. I followed the above instructions. What am I missing? Thanks.