Drosophila data for whole-embryo lineage reconstruction with linajea
This article enables access to the drosophila data (120828) for "Automated reconstruction of whole-embryo cell lineages by learning from sparse annotations" (Malin-Mayor et al. 2023, DOI: https://doi.org/10.1038/s41587-022-01427-7).
Here we provide the ground truth tracks used to train the deep learning model, the trained networks, and the predicted tracks. Additionally, we provide information on how to access the image data, although it is not uploaded here due to size. Related artifacts include the source code for experiments and methods.
Image Data
The image dataset in n5/zarr format (as used in Malin-Mayor et al. 2023) can be accessed at the following Dropbox link: https://www.dropbox.com/scl/fi/vjaqxf447th926eespjwu/120828_drosophila_sequence.tar.gz?rlkey=lq9f83uhydzesfm8wh6pbpevc&dl=0. This image dataset was originally published in "Fast, accurate reconstruction of cell lineages from large-scale fluorescence microscopy data" (Amat et al. 2014, DOI: https://doi.org/10.1038/nmeth.3036).
Ground Truth Tracks
Inside gt_tracks.zip
there are a number of files containing different subsets of tracks. Each has the following columns separated by tabs: time, z, y, x, cell_id, parent_id, track_id
.
tracks.txt
is the main file containg manual annotations of individual cells from start to end of video used to train the model. These tracks are sparse, but each cell included in the tracks.txt
had its whole lineage traced as completely as possible from start to end of the video.
tracks_side_1.txt
and tracks_side_2.txt
contain the same tracks as tracks.txt
, but each lineage is assigned to one of the two sides. As cells do not cross the center line of the embryo, this creates a natural division of the tracks into these two sides, which are used separetely for training and evaluation.
full_frame_divisions.txt
is a different set of manually annotated short tracks around divisions. These tracks cover every division in time points 150 and 300 and adjacent frames as completely as possible, and were used for evaluation and not model training.
Trained Models
trained_networks.zip
includes both networks trained on the drosphila dataset. The config files are in the drosophila_config_files
directory, and the other directories are named corresponding to the model_name
in each config file and contain the trained model files. There is one model trained/validated on each side of the embryo, and evaluated on the other side, as described in Supplemental Note 1.
Predicted Tracks
predicted_tracks.zip
contains both the TGMM baseline results and the results for the linajea method.
tgmm.txt
contains the TGMM results provided to us by the authors of the TGMM method.
The linajea results are organized similarly to the trained models, with one text file for each side of the embryo. drosophila_side_1_tracks_071621.txt
contains tracks generated by the model trained on side 2 and tested on side 1, and drosophila_side_2_tracks_071621.txt
contains tracks generated by the model trained on side 1 and tested on side 2.