Janelia Research Campus
9 files

Linear regression classifier development and application scripts

posted on 2022-04-11, 19:49 authored by David Stern, Clair HanClair Han

Code used in the development of the gene-structure based bicycle classifier from H. cornu and it's application to all 22 species from Figure 1 of "Gene structure-based homology search identifies highly divergent putative effector gene family", Stern and Han, BioRxiv https://doi.org/10.1101/2021.09.24.461719.

Classifier development from H. cornu data: Hcor_bicycle_classifier.Rmd

All other files in this folder are input files for this script.

Generates all panels in Fig S2

Application to all 22 species:


All annotation input files can be found on Janelia Figshare at https://doi.org/10.25378/janelia.17777888

Generates classifier bicycle counts in column 3 of Fig 1.


Generates all panels of Fig S3, whole transcriptome exon counts vs median exon lengths.

Input files available on Janelia Figshare at https://doi.org/10.25378/janelia.17783336 and


Please contact Clair Han at hanc@janelia.hhmi.org with questions.

See related materials in collection at https://doi.org/10.25378/janelia.c.5778905.