Linear regression classifier development and application scripts

posted on 08.01.2022, 02:53 by David Stern, Clair HanClair Han
Code used in the development of the gene-structure based bicycle classifier from H. cornu and it's application to all 22 species from Figure 1 of "Gene structure-based homology search identifies highly divergent putative effector gene family", Stern and Han, BioRxiv https://doi.org/10.1101/2021.09.24.461719.

Classifier development from H. cornu data: Hcor_bicycle_classifier.Rmd
All other files in this folder are input files for this script.
Generates all panels in Fig S2

Application to all 22 species:
All annotation input files can be found on Janelia Figshare at https://doi.org/10.25378/janelia.17777888
Generates classifier bicycle counts in column 3 of Fig 1.

Generates all panels of Fig S3, whole transcriptome exon counts vs median exon lengths.
Input files available on Janelia Figshare at https://doi.org/10.25378/janelia.17783336 and

Please contact Clair Han at hanc@janelia.hhmi.org with questions.

See related materials in collection at https://doi.org/10.25378/janelia.c.5778905.