39 files

Hormaphis cornu population genetics scripts

journal contribution
posted on 19.01.2021, 14:52 authored by Aishwarya Korgaonkar, Clair HanClair Han, David Stern
Code used in the Hormaphis cornu and Hormaphis hamamelidis selection signature around bicycle genes, associated with the publication "A novel family of secreted insect proteins linked to plant gall development" by Korgaonkar et al. (2021).

The scripts included here are for 3 analysis:

1) Polymorphism within each of H. cornu and H. hamamelidis, and divergence between them
Input files: polydiv_runlog.sh (requires fasta_consensus_Hham.sh, fasta_consensus.sh, make_masking_bed_Hham.sh, make_masking_bed.sh, subset_SNPs_Hham.sh, subset_SNPs.sh)
Three sets of plotting files:
-set 1: HcorVsHham_pi_Dxy_wholeGenome_w1000_v2.R (uses gene_cluster2.names and HcorVsHham_pi_Dxy_wholeGenome_w1000_biCYCle_genes_table_v2.txt); generates figures 7G-J.
-set 2: karyoplot_pi_Dxy.Rmd (uses jckhmmer_gene_cluster2); generates figure 6S.
-set 3: bicycle_polydiv_examples.R; generates figures 7C-F.

2) Whole-genome SweeD analysis in H. cornu and H. hamamelidis
-Neutral simulations to generate significance cutoff: neutral_simulation_runlog.sh and SweeD_neutral_sig_cutoff.R
-Genome-wide SweeD analysis: SweeD_runlog.sh, SweeD_wholeGenome_permutation_v2.R, SweeD_wholeGenome.R, HcorVsHham_pi_Dxy_wholeGenome_w1000_biCYCle_cluster_table.txt
Generate figures 7K-N and S7A-B.

3) Genome-wide adaptive non-synonymous substitutions
-DnDs: codeml_runlog.sh, dnds_allGenes.R
(codeml_runlog.sh requires clean_names.sh, fasta_to_phylip.sh, MK_all_biCYCle.R, run_codeml.sh, transcript_alignment.sh)
Generate figures 7A-B.
-Number of adaptive non-synonymous substitutions for different categories of genes over-expressed in fundatrix salivary glands: MK_polymorphorama_runlog.sh, MK_allGenes.R
(MK_polymorphorama_runlog.sh requires clean_names_Hcor.sh, transcript_alignment_Hcor.sh, run_polymorphorama_MAF0.03.sh and gene lists annot_no_sigp_names, annot_sigp_names, bicycle_cluster1.names, gene_cluster2.names, gene_cluster3.names, no_annot_no_sigp_names, no_annot_sigp_names, non_bicycle_cluster1.names)
Generate figure 7C.

Please contact Clair Han at hanc@janelia.hhmi.org with questions.