Janelia Research Campus
6 files

Data related to Lillvis JL et al., 2024: Nested neural circuits generate distinct acoustic signals during Drosophila courtship

Version 4 2024-02-16, 21:58
Version 3 2024-02-05, 17:21
Version 2 2023-12-19, 22:04
Version 1 2023-12-14, 18:31
posted on 2024-02-16, 21:58 authored by Joshua LillvisJoshua Lillvis

Data related to Lillvis JL et al. 2024, Nested neural circuits generate distinct acoustic signals during Drosophila courtship.

This dataset includes the raw and analyzed audio files for all optogenetic experiments, the model used to segment song in SongExplorer, and the ground truth song annotations used to train the model used to segment song in SongExplorer . Details of each can be found below.


This dataset includes courting male D. melanogaster optogenetic silencing experiments (GtACR_MaleFemalePairs.zip), courting male optogenetic activation experiments (CsChrimson_MaleFemalePairs.zip), and isolated male optogenetic activation experiments (CsChrimson_isolatedMales.zip).

The file naming scheme is as follows:

  • date_recordingID_chamber_genotype
    • genotype e.g., SS46542CHRIM = SS46542>CsChrimson = CsChrimson expressed in pIP10 or SS46542GTACR = SS46542>GtACR1 = GtACR expressed in pIP10
    • .WAV files are the raw audio files
    • -ambient, -mel-ipi, -mel-pulse, -mel-sine, -other, -poly .wav files are the probabilities of each characteristic as classified in SongExplorer
    • -predicted-1.0pr.csv are the ethograms of each recording after thresholding the probabilities for each word
      • precision/recall 1.0 thresholds were as follows:
        • poly: 0.79424
        • mel-pulse: 0.742793
        • mel-ipi: 0.598294
        • mel-sine: 0.892484
        • other: 0.982875
        • ambient: 0.930561
  • LED2.WAV is the timing and relative amplitude of the 525 nm light (GtACR experiments)
    • the intensities used for GtACR_MaleFemalePairs.zip were (in uW/mm2) 3, 5, 8, 10, 18, 24, 30, 36
  • LED1.WAV is the timing and relative amplitude of the 625 nm light (CsChrimson experiments)
    • the intensities used for CsChrimson_MaleFemalePairs.zip were (in uW/mm2) 2, 4, 6, 8, 17, 25, 34, 42
    • the intensities used for CsChrimson_isolatedMales.zip were (in uW/mm2) 0.3, 0.7, 1, 1.5, 2, 2.4, 2.8, 3.2, 3.7, 4.1, 4.5, 4.9, 5.4, 5.8, 6.2, 8.4, 10.5, 12.7, 14.8, 17, 19, 21, 25, 29, 34, 38, 42


This dataset includes audio .wav files and annotated .csv files used to train the classifier that was used to segment all audio recordings.

See https://github.com/JaneliaSciComp/SongExplorer for detailed instructions on training a convolutional neural network and analyzing data using SongExplorer. In brief, the songSegmentationModel_groundTruthData.zip files shared here can be inserted into the groundtruth directory used for training. The songSegmentationModelTrainingParameters.log file includes all of the parameters used for training. These values can be inserted into the corresponding boxes in SongExplorer before training.

The Input to the model was a 204.8 ms interval of the raw microphone waveform centered around points randomly chosen from within each annotated interval, which equates to 1024 ticks in time sampled at 5000 Hz. There were four blocks of layers, each with a 1D convolution, a kernel size of 128, eight feature maps, a ReLU non-linearity, and 50% dropout. The last three layers were also strided by two so as to downsample the output resolution by eight to 625 Hz. Notably, the network did not explicitly create a spectrogram or use a Fourier transform. Output were six taps which represented the predicted probability of the labelled classes (mel-pulse, mel-sine, mel-ipi, poly, ambient, other). In total, the model had 23,630 parameters, all of which were trainable. Training used a batch size of 32, a learning rate of 1e-6, and the Adam optimizer. Ten percent of the data were withheld for validation, and the number of training steps, 7 million, was chosen to ensure that the validation accuracy plateaued.

The version of SongExplorer used was dated 20 Feb 2022, and corresponds approximately to commit dce7cb87cb8f in the git repository. The exact values of the model parameters that were learned and used are contained in the included PB file (songSegmentationModel.zip) and in the log file (songSegmentationModelTrainingParameters.log). The model used in Lillvis JL et al., 2024 (songSegmentationModel.zip) will not work in the latest version of SongExplorer. However, training a new model using this data (songSegmentationModel_groundTruthData.zip) and similar parameters to those indicated above will generate a model with similar ability to segment D. melanogaster song.


Howard Hughes Medical Institute


Usage metrics

    Janelia Research Campus



    Ref. manager