• Figure 2.
    Download figureOpen in new tabFigure 2. Electrophysiology validates that odorant receptor-optimized molecular descriptors can successfully identify new ligands for Drosophila.

    Mean increase in response of neurons to 0.5-s stimulus of indicated odors (10−2 dilution) predicted for each associated Or. Dashed lines indicate the activator threshold (50 spikes/s). ΔH: Or85b (ab3B) = flies lack expression of Or22a in neighboring neuron, thus all observed neuron activation is unambiguously caused by Or85b. N = 3, error bars = s.e.m.

    DOI: http://dx.doi.org/10.7554/eLife.01120.008

    Figure 3.
    Download figureOpen in new tabFigure 3. Predicted receptor–odor interactions are highly specific.

    (A) Plot of activity (Top) for electrophysiologically tested receptor-odor interactions. (Bottom) Plot indicating locations of predicted receptor-odor combinations (green) and same odorants tested in non-target receptor-odor combinations (gray). (B) Plot of percentage of activating odors (>50 spikes/s) considering all activating or inactive odors (>0 spikes/s) across ranking bins for all odors tested using electrophysiology.

    DOI: http://dx.doi.org/10.7554/eLife.01120.009

    Figure 4.
    Download figureOpen in new tabFigure 4. Analysis of receptor–odor relationships and breadth of tuning.

    (A) Hierarchical clusters created from Euclidean distance values between Drosophila Ors calculated using: (left to right) shared optimized descriptors; known activity to training set odors (Hallem and Carlson, 2006); overlap across top 500 predicted ligands; and Phylogenic tree of receptors (Hallem and Carlson, 2006). Sub clusters shaded with colors or bars. (B) Frequency distribution of compounds from the >240K library within the top 15% distance from highest active plotted to generate predicted breadth of tuning curves. Green arrows indicate relative distance of the furthest known activating compound determined by electrophysiology.

    DOI: http://dx.doi.org/10.7554/eLife.01120.010

    Figure 5.
    Download figureOpen in new tabFigure 5. Analysis of predicted natural odor sources and cross activation.

    (A) (Left) The numbers of compounds present in the collected volatile library according to source. (Right) The numbers and sources of predicted ligands for the 19 Drosophila odor receptors/neurons within the top 500 predicted compounds. (B) Comparison of plots for percentage of receptors that are: (top left) activated by percentage of known odors from training set (Hallem and Carlson, 2006); (bottom left) predicted to be activated by Natural compound library; (top right) predicted to be activated from >240K library; and (bottom right) activated by ligands for 10 shared Ors in this study vs activated by comparable actives previously tested (Hallem and Carlson, 2006).

    DOI: http://dx.doi.org/10.7554/eLife.01120.011

    Figure 6.
    Download figureOpen in new tabFigure 6. Predicted odor space and network view of odor coding.

    (A) Expansion of the peripheral olfactory code in this study: large increase in numbers of identified activators and inhibitors. The different sized circles represent the approximate ratio of numbers of previously known ligands (top circles), predicted ligands based on a cutoff of the top 500 predicted compounds per receptor and corrected to the validation success rate (lower, diffuse circles). (B) Drosophila receptor–odor network. Each known interaction (>50 spikes/s) from this and previous studies (Hallem and Carlson, 2006) is linked by a purple edge. Predicted receptor–odor network (top 500 hits) are linked by light-grey edges. All compounds are represented as small black circles and Ors are represented as large colored circles matching the colors used in (Figure 4A).

    DOI: http://dx.doi.org/10.7554/eLife.01120.012

  • Table 1.

    Optimized molecular descriptor set compositions

    DOI: http://dx.doi.org/10.7554/eLife.01120.006

    Descriptor class type counts for all Ors
     GETAWAY descriptors75
     3D-MoRSE descriptors66
     2D autocorrelations44
     Edge adjacency indices44
     2D binary fingerprints44
     Functional group counts43
     Atom-centred fragments37
     WHIM descriptors36
     Topological charge indices24
     Atomtypes (Cerius2)23
     Burden eigenvalues23
     Molecular properties23
     Topological descriptors22
     Geometrical descriptors18
     2D frequency fingerprints11
     RDF descriptors8
     Walk and path counts6
     Connectivity indices5
     Information indices5
     Topological (Cerius2)4
     Constitutional descriptors3
     Structural (Cerius2)2
     Randic molecular profiles2
    Optimized descriptor analysis
     Average descriptor overlap between Ors13%
     Average number of descriptors per Or29.9
     Average number 3D descriptors per Or10.8
     Average number 2D descriptors per Or12.2
     Average number 1D descriptors per Or6.6
     Average number 0D descriptors per Or0.3
    Descriptor dimensionality counts
     Number three dimensional descriptors205
     Number two dimensional descriptors232
     Number one dimensional descriptors126
     Number zero dimensional descriptors5
    Descriptor Origin
     Number Dragon descriptors539
     Number Cerius descriptors29
    • Breakdowns of the molecular descriptor class type, dimensionality, origin, and average overlap for all optimized molecular descriptors selected for each Or.

  • Table 2.

    Predicted receptor–odor interactions validated as highly accurate using electrophysiology

    DOI: http://dx.doi.org/10.7554/eLife.01120.007

    ClassificationOr7aOr10aOr22aOr47aOr49bOr59bOr85aOr85bOr98aTotal
    Ligands (%)883186392791928710071
    Agonists (>50 spikes/s) (%)63318133186469709258
    Agonists (>100 spikes/s) (%)3113621194548486737
    Inverse agonists (%)250569252317813
    • Summary of prediction accuracy percentages obtained by electrophysiology validation. Ligands = Agonists (≥50 spikes/s) + Inverse agonists (>50% reduction from baseline activity).

  • Supplementary file 1.

    (A) Optimized descriptor sets for each Drosophila Or. Optimized descriptors occurrences, symbol, brief description, class, and dimensionality are listed. A summary of the total number of descriptors selected for the receptor repertoire is provided at the beginning. Descriptors are listed in ascending order of when they were selected into the optimized set, such that the descriptors selected first are more important. Weights indicate the number of times a descriptor was selected in an optimized descriptor set. (B) Top 100 predicted compounds for each Drosophila Or. Chemical name or Pubchem compound ID (CIDs), SMILES strings, and distances, of the top ∼100 predicted compounds for each Or. All distances represent the minimum distance based on optimized descriptors to the previously known strongest active compound listed in the gray cells for that particular Or.

    DOI: http://dx.doi.org/10.7554/eLife.01120.013

    Download source data [supplementary-file-1.media-1.xlsx]