Introduction

Development of an in-depth, mechanistic understanding of brain function in health and disease requires different scientific approaches spanning multiple scales, from gene expression to behavior. While “wet” experimental approaches are essential for characterizing the properties of neural systems and testing hypotheses, theory and modeling are critical for exploring how these complex systems behave across a wider range of conditions, and for generating new experimentally testable, physically plausible hypotheses. Theory and modeling also provide a way to integrate a panoply of experimentally measured parameters, functional properties, and responses to perturbations into a physio-chemically coherent framework that reproduces the properties of the neural system of interest (Einevoll et al., 2019; Yao et al., 2022; Poirazi and Papoutsi, 2020; Gurnani and Silver, 2021; Gleeson et al., 2018; Cayco-Gajic et al., 2017; Billings et al., 2014; Vervaeke et al., 2010; Kriener et al., 2022; Billeh et al., 2020; Markram et al., 2015).

Computational models in neuroscience often focus on different levels of description. For example, a cellular physiologist may construct a complex multicompartmental model to explain the dynamical behavior of an individual neuron in terms of its morphology, biophysical properties, and ionic conductances (Hay et italic., 2011; De Schutter and Bower, 1994; Migliore et al., 2005). In contrast, to relate neural population activity to sensory processing and behavior, a systems neurophysiologist may build a circuit level model consisting of thousands of much simpler integrate-and-fire neurons (Lapicque, 1907; Potjans and Diesmann, 2014; Brunel, 2000). Domain specific tools have been developed to aid the construction and simulation of models at varying levels of biological detail and scales. An ecosystem of diverse tools is powerful and flexible, but it also creates a number of serious challenges for the research community (Cannon et al., 2007). Each tool typically has its own design, features, Application Programming Interface (API) and syntax, a custom set of utility libraries, and last but not least, a distinct machine-readable representation of the model’s physiological components. This represents a complex landscape for users to navigate. Additionally, models developed in different simulators cannot be mixed and matched, nor easily compared, and the translation of a model from one tool-specific implementation to another can be non-trivial and error-prone. This fragmentation in modeling tools and approaches can act as a barrier to neuroscientists who wish to use models in their research, as well as impede how Findable, Accessible, Interoperable, and Reusable (FAIR) models are (Wilkinson et al., 2016).

To counter fragmentation and promote cooperation and interoperability within and across fields, standardization is required. The International Neuroinformatics Co-ordinating Facility (INCF) (Abrams et al., 2022) has highlighted the need for standards to “make research outputs machine-readable and computable and are necessary for making research FAIR” (INCF, 2023). In biology, a number of community standards have been developed to describe experimental data (e.g. Brain Imaging Data Structure (BIDS) (Gorgolewski et al., 2016), Neurodata Without Borders (NWB) (Teeters et al., 2015)) and computational models (e.g. Systems Biology Markup Language (SBML) (Hucka et al., 2003), CellML (Lloyd et al., 2004), Scalable Open Network Architecture TemplAte (SONATA) (Dai et al., 2020), PyNN (Davison et al., 2009) and Neural Open Markup Language (NeuroML) (Gleeson et al., 2010)). These standards have enabled open and interoperable ecosystems of software applications, libraries and databases to emerge, facilitating the sharing of research outputs, an endeavor encouraged by a growing number of funding agencies and scientific journals.

The initial version of the NeuroML standard, version 1 (NeuroMLv1), was originally conceived as a model description format (Goddard et al., 2001), and implemented as a three layered, declarative, modular, simulator independent language (Gleeson et al., 2010). NeuroMLv1 could describe detailed neuronal morphologies and their biophysical properties as well as specific instantiations of networks. It enabled the archiving of models in a standardized format and addressed the issue of simulator fragmentation by acting as the common language for model exchange between established simulation environments—NEURON (Hines and Carnevale, 1997; Awile et al., 2022), GENESIS (Bower and Beeman, 1997), and MOOSE (Ray and Bhalla, 2008). While solving a number of long-standing problems in computational neuroscience, NeuroMLv1 had several key limitations. The most restrictive of these was that the dynamical behavior of model elements was not formally described in the standard itself, making it only partially machine readable. Information on the dynamics of elements (i.e. how the state variables should evolve in time) was only provided in the form of human-readable documentation, requiring the developers of each new simulator to re-implement the behavior of these elements in their native format. Additionally, introduction of new model components required updates to the standard and all supporting simulators, making extension of the language difficult. Finally, the use of Extensible Markup Language (XML) as the primary interface language limited usability—applications would generally have to add their own code to read/write XML files.

To address these limitations, NeuroML was redesigned from the ground up in version 2 (NeuroMLv2) using the Low Entropy Modeling Specification (LEMS) language (Cannon et al., 2014). LEMS was designed to define a wide range of physio-chemical systems, enabling the creation of fully machine-readable, formal definitions of the structure and dynamics of any model components. The core modeling elements in NeuroMLv2 (cells, ion channels, synapses) have their mathematical and structural definitions described in LEMS (e.g. the parameters required and how the state variables change with time). Thus, NeuroMLv2 retains all the features of NeuroMLv1—it remains modular, declarative, and continues to support multiple simulation engines—but unlike version 1, it is extensible, and all specifications are fully machine-readable. NeuroMLv2 also moved to Python as its main interface language and provides a comprehensive set of Python libraries to improve usability (Vella et al., 2014), with XML retained as a machine-readable serialization format (i.e. the form in which the model files are saved/shared).

Since its release in 2014, the NeuroMLv2 standard, the software ecosystem, and the community have all steadily grown. An open, community based governance structure was put in place—an elected Editorial Board, overseen by an independent Scientific Committee, maintains the standard and core tools. Although the core tools were initially focused on enabling the simulation of models on multiple platforms, they have been expanded to support all stages of the model life cycle (Fig. 1). Modelers can use these tools to easily create, inspect and visualize, validate, simulate, fit and optimize, share and disseminate NeuroMLv2 models and outputs (Billings et al., 2014; Cayco-Gajic et al., 2017; Gurnani and Silver, 2021; Kriener et al., 2022; Gleeson et al., 2019). To provide clear, concise, searchable information for both users and developers, the NeuroML documentation has been significantly expanded and re-deployed using the latest modern web technologies1. Increased community-wide collaborations have also extended the software ecosystem well beyond the NeuroMLv2 tools developed by the core NeuroML team: additional simulators such as Brian (Stimberg et al., 2019), NetPyNE (Dura-Bernal et al., 2019), Arbor (Abi Akar et al., 2019) and EDEN (Panagiotou et al., 2022) all support NeuroMLv2. We have worked to ensure interoperability with other structured formats for model development in neuroscience such as PyNN (Davison et al., 2009) and SONATA (Dai et al., 2020). Platforms for collaboratively developing, visualizing, and sharing NeuroML models (Open Source Brain (OSB) (Gleeson et al., 2019)) as well as a searchable database of NeuroML model components (NeuroML Database (NeuroML-DB) (Birgiolas et al., 2023)) have been developed. These enhancements, driven by an ever expanding community, have helped NeuroMLv2 mature into a standard which has been officially endorsed by international organizations such as the INCF and COmputational Modeling in Biology NEtwork (COMBINE) (Hucka et al., 2015), and which is now sufficiently mature to be incorporated into a wide range of research workflows.

The NeuroML software ecosystem supports all stages of the model development life cycle.

In this paper, we provide an overview of the current scope of version 2 of the NeuroML standard, describe the current software ecosystem and community, and outline the extensive resources to assist researchers to incorporate NeuroML into their modeling work. We demonstrate, with examples, that NeuroML supports users in all stages of the model development life cycle (Fig. 1) and promotes FAIR principles in computational neuroscience. We highlight the various core NeuroML tools and libraries, additional utilities, supported simulation engines, and the related projects that build upon NeuroML for automated model validation, advanced analysis, visualization, and sharing/re-use of models. Finally, we summarize the organizational aspects of NeuroML, its governance structure and community.

Results

NeuroML provides a ready to use set of curated model elements

A central aim of the NeuroML initiative is to enable and encourage the use of multi-scale biophysically detailed models of neurons and neuronal circuits in neuroscience research. The initiative takes a number of steps to achieve this aim. NeuroML provides users with a curated library of model elements that form the NeuroML standard2 (Fig. 2). The standard is maintained by the NeuroML Editorial Board which has identified a core set of model types to support, to ensure that a significant proportion of commonly used neurobiological modeling entities can be described with the language. This includes (but is not limited to): active membrane conductances (using Hodgkin-Huxley style (Hodgkin and Huxley, 1952) or kinetic scheme based ionic conductances), multiple synapse and plasticity mechanisms, detailed multicompartmental neuron models with morphologies and biophysical properties, abstract point neuron models, and networks of such cells spatially arranged in populations, connected by targeted projections, receiving spiking and current based inputs.

NeuroML is a modular, hierarchical format that supports multi-scale modeling. Elements in NeuroML are formally defined, independent, self-contained building blocks with hierarchical relationships between them. (a) Models of ensembles of ionic conductances can be defined as a composition of gates, each with specific voltage (and potentially [Ca2+]) dependence that controls the conductance. (b) Morphologically detailed neuronal models specify the 3D structure of the cells, along with passive electrical properties, and reference ion channels that confer membrane conductances. (c) Network models contain populations of these cells connected via synaptic projections. (d) A truncated illustration of the NeuroMLv2 core element hierarchy. The standard includes commonly used model elements/building blocks that have been pre-defined for users in a hierarchical representation: Cells: neuronal models ranging from simple spiking point neurons to biophysically detailed cells with multicompartmental morphologies and active membrane conductances; Synapses and ionic conductance models: commonly used chemical and electrical synapse models (gap junctions), and multiple representations for ionic conductances; Inputs: to drive cell and network activity, e.g. current or voltage clamp, spiking background inputs; Networks: of populations (containing any of the aforementioned cell types), and projections between them, form the core of a NeuroML model description. The full list of NeuroML elements can be found in Appendix 1 Tables 4 and 5.

The NeuroMLv2 standard consists of two levels that are designed to enable users to easily create their models without worrying about simulator-specific details. The first level defines a formal “schema” for the standard model elements, their attributes/parameters (e.g. an integrate and fire cell model and its necessary attributes: a threshold parameter, a reset parameter, etc.), and the relationships between them (e.g. a network contains populations; a multicompartmental cell morphology contains segments). This allows the validation of the completeness of the description of individual NeuroML model elements and models, prior to simulation. The second level defines the underlying dynamical behavior of the model elements (e.g. how the time varying membrane potential of a cell model is to be calculated). Most users do not need to interact with this level (which is enabled by LEMS), which, among other features, facilitates the automated translation of simulator-independent NeuroML models into simulator-specific code.

Thus, modelers can use the standard NeuroML elements to conveniently build simulator-independent models, while also being able to examine and extend the underlying implementations of models. As a simulator independent language, NeuroML also promotes interoperability between different computational modeling tools, and as a result, the standard library is complemented by a large, well maintained ecosystem of software tools that supports all stages of the model life cycle—from creation, analysis, simulation and fitting, to sharing and reuse. Finally, as discussed in later sections, for advanced use cases where the existing NeuroML model building blocks are insufficient, NeuroML also includes a framework for creating and including new model elements.

NeuroML is a modular, structured language for defining FAIR models

NeuroMLv2 is a modular, structured, hierarchical, simulator independent format. All NeuroML elements are formally defined, independent, self-contained elements with hierarchical relationships between them. An “ionic conductance” model element in NeuroML, for example, can contain zero, one or more “gates” and plug into a “cell” model element, which can then fit into a “population” of a “network” (Fig. 2). To support the range of electrical properties found in biological neurons, ionic conductances with distinct ionic selectivities and dynamics can be generated in NeuroML through the inclusion of different types of gates (e.g. activation, inactivation), their dependence on variables such as voltage and [Ca2+] and their reversal potential. Cell types with different functional and biophysical properties can then be generated by conferring combinations of ionic conductances on their membranes. The conductance density can be adjusted to generate the electrophysiological properties found in vivo. In practice, many examples of ionic conductances that underlie the electrical behavior of neurons are already available in NeuroMLv2 and can simply be plugged into a cell membrane (Fig. 2). Indeed, a model element, once defined in NeuroML, acts as a building block that may be reused any number of times within or across models. Such reuse of model component speeds model construction and prototyping irrespective of the simulation engine used.

The defined structure of each model element and the relationships between them inform users of exactly how model elements are to be created and combined. This encourages the construction of well-structured models, reduces errors and redundancy and ensures that FAIR principles are firmly baked into NeuroML models and the ecosystem of tools. As we will see in the following sections, NeuroML’s formal structure also enables features such as model validation prior to simulation, translation into simulation specific formats, and the use of NeuroML as a common language of exchange between different tools.

NeuroML supports a large ecosystem of software tools that cover all stages of the model life cycle

Model building and the generation of scientific knowledge from simulation and analysis of models is a multi-step, iterative process requiring an array of software tools. NeuroML supports all stages of the model development life cycle (Fig. 1), by providing a single model description format that interacts with a myriad of tools throughout the process. Researchers typically assemble ad-hoc sets of scripts, applications, and processes to help them in their investigations. In the absence of standardization, they must work with the specific model formats and APIs that each tool they use requires, and somehow convert model descriptions when using multiple applications in a toolchain. NeuroML addresses this issue by providing a common language for the use and exchange of models and their components between different simulation engines and modeling tools. The NeuroML ecosystem includes a large collection of software tools, both developed and maintained by the main NeuroML contributors (the “core NeuroML tools”) and those external applications which have added NeuroML support (Fig. 3, Appendix 1 Tables 1 and 2).

Step by step guides for using NeuroML, illustrating the various stages of the model development lifecycle. An updated list is available at http://neuroml.org/gettingstarted.

The ecosystem of NeuroML complaint tools and their relation to the model life cycle. The inner circle shows the core NeuroML tools that are maintained directly by the NeuroML developers. These provide core functionality to read, modify, or create new NeuroML models, as well as to analyze, visualize and simulate the models. The outermost layer shows NeuroML compliant tools which have been developed independently to allow various interactions with NeuroML models. These complement the core tools by facilitating model creation, validation, visualization, simulation, fitting/optimization, sharing and reuse. Further information on each of the tools shown here can be found in Appendix 1 Tables 1 and 2.

The core NeuroML tools currently include APIs in several programming languages—Python, Java, C++ and MATLAB. These tools provide critical functionality to allow users to interact with NeuroML components and build models. Using these, researchers can build models from scratch, or read, modify, analyze, visualize, and simulate existing NeuroML models on supported simulation platforms. The simulation platforms (e.g. EDEN (Panagiotou et al., 2022), NEURON (Hines and Carnevale, 1997)), along with other independently developed tools, form the next layer of the software ecosystem—providing extra functionality such as interactive model construction (e.g. neuroConstruct (Gleeson et al., 2007), NetPyNE (Dura-Bernal et al., 2019)), additional visualization (e.g. OSB (Gleeson et al., 2019)), analysis (e.g. NeuroML-DB (Birgiolas et al., 2023)), data-driven validation (e.g. SciUnit (Gerkin et al., 2019)), and archival/sharing (e.g. OSB, NeuroML-DB). Indeed, OSB and NeuroML-DB are prime examples of how advanced neuroinformatics resources can be built on top of standards such as NeuroML.

Table 1 lists a number of interactive, step-by-step guides in the NeuroML documentation, illustrating how software tools can be used to achieve specific tasks across the model development life cycle. In the following sections we discuss the specific functionality available at each stage of model development.

Creating NeuroML models

The structured declarative elements of NeuroMLv2, when combined with a procedural scripting language such as Python, provide a powerful and yet intuitive “building block” approach to model construction. For this reason, Python is now the recommended language for interacting with NeuroML (Fig. 4), although XML remains the primary serialization language for the format (i.e. for saving to disk and depositing in model repositories (Fig. 5)). Python has emerged as a key programming language in science, including many areas of neuroscience (Muller et al., 2015). A Python based NeuroML ecosystem ensures that users can take advantage of Python’s features, and also use packages from the wider Python ecosystem in their work (e.g. Numpy (Harris et al., 2020), Matplotlib (Hunter, 2007)). pyNeuroML, the Python interface for working with NeuroML, is built on top of the Python NeuroML API, libNeuroML (Vella et al., 2014; Sinha et al., 2023a,b) (Fig. 4).

The core NeuroML software stack, and an example NeuroML model created using the Python NeuroML tools. (a) The core NeuroML software stack consists of Java (blue) and Python (orange) based applications/libraries, and the LEMS model ComponentType definitions (green), wrapped up in a single package, pyNeuroML. Each of these modules can be used independently or the whole stack can be obtained by installing pyNeuroML with the default Python package manager, Pip: pip install pyneuroml. (b) An example of how to create a simple NeuroML model is shown, using the NeuroMLv2 Python API (libNeuroML) to describe a model consisting of a population of 10 integrate and fire point neurons (IafTauCell) in a network. The IafTauCell, Network, Population, and NeuroMLDocument model ComponentTypes are provided by the NeuroMLv2 standard. The underlying dynamics of the model are hidden from the user, being specified in the LEMS ComponentType definitions of the elements (see Methods). The simulator-independent NeuroML model description can be simulated on any of the supported simulation back-ends. (c) XML serialization of the NeuroMLv2 model description shows the correspondence between the Python object model and the XML serialization.

Workflow showing how to create and simulate NeuroML models using Python. The Python API can be used to create models which can include elements built from scratch from the core NeuroML standard, re-use elements from previously created models, or create new components based on novel model definitions expressed in LEMS (red). The generated model elements are saved in the default XML-based serialization (blue). The NeuroML core tools (orange) include modules to import model descriptions expressed in the XML serialization, and supports multiple options for how simulators can execute these models (green). These include native execution of the NeuroML models by core simulators (1) and others such as EDEN (2) and generation of helper Python scripts allowing direct importation of the NeuroML model by simulators which support this (3). For other simulators, the fully expanded LEMS description of the models can be generated, which can be mapped to simulator specific formats to generate scripts to run in the target simulators (4). Finally, a number of other standardized formats are supported (5), to which the loaded model can be mapped.

As illustrated in Fig. 5, Python can be used to combine different NeuroML components into a model. NeuroML supports a number of pathways for the creation of new models. Modelers may use elements included in the NeuroML standard, re-use user-defined NeuroML model elements from other models or define completely new model elements using LEMS (Fig. 5) (see section on extending NeuroML below). It is common for models to use a combination of these strategies, e.g. Gurnani and Silver (2021); Kriener et al. (2022); Cayco-Gajic et al. (2017), highlighting the flexibility provided by the modular design of NeuroML. NeuroML APIs support all these workflows. The Python tools also include many additional higher level utilities to speed up model construction, such as factory functions, type hints, and convenience functions for building complex multicompartmental neuron models (Box 1).

For construction of complex 3D circuit models, or for users who are not experienced with Python, there are a number of NeuroML compliant online and standalone applications with graphical user interfaces for model construction. These include NetPyNE’s interactive web interface (Dura-Bernal et al., 2019) (which is available on the latest version of OSB3) and neuroConstruct (Gleeson et al., 2007) which can export models directly into NeuroML and LEMS. These applications can be used to build and simulate new NeuroML models without requiring programming. Thus, users can take advantage of the individual features provided by these applications to generate NeuroML compliant models and model elements.

Validating NeuroML models

Ensuring a model is “valid” can have different meanings at different stages of the development, testing and analysis of models - from checking whether the source files are in the correct format, to ensuring the model reproduces a significant feature of its biological counterpart. NeuroML’s hierarchical, well-defined structure allows users to check their model descriptions for correctness at multiple levels (Fig. 6), in a manner similar to multi-level testing in software development. Importantly, a majority of the validation tests in NeuroML are run on the models’ NeuroML descriptions prior to simulation.

NeuroML model development incorporates multi-level validation of models. Checks are performed on the model descriptions (blue) before simulation using validation at both the NeuroML and LEMS levels (green). After the models are simulated (yellow), further checks can be run to ensure the output is in line with expected behavior (brown). The OSB Model Validation (OMV) framework can be used to ensure consistent behavior across simulators, and comparisons can be made of model activity to their biological equivalents using SciUnit.

A first level of validation checks the structure of individual model elements against their formal specifications contained in the NeuroML standard. The standard includes information on the parameters of each model element, restrictions on parameter values, their allowed units, their cardinality, and the location of the model element in the model hierarchy—i.e. parent/children relationships. A second level of validation includes a suite of semantic and logical checks. For example, at this level, a model of a multicompartmental cell can be checked to ensure that all segments referenced in segment groups (e.g. the group of dendritic segments) have been defined, and only defined once with unique identifiers. A list of validation tests currently included in the NeuroML core tools can be found in Appendix 1 Table 3. These can be run against NeuroML files at the command line or programmatically in Python—Box 1.

A key advantage of using the NeuroML2/LEMS framework is that dimensions and units are inbuilt into LEMS descriptions. This enables automated conversions of units, unit checking, together with the validation of equations. Any expressions in models which are dimensionally inconsistent will be highlighted at this stage. Note that LEMS handles unit conversions “under the hood”—modelers have flexibility in how they enter the units of parameter values (e.g. specifying conductance density in S/m2 or mS/cm2) in the NeuroML files, with the underlying LEMS definitions ensuring that a consistent set of dimensions are used in model equations (Cannon et al., 2014). LEMS then takes care of mapping the entered units to the target simulator’s preferred units. This makes model definition, inspection, use, extension, and translation easier and less error-prone.

Once the set of NeuroML files are validated, the model can be simulated, and checks can be made to test whether execution produces consistent results (e.g. firing rate of neurons in a given population) across multiple simulators (or versions of the same simulator). For this, the OSB Model Validation (OMV) framework has been developed (Gleeson et al., 2019). This framework can automatically check that the output (e.g. spike times) of a NeuroML model running on a given simulator is within an allowed tolerance of the expected value. OMV has been applied to NeuroML models that have been shared on OSB, to test consistent behavior of models as the models themselves, and all supported simulators, are updated. This has proven to be a valuable ongoing process for ensuring uniform usage and interpretation of NeuroML across the ecosystem of supporting tools.

A final level of validation concerns checking whether the model elements have emergent features that are in line with experimentally observed behavior of the biological equivalents. NeuronUnit (Gerkin et al., 2019), a SciUnit (Omar et al., 2014) package for data-driven unit testing and validation of neuronal and ion channel models, is also fully NeuroML compliant, and also supports automated validation of NeuroML models shared on NeuroML-DB and OSB.

Visualizing/analyzing NeuroML models

Multiple visualization, inspection, and analysis tools are available in the NeuroML software ecosystem. Since NeuroML models have a fixed, well defined structure, the core NeuroML libraries can extract all information from their descriptions. This information can be used by modelers and their programs/tools to run automated programmatic analyses on models.

pyNeuroML includes a range of ready-made inspection utilities for users (Box 1) that can be used via Python scripts, interactive Jupyter Notebooks, and command line tools. Examining the structure of cell and network models with 2D and 3D views is important for manual validation and to compare them to their biological counterparts. Graphical views of cell model morphology and the 3-dimensional network layout (Fig. 7), population and connectivity matrices/graphs at different levels (Fig. 8), and model summaries including the mathematical dynamics can all be generated (Fig. 9). In addition to these inspection functions, a number of utilities for the inspection of NeuroML descriptions of electrophysiological properties of membrane conductances and their spatial distribution over the neuronal membrane are also provided (Fig. 9).

Visualization of detailed neuronal morphology of neurons and networks together with their functional properties (results from model simulation) enabled by NeuroML. (a) interactive 3-D (VisPy (Campagnola et al., 2023) based) visualization of an olfactory bulb network with detailed mitral and granule cells (Migliore et al., 2014), generated using pyNeuroML. (b) Visualization of an inhibition stabilized network based on Sadeh et al. (2017) using Open Source Brain (OSB) version 1 (Gleeson et al., 2019). (c) Visualization of network of simplified multicompartmental neurons together with spiking properties using NetPyNE’s GUI (Dura-Bernal et al., 2019), which is embedded in OSB version 2.

Analysis and visualization of network connectivity from NeuroML model descriptions prior to simulation. Network connectivity schematic (a) and connectivity matrix (b) for a half scale implementation of the human layer 2/3 cortical network model (Yao et al., 2022) generated using pyNeuroML.

Examples of analysis functions available for a NeuroML model neuron. (a) Electrophysiological analysis features provided by the NeuroML-DB web based platform (Birgiolas et al., 2023). Plots show four superimposed voltage traces in the top panel and corresponding current injection traces below). (b) Example plots of steady states of activation (na_channel na_m inf) and inactivation (na_channel na_h inf) variables and their time courses (na_channel na_m tau and na_channel na_h tau) for the Na channel from the classic Hodgkin Huxley model (Hodgkin and Huxley, 1952). (c) The distribution of the peak conductances for the Ih channel over a layer 5 Pyramidal cell (Hay et al., 2011). Both (b) and (c) were generated using the analysis features in pyNeuroML, and similar functionality is also available in OSB (Gleeson et al., 2019).

Box 1.

NeuroML Python tools for users

PyNeuroML, based on the Python libNeuroML API, provides Python functions and command line utilities that support all stages of the model life cycle.

Create
Validate
Inspect and visualize
Simulate
Share and archive

The graphical applications included in the NeuroML ecosystem (e.g. neuroConstruct, NeuroML-DB, OSB (v14 and v2), NetPyNE, and Arbor-GUI) also provide many of their own analysis and visualization functions. OSBv1, for example, supports automated 3D visualization of networks and cell morphologies, network connectivity graphs and metrics, and advanced model inspection features (Gleeson et al., 2019) (Fig. 7b). On OSBv2, NetPyNE provides advanced graphical plotting and analysis facilities (Fig. 7c). A complete JupyterLab5 interface is also included in OSBv2 for Python scripting, allowing interactive notebooks to be created and shared, mixing scripting and graphical elements, including those generated by pyNeuroML. NeuroML-DB also provides information on electrophysiology, morphology, and the simulation aspects of neuronal models, such as their computational complexity, i.e. an indication of how “fast” the model will run (Birgiolas et al., 2023) (Fig. 9a). In general, any NeuroML compliant application can be used to inspect and analyze elements of NeuroML models, each having their own distinct advantages.

Simulating NeuroML models

Users can simulate NeuroML models using a number of simulation “back ends” without making any changes to their models. This is because the NeuroML/LEMS descriptions of the models can be automatically translated into any number of simulator specific formats. pyNeuroML facilitates access to all available simulation options, both from the command line and using single function calls in Python scripts when using the API (Box 1).

The simulation back ends can be classified into five broad categories (Fig. 5):

  1. core NeuroML/LEMS simulation engines (reference implementations)

  2. other simulators that natively support NeuroML

  3. simulators that import/translate NeuroML to their own internal formats

  4. back ends which are supported through generation of simulator-specific scripts by the core NeuroML tools.

  5. export to other standardized formats which may allow simulation/analysis in other packages.

Having multiple strategies in place for supporting NeuroML gives more freedom to simulator developers to choose how much they wish to be involved with implementing and supporting NeuroML functionality in their applications, while maximizing the options available for end users.

The first category includes jLEMS, jNeuroML, and PyLEMS (Fig. 4). jLEMS serves as the reference implementation for the LEMS language, and as such it can simulate any model described in LEMS (not necessarily from neuroscience). When coupled with the LEMS definitions of NeuroML standard entity structure/dynamics, it can simulate most NeuroML models, though it does not currently support numerically solving the cable equation (Rall, 1962), which is required for multicompartmental neurons. jNeuroML bundles the NeuroML standard LEMS definitions, jLEMS and other functionality into a single package for ease of installation/usage. There is also a pure Python implementation of a LEMS interpreter, PyLEMS, which can be used in a similar way to jLEMS. The pyNeuroML package encapsulates all of these tools to give easy access (at both command line and in Python) to all of their functionality (Box 1).

The second category deals with simulators which use NeuroML natively as their base descriptions of modeling entities, as opposed to having their own specific formats. The EDEN simulator is one such recently developed example, which was designed from its inception to read NeuroML and LEMS models for efficient, parallel simulation (Panagiotou et al., 2022).

The third category involves simulators which have their own internal formats, and include methods to import NeuroMLv2/LEMS and in this way translate the models to their own formats. Examples include NetPyNE (Dura-Bernal et al., 2019), MOOSE (Ray and Bhalla, 2008), N2A (Rothganger et al., 2014).

The fourth category comprises simulators for which scripts can be generated by pyNeuroML. These include native NEURON code (consisting of Python, and the simulator’s hoc and NMODL format files), the Arbor simulator (which reuses many NEURON formats, e.g. NMODL for ionic conductances and hoc for morphologies), and the Brian simulator (Stimberg et al., 2019).

The final category consists of export options to standardized formats in neuroscience and the wider computational biology field, which enable interaction with simulators and applications supporting those formats. These include the PyNN package (Davison et al., 2009), which can be run in either NEURON, NEST (Gewaltig and Diesmann, 2007) or Brian, the SONATA data format (Dai et al., 2020) and the SBML standard (Hucka et al., 2003) (see Reusing NeuroML models for more details).

Each of these simulation options have different strengths and limitations, and NeuroML users can take advantage of these by simply changing their simulation back end when simulating their model. Many of these tools also support exporting NeuroML models once they have been created. It is important to note though that not all NeuroML models can be exported to/are supported by each of these target simulators. This depends on the capabilities of the simulator in question (whether it supports networks, or morphologically detailed cells) and pyNeuroML will provide feedback if a feature of the model is not supported in a chosen environment.

A Python based ecosystem ensures that automated simulation of models can easily be carried out either using scripts, or the command line tools. Utilities to facilitate the execution of simulations on dedicated supercomputing resources, such as the Neuroscience Gateway (NSG)6 (Sivagnanam et al., 2013) are also available within the ecosystem. OSBv1 takes advantage of these to support the submission of NeuroML model simulation jobs using the NEURON simulator on NSG. NetPyNE also includes batch processing and parameter exploration features, and its deployments on the cloud based OSBv2 platform supports on-demand scaling of computing resources. Finally, the JupyterLab environment on OSBv2 contains all of the core NeuroML tools and various simulation back ends as pre-installed software packages, ready to use.

Optimizing NeuroML models

Development of biologically detailed models of brain function requires that components and emergent properties match their behavior of the corresponding biology as closely as possible. Thus, fitting neurons and networks to experimental data is a critical step in the model life cycle (Rossant et al., 2011; Druckmann et al., 2007). NeuroML promotes data-driven modeling by providing functions to fit and optimize NeuroML models against experimental data in pyNeuroML. pyNeuroML includes the NeuroMLTuner module7, which builds on the Neurotune package8 for tuning and optimizing NeuroML models against data using evolutionary computation techniques. This module allows users to select a set of weighted features from their data to calculate the fitness of populations of candidate models. In each generation, the fittest models are found and mutated to create the next generation of models, until a set of models that best exhibit the selected data features are isolated (see Guide 5 in Table 1)9.

The NeuroML ecosystem includes a number of tools that also provide model fitting features. The Blue Brain Python Optimisation Library (BluePyOpt) (Van Geit et al., 2016), an extensible framework for data-driven model parameter optimization, supports exporting optimized models to NeuroML files10. Similar to pyNeuroML, NetPyNE also uses the inspyred Python package11 to provide evolutionary computation based model optimization features (Dura-Bernal et al., 2019).

Sharing NeuroML models

The NeuroML ecosystem includes the advanced web based model sharing platforms NeuroML-DB (Birgiolas et al., 2023)12 and OSB (Gleeson et al., 2019). These resources have been designed specifically for the dissemination of models and model elements standardized in NeuroML. The integrated collaborative OSB platform also supports visualization, analysis, simulation, and development of NeuroML models. Researchers can create shared, collaborative NeuroML projects on them and can take advantage of the in-built automated visualization and analysis pipelines to explore and re-use models and their components. Whereas version 1 (OSBv1) focused on providing an interactive 3D interface for running pre-existing NeuroML models (e.g. sourced from linked GitHub repositories) (Gleeson et al., 2019), OSBv2 provides cloud based workspaces for researchers to construct NeuroML based computational models as well as analyze, and compare them to, the experimental data on which they are based, thus facilitating data-driven computational modeling. Appendix 1 Table 6 provides a list of stable, well tested NeuroML compliant models from brain regions including the neocortex, cerebellum and hippocampus, which have been shared on OSB.

NeuroML-DB aims to promote the uptake of standardized NeuroML models by providing a convenient location for archiving and exploration. It includes advanced database search functions, including ontology based search (Birgiolas et al., 2015), coupled with pre-computed analyses on models’ electrophysiological and morphological properties, as well as their computational complexity (and indication of the relative speed of execution of different models).

NeuroML’s modular nature ensures that models and their components can be easily shared with others through standard code sharing resources. The simplest way of sharing NeuroML models and components is to share their Python descriptions or their XML serializations available through these resources. Indeed, it is straightforward to make Python descriptions or the XML serializations available via different file, code (GitHub, GitLab), model sharing (ModelDB (Migliore et al., 2003; McDougal et al., 2017)), and archival (Zenodo, Open Science Framework) platforms, just like any other code/data produced in scientific investigations. Complex models with many components, spanning multiple files, such as networks and neuronal models that reference multiple cell and ionic conductance definitions, can also be exported into a COMBINE archive (Bergmann et al., 2014), a zip file with a manifest that includes metadata about its contents. pyNeuroML includes functions to easily create COMBINE archives from NeuroML models and simulations (Box 1).

OSB is designed so that researchers can share their code on their chosen platform (e.g. GitHub), while retaining full control over write access to their repositories. Afterwards, a page for the model can be created on OSB which lists the latest files present there, with links to OSB visualization/analysis/simulation features which can use the standardized files found in the resource.

NeuroML supports the embedding of structured ontological information in model descriptions (Neal et al., 2018). Models can include NeuroLex (now InterLex) (Larson and Martone, 2013) identifiers for their components (e.g. neuro_lex_id in Box 1). This links model components to their biological counterparts and makes them more transparent, findable, and reusable. For example, different types of neurons and brain regions have unique ontological ids. A user can use these ids to search for relevant model components on NeuroML-DB.

Reusing NeuroML models

NeuroML models, once openly shared, become community resources that are accessible to all. Researchers can use models shared on NeuroML-DB and OSB without restrictions. Guide 7 in Table 1 provides an example of finding NeuroML based model components using the API of NeuroML-DB, and creating novel models incorporating these elements.

In addition to these platforms, other experimental data and model dissemination platforms also provide standardized NeuroML versions of relevant models to promote uptake and reuse. For example, NeuroMorpho.org (Ascoli et al., 2007) includes a tool to download NeuroML compliant versions of its cellular reconstructions. NeuroML versions of models released by organizations such as the Blue Brain Project (Markram et al., 2015) (whole cell models as well as ion channel models from Channelpedia (Ranjan et al., 2011)), the Allen Institute for Brain Science (Billeh et al., 2020), and the OpenWorm project (Gleeson et al., 2018) are also openly available for reuse (Appendix 1 Table 6).

NeuroML can also interact with other standards to further promote model re-use. Whereas NeuroML is a declarative standard, PyNN (Davison et al., 2009) is a procedural standard with a Python API for creating network models that can be simulated on multiple simulator back-ends. NeuroML models which are within the scope of PyNN can be converted to the PyNN format, and vice-versa. Similarly, NeuroML also interacts with SONATA (Dai et al., 2020) data format by supporting the two way conversion of the network structures of NeuroML models into SONATA. In standards not specific to neuroscience, models from the well established SBML standard (Hucka et al., 2003) can be converted to LEMS (Cannon et al., 2014), for inclusion in neuroscience related modeling pipelines, and a subset of NeuroML/LEMS models can be exported to SBML, which allows use with simulators and analysis packages compliant to this standard, e.g. Tellurium (Choi et al., 2018). Simulation execution details of NeuroML/LEMS models can also be exported to Simulation Experiment Description Markup Language (SED-ML) (Waltemath et al., 2011), allowing advanced resources such as Biosimulators13 (Shaikh et al., 2022) to feature NeuroML models.

NeuroML is extensible

While the core NeuroML elements (Appendix 1 Tables 4 and 5) provide a broad range of curated model types for simulation based investigations, NeuroML can be extended (using LEMS) to incorporate novel model elements and types when they are not (yet) available in the core standard.

LEMS is a general purpose model specification language for creating fully machine readable definitions of the structure and behavior of model elements (Cannon et al., 2014). The dynamics of NeuroML elements are described in LEMS. The hierarchical nature of LEMS means that new elements can build on pre-existing elements of the modular NeuroML framework. For example, a novel ionic conductance element can extend the “ionChannelHH” element, which in turn extends “baseIonChannel”. Thus, the new element will be known to the NeuroML elements as depending on an external voltage and producing a conductance, properties that are inherited from “baseIonChannel”. Other elements, e.g. a cell, can incorporate this new type without needing any other information about its internal workings.

LEMS (and therefore NeuroML) element definitions (called “ComponentTypes”) specify the dynamical behavior of the model element in terms of a list of yet to be set parameters. Once the generic model behavior is defined, modelers only need to fill in the appropriate values of the required parameters (e.g. conductance density, reversal potential, etc.) to create new instances (called “Components”) of the element (see Methods for more details). Users can therefore create arbitrary, reusable model elements in LEMS, which can be treated the same way as core model elements (for an example see Guide 8 in Table 1).

Another major advantage of NeuroML’s use of the LEMS language is its translatability. Since LEMS is fully machine readable, its primitives (e.g. state variables and their dynamics, expressed as ordinary differential equations) can be readily mapped into other languages. As a result, simulator specific code (Blundell et al., 2018) can be generated from NeuroML models and their LEMS extensions (Fig. 5), allowing NeuroML to remain simulator-independent while supporting multiple simulation back-ends.

Newly created elements that may be of interest to the wider research community can be submitted to the NeuroML Editorial Board for inclusion into the standard. The standard, therefore, evolves as new model elements are added, and improved versions of the standard and associated software tool chain are regularly released to the community.

NeuroML is maintained by an international, collaborative community

NeuroML is an open community standard, maintained collectively by a diverse set of stakeholders. This ensures that NeuroML supports the myriad of use cases generated by a multi-disciplinary computational modeling community.

The NeuroML Scientific Committee 14 and the elected NeuroML Editorial Board 15 oversee the standard, the core tools, and the initiative. The Scientific Committee sets the scientific focus of the NeuroML initiative. It ensures that the standard represents the state of the art—that it can encapsulate the latest knowledge in neuronal anatomy and physiology in their corresponding model components. The Scientific Committee also defines the governance structure of the initiative, and works with the wider scientific community to gather feedback on NeuroML and promote its use. The Editorial Board manages the day to day development and maintenance of LEMS, the NeuroML schema, the core software tools, and critical resources such as the documentation. The Editorial Board works with simulator developers in the extended ecosystem to help make tools NeuroML compliant by testing reference implementations and answering technical queries about NeuroML and the core software tools.

NeuroML is strongly linked to global standardization initiatives. NeuroML is an endorsed INCF (Abrams et al., 2022) community standard (Martone et al., 2019) and is one of the core standards of the international COMBINE initiative (Hucka et al., 2015), which supports the development of other standards in computational biology as well (e.g. SBML (Hucka et al., 2003) and CellML (Lloyd et al., 2004)). Participation in these organizations guarantees that NeuroML follows current best practices in standardization, and remains linked to and interoperable with other standards wherever possible.

Outreach and training is critical to the uptake and improvement of NeuroML. This includes in person (and in recent years online) tutorials, development workshops, and presentations at the annual meetings of multiple organizations such as, INCF, COMBINE, and Organization for Computational Neuroscience (OCNS). Under the umbrella of the INCF, the NeuroML community also participates in the Google Summer of Code initiative where candidates have converted published computational models into the standardized NeuroML format. These standardized models are made available to the community on the OSB platform and promoted at OSB workshops.

Communication within the NeuroML community is facilitated via a public mailing list16 for asynchronous communication and announcements while open chat channels on Gitter (now Matrix/Element17) enable users and developers more immediate access to the NeuroML team for troubleshooting and general discussion. Finally, all software repositories hosted on GitHub also have issue trackers which are used for software specific queries. These channels and the aforementioned training activities provide the opportunity to learn more about NeuroML, help with standardizing models, and making tools NeuroML compliant. A community Code of Conduct18 sets the standards of communication and behavior expected on all community channels.

A core aim of NeuroML is to enable Open Science and ensure models in computational neuroscience are FAIR. To this end, all development and discussions related to NeuroML are done publicly. The schema, all core software tools, and related resources such as documentation are made freely available under suitable Free/Open Source Software (FOSS) licenses on public platforms. Everyone can, therefore, use, modify, study, and share all NeuroML artifacts without restriction. Users and developers are encouraged to contribute modifications and improvements to the schema and core tools and to participate in the general maintenance and release process.

Discussion

The model description language NeuroMLv2 has matured into a widely adopted community standard for computational neuroscience. Its modular, hierarchical structure can define a wide range of neuronal and circuit model types including simplified representations and those with a high degree of biological detail. The standardized, machine readable format of the NeuroMLv2/LEMS framework provides a flexible, common language for communicating between a wide range of tools and simulators used to create, validate, visualize, analyze, simulate, share and reuse models. By enabling this interoperability, NeuroMLv2 has spawned a large ecosystem of interacting tools that cover all stages of the model development life cycle, bringing greater coherence to a previously fragmented landscape. Moreover, the modular nature of the model components and hierarchical structure conferred by NeuroMLv2, combined with the flexibility of coding in Python, has created a powerful “building block” approach for constructing standardized models from scratch. This can be facilitated by harnessing the NeuroMLv2 tool ecosystem and the multi-level automated validation of NeuroMLv2/LEMS based models. NeuroML has therefore evolved from a standardized archiving format into a mature language that supports an ecosystem of tools for the creation and execution of models that support the FAIR principles and promote open, transparent and reproducible science.

Evolution of NeuroML and emergence of the NeuroMLv2 tool ecosystem

NeuroML was first conceived (Goddard et al., 2001) and developed (Gleeson et al., 2010) as a declarative XML based framework for defining biophysical models of neurons and networks in a standardized form in order to compare model properties across simulators and to promote transparency and reuse. NeuroML version 1 achieved these aims and was mainly used to archive and visualize existing models (Gleeson et al., 2010). Building on this, the subsequent development of the NeuroMLv2/LEMS framework provided a way to describe models as a hierarchical set of components with dimensional parameters and state variables, so that their structure and dynamics are fully machine readable (Cannon et al., 2014). This enabled models to be losslessly mapped to other representations, greatly facilitating interoperability between tools through read-write and automated code generation (Blundell et al., 2018). As NeuroMLv2 matured and became a community standard recognized by the INCF with a formal governance structure, an increasingly wide range of models and modeling tools have been developed or modified to be NeuroMLv2 compliant (Appendix 1 Tables 1, 2 and 6). The core tools, maintained directly by the NeuroML developers (Fig. 4), provide functionality to read, modify, or create new NeuroML models, as well as to analyze and visualize, and simulate the models. Moreover, there are now a larger number of tools that have been developed by other members of the community (Fig. 3), including a neuronal simulator designed specifically for NeuroMLv2 (Panagiotou et al., 2022). The emergence of an ecosystem of NeuroMLv2 compliant tools enables modelers to build tool chains that span the model life cycle and build and reuse standardized models.

NeuroML and other standards in computational neuroscience

A number of other standards and formats exist to support computational modeling of neuronal systems. Whereas NeuroML is a modular, declarative simulator independent standard for biophysical neuronal modelling, PyNN (Davison et al., 2009) and SONATA (Dai et al., 2020) provide a procedural Python based simulator independent API and a framework for efficiently handling large scale network simulations, respectively. Even though there is some overlap in the functionality provided by these standards, they each target distinct use cases and have their own goals and features. The teams developing these standards work in concert to ensure that they remain interoperable with each other, frequently sharing methods and techniques (Dai et al., 2020). This allows researchers to use their standard of choice and easily combine with another if the need arises. PyNN and SONATA are therefore integral parts of the wider NeuroML ecosystem.

Why using NeuroML and Python facilitates the construction of FAIR models

The modular and hierarchical structure of NeuroMLv2, when combined with Python, provides a powerful combination of structured declarative elements and flexible procedural approaches that enables a “Lego-like” building block approach for constructing biologically detailed models (Cayco-Gajic et al., 2017; Billings et al., 2014; Kriener et al., 2022; Gurnani and Silver, 2021). This has been facilitated by the development of pyNeuroML, which provides a single installable package which offers direct access to a range of functionality for handling NeuroML models (Box 1). Moreover, the web-based documentation of NeuroMLv2, with multiple Python scripts illustrating the usage of the language and associated tools (Table 1), has recently been updated and expanded (https://docs.neuroml.org). This provides a core resource for both new and experienced users of NeuroML facilitating its use in model building. As the examples of this resource illustrate, building models using NeuroMLv2 is efficient and intuitive, as the model components are pre-made and how they fit together specified. The structured format allows APIs like libNeuroML to incorporate features such as auto-completion and inline validation of model parameters and structure as scripts are being developed. In addition, automated multi-stage model validation ensures the code, equations and internal structure are validated against the NeuroML schema minimizing human errors and model simulations outputs are within acceptable bounds (Fig. 6). The NeuroMLv2 ecosystem also provides convenient ways to visualize and inspect the inner structure of models. pyNeuroML provides Python functions and corresponding command line utilities to view neuronal morphology (Fig. 7), neuronal electrophysiology (Fig. 9), circuit connectivity and schematics (Fig. 8). In addition, custom analysis pipelines and advanced neuroinformatics resources can easily be built using the APIs. For example, loading a NeuroML model of a neuron into OSB enables visualization of the morphology and the spatial distribution of ionic conductance over the membrane as well as inspection of the conductance state variables, while the connectivity and synaptic weight matrices can be automatically displayed for circuit models (Fig. 7, Gleeson et al. (2019)). Such features of OSB, which are made possible by the structured format of NeuroMLv2, promote model transparency, reproducibility and sharing. By enabling the development and sharing of well tested and transparent models the wider NeuroMLv2 ecosystem promotes Open Science.

Limitations of NeuroML and current work

A limitation of any standardized framework is that there will always be models and model elements that fall outside the current scope of the standard. While NeuroML suffers from this limitation, the underlying LEMS based framework provides a flexible route to develop a wide range of new types of physio-chemical model (Cannon et al., 2014). This is relatively straightforward if the new model component, such as a synaptic plasticity mechanism, fits within the existing hierarchical structure of NeuroMLv2, as the new type of synaptic element can build on an existing base synapse type, which specifies the relevant input and outputs (e.g. local voltage and synaptic current). For more radical shifts in model types (e.g. neuronal morphologies that grow during learning), that do not fit neatly into the current NeuroMLv2 schema, structural changes to the language would be required. This route is more involved as the pros and cons of changes to the structure of NeuroMLv2 would need to be considered by the Scientific Committee and, if approved, implemented by the Editorial Board.

While the current scope of NeuroMLv2 encompasses models of spiking neurons and networks at different levels of biological detail, plans are in place to extend its scope to include more abstract, rate-based models of neuronal populations (e.g. see Wilson and Cowan (1972); Mejias et al. (2016) in Appendix 1 Table 6). Additionally, work is under way to extend current support for SBML (Hucka et al., 2003) based descriptions of chemical signaling pathways (Cannon et al., 2014), to enable more complete biochemical descriptions of sub-cellular activity in neurons and synapses.

There is a growing interest in the field for the efficient generation and serialization of large scale network models, containing numbers of neurons closer to their biological equivalents (Markram et al., 2015; Billeh et al., 2020; Einevoll et al., 2019). While a number of applications in the NeuroML ecosystem support large scale model generation (e.g. NetPyNE, neuroConstruct, PyNN), the default serialization of NeuroML (XML) is inefficient for reading/writing/storing such extensive descriptions. NeuroML does have an internal format for serializing in the binary format HDF5 (see Methods), but has also recently added support for export of models to the SONATA data format (Dai et al., 2020), allowing efficient serialization of large scale models. While individual instances of large scale models are useful, the ability to generate families of these for multiple simulation runs, and more particularly a way to encapsulate, examine and reuse “recipes” for network models, is also required. A prototype package, NeuroMLlite19, has been developed which allows these concise network templates to be described and multiple instances of networks to be generated, and facilitates interaction with simulation platforms and efficient serialization formats.

As discoveries and insights in neuroscience inform machine learning and visa versa, there is an increasing need to develop a common framework for describing both biological and artificial neural networks. Model Description Format (MDF) has been developed to address this (Gleeson et al., 2023). This initiative has developed a standardized format, along with a Python API, which allows the specification of artificial neural networks (e.g. Convolutional Neural Networks, Recurrent Neural Networks) and biological neurons using the same underlying entities. Support for mapping MDF to/from NeuroMLv2/LEMS has been included from the start. This work will enable deeper integration of computational neuroscience and “brain-inspired” networks in Artificial Intelligence (AI).

Conclusion and vision for the future

NeuroMLv2 is already a mature community standard that provides a framework for standardizing biologically detailed neuronal network models. By providing a stable, common framework defining the core entities required for biologically detailed neuronal modeling, NeuroML has spawned an ecosystem of tools that span all stages of the model development life cycle. In the short term, we envision the functionality of NeuroML to expand further and for new online resources that facilitate the construction of FAIR models using pyNeuroML to be taken up by the community. The NeuroML development team are also beginning to explore how to combine NeuroML-based circuit models with musculo-skeletal simulations to enable models of the neural control of behavior. In the longer term, developing seamless interfaces between NeuroML and other domain specific standards will enable the development of more holistic models of the neural control of body systems across a wide range of organisms, as well as greater exchange of models and insights between computational neuroscience and AI.

Methods

NeuroMLv2 is formally specified by the NeuroMLv2 XML schema, which defines the allowed structure of XML files which comply to the standard, and the LEMS ComponentType definitions, which define the internal state variables of the underlying elements, providing a machine-readable specification of the time evolution of model components. This core specification is backed up by a suite of software tools that support the model life cycle, and the accompanying usage and development documentation.

We illustrate the key parts of this framework using the HindmarshRose cell model (Hindmarsh et al. (1984), Fig. 10), which, as an abstract point neuron model, serves as an appropriate simple NeuroMLv2 ComponentType.

Example model description of a HindmarshRose1984Cell NeuroML component. (a) XML serialization of the model description containing the main hindmarshRose1984Cell element, with a set of parameters which result in regular bursting. A current clamp stimulus is applied using a pulseGenerator, and a population of one cell is added with this in a network. This XML can be validated against the NeuroML Schema. (b) Membrane potentials generated from a simulation of the model in (a). The LEMS simulation file to execute this is shown in Fig. 14.

The NeuroML XML Schema

We begin with the NeuroMLv2 standard. The standard consists of two parts, each serving different functions:

  1. the NeuroMLv2 XML schema

  2. corresponding LEMS component type definitions

The NeuroMLv2 schema is a language independent data model that constrains the structure of a NeuroMLv2 model description. The NeuroML schema is formally described as an XML Schema document20 in the XML Schema Definition (XSD) format, a recommendation of the World Wide Web Consortium (W3C)21. An XML document that claims to conform to a particular schema can be validated against the schema. All NeuroMLv2 model descriptions can therefore, be validated against the NeuroMLv2 schema.

The basic building blocks of an XSD schema are “simple” or “complex” types and their “attributes”. All types are created as “extensions” or “restrictions” of other types. Complex types may contain other types and attributes whereas simple types may not. Fig. 11 shows some example types defined in the NeuroMLv2 schema. For example, the Nml2Quantity_none simple type restricts the in-built “string” type using a regular expression “pattern” that limits what string values it can contain. The type is Nml2Quantity_none is to be used for unit-less quantities (e.g. 3, 6.7, -1.1e-5) and the restriction pattern for translates to “a string that may start with a hyphen (negative sign), followed by any number of numerical characters (potentially containing a decimal point) and a string containing capital or small ‘e’ (to specify the exponent)”. The restriction pattern for the Nml2Quantity_voltage type is similar, but must be followed by a “V” or “mV”. In this way, the restriction ensures that a value of type “Nml2Quantity_voltage” represents a physical voltage quantity with units “V” (volt) or “mV” (millivolt). Furthermore, a NeuroMLv2 model description that uses a voltage value that does not match this pattern, for example “0.5 s”, will be invalid.

Type definitions taken from the NeuroMLv2 schema, which describes the structure of NeuroMLv2 elements. Top: “simple” types may not include other elements or attributes. Here, the Nml2Quantity_none and Nml2Quantity_voltage types define restrictions on the default string type to limit what strings can be used as valid values for attributes of these types. Bottom: example of a “complex” type, the HindmarshRose cell model (Hindmarsh et al., 1984), that can also include other elements of other types, and extend other types.

The example of a complex type in Fig. 11 is the HindmarshRose1984Cell type that extends the BaseCellMembPotCap complex type (the base type for any cell producing a membrane potential v with a capacitance parameter C), and defines a number of new “required” (compulsory) attributes. These attributes are of simple types—these are all unit-less quantities apart from v_scaling, which has dimension voltage. Note that inherited attributes are not re-listed in the complex type definition—the compulsory capacitance attribute, C, is inherited here from BaseCellMembPotCap.

The NeuroMLv2 schema serves a number of critical functions. A variety of tools and libraries support the validation of files against XSD schema definitions. Therefore, the NeuroMLv2 schema enables the validation of model descriptions prior to simulation (Fig. 6). XSD schema definitions, as language independent data models, also allow the generation of APIs in different languages. More information on how APIs in different languages are generated using the NeuroMLv2 XSD schema definition is provided in later sections.

The NeuroMLv2 XSD schema is also released and maintained as a versioned artifact, similar to the software packages. The current version is 2.3, and can be found in the NeuroML2 repository on GitHub22.

LEMS ComponentType definitions

The second part of the NeuroMLv2 standard consists of the corresponding LEMS ComponentType definitions. Whereas the XSD Schema describes the structure of a NeuroMLv2 model description, the LEMS ComponentType definitions formally describe the dynamics of the model elements.

LEMS (Cannon et al., 2014) is a domain independent general purpose machine-readable language for describing models and their simulations. A complete description of LEMS is provided in Cannon et al. (2014) and in our documentation23. Here, we limit ourselves to a short summary necessary for understanding the NeuroMLv2 ComponentType definitions.

LEMS allows the definition of new model types referred to as ComponentTypes. These are formal descriptions of how a generic model element of that type behaves (the “dynamics”), independent of the specific set of parameters in any instance. To describe the dynamics, such descriptions must list any necessary parameters that are required, as well as the time-varying state variables. The dimensions of these parameters and state variables must be specified, and any expressions involving them must be dimensionally consistent. An instance of such a generic model is termed a Component, and can be instantiated from a ComponentType by providing the necessary parameters. One can can think of ComponentTypes as user defined data types similar to “classes” in many programming languages, and Components as “objects” of these types with particular sets of parameters. Types in LEMS can also extend other types, enabling the construction of a hierarchical library of types. In addition, since LEMS is designed for model simulation, ComponentType definitions also include other simulation related features such as Exposures, specifying quantities that may be accessed/recorded by users.

There is a one-to-one mapping between elements specified in the NeuroML XSD schema, and LEMS ComponentTypes, with the same parameters specified in each. The addition of new model elements to the NeuroML standard, therefore, requires the addition of new type definitions to both the XSD schema and the LEMS definitions. New user defined ComponentTypes, however, can be easily defined in LEMS and used freely in models, and these do not need to be added to the standard before use. The only limitation here is that new user defined ComponentTypes cannot be validated against the NeuroML schema, since their type definitions will not be included there.

Fig. 12 shows the ComponentType definition for the HindmarshRose1984Cell model element. Here, the HindmarshRose1984Cell ComponentType extends baseCellMembPotCap and inherits its elements. The ComponentType includes a number of parameters that users must provide when creating a new instance (component): a, b, c, d, r, s, x1, v_scaling.

LEMS ComponentType definition

of the HindmarshRose cell model (Hindmarsh et al., 1984).

Other parameters, x0, y0, and z0 are used to initialize the three state variables of the model, x, y, z. x is the proxy for the membrane potential of the cell used in the original formulation of the model (Hindmarsh et al., 1984) and is here scaled by a factor v_scaled to expose a more physiological value for the membrane potential of the cell in StateVariable v. A Constant MSEC is defined to hold the value of 1 ms for use in the ComponentType. Next, an Attachment is used to enable the addition of entities that would provide external inputs to the ComponentType. Here, synapses are Attachments of the type basePointCurrent and provide synaptic current input to this ComponentType.

The Dynamics block lists the mathematical formalism required to simulate the ComponentType. By default, variables defined in the Dynamics block are private, i.e., they are not visible outside the ComponentType. To make these visible to other ComponentTypes and to allow users to record them, they must be connected to Exposures. Exposures for this ComponentType include the three state variables, but also the internal derived variables, which while not used by other components, are useful in inspecting the ComponentType and its dynamics. An extra exposure, spiking, is added to allow other NeuroML components access to the spiking state of the cell that will be determined in the Dynamics block.

A number of StateVariable definitions are followed by DerivedVariables, variables whose values depend on other variables. The total synaptic current, iSyn, is a summation of all the synaptic currents, i received by the synapses that may be attached on to this ComponentType. The add value of the reduce field tells LEMS that when there are multiple values, they should all be summed. As noted, x is a scaled version of the membrane potential variable, v. This is followed by the three derived variables, ph i, ch i, rho where:

The total membrane potential of the cell, iMemb is calculated as the sum of the capacitive current and the synaptic current:

v, y, z are TimeDerivatives, with the “value” representing the rate of change of each variable:

The final few blocks set the initial state of the component (OnStart),

and define conditional expressions to set the spiking state of the cell:

Both the XSD schema and the LEMS ComponentType definitions enable model validation. However, despite some overlap, they support different types of validation. Whereas the XSD schema allows for the validation of model descriptions (e.g. the XML files), the LEMS ComponentType definitions enable validation of model instances, i.e., the “runnable” instances of models that are constructed once components have been created by instantiating ComponentTypes with the necessary parameters, and various attachments created between source and target components. A model description may be used to create a number of different model instances for simulation. Indeed, it is common practice to run models that include stochasticity with different seeds for random number generators to verify the robustness of simulation results. Thus, the validation of dimensions and units that LEMS carries out is done only after a runnable instance of a model has been created.

The LEMS ComponentType definitions for NeuroMLv2 are also maintained as versioned files that are updated along with the XSD schema. These can also be seen in the NeuroMLv2 GitHub repository24. An index of the ComponentTypes included in version 2.3 of the NeuroML standard, with links to online documentation, is also provided in Appendix 1 Tables 4 and 5.

NeuroML APIs

The NeuroMLv2 software stack relies on the NeuroML APIs that provide functionality to read, write, validate, and inspect NeuroML models. The APIs are programmatically generated from the machine readable XSD schema, thus ensuring that the class for defining a specific NeuroML element in a given language (e.g. Java) has the correct set of fields with the appropriate type (e.g. float or string) corresponding to the allowed parameters in the corresponding NeuroML element. NeuroMLv2 currently provides APIs in a number of languages—Python (libNeuroML which is generated via generateDS25), Java (org.neuroml.model via JAXB XJC26), C++ (NeuroML_CPP via XSD27) and MATLAB (NeuroML-Toolbox which accesses the Java API from MATLAB), and APIs for other languages can also be easily generated as required. LEMS is also supported by a similar set of APIs—PyLEMS in Python, and jLEMS in Java—and since a NeuroMLv2 model description is a set of LEMS Components, the LEMS APIs also support them (e.g. the hindmarshRose1984Cell example in Fig. 10 could be loaded by jLEMS and treated as a LEMS Component).

Fig. 13 shows the use of the NeuroML Python API to describe a model with one HindmarshRose cell. In Python, the instances of ComponentTypes, their Components, are represented as Python objects. The hr0 Python variable stores the created HindmarshRose1984Cell component/object. This is added to a Population, pop0 in the Network net. The network also includes a PulseGenerator with amplitude 5 nA as an ExplicitInput that is targeted at the cell in the population. The model description is serialized to XML (Fig. 10) and validated. Note that, as the standard convention for classes in Python is to use capitalized names, HindmarshRose1984Cell is used in Python, which is serialized as <hindmarshRose1984Cell> in the XML. Users can either share the Python script itself, or the XML serialization. Any valid XML serialization can be also loaded into a Python object model and modified.

Example model description of a HindmarshRose1984Cell NeuroML component in Python, using parameters for regular bursting. This script generates the XML in Fig. 10.

XML is the default serialization of NeuroML, and all existing APIs can read and write the format (and it should be seen as a minimal requirement for new APIs to support XML). There is however an alternative HDF528 based serialization of NeuroML files which is supported by both libNeuroML and the Java API, org.neuroml.model29. This format is based on an efficient representation of cell positions and connectivity data as HDF5 data sets, which can be serialized in compact binary format, and loaded into memory for optimized access (e.g. as numpy arrays in libNeuroML). This reduces the size of the saved files for large scale networks, and speeds up loading/writing models, eliminating the need to parse/generate large text files containing XML. Models serialized in this format can be loaded and transformed to simulator code in the same way as XML based models by the Java and Python APIs.

Simulating NeuroML models

The model description shown in Fig. 10 contains no information about how it is to be simulated, or on the dynamics of each model component. Providing this simulation information and linking in the ComponentType definition requires creating a LEMS file to fully specify the simulation. Fig. 14 shows the use of utilities included in the Python pyNeuroML package to describe a LEMS simulation of the HindmarshRose model defined in Fig. 10. The LEMSSimulation object includes simulation specific information, such as the duration of the simulation, the integration time step, and the seed value. It also allows the specification of files for the storage of data recorded from the simulation. In this example, we record the membrane potential, v, of our cell in its population, HRPop0[0]. Similar to the NeuroMLv2 model description, the simulation object can also be serialized to XML for storage and sharing (Fig. 14, bottom).

An example simulation of the HindmarshRose model description shown in Fig. 13, with the LEMS serialization shown at the bottom.

As noted previously, NeuroML/LEMS model and simulation descriptions are machine readable and simulator independent. They can, therefore, be translated into specific formats for simulation on different simulation backends. The core tool for simulating NeuroML/LEMS models is jNeuroML, which is also made available to Python users via pyNeuroML. In addition to functions for programmatic access, jNeuroML and pyNeuroML also provide the jnml and pynml command line tools (Box 1).

jNeuroML supports all five simulator engine categories (Fig. 5). It includes the jLEMS reference LEMS simulator that can simulate LEMS models that do not use the cable equation. It can therefore, directly simulate simpler models. For example, the HindmarshRose model shown in Figs. 13 and 14 is simulated directly in jNeuroML. jNeuroML can also pass simulations to the EDEN simulator (Panagiotou et al., 2022) for direct simulation. For simulators that require the conversion of NeuroML/LEMS simulations into either scripts which allow importing by the simulator’s own methods (e.g. NetPyNE (Dura-Bernal et al., 2019)) or full conversion to native simulator scripts (e.g. NEURON (Hines and Carnevale, 1997)), jNeuroML includes the org.neuroml.export library30. This library translates the complete, expanded LEMS model simulation instance that is available to it in jNeuroML (via jLEMS) and translates it into the required format. Thus, supporting a new simulation back end that requires translation of NeuroML/LEMS into another format requires the addition of new “writers” to the org.neuroml.export library. jNeuroML also includes the org.neuroml.import31 library that converts from other formats (e.g. SBML (Hucka et al., 2003)) to LEMS for combination with NeuroML models.

All NeuroML and LEMS software packages are made available under FOSS licenses. The source code for all NeuroML packages and the standard can be obtained from the NeuroML GitHub organization32. The NeuroML Python API33 was developed in collaboration with the NeuralEnsemble initiative34, which also maintains other commonly used Python packages such as PyNN (Davison et al., 2009), Neo (Garcia et al., 2014) and Elephant (Denker et al., 2018). LEMS packages are available from the LEMS GitHub organization35.

To ensure replication and reproduction of studies, it is important to note the exact versions of software used in studies. For NeuroML and LEMS packages, archives of each release along with citations are published on Zenodo36 to enable researchers to cite them in their work.

Documentation

A standard and its accompanying software ecosystem must be supported by comprehensive documentation if it is to be of use to the research community. The primary NeuroML documentation for users that accompanies this paper has been consolidated into a JupyterBook (Executable Books Community, 2020) at https://docs.neuroml.org. This includes explanations of NeuroML and computational modeling concepts, interactive tutorials with varying levels of complexity, information about tools and what functions they provide to support different stages of the model life cycle. The JupyterBook framework supports “executable” documentation through the inclusion of interactive Jupyter notebooks which may be run in the users’ web browser on free services such as OSBv2, Binder.org37 (Project Jupyter et al., 2018) and Google Colab38. Finally, the machine readable nature of the schema and LEMS also enables the automated generation of human readable documentation for the standard and low level APIs (Fig. 15) along with their examples39. In addition, the individual NeuroML software packages each have their own individual documentations (e.g. pyNeuroML40, libNeuroML41).

Documentation for the HindmarshRose1984Cell NeuroMLv2 ComponentType generated from the XSD schema and LEMS definitions on the NeuroML documentation website, showing its dynamics. More information about the ComponentType can be obtained from the tabs provided.

As with the rest of the NeuroML ecosystem, the documentation is hosted on GitHub42, licensed under a FOSS license, and community contributions to it are welcomed. A PDF version of the documentation can also be downloaded for offline use43.

Acknowledgements

We thank all the members of the NeuroML Community who have contributed to the development of the standard over the years, have added support for the language to their applications, or who have converted published models to NeuroML. We would particularly like to thank the following for contributions to the NeuroML Scientific Committee: Upi Bhalla, Avrama Blackwell, Hugo Cornells, Robert McDougal, Lyle Graham, Cengiz Gunay and Michael Hines. The following have also contributed to developments related to the named tools/simulators/resources: EDEN - Mario Negrello and Christos Strydis, SONATA - Anton Arkhipov and Kael Dai, MOOSE - Subhasis Ray, NeuroML-DB - Justas Birgiolas, NeuroMorpho.Org - Giorgio Ascoli, N2A - Fred Rothganger, pyLEMS - Gautham Ganapathy, MDF - Manifest Chakalov, libNeuroML and NeuroTune - Mike Vella, Open Source Brain - Matt Earnshaw, Adrian Quintana and Eugenio Piasini, SciUnit/NeuronUnit - Richard C Gerkin, Brian - Marcel Stimberg and Dominik Krzemiński, Arbor - Nora Abi Akar, Thorsten Hater and Brent Huisman, BluePyOpt - Jaquier Aurélien Tristan and Werner van Geit, C++/MATLAB APIs - Jonathan Cooper. We thank Rokas Stanislavos, András Ecker, Jessica Dafflon, Ronaldo Nunes, Anuja Negi and Shayan Shafquat for their work converting models to NeuroML format as part of the Google Summer of Code program. We also thank Diccon Coyle for feedback on the manuscript.

Funding

NeuroML software core tools, with a description of their scope, the main programming language they use (or other interaction means, e.g. Command Line Interface (CLI)), and links for more information.

Tools in the wider NeuroML software ecosystem, with a description of their scope, the main programming language they use (or other interaction means, e.g. through a web browser, Graphical User Interface (GUI) or Command Line Interface (CLI)), and links for more information.

Listing of validation tests run by NeuroML

Index of standard NeuroMLv2 ComponentTypes

Index of standard NeuroMLv2 ComponentTypes (continued)

Listing of NeuroML models and example repositories