Simulation Settings =================== The script ``r.py`` in the Araudia repo can be used to launch a general simulation. The various settings available are explained below. General Parameters ------------------ **evolution_on** When ``True``, dividing protocells can occasionally give rise to one off-spring that is a daughter variant, giving hereditable variation and thus enabling natural selection. If ``False``, protocells divide and protocell type populations grow in size, but daughter variants (that start new protocell type populations) are never produced. Setting to ``False`` is useful if only the ecological aspects of a group of protocells are of interest. **regulation_on** When ``True``, regulatory networks can become active in protocells. Regulatory networks can dynamically control the levels of nutrient inport enzymes over short time periods. Regulatory networks are implemented as Neural ODEs with tuneable weights. If ``False``, protocells maintain fixed levels of import enzymes which can only be changed via evolutionary mutations in offspring. **min_ptypes_to_continue** An integer, specifying the number of protocell types that need to exist in the chemostat for the simulation to keep running. This allows premature halting of undesirable simulations. When set to ``0``, the chemostat will continue simulating even when all protocell types are washed out (simulating just the inflow/outflow of chemicals). When set to ``1``, at least one protocell type must exist for the simulation to keep running, and so on. **run_on_HPC** When ``True``, the simulator does not check parameters, nor prompt the user that the chemostat feeds are correct: it simply force starts the simulation. Parameters and feeds are assumed acceptable. This is like a "headless" mode that needs to run when an HPC cluster, rather than a user, is interacting with the program. **no_output** When ``True``, the simulation makes no data files at all, except for a ``sim.stats`` file at the end. This option is infrequently used, but it can be employed, for example, if species diversity at simulation end time is of interest. **record_trajectories** When ``True``, detailed even-by-event information about chemical level and protocell population level trajectories are recorded. It is only advisable to use this option for short simulations, or else data output can quickly max-out the computer filesystem capacity. **start_in_steady_state** When ``True`` the chemostat begins with internal chemical concentrations equal to the feed concentrations at time = 0. This is the default option. If set to ``False``, the reactor starts only with solvent and fills up with chemicals solutes supplied by the feeds. Chemostat Parameters -------------------- **MU** Dilution rate of chemostat. Higher ``MU`` means faster dilution, i.e. more volume per unit time enters the chemostat from the nutrient feeds. Lower ``MU`` means slower dilution (higher mean residence time of chemostat contents). **OMEGA** Chemostat volume (in arbitrary units). Also equal to the number of particles per concentration unit. The chemostat has a constant volume. Metabolic Network Parameters ---------------------------- **BETA** The fraction of an imported nutrient growth value that is guaranteed to be leaked back to environment as (various) byproducts. From ``0`` to ``1.0``. For example, ``0.2`` = **At least** 20 percent growth value of each imported nutrient is leaked back to the environment (can be more for some nutrients). This 'minimum leakage constant' promotes cross-feeding relationships. **SELF_MAINTAIN_COST** Growth value per unit time that a protocell must use for self-maintenance / housekeeping. Applies to all protocell types. A protocell will only grow and divide if it imports more growth value per unit time than this lower limit. **RESOURCE_MAX** Maximum enzyme cost supported by a protocell. This places an upper limit on the number of import enzymes a protocell can have. **E_MIN** The minimum level that an import enzyme can have, in any protocell type. **g_MULTIPLIER** Multiplier converting imported growth value rate per unit time to division rate of a protocell. A higher value means less nutrient growth value needs to be imported for a protocell to divide. The multiplier enforces that protocells must absorb many nutrient particles before dividing, in line with them being reproducing 'systems' rather than replicating molecules. Regulatory Network Parameters ----------------------------- **k_TAU** Timescale multiplier for regulatory networks. Scales up or down how fast regulatory responses are. A higher value means that regulatory networks can change internal protocell enzyme levels more quickly. **b1_MULTIPLIER**, **b2_MULTIPLIER** "Push back" constants, making sure that regulatory network dynamics keep internal enzyme levels within a valid range. Higher value = harder push back when enzyme level limit is crossed. Evolutionary Parameters ----------------------- **MUTANT_EVERY_N_DIVISIONS** A variant daughter protocell is produced, on average, at this many protocell divisions. In this 'special' division event, one normal and one variant daughter protocell are produced. **P_MUTATE_PARAMETER** The probability that any single parameter in the phenotype of a parent protocell will be perturbed (mutated) before passing on to the variant daughter. Parents types with longer phenotypes have more parameters perturbed on average. **P_MAJOR_INNOVATION** The probability that a major innovation happens in a daughter variant protocell. A major innovation involves the loss/gain of a nutrient input and/or the loss/gain of an excreted byproduct. **PARAMETER_PERTURB_FRAC** The magnitude of evolutionary parameter changes, given as a fraction of absolute parameter ranges. From ``0`` to ``1.0``. For example, ``0.05`` mean that all phenotype parameters are perturbed up to +/- 5% of their maximum ranges. Secondary Parameters -------------------- **f_MULTIPLIER** Reactor feed concentrations multiplier. Normalised concentrations in the reactor feed CSV files (see below) are multiplied by this number, to get the actual reactor feed concentration units. **MIN_NUTRIENT_PARTICLES_PER_DIV** A protocell must uptake **at least** this many nutrient particles before dividing. This is checked when the chemostat first initialises. **k_MULTIPLIER** The Monod "K" constant for all nutrient uptake rates. Set to ``1.0`` by default. **u1_MULTIPLIER**, **u2_MULTIPLIER** These constants are used when a chemical universe object is initialised, to define the absolute magnitudes for growth value and import enzyme cost (respectively) for each chemical in the system. These constants can be disregarded is the chemical universe has values set directly in another way. Format of Nutrient Feed Forcing CSV Files ----------------------------------------- The type of external nutrient forcing to apply to the chemostat is supplied by using a CSV file. Below is a simple example of a nutrient forcing CSV file: :: 0, ,1, 0,0.7,0,0.9 And in a more intuitive table form: +-----------+-----------+-----------+-----------+ | 0 | | 1 | | +===========+===========+===========+===========+ | 0 | 0.7 | 0 | 0.9 | +-----------+-----------+-----------+-----------+ - The header row describes which nutrients are flowing into the chemostat: nutrients 0 and 1 in this case. - Each nutrient has two associated columns. The first column is the **time**, and the second column is the **normalised feed concentration** of the nutrient at that time (between 0 and 1). In this example, nutrient 0 is supplied to the chemostat at normalised concentration 0.7 at time = 0. It will continue to be supplied at this concentration until the end of the simulation. Likewise, nutrient 1 is supplied at normalised concentration 0.9 at time = 0. The normalised concentrations are multiplied by the ``f_MULTIPLIER`` parameter. If ``f_MULTIPLIER = 200`` then nutrient 0 is actually supplied at 140 (arbitrary) concentration units and nutrient 1 is supplied at 180 units. .. note:: If creating a CSV file in e.g. Excel, make sure to save the file as raw CSV format, not in Excel format. In the example below, five nutrients are pulsed into the reactor at different times. Each nutrient is given a spike from 0 to maximum concentration that lasts for 5000 time steps: +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | 0 | | 1 | | 2 | | 3 | | 4 | | +===========+===========+===========+===========+===========+===========+===========+===========+===========+===========+ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | 10000 | 1 | 20000 | 1 | 30000 | 1 | 40000 | 1 | 50000 | 1 | +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | 15000 | 0 | 25000 | 0 | 35000 | 0 | 45000 | 0 | 55000 | 0 | +-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ Note that the time columns for each nutrient are **independent**. Times are not required to line up across the row. One nutrient may change availability in a complicated way and need many time entries, whereas another nutrient may, for example, be constant and only have one time entry like in the simple example above. In the final example below, nutrients 2 and 3 are supplied to the chemostat as anti-phase sine waves: +-----------+-----------+-----------+-----------+ | 2 | | 3 | | +===========+===========+===========+===========+ | 0 | 0.60 | 0 | 0.60 | +-----------+-----------+-----------+-----------+ | 500 | 0.65 | 500 | 0.55 | +-----------+-----------+-----------+-----------+ | 1000 | 0.70 | 1000 | 0.50 | +-----------+-----------+-----------+-----------+ | 1500 | 0.75 | 1500 | 0.45 | +-----------+-----------+-----------+-----------+ | 2000 | 0.79 | 2000 | 0.41 | +-----------+-----------+-----------+-----------+ | 2500 | 0.84 | 2500 | 0.36 | +-----------+-----------+-----------+-----------+ | ... | ... | ... | ... | +-----------+-----------+-----------+-----------+ It is worth commenting that the chemostat nutrient forcing function is **always a step function**. That is, it is a function made up of discrete steps (rather than being continuous).