data science, dynamic simulation modelling, genomics, interactive visualisation, dashboards, image & video analysis

**If you’d like me to work on projects of any size, please get in touch**

e: cnr.lwlss@gmail.com

t: @cnrlwlss

In this post, I will describe a family of three related mathematical models of population dynamics. Since their development, scientists have used these models to simulate the growth of a wide range of different types of populations including the growth of microbial cells, plant growth and the growth of human populations. Each of these three models are written down in the form of ordinary differential equations (ODEs) and are relatively straightforward to simulate from. I will discuss the development of the models and how they work. I will also demonstrate interactive plots for visualising and comparing simulation results across the family.

In 1798, the Reverend Thomas Malthus wrote an influential book outlining his concerns about over-crowding. He developed a simple mathematical model to predict future human population sizes in response to increased rates of food production. He used simulations from this model to make dire predictions about our ability to support the growing human population in the near future and encouraged “moral restraint” to restrict human population growth.

The birth-death model describes the rate of increase in size of a population ($N$): the expression on the left-hand side of the equation below. The rate of change caused by birth (positive) is simply assumed to be proportional to the size of the population itself at any given time. The constant of proportionality between population size and the rate of change of population size with time, due to birth, is the model parameter: $b$. An equivalent parameter $d$ represents the rate at which members of the population die (again, this term depends on $N$). Assuming deterministic behaviour and a continuous population size, the model can be written down as an ordinary differential equation (ODE):

$$\frac{dN}{dt} = b N - d N$$

For discrete stochastic modelling, which explicitly takes the integer nature of real populations and random effects into account, the distinction between birth and death processes can be important, and the model as written above should be considered for use directly. However, in this post we will consider the more usual continuous, deterministic form of the model, which is indistinguishable from the mean of the discrete stochastic equivalent when populations are large, in some sense. In a continuous, deterministic context, the effects of birth and death can usefully be combined into an effective birth constant (or intrinsic growth rate) $r$ (where $r = b - d$), to give the exponential model:

$$
\begin{aligned}
\frac{dN}{dt} &= (b - d) N \\
\Rightarrow \frac{dN}{dt} &= r N
\end{aligned}
$$

This model is consistent with assuming either a) members of this population die at a negligibly low rate (i.e. $d << b$) or b) that there is no advantage to considering birth and death as separate processes. This ODE can be solved, given the initial condition, the size of the population at the start of the experiment, ($N_0$), to give what is called an analytical solution of the ODE. This solution is a mathematical function, which provides a straightforward method to calculate population size at any given time ($t$), given values for the model parameter $r$ and an initial condition $N(0)=N_0$.

$$N(t) = N_0 e^{rt}$$

The functional form for the solution is straightforward to plot. Adjust the sliders below the figure panels to visualise the effects of changing the initial population size $N_0$ and the growth rate $r$ on population dynamics simulated from the exponetial model:

Malthus’s predictions were ultimately incorrect for several reasons, but one major reason was that the model underlying his predictions failed to account for any secondary effects of population size on population growth rate. In particular, according to this model, there is never any negative effect of population size on growth rate, regardless of how big the population eventually becomes. Effectively, this model assumes that the difference between birth rate and death rate does not depend on the size of the population at all. The extreme population explosion predicted by the exponential model is not sustainable. All real populations rely on some substrate (e.g. food) or resource (e.g. housing) for growth as well as the presence of a starting population. In reality, all substrates and resources are finite.

The exponential model is a valid description of populations which are not yet resource-limited or density-limited. As such, it well describes the early growth of small, colonising populations, recently moved to a rich environment with plenty of space and resources. Its simplicity and its ability to describe the early growth of populations makes this model accurate, predictive and useful in quite a wide range of circumstances. For example, the rate at which pure, isolated microbial cell populations grow under laboratory conditions is an important phenotype in microbiology. However the exponential model never does a good job of describing the whole life-cycle of living populations. Given the rapid initial growth experienced by such well-resourced populations, this inability to describe complete growth curves often quickly becomes a major drawback.

Developing and considering the exponential model of population growth is an interesting example of a model being wrong, but useful. It makes testable, quantitative predictions about long-term population behaviour which can be demonstrated to be false. After reading Malthus’s work on population dynamics, Verhulst, another population dynamics researcher, created a more realistic version, which he called the logistic model.

In 1838, in order to improve on Malthus’s exponential model of population growth by capturing the negative effect of high densities, Pierre Francois Verhulst proposed the slightly more complex and more realistic logistic model. The logistic model includes a second parameter $K$, representing the maximum achievable size of the population being modelled. $K$ is often called the population’s carrying capacity, and represents the quality or size of the environment in which the population is growing.

$$\frac{dN}{dt} = r N\left ( 1-\frac{N}{K} \right )$$

The logistic model is also an ODE with the left hand side representing the rate of increase of population size with time. The right hand side of this model consists of two terms multiplied together. The first has the same form as the exponential model (rate constant times population size). The second is a new term representing density dependence. The form of the second term was chosen so that, when the population size is small, this term is essentially equal to $1$, giving the original exponential model. Small in this case is where $N << K$, making the $\frac{N}{K}$ term essentially zero, the second term essentially equal to one, recovering the exponential model. So, following the assumptions underlying the logistic model, small, founder populations grow exponentially, until $\frac{N}{K}$ starts to be noticeably greater than zero. At that point, the population size predictions made by the logistic model start to diverge from those of the exponential model. As the population grows, at some point, $N$ reaches values close to $K$ and so $\frac{N}{K}$ starts to become close to $1$. This means that the second term starts to become close to zero, representing a slowing down of population growth ($\frac{dN}{dt} \rightarrow 0$). In the limit as $N \rightarrow K$, the rate of increase of population size with time becomes zero. In other words, population growth slows down as population size approaches its carrying capacity.

Although $N$ can never actually reach $K$, it can get arbitrarily close, given enough time. $N=K$ is a solution of the equation $\frac{dN}{dt} = 0$ (no growth because population has reached its carrying capacity) and is known as the stable steady state of the logistic ODE, because growth drives populations of all sizes towards $K$. Another steady state solution is the trivial solution $N=0$ (no growth because no population present). This steady state is unstable: any minor deviations from $N=0$ (i.e. by introducing a colonising population into the system) would cause growth to occur and $N$ would rapidly move away from zero.

The logistic model also has an analytical solution. Solving the logistic model ODE is a good exercise in integration by parts:

$$N(t) = \frac{K N_0 e^{rt}}{K + N_0 \left( e^{rt} - 1\right)}$$

The plot below compares simulated output from the exponential model and the logistic model. Adjust the sliders to compare the effect of the parameter values and initial condition on the population dynamics simulated from the two models.

In the controlled environments of microbiology laboratories, it is possible to observe the growth of populations from a small founder population (called an inoculum) to the population’s carrying capacity in a particular environment. A full set of observations of growth from a small inoculum until carrying capacity is called a growth curve. In fact, such experiments are regularly carried out in order to compare the rate at which different microbial populations grow. The symmetrical, sigmoidal curve of the logistic equation (the analytical solution of the logistic model) looks like real growth curves: it has an exponential growth phase and a stationary phase. However, in some cases, real growth curves are not symmetrical: the way that the population grows after inoculation is not necessarily the mirror image of the way that it slows down as it approaches its carrying capacity.

While working at Rothamsted, Francis Richards wrote an article describing the generalised logistic model, which includes another parameter $\nu$, for introducing some asymmetry into the logistic model. Although this extra model parameter widens the range of empirical growth curves that can be well described by the model, the extra parameter also makes the model significantly less mechanistic. They symmetry parameter $\nu$ does not really have a direct biological interpretation. The intrinsic growth constant $r$ represents the rate at which members of a population reproduce, the carrying capacity $K$ represents the amount of space or resources available to a population in a given environment. These are fairly concrete ideas corresponding fairly to biological processes. However, $\nu$ represents the degree of asymmetry of the population’s growth curve, which is far more abstract. Since $\nu$ and $r$ both affect the shape of the early part of the growth curve, the introduction of this new parameter actually interferes with how we can interpret $r$. Nevertheless, the new model looks like this:

$$\frac{dN}{dt} = r N\left ( 1-\left(\frac{N}{K} \right)^\nu \right)$$

As before, it has an analytic solution:

$$N(t) = \frac{K}{\left( 1+\left(\frac{K}{N_0}-1\right)^\nu e^{-r \nu t} \right)^{\frac{1}{\nu}}}$$

Simulations from this model can easily be plotted and compared with the simpler models in the family. As before, adjust the sliders to visualise and compare the effect of model parameters on simulated output:

The family of three population models presented here share a set of parameters ($r$, $K$ & $\nu$) and initial condition ($N_0$). Each of the simpler models can be recovered from either of their more complex equivalents. For example, by setting $K=\infty$, it is possible to recover the exponential model from the logistic model. In practical terms that just means setting $K$ to be sufficiently large so that differences between the logistic model and exponential model over the timescale of interest can be ignored. Similarly, by setting the generalised logistic model parameter $\nu=1$ we can recover the logistic model. We can investigate these scenarios by inspecting the growth curves resulting from appropriate parameter combinations. Another interesting observation which we can make from this visualisation tool is that the effect of increasing $r$ and increasing $N_0$ on the simulated growth curve is quite similar under a range of parameter values. This has implications for being able to identify parameter values by comparing simulated results against noisy data.

In another post, I will fit simulations from these three models to some experimentally observed and some synthetic population growth curves and consider the relative quality of their fits to data. Being able to switch between members of the model family by appropriate choice of parameter values allows us to consider all three alternatives during one model fitting procedure. This will prove useful for asking: which model is the most useful representation of a particular growth curve? This question is distinct from the usual model fitting question: which set of parameters fits data the best, given a particular model?

Tags: population exponential logistic generalised interactive visualisation dynamic simulation solution ODE