Using Zeus and Nestle to model the sunspot cycles

A sunspot is an area on the surface of the Sun that appears a lot darker than its surroundings, caused by intense magnetic fields regulating a convection effect on the Sun's surface. Humanity has been tracking the average number of sunspots visible from Earth on the solar disk, and found that the sunspot number follows a cycle over roughly 11 years.

In this example, I'll use the "Zeus" and "Nestle" samplers to fit a model describing the sunspot number to a single solar cycle. I'll then use this model to create a new model which predicts the properties of a solar cycle, given the properties of the previous cycle.

Useful imports

# numpy
import numpy as np

# pandas
import pandas as pd

# scipy
from scipy.signal import find_peaks
from scipy.stats import gaussian_kde
from scipy import integrate
from scipy.special import ndtri, gammaln

# Plotting
import corner

import matplotlib.pyplot as plt
%matplotlib inline

# Samplers
import zeus
print('Zeus version: {}'.format(zeus.__version__))

import nestle
print('Nestle version: {}'.format(nestle.__version__))

# misc
import logging
from time import time

Zeus version: 1.0.7
Nestle version: 0.2.0

Viewing the data

Using Pandas, we can load the .csv file containing the average sunspot number over 24 solar cycles, since the 1750s. First I will plot the entire dataset to decide which cycles are of interest, however before plotting I'll first take the square root of the sunspot number. This is just to decrease the variance in the peak height, which will make everything a little easier when it comes to making a model that can predict the height of the next peak.

dataframe = pd.read_csv("SN_m_tot_V2.0.csv", sep=";", usecols=[2,3], header=None)
dates = list(dataframe[2])[80:]
ssn = list(dataframe[3])[80:]
sqrtssn = np.sqrt(ssn)

plt.figure(figsize=(16,3))
plt.plot(dates,sqrtssn)
plt.xlabel("Date / years")
plt.ylabel("Square Root of Sunspot Number")
plt.title("Average sunspot number from 1750-2020")
plt.show()

png

I chose to look at the 4 solar cycles starting around the year 1924, as there seems to be a consistent (predictable!) change from peak to peak. Next, we need to take a closer look at those 4 solar cycles so that we can create a model.

start = 2020      # data point corresponding to the start of the 1924 solar cycle
period = 123      # average width of a solar cycle

fig,(ax1,ax2) = plt.subplots(1,2,figsize=(12,3))

# plot region of interest
ax1.plot(dates[start:start + 4*period],sqrtssn[start:start + 4*period])
ax1.set_xlabel("Date / years")
ax1.set_ylabel("Square Root of Sunspot Number")
ax1.set_title("Average sunspot number from 1924-1965")

# fragment the 4 solar cycles into 4 separate lists
peaks = [sqrtssn[start + i*period:start + (i+1)*period] for i in range(4)]
peaktimes = [dates[start + i*period:start + (i+1)*period] for i in range(4)]

# plot a typical solar cycle
ax2.plot(peaktimes[0],peaks[0])
ax2.set_xlabel("Date / years")
ax2.set_title("A typical solar cycle")
plt.show()

png

The model

We can create the model by splitting it into two parts: the ascending region, and the descending region. To define these regions, we need to know when the cycle begins, when the cycle peaks, and when the cycle ends. We'll call these "t0", "tmax", and "t1" respectively. Next, we need to know the amplitude of the peak, which we will call "c". Finally, we need 2 more parameters which will describe how much the ascending and descending regions curve, which we will call "a1", and "a2".

Using all of these, we can define our model for a single solar cycle below:

def cycle(times, c, a1, a2, t0, tmax, t1):

    # a1,a2 > 1
    if a1 < 1:
        a1 = 1
    if a2 < 1:
        a2 = 1

    #t0 < tmax < t1
    if t0 >= tmax:
        t0 = tmax - 1
    if t1 <= tmax:
        t0 = tmax + 1

    # sunspot number as a function of time
    ssn = [c*(1-((tmax-t)/(tmax-t0))**a1) if t < tmax
           else c*(1-((t-tmax)/(t1-tmax))**a2) for t in times]

    # ssn > 0
    ssn_non_negative = [i if i > 0 else 0.1 for i in ssn]

    return ssn_non_negative

I'll take the first solar cycle (starting 1924), and use some guesses for each parameter to overplot a model, just to give an idea of what the model will look like.

# plot cycle data
plt.plot(peaktimes[0],peaks[0],label="Original Data")
# plot an example model, with guessed parameters
x = np.linspace(min(peaktimes[0]),max(peaktimes[0]),300)
y = cycle(x,11,2.1,1.3,1923.5,1927.5,1935)
plt.plot(x,y,label="Example Model")

plt.xlabel("Date / years")
plt.ylabel("Square Root of Sunspot Number")
plt.title("A typical solar cycle, with an \n example model overplotted")
plt.legend()
plt.show()

png

Modelling with Zeus

Zeus has a very similar interface to the sampler "emcee", which uses an ensemble technique to obtain the prior distributions. The next step is defining the prior distributions for each parameter. I'll use a normal distribution for the peak amplitude "c", since it's pretty easy to eyeball. Aside from that, I'll be using uniform priors for every other parameter as they're a little trickier to guess.

nens = 100     # number of ensemble points
ndims = 6      # number of parameters

# mean and standard deviation of normal parameter priors
cmu,csig = 11,1
# lower and upper bounds of uniform parameter priors
a1min,a1max = 1,3
a2min,a2max = 1,3
t0min,t0max = 1922,1925
tmaxmin,tmaxmax = 1926,1929
t1min,t1max = 1933,1936

param_priors = []
# normal prior on c
param_priors.append(np.random.normal(cmu,csig,nens))
# uniform prior on a1
param_priors.append(np.random.uniform(a1min,a1max,nens))
# uniform prior on a2
param_priors.append(np.random.uniform(a2min,a2max,nens))
# uniform prior on t0
param_priors.append(np.random.uniform(t0min,t0max,nens))
# uniform prior on tmax
param_priors.append(np.random.uniform(tmaxmin,tmaxmax,nens))
# uniform prior on t1
param_priors.append(np.random.uniform(t1min,t1max,nens))

param_samples = np.array(param_priors).T

Next, we need to define a log prior, log likelihood, and log posterior. The log prior can be defined as below:

def logprior(theta):
    """
    Function to return the log of the prior, given set of current parameters
    """
    lprior = 0
    for i in range(len(param_priors)):
        # sum log priors from each parameter

        if i == 0:
            # normal priors
            lprior -= 0.5*((theta[i] - cmu) / csig)**2
        else:
            # uniform priors

            # set bounds
            if i == 1:
                low, up = a1min, a1max
            elif i == 2:
                low, up = a2min, a2max
            elif i == 3:
                low, up = t0min,t0max
            elif i == 4:
                low, up = tmaxmin,tmaxmax
            else:
                low, up = t1min,t1max

            # parameter must be between bounds
            if  low < theta[i] < up: 
                pass
            else:
                lprior = -np.inf

    return lprior

The log likelihood takes the form of a Poisson likelihood, since the sunspot counts are non-negative. using "lmbda" as the expected value of the cycle function, we can define the likelihood as below:

def loglike(theta, times, obs):
    """
    Function to return the log likelihood, given the current parameters, dates, and sunspot counts
    """
    # unpack parameters
    c_like, a1_like, a2_like, t0_like, tmax_like, t1_like = theta
    # expected value
    lmbda = np.array(cycle(times, c_like, a1_like, a2_like, t0_like, tmax_like, t1_like))
    n = len(obs)
    a = np.sum(gammaln(np.array(obs)+1))
    b = np.sum(np.array(obs) * np.log(lmbda))
    return -np.sum(lmbda) - a + b

Finally the log posterior is simply the sum of the log prior log likelihood:

def logposterior(theta, times, obs):
    """
    Function to return the log posterior, given the log prior and log likelihood
    """
    lprior = logprior(theta)
    # check log prior is finite
    if not np.isfinite(lprior):
        return -np.inf
    return lprior + loglike(theta, times, obs)

Sampling the data

Now that we've defined everything we need, we can easily run the sampling process with Zeus.

# create sampler using the first peak
sampler = zeus.sampler(nens, ndims, logposterior, args=[peaktimes[0], peaks[0]])

nburn = 500       # burn-in points
nsamples = 500    # points after burn-in

time0 = time()
sampler.run_mcmc(param_samples, nburn + nsamples)
time1 = time()

print("Time taken to sample first peak with Zeus: {} seconds".format(time1-time0))

Initialising ensemble of 100 walkers...
Sampling progress : 100%|██████████| 1000/1000 [01:16<00:00, 13.13it/s]

Time taken to sample first peak with Zeus: 76.26689600944519 seconds

Results

Collecting the samples after the sampling is equally simple, remembering to discard the burn-in samples at the start of the chain. We can create a corner plot using "corner.py", which shows the posteriors of each parameter along with contour plots describing how one parameter varies with any other:

samples_zeus = sampler.get_chain(flat=True, discard=nburn)

def plotposts(samples, labels, **kwargs):
    fig = corner.corner(samples, labels=labels, hist_kwargs={'density': True}, **kwargs)
    pos = [i*(len(labels)+1) for i in range(len(labels))]
    for axidx, samps in zip(pos, samples.T):
        kde = gaussian_kde(samps)
        xvals = fig.axes[axidx].get_xlim()
        xvals = np.linspace(xvals[0], xvals[1], 50)
        fig.axes[axidx].plot(xvals, kde(xvals), color='firebrick')

labels = ['c', 'a1', 'a2', 't0', 'tmax', 't1']
plotposts(samples_zeus, labels)

png

The posterior for the "a2" parameter looks as though it peaks below a value of 1, which is outside the prior range. However, the model will not have the correct shape if "a1" is less than 1, which suggests that the choice of model may be limiting the quality of the fit.

Next, I'll find the means and standard deviation error on each parameter. I'll also choose random samples from the chains of each parameter, which I'll use to visualise the posteriors.

param_mu = [np.mean(samples_zeus[:,i]) for i in range(6)]
param_sig = [np.std(samples_zeus[:,i]) for i in range(6)]

nfits = 300
param_samples = [np.random.choice(samples_zeus[:,i],nfits) for i in range(6)]
post_samples = np.array(param_samples).T

print("Parameters describing the square root SSN cycle starting 1923: \n \n" +
      "   c = {} \u00B1 {} square root counts\n".format(param_mu[0], param_sig[0]) + 
      "  a1 = {} \u00B1 {} \n".format(param_mu[1], param_sig[1]) +
      "  a2 = {} \u00B1 {} \n".format(param_mu[2], param_sig[2]) +
      "  t0 = {} \u00B1 {} years\n".format(param_mu[3], param_sig[3]) +
      "tmax = {}  \u00B1 {} years\n".format(param_mu[4], param_sig[4]) +
      "  t1 = {} \u00B1 {} years\n".format(param_mu[5], param_sig[5]))

Parameters describing the square root SSN cycle starting 1923:

   c = 11.23419227189279 ± 0.5155512374811139 square root counts
  a1 = 2.046091218734536 ± 0.5511917043187438 
  a2 = 1.3175512464689283 ± 0.2551941095539496 
  t0 = 1923.4364606185572 ± 0.3683613113062203 years
tmax = 1927.5745437571138  ± 0.6532168763478214 years
  t1 = 1935.0535923941627 ± 0.30717761056370796 years

Plotting the posterior

Below I'll show two plots. The left plot will show the model produced using the mean parameters for the first peak. The right plot will show a posterior predictive plot, where the darker the colour of the plot, the higher the probability of the model passing through that region.

fig, (ax1,ax2) = plt.subplots(1,2,figsize=(13,3))

# mean posterior plot
ax1.plot(peaktimes[0], peaks[0])
x = np.linspace(min(peaktimes[0]), max(peaktimes[0]), 300)
y = cycle(x,*param_mu)
ax1.plot(x,y)
ax1.set_xlabel("Date / years")
ax1.set_ylabel("square root counts")
ax1.set_title("Mean posterior plot for solar cycle starting 1924")

# posterior predictive plot
x = np.linspace(min(peaktimes[0]),max(peaktimes[0]),300)
ax2.plot(peaktimes[0], peaks[0], label="Origina Data")
for i in range(len(post_samples)):
    params = post_samples[i]
    y = cycle(x,*params)
    ax2.plot(x,y,'orange',alpha=0.02, linewidth=3,label="Fitted Model" if i == 0 else "")
ax2.set_xlabel("Date / years")
ax2.set_title("Posterior predictive plot for SSN cycle starting 1924")
leg = ax2.legend()
for lh in leg.legendHandles: 
    lh.set_alpha(1)

plt.show()

png

Predicting the next solar cycles

We now know the properties of the solar cycle between 1924-1934. Next, we want to know the properties of the next three cycles in our region of interest. We could just redefine our prior distributions, and run the sampling again for each peak. Alternatively, we could create a new model which takes the current solar cycle parameters, and uses them to predict the next solar cycle. This model will only realistically be able to make long-range predictions if the solar cycles evolve consistently, as mentioned above.

This way this model is put together is quite complicated. More details can be found here, but the brief is that we need 3 scaling parameters (y1,y2,y3), and 4 translation parameters (d0,d1,d2,d3). This model can be implemented as follows:

def predict_cycle(theta_prev, d0, d1, d2, d3, y1, y2, y3):
    # theta_prev contains the set of parameters describing the previous solar cycle
    c_prev, a1, a2, t0_prev, tmax_prev, t1_prev = theta_prev
    # start of cycle found using slight deviation d0
    t0 = t1_prev + d0
    # current c found using current t0 and previous c,tmax
    c = y1*c_prev/(t0 - tmax_prev) + d1
    # current tmax found using current t0,c
    # t0-param_mu[4] describes the time since 1924 solar cycle
    tmax = t0 + y2*c + d2*(t0-param_mu[4])**0.9
    # current t1 found using current tmax,c
    t1 = tmax + y3*c + d3*(t0-param_mu[4])**0.3
    #a1,a2 unchanged
    # return new set of parameters
    theta_new = (c,a1,a2,t0,tmax,t1)
    return theta_new

We can now try to guess the scaling and translational parameters, and using the set of parameters defining the first solar cycle, we can try to predict the second solar cycle between 1934-1944. I'll show this by plotting the data for the second solar cycle, and overplotting the predicted model. After some trial and error, I found the following prediction:

# predict the next peak parameters, using current peak parameters
params = predict_cycle(param_mu,-1,0,-0.4,0.2,7.5,0.5,0.5)
# plot second solar cycle
plt.plot(peaktimes[1],peaks[1],label="Original Data")
# overplot predicted model with new parameters
x = np.linspace(min(peaktimes[1]),max(peaktimes[1]),300)
plt.plot(x,cycle(x,*params),label="Predicted model")
plt.xlabel("Date / years")
plt.ylabel("Square Root of Sunspot Number")
plt.title("Average square root sunspot number between 1934-1944 \n with overplotted predicted model")
plt.legend()
plt.show()

png

Using the above guesses, we can define some prior distributions. This is done in exactly the same way as above, however due to some hard limits on the parameters (such as a1,a2 being strictly greater than 1), I'll use uniform priors for all parameters.

nens = 100
ndims = 7

d0min,d0max = -2.,0
d1min,d1max = -1.,1.
d2min,d2max = -1.,0.
d3min,d3max = 0.,0.4
y1min,y1max = 7.,8.
y2min,y2max = 0.2,0.8
y3min,y3max = 0.25,0.75

predict_priors = []
# uniform prior on d0
predict_priors.append(np.random.uniform(d0min,d0max,nens))
# uniform prior on d1
predict_priors.append(np.random.uniform(d1min,d1max,nens))
# uniform prior on d2
predict_priors.append(np.random.uniform(d2min,d2max,nens))
# uniform prior on d3
predict_priors.append(np.random.uniform(d3min,d3max,nens))
# uniform prior on y1
predict_priors.append(np.random.uniform(y1min,y1max,nens))
# uniform prior on y2
predict_priors.append(np.random.uniform(y2min,y2max,nens))
# uniform prior on y3
predict_priors.append(np.random.uniform(y3min,y3max,nens))

priors = [(d0min,d0max),(d1min,d1max),(d2min,d2max),(d3min,d3max),
          (y1min,y1max),(y2min,y2max),(y3min,y3max)]
predict_samples = np.array(predict_priors).T

We need to create new log prior, log likelihood, and log posterior functions. This is done almost exactly as above, but with very slight variations, so I'll won't go into detail about the machinery in these functions.

def loglike_predict(theta, times, obs):
    """
    Function to return the log likelihood, given the current parameters, dates, and sunspot counts
    """
    # expected value found by predicting new parameters, and then mimicking old likelihood function
    params_new = predict_cycle(param_mu, *theta)
    lmbda = np.array(cycle(times, *params_new))
    n = len(obs)
    a = np.sum(gammaln(np.array(obs)+1))
    b = np.sum(np.array(obs) * np.log(lmbda))
    return -np.sum(lmbda) - a + b

def logprior_predict(theta):
    """
    Function to return the log of the prior, given set of current parameters
    """
    lprior = 0
    for i in range(len(predict_priors)):
        # sum log priors from each parameter
        low, up = priors[i][0],priors[i][1]            
        # uniform prior for time parameters
        if  low < theta[i] < up: 
            pass
        else:
            lprior = -np.inf

    return lprior

def logposterior_predict(theta, times, obs):
    """
    Function to return the log posterior, given the log prior and log likelihood
    """
    lprior = logprior_predict(theta)
    # check log prior is finite
    if not np.isfinite(lprior):
        return -np.inf
    return lprior + loglike_predict(theta, times, obs)

Running the sampler, using the same burn-in and chain lengths as above, we can tune the parameters so they predict the parameters of the second solar cycle well. We'll then assume that these parameters are the same for the third and fourth peaks, and see how good the quality of the fit is with increasing forecast time.

sampler = zeus.sampler(nens, ndims, logposterior_predict, args=[peaktimes[1], peaks[1]])

nburn = 500
nsamples = 500

time0 = time()
sampler.run_mcmc(predict_samples, nburn + nsamples)
time1 = time()

print("Time taken to sample predict function with Zeus: {} seconds".format(time1-time0))

Initialising ensemble of 100 walkers...
Sampling progress : 100%|██████████| 1000/1000 [01:19<00:00, 12.56it/s]

Time taken to sample predict function with Zeus: 79.66238069534302 seconds

Prediction results

Out of curiosity, we can check the corner plot for the prediction model. Since the prediction model has every parameter interacting with each other, we can expect a load of covariance between parameters.

samples_zeus2 = sampler.get_chain(flat=True, discard=nburn)

predict_param_mu = [np.mean(samples_zeus2[:,i]) for i in range(7)]
predict_param_sig = [np.std(samples_zeus2[:,i]) for i in range(7)]

nfits = 300
predict_param_samples = [np.random.choice(samples_zeus2[:,i],nfits) for i in range(7)]
predict_post_samples = np.array(predict_param_samples).T

labels = ['d0','d1','d2','d3','y1', 'y2', 'y3']
plotposts(samples_zeus2,labels)

png

Lets check what predictions we can now make. The below code takes the set of parameters describing the first solar cycle, and uses it to predict the second solar cycle. The prediction will be plotted over the data, and then the program will move on to the next peak and repeat the process. Since we're making predictions on using predictions, over and over, we can expect the fit to get worse as we look deeper into the future.

I'll also use the samples from the prediction model posteriors, and create a posterior predictive plot for each cycle. This should show the quality of the prediction, and how it evolves with increasing forecast time.

fig,axs = plt.subplots(3,2,figsize=(10,10))

# predict second peak parameters
# plot mean posterior plot
new_param_mu = predict_cycle(param_mu,*predict_param_mu)
axs[0,0].plot(peaktimes[1],peaks[1])
x = np.linspace(min(peaktimes[1]),max(peaktimes[1]),300)
axs[0,0].plot(x,cycle(x,*new_param_mu))
axs[0,0].set_ylabel("Square Root of Sunspot Number")
axs[0,0].set_title("Average sunspot number for the solar cycles\n between 1933-1965" + 
              " with overfitted\n mean predicted models")
# posterior predictive plot
axs[0,1].plot(peaktimes[1],peaks[1])
new_pred = []
for i in range(len(predict_post_samples)):
    pred = predict_cycle(param_mu,*predict_post_samples[i])
    new_pred.append(pred)
    y = cycle(x,*pred)
    axs[0,1].plot(x,y,"orange",alpha=0.04,linewidth=3)
axs[0,1].set_title("Average sunspot number for the solar cycles\n between 1933-1965" + 
              " with overfitted\n posterior predictive plots")


# predict third peak parameters 
# plot mean posterior plot
new_param_mu = predict_cycle(new_param_mu, *predict_param_mu)
axs[1,0].plot(peaktimes[2],peaks[2])
x = np.linspace(min(peaktimes[2]),max(peaktimes[2]),300)
axs[1,0].plot(x,cycle(x,*new_param_mu))
axs[1,0].set_ylabel("Square Root of Sunspot Number")
axs[1,0].set_xlabel("Date / years")
# posterior predictive plot
axs[1,1].plot(peaktimes[2],peaks[2])
new_pred2 = []
for i in range(len(predict_post_samples)):
    pred = predict_cycle(new_pred[i],*predict_param_mu)
    new_pred2.append(pred)
    y = cycle(x,*pred)
    axs[1,1].plot(x,y,"orange",alpha=0.04,linewidth=3)

# predict fourth peak parameters
# plot mean posterior plot
new_param_mu = predict_cycle(new_param_mu, *predict_param_mu)
axs[2,0].plot(peaktimes[3],peaks[3],label="Original Data")
x = np.linspace(min(peaktimes[3]),max(peaktimes[3]),300)
axs[2,0].plot(x,cycle(x,*new_param_mu),label="Predicted Model")
axs[2,0].set_ylabel("Square Root of Sunspot Number")
axs[2,0].set_xlabel("Date / years")
axs[2,0].legend()
# posterior predictive plot
axs[2,1].plot(peaktimes[3],peaks[3])
for i in range(len(predict_post_samples)):
    pred = predict_cycle(new_pred2[i],*predict_param_mu)
    y = cycle(x,*pred)
    axs[2,1].plot(x,y,"orange",alpha=0.04,linewidth=3)
axs[2,1].set_xlabel("Date / years")
plt.show()

png

Saying that our prediction model was fairly simplified, it actually does a pretty good job at predicting the solar cycle shape. The posterior plots show the predictions getting worse over time as expected, but even so the accuracy on the mean plots are within 20% for the majority of the cycles.

Modelling with Nestle

The predictions from this model only work if we know in advance that the next solar cycle will carry on the trend from the previous solar cycle. Since in reality it is very hard to say how the Sun will behave in advance, this model won't be all that useful.

However, it did save us from having to individually sampling 3 solar cycles. To decide whether or not the time save is worth the loss in accuracy, I'll use "Nestle" to sample both the second solar cycle (between 1934-1945), and the prediction model for that same solar cycle. Nestle uses nested sampling, and so it produces a value of the marginalised evidence. Comparing these values will allow us to see how much accuracy we're sacrificing by predicting the solar cycle, instead of just sampling the next peak.

Sampling the data

To start, we'll assume that Zeus did a good enough job at sampling the first solar cycle, and jump straight to the second cycle (1934-1945). Lets define some new guesses for each parameter (c,a1,a2,t0,tmax,t1):

# mean and standard deviation of normal parameter priors
cmu,csig = 13,1
# lower and upper bounds of uniform parameter priors
a1min,a1max = 1,3
a2min,a2max = 1,3
t0min,t0max = 1931,1935
tmaxmin,tmaxmax = 1936,1940
t1min,t1max = 1943,1947

param_priors = [(cmu,csig),(a1min,a1max),(a2min,a2max),
                (t0min,t0max),(tmaxmin,tmaxmax),(t1min,t1max)]

We can reuse most of the stuff we had from before, except for the log prior functions. Nestle samples from a unit hypercube parameter space, so we need a function that transforms the priors back to their original space. This only requires a slight modification to our previous "logprior" function, making use of scipy's ndtri function.

I'll also make a very minor change to the log likelihood function. Instead of taking the dates and sunspot counts as arguments, it will use the data for the second solar cycle by default.

def prior_transform(theta):
    """
    Function to transform the parameters from unit hypercube to their true form
    """
    trans_params = []
    # transform normal prior
    trans_params.append(param_priors[0][0] + param_priors[0][1]*ndtri(theta[0]))
    # transform uniform prior
    for i in range(1,6):
        mini, maxi = param_priors[i][0], param_priors[i][1]
        trans_params.append(theta[i]*(maxi-mini) + mini)

    return trans_params

def loglike_nestle(theta):
    """
    Function to return the log likelihood, with data fixed for solar cycle between 1934-1945
    """
    obs, times = peaks[1], peaktimes[1] 
    # unpack parameters
    c_like, a1_like, a2_like, t0_like, tmax_like, t1_like = theta
    # expected value
    lmbda = np.array(cycle(times, c_like, a1_like, a2_like, t0_like, tmax_like, t1_like))
    n = len(obs)
    a = np.sum(gammaln(np.array(obs)+1))
    b = np.sum(np.array(obs) * np.log(lmbda))
    return -np.sum(lmbda) - a + b

We're now ready to run the sampling using Nestle:

# set number of dimensions, live points, sampling method, and stopping criterion
ndims = 6
nlive = 1024     
method = 'multi'        
stop = 0.1

time0 = time()
results_sample = nestle.sample(loglike_nestle, prior_transform, ndims,
                    method=method, npoints=nlive, dlogz=stop)
time1 = time()

print("Time taken to sample second solar cycle with Nestle: {} seconds".format(time1-time0))

Time taken to sample second solar cycle with Nestle: 18.891968965530396 seconds

We'll now try the sampling again, but this time we'll use the parameters from the first solar cycle (found using Zeus), and sample the parameters of the prediction model. Start by guessing at the parameters for the prediction model:

# reuse our guesses from Zeus
d0min,d0max = -2.,0
d1min,d1max = -1.,1.
d2min,d2max = -1.,0.
d3min,d3max = 0.,0.4
y1min,y1max = 7.,8.
y2min,y2max = 0.2,0.8
y3min,y3max = 0.25,0.75

predict_priors = [(d0min,d0max),(d1min,d1max),(d2min,d2max),(d3min,d3max),
               (y1min,y1max),(y2min,y2max),(y3min,y3max)]

We have to tinker with out prediction model's likelihood and prior functions to suit the prediction model. This is done exactly as it was previously with Zeus.

def loglike_nestle_predict(theta):
    """
    Function to return the log likelihood, given the current parameters, dates, and sunspot counts
    """
    obs,times = peaks[1],peaktimes[1]
    # expected value found by predicting new parameters, and then mimicking old likelihood function
    params_new = predict_cycle(param_mu, *theta)
    lmbda = np.array(cycle(times, *params_new))
    n = len(obs)
    a = np.sum(gammaln(np.array(obs)+1))
    b = np.sum(np.array(obs) * np.log(lmbda))
    return -np.sum(lmbda) - a + b

def prior_transform_predict(theta):
    """
    Function to return the log of the prior, given set of current parameters
    """
    trans_priors = []
    for i in range(len(predict_priors)):
        # sum log priors from each parameter
        mini, maxi = predict_priors[i][0],predict_priors[i][1]            
        # uniform prior for time parameters
        trans_priors.append(theta[i]*(maxi-mini)+mini)

    return trans_priors

Now that everything is set up, I'll run through the same sampling process, using the same hyper-parameters for fairness.

# set number of dimensions, live points, sampling method, and stopping criterion
ndims = 7
nlive = 1024     
method = 'multi'        
stop = 0.1

time0 = time()
results_predict = nestle.sample(loglike_nestle_predict, prior_transform_predict, ndims,
                    method=method, npoints=nlive, dlogz=stop)
time1 = time()

print("Time taken to sample prediction model with Nestle: {} seconds".format(time1-time0))

Time taken to sample prediction model with Nestle: 13.779999017715454 seconds

Results

Next, we find the log of the marginalised evidence provided by Nestle. This can be done as follows, using the information gain to estimate the error:

logZ_sample = results_sample.logz
logZerr_sample = np.sqrt(results_sample.h/nlive) 

logZ_predict = results_predict.logz
logZerr_predict = np.sqrt(results_predict.h/nlive) 

print("log(Z) from sampling the solar cycle = {} ± {}".format(logZ_sample, logZerr_sample))
print("log(Z) from predicting the solar cycle = {} ± {}".format(logZ_predict, logZerr_predict))

log(Z) from sampling the solar cycle = -266.75746256813557 ± 0.07494693736170717
log(Z) from predicting the solar cycle = -266.81473144676227 ± 0.07314472974917986

The Bayes factor is a metric that describes how much more likely a model is to produce an observed data set. It's defined as the ratios between the marginalised evidences of the two models:

K = np.exp(logZ_predict - logZ_sample)
print("Bayes factor: {}".format(K))

Bayes factor: 0.9443401223523545

This tells us that sampling via the prediction method rather than just sampling the peak doesn't come at much of a cost to the quality of the model. Since the methods produce similar results, lets collect the samples from the prediction method to use for visualising our results. Since Nestle uses nested sampling, we have to resample with weights to obtain the posteriors.

weights = results_predict.weights/np.max(results_predict.weights)
mask = np.where(np.random.rand(len(weights)) < weights)[0]
# collect posterior samples
samples_nestle = results_predict.samples[mask,:]
# collect posterior means
predict_param_mu_nestle = [np.mean(samples_nestle[:,i]) for i in range(7)]
# collect samples for posterior predictive plot
nfits = 300
predict_param_samples_nestle = [np.random.choice(samples_nestle[:,i],nfits) for i in range(7)]
predict_post_samples_nestle = np.array(predict_param_samples).T

Plotting the posterior

Let's start by making a comparison between the posteriors from Zeus and Nestle. Below I'll create two plots. The first will show the mean posterior predictions of the second solar cycle, from both Zeus and Nestle. The second plot will show the posterior predictive plots, for both samplers also.

fig,(ax1,ax2) = plt.subplots(1,2,figsize=(13,4))

# plot mean posterior plot
new_param_mu_zeus = predict_cycle(param_mu,*predict_param_mu)
new_param_mu_nestle = predict_cycle(param_mu,*predict_param_mu_nestle)
ax1.plot(peaktimes[1],peaks[1],label="Original Data")
x = np.linspace(min(peaktimes[1]),max(peaktimes[1]),300)
ax1.plot(x,cycle(x,*new_param_mu_zeus),"orange",label="Zeus Prediction")
ax1.plot(x,cycle(x,*new_param_mu_nestle),"purple",label="Nestle Prediction")
ax1.set_ylabel("Square Root of Sunspot Number")
ax1.set_xlabel("Date / years")
ax1.set_title("Average sunspot number for the solar cycles\n between 1934-1945" + 
              " with overfitted\n mean predicted models")
ax1.legend()

# posterior predictive plot
ax2.plot(peaktimes[1],peaks[1])
for i in range(len(predict_post_samples)):
    pred = predict_cycle(param_mu,*predict_post_samples[i])
    y = cycle(x,*pred)
    ax2.plot(x,y,"orange",alpha=0.04,linewidth=3)

    pred = predict_cycle(param_mu,*(np.array(predict_post_samples_nestle[i])+0.01))
    y = cycle(x,*pred)
    ax2.plot(x,y,"purple",alpha=0.04,linewidth=3)
ax1.set_ylabel("Date / years")
ax2.set_title("Average sunspot number for the solar cycles\n between 1934-1945" + 
              " with overfitted\n posterior predictive plots")
plt.show()

png

Whilst the posterior predictive plot looks a little messy, it tells us that the posterior predictives from Zeus and Nestle are difficult to tell apart, since they overlap so much. The mean posterior plot supports this, showing that the two samplers have no deviation over the entire cycle.

Finally, lets plot a posterior predictive plot that shows the evolution of the prediction accuracy over time. Below is a plot that shows predictions for the entire range of 4 solar cycles between 1923 and 1965.

plt.figure(figsize=(14,4))
# plot data between 1923 and 1965
plt.plot(dates[start:start + 4*period],sqrtssn[start:start + 4*period],label="Original Data")

# plot first cycle posterior predictive, using Zeus samples
x = np.linspace(min(peaktimes[0])-20,max(peaktimes[0]),300)
for i in range(len(post_samples)):
    params = post_samples[i]
    y = cycle(x,*params)
    plt.plot(x,y,"purple",alpha=0.02, linewidth=3,label="Fitted Model" if i == 0 else "")

# use first cycle posterior predictive to find second cycle posterior predictive
new_pred = []
x = np.linspace(min(peaktimes[1])-20,max(peaktimes[1])+20,300)
for i in range(len(predict_post_samples_nestle)):
    pred = predict_cycle(param_mu,*predict_post_samples_nestle[i])
    new_pred.append(pred)
    y = cycle(x,*pred)
    plt.plot(x,y,"orange",alpha=0.04,linewidth=3,label="Predicted Model" if i == 0 else "")

# use second cycle posterior predictive to find third cycle posterior predictive
x = np.linspace(min(peaktimes[2])-20,max(peaktimes[2])+20,300)
new_pred2 = []
for i in range(len(predict_post_samples_nestle)):
    pred = predict_cycle(new_pred[i],*predict_param_mu_nestle)
    new_pred2.append(pred)
    y = cycle(x,*pred)
    plt.plot(x,y,"orange",alpha=0.04,linewidth=3)

# use third cycle posterior predictive to find fourth cycle posterior predictive
x = np.linspace(min(peaktimes[3])-20,max(peaktimes[3])+20,300)
for i in range(len(predict_post_samples)):
    pred = predict_cycle(new_pred2[i],*predict_param_mu_nestle)
    y = cycle(x,*pred)
    plt.plot(x,y,"orange",alpha=0.04,linewidth=3)

plt.xlabel("Date / years")
plt.ylabel("Square Root of Sunspot Number")
plt.title("Predictions of average sunspot number between 1934-1965,\n" + 
          " using the cycle between 1923-1934 to make predictions")

plt.xlim(1923,1967)
plt.ylim(1)

leg = plt.legend(loc="upper left")
for lh in leg.legendHandles: 
    lh.set_alpha(1)

plt.show()

png