Walkthrough

Warning

This walk through has not been fleshed out with the relevant astrophysical content yet (for example, whether a fit is good, what the different parameters mean, etc.), and so assumes some familarity with spectral fitting in general.

It is also not yet complete, nor a faithful illustration of everything SpectralFitting.jl can do. It serves to illustrate similarities and differences in syntax between SpectralFitting.jl and XSPEC.

This example walkthrough is the SpectralFitting.jl equivalent of the Walk through XSPEC from the XSPEC manual. We will use the same dataset, available for download from this link to the data files.

Overview

The first thing we want to do is load our datasets. Unlike in XSPEC, we have no requirement of being in the same directory as the data, or even that all of the response, ancillary, and spectral files be in the same place. For simplicity, we'll assume they are:

Note

Be sure to set DATADIR pointing to the directory where you keep the walkthrough data.

using SpectralFitting, XSPECModels, Plots

DATADIR = "..."
spec1_path = joinpath(DATADIR, "s54405.pha")
data = OGIPDataset(spec1_path)
┌ SpectralDataset{SpectralFitting.OGIPData}:
│   . Object              : 1E 1048-5937
│   . Observation ID      : [no observation id]
│   . Exposure ID         : [no exposure id]
│ SpectralData with 125 active channels: 
│   . Chn. E (min/max)    : (0.69357, 70.563)
│   . Masked channels     : 0 / 125
│   . Model domain size   : 129
│   . Domain (min/max)    : (0.44347, 36.632)
│ Primary Spectrum:
│ Spectrum: EXOSAT[ME]
│   Units                 : counts s^-1
│   . Exposure time       : 23580.0
│   . Channels            : 125
│   . Data (min/max)      : (-0.028499, 0.30515)
│   . Grouped             : yes
│   . Bad channels        : yes (40)
│ Response:
│ Response Matrix(125 x 128)channels:
│   . Chn. E (min/max)    : (0.69357, 70.563)
│   . Domain E (min/max)  : (0.44347, 36.632)
│ Background: nothing
│ Ancillary: nothing
└ 

This will print a little card about our data, which shows us what else SpectralFitting.jl loaded. We can see the Primary Spectrum, the Response, but that the Background and Ancillary response files are missing. That's to be expected, since we don't have those files in the dataset.

We can check what paths it used by looking at

data.user_data.paths
SpectralFilePaths:
  . Spectrum       : /home/runner/work/SpectralFitting.jl/SpectralFitting.jl/docs/build/../../ex-datadir/s54405.pha
  . Response       : /home/runner/work/SpectralFitting.jl/SpectralFitting.jl/docs/build/../../ex-datadir/s54405.rsp
  . Background     : nothing
  . Ancillary      : nothing

We can load and alter any part of a dataset as we do our fitting. For example, if you have multiple different ancillary files at hand, switching them between fits is a one-liner.

To visualize our data, we can use some of the Plots.jl recipes included in SpectralFitting.jl:

plot(data, xlims = (0.5, 70), xscale = :log10)
Example block output

Note that the units are currently not divided by the energy bin widths. We can either do that manually, or use the normalize! to convert whatever units the data is currently in to the defacto standard counts s⁻¹ keV⁻¹ for fitting. Whilst we're at it, we see in the model card that there are 40 bad quality bins still present in our data. We can drop those as well, and plot the data on log-log axes:

normalize!(data)
drop_bad_channels!(data)
plot(data, ylims = (0.001, 2.0), yscale = :log10, xscale = :log10)
Example block output

Note that when there are no negative axes, the scale defaults to log on the plot unless otherwise specified.

Next we want to specify a model to fit to this data. Models that are prefixed with XS_ are models that are linked from the XSPEC model library, provided via LibXSPEC_jll. For a full list of the models, see Models library.

Warning

It is advised to use the Julia implemented models. This allows various calculations to benefit from automatic differentiation, efficient multi-threading, GPU offloading, and various other useful things, see Why & how.

We will start by fitting a photoelectric absorption model that acts on a power law model:

Note

To see information about a model, use the ? in the Julia REPL:

julia> ?PowerLaw
XS_PowerLaw(K, a)

    •  K: Normalisation.

    •  a: Photon index.

Example
≡≡≡≡≡≡≡
...
model = PhotoelectricAbsorption() * PowerLaw()
┌ CompositeModel with 2 model components:
│      m1 * a1
│ Model key and parameters:
│    a1 => PowerLaw
│       K_1 ->  1 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->  2 ± 0.2  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->  1 ± 0.1  ∈ [ 0, Inf ]   FREE

If we want to specify paramters of our model at instantiation, we can do that with

model = PhotoelectricAbsorption() * PowerLaw(a = FitParam(3.0))
┌ CompositeModel with 2 model components:
│      m1 * a1
│ Model key and parameters:
│    a1 => PowerLaw
│       K_1 ->  1 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->  3 ± 0.3  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->  1 ± 0.1  ∈ [ 0, Inf ]   FREE

SpectralFitting.jl adopts the SciML problem-solver abstraction, so to fit a model to data we specify a FittingProblem:

prob = FittingProblem(model => data)
┌ FittingProblem:
│   . Models     : 1
│   . Datasets   : 1
│   Parameter Summary:
│   . Total      : 3
│   . Frozen     : 0
│   . Bound      : 0
│   . Free       : 3
└ 

SpectralFitting.jl makes a huge wealth of optimizers availble from Optimizations.jl, and others from further afield. For consistency with XSPEC, we'll use here a delayed-gratification least-squares algorithm from LsqFit.jl:

result = fit(prob, LevenbergMarquadt())
┌ FittingResult:
│   Model: CompositeModel[PhotoelectricAbsorption * PowerLaw]
│   . u     : [22.181, 2.2488, 0.56637]
│   . σᵤ    : [4.3720, 0.12638, 0.27433]
│   . χ²    : 108.94
└ 

Here we can see the parameter vector, the estimated error on each parameter, and the measure of the fit statistic (here chi squared). We can overplot our result on our data easily:

plot(data,
    ylims = (0.001, 2.0),
    xscale = :log10,
    yscale = :log10
)
plot!(result)
Example block output

Our model does not account for the high energy range well. We can ignore that range for now, and select everything from 0 to 15 keV and refit:

mask_energies!(data, 0, 15)
result = fit(prob, LevenbergMarquadt())
┌ FittingResult:
│   Model: CompositeModel[PhotoelectricAbsorption * PowerLaw]
│   . u     : [21.584, 2.2307, 0.53199]
│   . σᵤ    : [4.2347, 0.12576, 0.27277]
│   . χ²    : 43.970
└ 
plot(data,
    ylims = (0.001, 2.0),
    xscale = :log10,
    yscale = :log10
)
plot!(result, label = "PowerLaw")
Example block output

The result is not yet baked into our model, and represents just the outcome of the fit. To update the parameters and errors in the model, we can use update_model!

update_model!(model, result)
┌ CompositeModel with 2 model components:
│      m1 * a1
│ Model key and parameters:
│    a1 => PowerLaw
│       K_1 ->   21.6 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->   2.23 ± 0.3  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->  0.532 ± 0.1  ∈ [ 0, Inf ]   FREE
Note

Since fitting and updating a model is often done in tandem, SpectralFitting.jl has both a fit and fit! method, the latter automatically updates the model parameters after fit.

To estimate the goodness of our fit, we can mimic the goodness command from XSPEC. This will use the simulate function to simulate spectra for a dataset (here determined by the result), and fit the model to the simulated dataset. The fit statistic for each fit is then appended to an array, which we can use to plot a histogram:

spread = goodness(result; N = 1000, seed = 42, exposure_time = data.spectrum.exposure_time)
histogram(spread, ylims = (0, 300), label = "Simulated")
vline!([result.χ2], label = "Best fit")
Example block output

Note we have set the random number generator seed with seed = 42 to allow our results to be strictly reproduced.

The goodness command will log the percent of simulations with a fit statistic better than the result, but we can equivalently calculate that ourselves:

count(<(result.χ2), spread) * 100 / length(spread)
56.9

Next we want to calculate the flux in an energy range observed by the detector. We can do this with LogFlux or XS_CalculateFlux, as they are both equivalent implementations.

We can modify our model by accessing properties from the model card and writing a new expression:

calc_flux = XS_CalculateFlux(
    E_min = FitParam(0.2, frozen = true),
    E_max = FitParam(2.0, frozen = true),
    log10Flux = FitParam(-10.3, lower_limit = -100, upper_limit = 100),
)

flux_model = model.m1 * calc_flux(model.a1)
┌ CompositeModel with 3 model components:
│      m1 * c1(a1)
│ Model key and parameters:
│    a1 => PowerLaw
│              K_1 ->   21.6 ± 0.1   ∈ [    0, Inf ]   FREE
│              a_1 ->   2.23 ± 0.3   ∈ [    0, Inf ]   FREE   c1 => XS_CalculateFlux
│          E_min_1 ->    0.2                               FROZEN
│          E_max_1 ->      2                               FROZEN
│      log10Flux_1 ->  -10.3 ± 1     ∈ [ -100, 100 ]   FREE   m1 => PhotoelectricAbsorption
│             ηH_1 ->  0.532 ± 0.1   ∈ [    0, Inf ]   FREE

Since we used the old model to define the new one, our best fit values are automatically copied into the new model. We can now freeze the normalization, as we are using the flux integrating model to scale the powerlaw component:

flux_model.a1.K.frozen = true
flux_model
┌ CompositeModel with 3 model components:
│      m1 * c1(a1)
│ Model key and parameters:
│    a1 => PowerLaw
│              K_1 ->   21.6                               FROZEN
│              a_1 ->   2.23 ± 0.3   ∈ [    0, Inf ]   FREE   c1 => XS_CalculateFlux
│          E_min_1 ->    0.2                               FROZEN
│          E_max_1 ->      2                               FROZEN
│      log10Flux_1 ->  -10.3 ± 1     ∈ [ -100, 100 ]   FREE   m1 => PhotoelectricAbsorption
│             ηH_1 ->  0.532 ± 0.1   ∈ [    0, Inf ]   FREE

Looking at the data card, we see the fit domain does not include the full region that we want to integrate the flux over. We therefore need to extend the fitting domain:

flux_problem = FittingProblem(flux_model => data)
# TODO: domain extensions not fully implemented yet
┌ FittingProblem:
│   . Models     : 1
│   . Datasets   : 1
│   Parameter Summary:
│   . Total      : 6
│   . Frozen     : 3
│   . Bound      : 0
│   . Free       : 3
└ 

Now to fit we can repeat the above procedure, and even overplot the region of flux we integrated:

flux_result = fit(flux_problem, LevenbergMarquadt())

plot(data,
    ylims = (0.001, 2.0),
    xscale = :log10,
    yscale = :log10
)
plot!(flux_result)
vspan!([flux_model.c1.E_min.value, flux_model.c1.E_max.value], alpha = 0.5)
Example block output

Let's try alternative models to see how they fit the data. First, an absorbed black body:

model2 = PhotoelectricAbsorption() * XS_BlackBody()
┌ CompositeModel with 2 model components:
│      m1 * a1
│ Model key and parameters:
│    a1 => XS_BlackBody
│       K_1 ->  1 ± 0.1  ∈ [ 0, Inf ]   FREE
│       T_1 ->  3 ± 0.3  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->  1 ± 0.1  ∈ [ 0, Inf ]   FREE

We fit in the same way as before:

prob2 = FittingProblem(model2 => data)
result2 = fit!(prob2, LevenbergMarquadt())
┌ FittingResult:
│   Model: CompositeModel[PhotoelectricAbsorption * XS_BlackBody]
│   . u     : [0.46462, 0.89091, 0.0]
│   . σᵤ    : [0.022068, 0.032552, 0.27922]
│   . χ²    : 123.78
└ 

Let's overplot this result against our power law result:

dp = plot(data,
    ylims = (0.001, 2.0),
    xscale = :log10,
    yscale = :log10,
    legend = :bottomleft,
)
plot!(dp, result, label = "PowerLaw $(round(result.χ2))")
plot!(dp, result2, label = "BlackBody $(round(result2.χ2))")
Example block output

Or a bremsstrahlung model:

model3 = PhotoelectricAbsorption() * XS_BremsStrahlung()
prob3 = FittingProblem(model3 => data)
result3 = fit(prob3, LevenbergMarquadt())
┌ FittingResult:
│   Model: CompositeModel[PhotoelectricAbsorption * XS_BremsStrahlung]
│   . u     : [13.868, 5.3034, 0.0]
│   . σᵤ    : [1.3178, 0.70705, 0.19664]
│   . χ²    : 40.027
└ 
plot!(dp, result3, label = "Brems $(round(result3.χ2))")
Example block output

Let's take a look at the residuals of these three models. There are utility methods for this in SpectralFitting.jl, but we can easily just interact with the result directly:

function calc_residuals(result)
    # select which result we want (only have one, but for generalisation to multi-model fits)
    r = result[1]
    y = invoke_result(r)
    @. (r.objective - y) / sqrt(r.variance)
end

domain = SpectralFitting.plotting_domain(data)

rp = hline([0], linestyle = :dash, legend = false)
plot!(rp,domain, calc_residuals(result), seriestype = :stepmid)
plot!(rp, domain, calc_residuals(result2), seriestype = :stepmid)
plot!(rp, domain, calc_residuals(result3), seriestype = :stepmid)
rp
Example block output

We can compose this figure with our previous one, and change to a linear x scale:

plot(dp, rp, layout = grid(2, 1, heights = [0.7, 0.3]), link = :x, xscale = :linear)
Example block output

We can do all that plotting work in one go with the plotresult recipe:

plotresult(
    data,
    [result, result2, result3],
    ylims = (0.001, 2.0),
    xscale = :log10,
    yscale = :log10,
    legend = :bottomleft,
)
Example block output

Let's modify the black body model with a continuum component

bbpl_model = model2.m1 * (PowerLaw() + model2.a1) |> deepcopy
┌ CompositeModel with 3 model components:
│      m1 * (a2 + a1)
│ Model key and parameters:
│    a1 => XS_BlackBody
│       K_1 ->  0.465 ± 0.1  ∈ [ 0, Inf ]   FREE
│       T_1 ->  0.891 ± 0.3  ∈ [ 0, Inf ]   FREE   a2 => PowerLaw
│       K_2 ->      1 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->      2 ± 0.2  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->      0 ± 0.1  ∈ [ 0, Inf ]   FREE
Note

We pipe the model to deepcopy to create a copy of all the model parameters. Not doing this means the parameters in bbpl_model will be aliased to the parameters in model2, and changing one with change the other.

We'll freeze the hydrogen column density parameter to the galactic value and refit:

bbpl_model.ηH_1.value = 4
bbpl_model.ηH_1.frozen = true
bbpl_model
┌ CompositeModel with 3 model components:
│      m1 * (a2 + a1)
│ Model key and parameters:
│    a1 => XS_BlackBody
│       K_1 ->  0.465 ± 0.1  ∈ [ 0, Inf ]   FREE
│       T_1 ->  0.891 ± 0.3  ∈ [ 0, Inf ]   FREE   a2 => PowerLaw
│       K_2 ->      1 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->      2 ± 0.2  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->      4                           FROZEN

And fitting:

bbpl_result = fit(
    FittingProblem(bbpl_model => data),
    LevenbergMarquadt()
)
┌ FittingResult:
│   Model: CompositeModel[PhotoelectricAbsorption * (PowerLaw + XS_BlackBody)]
│   . u     : [83.697, 0.15572, 77.052, 2.9226]
│   . σᵤ    : [66.153, 0.015973, 12.743, 0.11682]
│   . χ²    : 40.864
└ 

Let's plot the result:

plot(data,
    ylims = (0.001, 2.0),
    xscale = :log10,
    yscale = :log10,
    legend = :bottomleft,
)
plot!(bbpl_result)
Example block output

Update the model and fix the black body temperature to 2 keV:

update_model!(bbpl_model, bbpl_result)

bbpl_model.T_1.value = 2.0
bbpl_model.T_1.frozen = true
bbpl_model
┌ CompositeModel with 3 model components:
│      m1 * (a2 + a1)
│ Model key and parameters:
│    a1 => XS_BlackBody
│       K_1 ->  83.7 ± 0.1  ∈ [ 0, Inf ]   FREE
│       T_1 ->     2                          FROZEN   a2 => PowerLaw
│       K_2 ->  77.1 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->  2.92 ± 0.2  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->     4                          FROZEN

Fitting:

bbpl_result2 = fit(
    FittingProblem(bbpl_model => data),
    LevenbergMarquadt()
)
┌ FittingResult:
│   Model: CompositeModel[PhotoelectricAbsorption * (PowerLaw + XS_BlackBody)]
│   . u     : [0.38234, 573.17, 4.8351]
│   . σᵤ    : [0.034796, 83.183, 0.16012]
│   . χ²    : 71.708
└ 

Overplotting this new result:

plot!(bbpl_result2)
Example block output

MCMC

We can use libraries like Pidgeons.jl or Turing.jl to perform Bayesian inference on our paramters. SpectralFitting.jl is designed with BYOO (Bring Your Own Optimizer) in mind, and so makes it relatively easy to get at the core fitting functions to be used with other packages.

Let's use Turing.jl here, which means we'll also want to use StatsPlots.jl to plot our walker chains.

using StatsPlots
using Turing

Turing.jl provides enormous control over the definition of the model, and this is not control SpectralFitting.jl wants to take away from you. Although we will provide utility scripts to do the basics, here we'll show you everything step by step to give you an overview of what you can do.

Let's go back to our first model:

model
┌ CompositeModel with 2 model components:
│      m1 * a1
│ Model key and parameters:
│    a1 => PowerLaw
│       K_1 ->   21.6 ± 0.1  ∈ [ 0, Inf ]   FREE
│       a_1 ->   2.23 ± 0.3  ∈ [ 0, Inf ]   FREE   m1 => PhotoelectricAbsorption
│      ηH_1 ->  0.532 ± 0.1  ∈ [ 0, Inf ]   FREE

This gave a pretty good fit but the errors on our paramters are not well defined, being estimated only from a convariance matrix in the least-squares solver. MCMC can give us better confidence regions, and even help us uncover dependencies between paramters. Here we'll take all of our parameters and convert them into a Turing.jl model with use of their macro:

@model function mcmc_model(domain, objective, variance, f)
    K ~ Normal(20.0, 1.0)
    a ~ Normal(2.2, 0.3)
    ηH ~ truncated(Normal(0.5, 0.1); lower = 0)

    pred = f(domain, [K, a, ηH])
    return objective ~ MvNormal(pred, sqrt.(variance))
end
mcmc_model (generic function with 2 methods)

A few things to note here: we use the Turing.jl sampling syntax ~ to say that a variable is sampled from a certain type of prior distribution. There are no fixed criteria for what a distribution can be, and we encourage you to consult the Turing.jl documentation to learn how to define your own custom probability distributions. In this case, we will use Gaussians for all our parameters, and for the means and standard deviations use the best fit and estimated errors.

At the moment we haven't explicitly used our model, but f in this case takes the roll of invoking our model, and folding through instrument responses. We call it in much the same way as invokemodel, despite it going the extra step to fold our model. To instantiate this, we can use the SpectralFitting.jl helper functions:

config = FittingConfig(FittingProblem(model => data))
mm = mcmc_model(
    make_model_domain(ContiguouslyBinned(), data),
    make_objective(ContiguouslyBinned(), data),
    make_objective_variance(ContiguouslyBinned(), data),
    # _f_objective returns a function used to evaluate and fold the model through the data
    SpectralFitting._f_objective(config),
)
eltype(args) = CompositeModel{PhotoelectricAbsorption{Interpolations.Extrapolation{Float64, 1, Interpolations.GriddedInterpolation{Float64, 1, Vector{Float64}, Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}, Tuple{Vector{Float64}}}, Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}, Interpolations.Line{Nothing}}, FitParam{Float64}}, PowerLaw{FitParam{Float64}}, MultiplicationOperator, FitParam{Float64}, Additive}
eltype(args) = SpectralData{Float64, SpectralFitting.OGIPData, SpectralFitting.OGIPMetadata{FITSIO.FITSHeader}}

That's it! We're now ready to sample our model. Since all our models are implemented in Julia, we can use gradient-boosted samplers with automatic differentiation, such as NUTS. We'll walk 5000 itterations, just as a small example:

chain = sample(mm, NUTS(), 5_000)
Chains MCMC chain (5000×15×1 Array{Float64, 3}):

Iterations        = 1001:1:6000
Number of chains  = 1
Samples per chain = 5000
Wall duration     = 15.22 seconds
Compute duration  = 15.22 seconds
parameters        = K, a, ηH
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ⋯
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

           K   20.2880    0.8586    0.0216   1583.9714   2060.4402    1.0000   ⋯
           a    2.1888    0.0340    0.0008   1681.6584   2248.5744    1.0003   ⋯
          ηH    0.4707    0.0722    0.0015   2420.3930   2042.0820    1.0002   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           K   18.5775   19.7203   20.2969   20.8620   21.9507
           a    2.1212    2.1660    2.1895    2.2118    2.2545
          ηH    0.3356    0.4210    0.4690    0.5170    0.6161

In the printout we see summary statistics about or model, in this case that it has converged well (rhat close to 1 for all parameters), better estimates of the standard deviation, and various quantiles. We can plot our chains to make sure the caterpillers are healthy and fuzzy, making use of StatsPlots.jl recipes:

plot(chain)
Example block output

Corner plots are currently broken at time of writing.