Walkthrough
This walk through has not been fleshed out with the relevant astrophysical content yet (for example, whether a fit is good, what the different parameters mean, etc.), and so assumes some familarity with spectral fitting in general.
It is also not yet complete, nor a faithful illustration of everything SpectralFitting.jl can do. It serves to illustrate similarities and differences in syntax between SpectralFitting.jl and XSPEC.
This example walkthrough is the SpectralFitting.jl equivalent of the Walk through XSPEC from the XSPEC manual. We will use the same dataset, available for download from this link to the data files.
Overview
The first thing we want to do is load our datasets. Unlike in XSPEC, we have no requirement of being in the same directory as the data, or even that all of the response, ancillary, and spectral files be in the same place. For simplicity, we'll assume they are:
Be sure to set DATADIR
pointing to the directory where you keep the walkthrough data.
using SpectralFitting, XSPECModels, Plots
DATADIR = "..."
spec1_path = joinpath(DATADIR, "s54405.pha")
data = OGIPDataset(spec1_path)
┌ SpectralDataset{SpectralFitting.OGIPData}:
│ . Object : 1E 1048-5937
│ . Observation ID : [no observation id]
│ . Exposure ID : [no exposure id]
│ SpectralData with 125 active channels:
│ . Chn. E (min/max) : (0.69357, 70.563)
│ . Masked channels : 0 / 125
│ . Model domain size : 129
│ . Domain (min/max) : (0.44347, 36.632)
│ Primary Spectrum:
│ Spectrum: EXOSAT[ME]
│ Units : counts s^-1
│ . Exposure time : 23580.0
│ . Channels : 125
│ . Data (min/max) : (-0.028499, 0.30515)
│ . Grouped : yes
│ . Bad channels : yes (40)
│ Response:
│ Response Matrix(125 x 128)channels:
│ . Chn. E (min/max) : (0.69357, 70.563)
│ . Domain E (min/max) : (0.44347, 36.632)
│ Background: nothing
│ Ancillary: nothing
└
This will print a little card about our data, which shows us what else SpectralFitting.jl loaded. We can see the Primary Spectrum
, the Response
, but that the Background
and Ancillary
response files are missing. That's to be expected, since we don't have those files in the dataset.
We can check what paths it used by looking at
data.user_data.paths
SpectralFilePaths:
. Spectrum : /home/runner/work/SpectralFitting.jl/SpectralFitting.jl/docs/build/../../ex-datadir/s54405.pha
. Response : /home/runner/work/SpectralFitting.jl/SpectralFitting.jl/docs/build/../../ex-datadir/s54405.rsp
. Background : nothing
. Ancillary : nothing
We can load and alter any part of a dataset as we do our fitting. For example, if you have multiple different ancillary files at hand, switching them between fits is a one-liner.
To visualize our data, we can use some of the Plots.jl recipes included in SpectralFitting.jl:
plot(data, xlims = (0.5, 70), xscale = :log10)
Note that the units are currently not divided by the energy bin widths. We can either do that manually, or use the normalize!
to convert whatever units the data is currently in to the defacto standard counts s⁻¹ keV⁻¹
for fitting. Whilst we're at it, we see in the model card that there are 40 bad quality bins still present in our data. We can drop those as well, and plot the data on log-log axes:
normalize!(data)
drop_bad_channels!(data)
plot(data, ylims = (0.001, 2.0), yscale = :log10, xscale = :log10)
Note that when there are no negative axes, the scale defaults to log on the plot unless otherwise specified.
Next we want to specify a model to fit to this data. Models that are prefixed with XS_
are models that are linked from the XSPEC model library, provided via LibXSPEC_jll. For a full list of the models, see Models library.
It is advised to use the Julia implemented models. This allows various calculations to benefit from automatic differentiation, efficient multi-threading, GPU offloading, and various other useful things, see Why & how.
We will start by fitting a photoelectric absorption model that acts on a power law model:
To see information about a model, use the ?
in the Julia REPL:
julia> ?PowerLaw
XS_PowerLaw(K, a)
• K: Normalisation.
• a: Photon index.
Example
≡≡≡≡≡≡≡
...
model = PhotoelectricAbsorption() * PowerLaw()
┌ CompositeModel with 2 model components:
│ m1 * a1
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a1 => PowerLaw
│ K -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2 ± 0.2 ∈ [ 0, Inf ] FREE
└
If we want to specify paramters of our model at instantiation, we can do that with
model = PhotoelectricAbsorption() * PowerLaw(a = FitParam(3.0))
┌ CompositeModel with 2 model components:
│ m1 * a1
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a1 => PowerLaw
│ K -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 3 ± 0.3 ∈ [ 0, Inf ] FREE
└
Alternatively, we can modify the parameters by reaching into the corresponding model component:
set_value!(model.a1.a, 4.0)
4.0
SpectralFitting.jl adopts the SciML problem-solver abstraction, so to fit a model to data we specify a FittingProblem
:
prob = FittingProblem(model => data)
┌ FittingProblem:
│ . Models : 1
│ . Datasets : 1
│ Parameter Summary:
│ . Total : 3
│ . Frozen : 0
│ . Bound : 0
│ . Free : 3
└
SpectralFitting.jl makes a huge wealth of optimizers available from Optimizations.jl, and others from further afield. For consistency with XSPEC, we'll use here a delayed-gratification least-squares algorithm from LsqFit.jl:
result = fit(prob, LevenbergMarquadt())
┌ FitResult:
│ Model: CompositeModel
│ . Name : m1.ηH a1.K a1.a
│ . u : 0.56637 22.181 2.2488
│ . Δu : 0.27433 4.3720 0.12638
│ . χ² : 108.94
└ Σχ² = 108.94
Here we can see the parameter vector, the estimated error on each parameter, and the measure of the fit statistic (here chi squared). We can overplot our result on our data easily:
plot(data,
ylims = (0.001, 2.0),
xscale = :log10,
yscale = :log10
)
plot!(result)
Our model does not account for the high energy range well. We can ignore that range for now, and select everything from 0 to 15 keV and refit:
mask_energies!(data, 0, 15)
result = fit(prob, LevenbergMarquadt())
┌ FitResult:
│ Model: CompositeModel
│ . Name : m1.ηH a1.K a1.a
│ . u : 0.53199 21.584 2.2307
│ . Δu : 0.27277 4.2347 0.12576
│ . χ² : 43.970
└ Σχ² = 43.970
plot(data,
ylims = (0.001, 2.0),
xscale = :log10,
yscale = :log10
)
plot!(result, label = "PowerLaw")
The result is not yet baked into our model, and represents just the outcome of the fit. To update the parameters and errors in the model, we can use update_model!
update_model!(model, result)
┌ CompositeModel with 2 model components:
│ m1 * a1
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 0.532 ± 0.1 ∈ [ 0, Inf ] FREE
│ a1 => PowerLaw
│ K -> 21.6 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2.23 ± 0.3 ∈ [ 0, Inf ] FREE
└
To estimate the goodness of our fit, we can mimic the goodness
command from XSPEC. This will use the simulate
function to simulate spectra for a dataset (here determined by the result), and fit the model to the simulated dataset. The fit statistic for each fit is then appended to an array, which we can use to plot a histogram:
pcent, spread = goodness(result; N = 1000, seed = 42, exposure_time = data.spectrum.exposure_time)
@info "%goodness = $pcent"
histogram(spread, ylims = (0, 300), label = "Simulated")
vline!([result[1].stats], label = "Best fit")
Note we have set the random number generator seed with seed = 42
to allow our results to be strictly reproduced.
The goodness
command will return the percent of simulations with a fit statistic better than the result, in addition to the statistics of each individual trial.
Next we want to calculate the flux in an energy range observed by the detector. We can do this with LogFlux
or XS_CalculateFlux
, as they are both equivalent implementations.
We can modify our model by accessing properties from the model card and writing a new expression:
calc_flux = XS_CalculateFlux(
E_min = FitParam(0.2, frozen = true),
E_max = FitParam(2.0, frozen = true),
log10Flux = FitParam(-10.3, lower_limit = -100, upper_limit = 100),
)
flux_model = model.m1 * calc_flux(model.a1)
┌ CompositeModel with 3 model components:
│ m1 * c1(a1)
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 0.532 ± 0.1 ∈ [ 0, Inf ] FREE
│ c1 => XS_CalculateFlux
│ E_min -> 0.2 FROZEN
│ E_max -> 2 FROZEN
│ log10Flux -> -10.3 ± 1 ∈ [ -100, 100 ] FREE
│ a1 => PowerLaw
│ K -> 21.6 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2.23 ± 0.3 ∈ [ 0, Inf ] FREE
└
Since we used the old model to define the new one, our best fit values are automatically copied into the new model. We can now freeze the normalization, as we are using the flux integrating model to scale the powerlaw component:
flux_model.a1.K.frozen = true
flux_model
┌ CompositeModel with 3 model components:
│ m1 * c1(a1)
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 0.532 ± 0.1 ∈ [ 0, Inf ] FREE
│ c1 => XS_CalculateFlux
│ E_min -> 0.2 FROZEN
│ E_max -> 2 FROZEN
│ log10Flux -> -10.3 ± 1 ∈ [ -100, 100 ] FREE
│ a1 => PowerLaw
│ K -> 21.6 FROZEN
│ a -> 2.23 ± 0.3 ∈ [ 0, Inf ] FREE
└
Looking at the data card, we see the fit domain does not include the full region that we want to integrate the flux over. We therefore need to extend the fitting domain:
flux_problem = FittingProblem(flux_model => data)
# TODO: domain extensions not fully implemented yet
┌ FittingProblem:
│ . Models : 1
│ . Datasets : 1
│ Parameter Summary:
│ . Total : 6
│ . Frozen : 3
│ . Bound : 0
│ . Free : 3
└
Now to fit we can repeat the above procedure, and even overplot the region of flux we integrated:
flux_result = fit(flux_problem, LevenbergMarquadt())
plot(data,
ylims = (0.001, 2.0),
xscale = :log10,
yscale = :log10
)
plot!(flux_result)
vspan!([flux_model.c1.E_min.value, flux_model.c1.E_max.value], alpha = 0.5)
Let's try alternative models to see how they fit the data. First, an absorbed black body:
model2 = PhotoelectricAbsorption() * XS_BlackBody()
┌ CompositeModel with 2 model components:
│ m1 * a1
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a1 => XS_BlackBody
│ K -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ T -> 3 ± 0.3 ∈ [ 0, Inf ] FREE
└
We fit in the same way as before:
prob2 = FittingProblem(model2 => data)
result2 = fit!(prob2, LevenbergMarquadt())
┌ FitResult:
│ Model: CompositeModel
│ . Name : m1.ηH a1.K a1.T
│ . u : 0.0 0.46462 0.89091
│ . Δu : 0.27922 0.022068 0.032552
│ . χ² : 123.78
└ Σχ² = 123.78
Let's overplot this result against our power law result:
dp = plot(data,
ylims = (0.001, 2.0),
xscale = :log10,
yscale = :log10,
legend = :bottomleft,
)
plot!(dp, result, label = "PowerLaw $(round(sum(result.stats)))")
plot!(dp, result2, label = "BlackBody $(round(sum(result2.stats)))")
Or a bremsstrahlung model:
model3 = PhotoelectricAbsorption() * XS_BremsStrahlung()
prob3 = FittingProblem(model3 => data)
result3 = fit(prob3, LevenbergMarquadt())
┌ FitResult:
│ Model: CompositeModel
│ . Name : m1.ηH a1.K a1.T
│ . u : 0.0 13.870 5.3022
│ . Δu : 0.19659 1.3217 0.69966
│ . χ² : 40.027
└ Σχ² = 40.027
plot!(dp, result3, label = "Brems $(round(sum(result3.stats)))")
Let's take a look at the residuals of these three models. There are utility methods for this in SpectralFitting.jl, but we can easily just interact with the result directly:
function calc_residuals(result)
# select which result we want (only have one, but for generalisation to multi-model fits)
r = result[1]
y = calculate_objective!(r, r.u)
obj, var = get_objective(r), get_objective_variance(r)
@. (obj - y) / sqrt(var)
end
domain = SpectralFitting.plotting_domain(data)
rp = hline([0], linestyle = :dash, legend = false)
plot!(rp,domain, calc_residuals(result), seriestype = :stepmid)
plot!(rp, domain, calc_residuals(result2), seriestype = :stepmid)
plot!(rp, domain, calc_residuals(result3), seriestype = :stepmid)
rp
We can compose this figure with our previous one, and change to a linear x scale:
plot(dp, rp, layout = grid(2, 1, heights = [0.7, 0.3]), link = :x, xscale = :linear)
We can do all that plotting work with some of the builtin recipes:
function plot_result(data, results...)
p1 = plot(data,
ylims = (0.001, 2.0),
xscale = :log10,
yscale = :log10,
legend = :bottomleft,
)
p2 = plot(xscale = :log10)
for r in results
plot!(p1, r)
residualplot!(p2, r)
end
plot(p1, p2, link = :x, layout = @layout [top{0.75h} ; bottom{0.25h}])
end
plot_result(data, result, result2, result3)
Let's modify the black body model with a continuum component
bbpl_model = model2.m1 * (PowerLaw() + model2.a1) |> deepcopy
┌ CompositeModel with 3 model components:
│ m1 * (a1 + a2)
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 0 ± 0.1 ∈ [ 0, Inf ] FREE
│ a1 => PowerLaw
│ K -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2 ± 0.2 ∈ [ 0, Inf ] FREE
│ a2 => XS_BlackBody
│ K -> 0.465 ± 0.1 ∈ [ 0, Inf ] FREE
│ T -> 0.891 ± 0.3 ∈ [ 0, Inf ] FREE
└
We pipe the model to deepcopy
to create a copy of all the model parameters. Not doing this means the parameters in bbpl_model
will be aliased to the parameters in model2
, and changing one with change the other.
We'll freeze the hydrogen column density parameter to the galactic value and refit:
bbpl_model.m1.ηH.value = 4
bbpl_model.m1.ηH.frozen = true
bbpl_model
┌ CompositeModel with 3 model components:
│ m1 * (a1 + a2)
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 4 FROZEN
│ a1 => PowerLaw
│ K -> 1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2 ± 0.2 ∈ [ 0, Inf ] FREE
│ a2 => XS_BlackBody
│ K -> 0.465 ± 0.1 ∈ [ 0, Inf ] FREE
│ T -> 0.891 ± 0.3 ∈ [ 0, Inf ] FREE
└
And fitting:
bbpl_result = fit(
FittingProblem(bbpl_model => data),
LevenbergMarquadt()
)
┌ FitResult:
│ Model: CompositeModel
│ . Name : a1.K a1.a a2.K a2.T
│ . u : 77.052 2.9226 83.697 0.15572
│ . Δu : 12.743 0.11682 66.153 0.015973
│ . χ² : 40.864
└ Σχ² = 40.864
Let's plot the result:
plot_result(data, bbpl_result)
Update the model and fix the black body temperature to 2 keV:
update_model!(bbpl_model, bbpl_result)
bbpl_model.a2.T.value = 2.0
bbpl_model.a2.T.frozen = true
bbpl_model
┌ CompositeModel with 3 model components:
│ m1 * (a1 + a2)
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 4 FROZEN
│ a1 => PowerLaw
│ K -> 77.1 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2.92 ± 0.2 ∈ [ 0, Inf ] FREE
│ a2 => XS_BlackBody
│ K -> 83.7 ± 0.1 ∈ [ 0, Inf ] FREE
│ T -> 2 FROZEN
└
Fitting:
bbpl_result2 = fit(
FittingProblem(bbpl_model => data),
LevenbergMarquadt()
)
┌ FitResult:
│ Model: CompositeModel
│ . Name : a1.K a1.a a2.K
│ . u : 573.17 4.8351 0.38234
│ . Δu : 83.183 0.16012 0.034796
│ . χ² : 71.708
└ Σχ² = 71.708
Overplotting this new result:
plot!(bbpl_result2)
MCMC
We can use libraries like Pidgeons.jl or Turing.jl to perform Bayesian inference on our paramters. SpectralFitting.jl is designed with BYOO (Bring Your Own Optimizer) in mind, and so makes it relatively easy to get at the core fitting functions to be used with other packages.
Let's use Turing.jl here, which means we'll also want to use StatsPlots.jl to plot our walker chains.
using StatsPlots
using Turing
Turing.jl provides enormous control over the definition of the model, and this is not control SpectralFitting.jl wants to take away from you. Although we will provide utility scripts to do the basics, here we'll show you everything step by step to give you an overview of what you can do.
Let's go back to our first model:
model
┌ CompositeModel with 2 model components:
│ m1 * a1
│ Model key and parameters:
│ m1 => PhotoelectricAbsorption
│ ηH -> 0.532 ± 0.1 ∈ [ 0, Inf ] FREE
│ a1 => PowerLaw
│ K -> 21.6 ± 0.1 ∈ [ 0, Inf ] FREE
│ a -> 2.23 ± 0.3 ∈ [ 0, Inf ] FREE
└
This gave a pretty good fit but the errors on our paramters are not well defined, being estimated only from a convariance matrix in the least-squares solver. MCMC can give us better confidence regions, and even help us uncover dependencies between paramters. Here we'll take all of our parameters and convert them into a Turing.jl model with use of their macro:
@model function mcmc_model(objective, stddev, f)
K ~ Normal(20.0, 1.0)
a ~ Normal(2.2, 0.3)
ηH ~ truncated(Normal(0.5, 0.1); lower = 0)
pred = f(K, a, ηH)
return objective ~ MvNormal(pred, stddev)
end
mcmc_model (generic function with 2 methods)
A few things to note here: we use the Turing.jl sampling syntax ~
to say that a variable is sampled from a certain type of prior distribution. There are no fixed criteria for what a distribution can be, and we encourage you to consult the Turing.jl documentation to learn how to define your own custom probability distributions. In this case, we will use Gaussians for all our parameters, and for the means and standard deviations use the best fit and estimated errors.
At the moment we haven't explicitly used our model, but f
in this case takes the roll of invoking our model, and folding through instrument responses. We call it in much the same way as invokemodel
, despite it going the extra step to fold our model. To instantiate this, we can use the SpectralFitting.jl helper functions:
config = FittingConfig(FittingProblem(model => data))
mm = mcmc_model(
get_objective_single(config),
sqrt.(get_objective_variance_single(config)),
get_invoke_wrapper_single(config),
)
That's it! We're now ready to sample our model. Since all our models are implemented in Julia, we can use gradient-boosted samplers with automatic differentiation, such as NUTS. We'll walk 5000 itterations, just as a small example:
chain = sample(mm, NUTS(), 5_000)
Chains MCMC chain (5000×15×1 Array{Float64, 3}):
Iterations = 1001:1:6000
Number of chains = 1
Samples per chain = 5000
Wall duration = 13.87 seconds
Compute duration = 13.87 seconds
parameters = K, a, ηH
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
K -1.4743 0.0860 0.0020 1838.1078 2331.1757 1.0001 ⋯
a 3.7689 0.2107 0.0059 1283.0951 1696.3780 1.0002 ⋯
ηH 1.1266 0.0361 0.0010 1339.5614 1787.8445 1.0002 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
K -1.6396 -1.5321 -1.4766 -1.4143 -1.3095
a 3.3597 3.6284 3.7709 3.9125 4.1828
ηH 1.0538 1.1021 1.1277 1.1508 1.1967
In the printout we see summary statistics about or model, in this case that it has converged well (rhat
close to 1 for all parameters), better estimates of the standard deviation, and various quantiles. We can plot our chains to make sure the caterpillers are healthy and fuzzy, making use of StatsPlots.jl recipes:
plot(chain)
Using PairPlots.jl to create corner plots:
import PairPlots, Makie, CairoMakie
table = (; # named tuple syntax
K = vec(chain["K"]),
a = vec(chain["a"]),
ηH = vec(chain["ηH"])
)
PairPlots.pairplot(table)
