Surrogate models

Surrogate models allow you to create fast or memory efficient approximations of model components, or assist in optimizing some objective function directly. SpectralFitting uses the Surrogates.jl library of models, that yields pure-Julia surrogate models. Consequently, surrogate models also permit use of automatic differentiation in fitting, and are therefore powerful tools for improving fitting performance.

Surrogates overview

Warning

The surrogate model optimization does not work well for most XSPEC models currently. This is being actively developed.

Any function may be wrapped as a surrogate model using the SurrogateSpectralModel type.

SpectralFitting.SurrogateSpectralModel — Type

SurrogateSpectralModel <: AbstractSpectralModel
SurrogateSpectralModel(modelkind, surrogate, params, params_symbols)

Used to wrap a surrogate function into an AbstractSpectralModel.

Example

Creating a surrogate function using make_surrogate_harness:

# build and optimize a surrogate model
surrogate = make_surrogate_harness(model, lower_bounds, upper_bounds)

# create surrogate spectral model
sm = SurrogateSpectralModel(
    Multiplicative(),
    surrogate,
    (FitParam(1.0),),
    (:ηH,)
)

The lower_bounds and upper_bounds must be tuples in the form (E, params...), where E denotes the bounds on the energy range to train over.

source

To facilitate easy surrogate builds, SpectralFitting exports a number of utility functions.

SpectralFitting.make_surrogate_harness — Function

make_surrogate_harness(
    model::M,
    lowerbounds::T,
    upperbounds::T;
    optimization_samples = 200,
    seed_samples = 50,
    S::Type = RadialBasis,
    sample_type = SobolSample(),
    verbose = false,
)

Creates and optimizes a surrogate model of type S for model, using wrap_model_as_objective and optimize_accuracy! for optimization_samples iterations. Model is initially seeded with seed_samples points prior to optimization.

Warning

Additive models integrate energies to calculate flux, which surrogate models are currently not capable of. Results for Additive models likely to be inaccurate. This will be patched in a future version.

source

SpectralFitting.optimize_accuracy! — Function

optimize_accuracy!(
    surr::AbstractSurrogate,
    obj::Function,
    lb,
    ub;
    sample_type::SamplingAlgorithm = SobolSample(),
    maxiters = 200,
    N_truth = 5000,
    verbose = false,
)

Improve accuracy (faithfullness) of the surrogate model in recreating the objective function.

Samples a new space of N_truth points between lb and ub, and calculates the objective function obj at each. Finds the point with largest MSE between surrogate and objective, and adds the point to the surrogate pool. Repeats maxiters times, adding maxiters points to surrogate model.

Optionally print to stdout the MSE and iteration count with verbose = true.

Note that upper- and lower-bounds should be in the form (E, params...), where E is a single energy and params are the model parameters.

source

Creating a surrogate for `XS_PhotoelectricAbsorption`

Before we start, let us discuss a number of benefits the use of surrogate models may bring us:

SurrogateSpectralModel permit use of automatic differentiation.
Surrogate models may be allocation-free depending on setup, whereas XSPEC wrappers will always have to allocate for type-conversions.
Surrogate models may be considerably faster, especially for table models.
Surrogate models are shareable (see Sharing surrogate models), and are tunable in size.

XS_PhotoelectricAbsorption is an XSPEC model that is wrapped by a thin C-wrapper into Julia. The implementation of this model is a number of Fortran routines from the late 90s, including a tabulation of ~3000 lines of data that has been copied directly into the Fortran source code.

The performance of this model represents its complexity.

using SpectralFitting, XSPECModels

energy = collect(range(0.1, 20.0, 200))
model = XS_PhotoelectricAbsorption()

flux = similar(energy)[1:end-1]

199-element Vector{Float64}:
 6.92759885610737e-310
 6.92770245895555e-310
 6.9275972144371e-310
 6.0e-322
 6.92759885611686e-310
 6.9277049201827e-310
 6.92771240338e-310
 2.7e-322
 5.0e-324
 6.92770245895555e-310
 ⋮
 6.9275972144371e-310
 7.76e-322
 6.92759885629077e-310
 6.92770245895555e-310
 6.9275972144371e-310
 7.8e-322
 6.92759885629156e-310
 6.92770245895555e-310
 6.9275972144371e-310

Benchmarking with BenchmarkTools.jl:

using BenchmarkTools
@benchmark invokemodel!($flux, $energy, $model)

BenchmarkTools.Trial: 4733 samples with 1 evaluation per sample.
 Range (min … max):  1.045 ms …  1.635 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.048 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.052 ms ± 36.320 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▆██▆▅▄▂ ▁▂▃▂▂▁                                            ▂
  ████████████████▇▇▇▇▇▇▆▆▇▅▅▅▄▄▃▄▃▅▅▄▁▅▄▃▅▁▄▃▅▅▅▅▆▃▄▁▅▃▃▁▃▃ █
  1.05 ms      Histogram: log(frequency) by time      1.1 ms <

 Memory estimate: 272 bytes, allocs estimate: 8.

The surrogate we'll construct will have to be tailored a little to the data we wish to fit, as we need to specify the parameter ranges our surrogate should learn. For example, we might be interested in energies between $0.1$ and $20$ keV (expressed in our domain), with equivalent hydrogen column $\eta$H anywhere between $10^{-3}$ and $30$. We specify the parameter bounds using tuples:

lower_bounds = (1e-3,)
upper_bounds = (30.0,)

Note

The first index is always the energy bounds, and the subsequent indices are the parameters in the same order they are defined in the model structure.

Next, we use make_surrogate_harness to build and optimize a surrogate function for our model. By default, the surrogate uses linear radial basis functions, and seeds the coefficients with a number of seed points. This function then improves the accuracy of the model using optimize_accuracy!, until a maximal number of iterations has been reached.

For illustration purposes, we'll omit the accuracy improving step, and perform this ourselves. We can do this by setting optimization_samples = 0 in the keyword arguments:

using Surrogates

harness = make_surrogate_harness(
    (x, y) -> RadialBasis(x, y, lower_bounds, upper_bounds),
    energy,
    model,
    lower_bounds,
    upper_bounds;
    # default is 50, but to illustrate the training behaviour we'll set this low
    seed_samples = 2,
)

# number of points the surrogate has been trained on
length(harness.surrogate.x)

We can examine how well our surrogate reconstructs the model for a given test parameter:

using Plots
# random test value
ηh_test = 22.9

model.ηH.value = ηh_test
f = invokemodel(energy, model)

f̂ = harness.surrogate([ηh_test])

Now we'll use optimize_accuracy! to improve the faithfulness of our surrogate. This requires making use of wrap_model_as_objective as a little wrapper around our model:

optimize_accuracy!(harness; maxiters=50)

length(harness.surrogate.x)

We can plot the surrogate model again and see the improvement.

new_f̂ = harness.surrogate([ηh_test])

Tight. We can also inspect the memory footprint of our model:

# in bytes
Base.summarysize(harness)

This may be reduced by lowering maxiters in optimize_accuracy! at the cost of decreasing faithfulness. However, compare this to the Fortran tabulated source file in the XSPEC source code, which is approximately 224 Kb. The surrogate model with all it's training data is of the same order.

Using a surrogate spectral model

Now that we have the surrogate model, we use SurrogateSpectralModel to wrap it into an AbstractSpectralModel. The constructor also needs to know the model kind, have a copy of the model parameters, and know which symbols to represent the parameters with.

sm = make_model(harness)

┌ SurrogateSpectralModel
│ ηH -> 22.9 ± 0.1 ∈ [ 0, Inf ] FREE
└

We can now use the familiar API and attempt to benchmark the performance:

@benchmark invokemodel!($flux, $energy, $sm)

BenchmarkTools.Trial: 10000 samples with 4 evaluations per sample.
 Range (min … max):  7.058 μs … 53.959 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     7.101 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.448 μs ±  1.890 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▃           ▂                                             ▁
  ██▇▇▆▅▅▅▅▄▅▄██▇█▇▆▄▄▄▄▄▄▅▄▄▄▄▅▄▅▄▁▄▅▁▄▃▃▁▃▁▁▄▁▄▆▅▄▇▆▅▆▆▅▆▆ █
  7.06 μs      Histogram: log(frequency) by time     16.3 μs <

 Memory estimate: 1.75 KiB, allocs estimate: 6.

Comparing this to the initial benchmark of XS_PhotoelectricAbsorption, we see about a significant speedup, with no allocations, and this surrogate model is now automatic differentiation ready.

Evaluating the model

p_range = collect(range(1.0, 30.0))

fluxes_vecs = map(p_range) do p
    model.ηH.value = p
    f = invokemodel(energy, model)
end
fluxes_mat = reduce(hcat, fluxes_vecs)

surface(p_range, energy[1:end-1], fluxes_mat, xlabel = "ηH", ylabel = "E", zlabel = "f", title = "Model")

s_fluxes_vecs = map(p_range) do p
    sm.params[1].value = p
    display(sm)
    f = invokemodel(energy, sm)
end
s_fluxes_mat = reduce(hcat, s_fluxes_vecs)

surface(p_range, energy[1:end-1], s_fluxes_mat, xlabel = "ηH", ylabel = "E", zlabel = "f", title = "Surrogate")

To export and import surrogate models, JLD2.jl is recommended.

Surrogate models

Surrogates overview

Creating a surrogate for XS_PhotoelectricAbsorption

Using a surrogate spectral model

Evaluating the model

Sharing surrogate models

Creating a surrogate for `XS_PhotoelectricAbsorption`