The Lightweaver Framework for NLTE Radiative Transfer in Python

Tools for computing detailed optically thick spectral line profiles out of local thermodynamic equilibrium have always been focused on speed, due to the large computational effort involved. With the Lightweaver framework, we have produced a more flexible, modular toolkit for building custom tools in a high-level language, Python, without sacrificing speed against the current state of the art. The goal of providing a more flexible method for constructing these complex simulations is to decrease the barrier to entry and allow more rapid exploration of the field. In this paper we present an overview of the theory of optically thick NLTE radiative transfer, the numerical methods implemented in Lightweaver including the problems of time-dependent populations and charge-conservation, as well as an overview of the components most users will interact with, to demonstrate their flexibility.


INTRODUCTION
Optically thick non-local-thermodynamic equilibrium (NLTE) radiative transfer is one of the most computationally intensive problems in modern solar and stellar physics. It consists of taking a model atmosphere and computing self-consistent atomic populations whilst taking into account the fact that radiation originating from these atomic transitions may also affect their states elsewhere in the atmosphere. The high numerical cost of this problem is due in part to the high dimensionality of the the intensity, as it varies with wavelength and direction in addition to the spatial and temporal variation of most other quantities considered, and also the possibly large number of contributors at each wavelength. The NLTE problem can be extended to take into account the problem of finding an electron density consistent with the atomic populations, and this will also be discussed.
In recent times there has been a rise of flexible high performance frameworks available in high-level languages such as Python. One domain where these have demonstrated their power is machine learning, where the building blocks provided by the frameworks allow researchers to rapidly prototype new systems with little loss in performance over a hand-tuned highly specific low-level implementation. The goal of Lightweaver is to provide a similar set of tools for plane-parallel optically thick radiative transfer. To this end it consists of an extensible Python frontend with a clean high-performance C++ backend. During development the code has been extensively tested against both RH (Uitenbroek 2001;Pereira & Uitenbroek 2015) and SNAPI (Milić & van Noort 2018), to ensure agreement between all three on a range of problems. Whilst most radiative transfer tools are designed specifically for a single task, there is much commonality between the operations performed (especially the most costly operations, such as the formal solution of the radiative transfer equation). It is therefore reasonable to abstract out these common building blocks in a way that allows a user to quickly build what amounts to a specialised tool with very little code, in a high-level, memory-safe language that is widely supported in the scientific computing community.
This report describes in detail the components of the Lightweaver framework, including the numerical methods used. In Section 2 we provide an overview of NLTE radiative transfer and describe the numerical methods and their implementations. Then in Section 3 the structure of the framework is described to demonstrate how modularity is achieved.
Lightweaver can be installed by an end-user through the standard Python package manager pip without need for particular compilers to be installed. The code is freely available under the permissive MIT license 1 and is available on GitHub 2 with archival on Zenodo (Osborne 2021a). Lightweaver is in constant development and suggestions and enhancements are welcomed by contacting the authors or through the software's repository. All examples in this paper were tested against the most recent release of Lightweaver , v0.5.0. These examples are available on Zenodo (Osborne 2021b).

NUMERICAL NLTE RADIATIVE TRANSFER
In this Section we first present a brief overview of NLTE radiative transfer; for a much more in depth introduction see Hubený & Mihalas (2014). We also explain how most terms are implemented in Lightweaver , especially those that are less apparent.
Solving the NLTE radiative transfer problem consists primarily of two coupled sub-problems: • Solving the radiative transfer equation to obtain the specific intensity at each frequency, point, and direction in the discretised computational domain for a given set of atomic populations and a given atmospheric model. This step is known as the formal solution of the radiative transfer equation.
• Updating the populations based on the radiative rates obtained from the formal solution.
These two problems are solved iteratively; a formal solution is first computed from an initial guess of the atomic populations, which is then used to construct a linear operator applied in conjunction with the current populations to update these until convergence. Note that to compute the population update it is necessary to compute the intensity from transitions that do not overlap the spectral range of interest, due to their effect on the balance of transitions between the atomic population levels.
In the following we will expand on the solution of these two problems, first describing the terms that enter the equations, the construction of a linear operator, and the formal solution to the radiative transfer equation.

Basic Definitions
The most basic quantity to consider in the study of radiation and radiation transport is specific intensity. This is commonly denoted I(ν, d) at some frequency ν and direction d and has (SI) units J m −2 s −1 Hz −1 sr −1 . Specific intensity (and its projections in the Stokes vector) is the quantity where our observations and simulations meet, and the only vector by which spectroscopic (and polarimetric) information arrives from the observed object.
A ray travelling through a medium, such as neutral gas or a plasma, gains a certain amount of energy per unit length due to emission processes in the plasma, and loses another amount due to absorption (χ abs ) and scattering (χ scatt ) processes. These gain and loss terms are called emissivity and opacity, typically denoted η and χ = χ abs + χ scatt respectively, and will depend both on the frequency considered, as well as the location and direction of the ray.
Considering the case of a ray travelling through plasma made up of neutral and ionised atoms, the emissivity and opacity will depend on the number of atoms, the frequency-dependent cross-sections of these atoms, and the quantum mechanical processes coupling the photons and the plasma.
For a bound-bound process we then arrive at the following expression for the radiative rates for a transition from level i to level j (j > i) where φ is the line absorption profile, ψ is the line emission profile, A and B are the Einstein coefficients for the transition. The radiative rates have units s −1 , and locally describe the number of atomic transitions (i → j and j → i respectively) per unit time. For bound-free transitions the radiative rates are given by where h is Planck's constant, c is the speed of light, α ij is the photoionisation cross-section, k B is Boltzmann's constant, and n e is the electron number density. Φ is the Saha-Boltzmann equation defined such that where n * is the population of the species in LTE, m e is the electron mass, ∆E ji is the energy difference between levels j and i, and g i is the statistical weight of level i. The Saha-Boltzmann equation is obtained by combining the Saha ionisation equation and the Boltzmann excitation equation, and describes the distribution of the total atomic population across its possible states, at a given electron density, and under the assumption of local thermodynamic equilibrium (LTE). The rest frequency ν ij of a transition is given by In Lightweaver , due to existing convention in other RT codes, the energy of each level relative to the ground level of the model atom is supplied in cm −1 . The Einstein coefficients are related to each other and to the oscillator strength (a dimensionless quantity describing absorption probability) by where f ij is the oscillator strength, e is the charge of an electron, and 0 is the vacuum permittivity. As the oscillator strength can be used to compute the Einstein coefficients for a transition, and again for consistency with other codes, Lightweaver requires f ij for each spectral line.

Line Broadening
A transition between well-defined energy states in a bulk motionless plasma is not infinitely narrow, but instead broadened by a number of factors, including natural radiative broadening, Doppler broadening, and collisional broadening, to give the absorption profile φ ij . By default we follow the standard assumption of a Voigt absorption profile, and allow for a combination of different damping terms. There are described in detail in Appendix A.
The design of Lightweaver also supports non-Voigt line profiles, such as the more complex model for electric pressure broadening discussed in Kowalski et al. (2017), whilst only modifying the Python code in the model atom object, however the standard code-path for the Voigt profile is more optimised. The example presented in Section 3.7 shows how a Doppler line profile could be implemented.

MALI
In the following we present a brief review of the numerical techniques implemented in Lightweaver . Much of the content follows Uitenbroek (2001) and the RH code described therein.
Given an atomic species where the population of excitation level i is given by n i the general form of the kinetic equilibrium equation is given by where v is the bulk macroscopic velocity of the particle distribution, P ij is the total transition rate between atomic states i and j and is given by where R ij is the radiative rate, due to interaction with photons or spontaneous emission, and C ij is the collisional rate, due to interaction with other particles. In NLTE studies it is normally assumed that the collisional rate can be known a priori from the model atmosphere definition. A common simplification of (10) is to assume that the atmosphere is in a steady state (i.e. ∂n i /∂t = 0), and therefore the advective term can also be ignored. This sets the left-hand side of (10) to 0 and we obtain the statistical equilibrium equation For both (10) and (12), the system must be solved simultaneously for all levels of an atomic species. The latter requiring a constraint equation to avoid degeneracies. Lightweaver adopts the same Rybicki-Hummer full preconditioning approach (Rybicki & Hummer 1992) as used in RH (Uitenbroek 2001), although the implementation is slightly different. Following these authors, we write the emissivity η and opacity χ of a transition between atomic levels i and j, at frequency ν, along a ray of direction d as where n i is the population density level i. We assume here that j > i and then, by convention, χ ji = −χ ij .
The U and V terms are defined for bound-bound and bound-free transitions as By convention we define U ij = U ii = V ii = 0. Lightweaver can also treat lines necessitating partial redistribution (PRD). Following Uitenbroek (2001) we define and thus In the case of complete redistribution (CRD) ρ = 1. These terms will be discussed in detail in Section 2.5. The total opacity and emissivity can then be found by summing over all species In Lightweaver we split species into three categories: • Background: Bound-free transitions are considered under the assumption of LTE. The opacity and emissivity contribution here is considered to be isotropic.
• Detailed: All transitions (bound-bound and bound-free) are considered in detail, using either given (e.g. from a previous NLTE simulation) or LTE populations. The opacity and emissivity contribution here is considered to be angle-dependent.
• Active: All transitions are considered in detail and terms necessary for iterating the populations are accumulated. The opacity and emissivity contribution here is considered to be angle-dependent.
The expressions for total emissivity and opacity can then be written as the summation over the emissivity and opacity in each of the three previous categories, for each frequency and direction.
The source function at a given frequency and direction is then given by where σ is the continuum scattering coefficient that will be discussed further in Section 2.9.5. It is common to define an operator, Λ used to obtain the monochromatic radiation field in a particular direction from the source function In essence, this is our formal solver, discussed in Section 2.4.1. Rybicki & Hummer (1992) introduce an additional operator, Ψ, that aids in the construction of a linear preconditioned iterative scheme for solving (12) such that where χ † tot is the opacity evaluated with the populations from the previous iteration. For the converged solution, these two operators are equivalent, as χ † = χ.
Taking the n as the vector of level populations {n 1 , n 2 , . . . , n N } at a location in the atmosphere, we can write our iterative scheme for (10) as where Γ i is a row vector from the matrix Γ = Γ C + Γ R , which is evaluated using the previous population estimate. Γ C and Γ R represent the preconditioned collisional and radiative rate equations respectively. We will address the construction of Γ C later. From Rybicki & Hummer (1992) and Uitenbroek (2001) we can write for l = l , and all † terms evaluated with the current level population and ρ estimates. The term when assuming a diagonal Ψ * operator, describes the non-local contribution to the radiation field from the atom in question, and the local contribution from other species. It is often separated, as it remains constant for all transitions in an atom. The diagonal terms of Γ are computed using the conservation property that requires, for the sake of total number conservation, that the sum of each column of Γ be zero (Rybicki & Hummer 1992).
Now that Γ has been constructed it can be used in (26). In the case of statistical equilibrium, we must solve the matrix-vector equation Γ n = 0. A constraint equation is also needed, to avoid the trivial solution ( n = 0), typically a constraint on the total number density of the species. In effect, this amounts to replacing one of the equations with a sum over the level populations, i.e. replacing one of the rows of Γ with ones, and the associated entry in right-hand side with the total population number density.
The discretisation of the time-dependent form of the kinetic equilibrium equation is discussed in the following section as it involves extra complexities strongly coupled to the numerical methods applied.

Numerical Implementation
Most of the integration terms proceed similarly to those of the RH code, but as those have not been presented in a single document, we describe the numerical implementation in detail here.
When considering a one-dimensional plane parallel atmosphere, as is done in Lightweaver , it is efficient to discretise the integration over solid-angle using Gauss-Legendre quadrature over the cosine of the angle between the ray and the normal to atmospheric slabs, commonly denoted µ. These integrations are then implemented as weighted summations of the integrand at the Gauss-Legendre nodes i.e. the angle averaged intensity at a point in the atmosphere can be calculated from The number of angle points is user defined, as it depends on the problem (the anisotropy of the radiation field): for static atmospheres three angle samplings are normally sufficient, whereas five is more reliable in dynamic atmospheres.
As Lightweaver handles overlapping transitions, there needs to be a common wavelength grid that covers all transitions for the problem in question. Each transition provides a set of wavelengths that need to be taken into account to reliably solve the RT problem (e.g. lines are typically densely sampled in the line core and sparse in the wings). All of these individual wavelength grids are combined to produce the global wavelength grid, and a new grid is created for each transition which contains all of the original points, as well as the wavelength points from all other transitions that overlap.
These wavelength grids also define the basis of a numerical quadrature that is described in Appendix B. Therein we also describe the specific accumulation terms used in the construction of the fully preconditioned Γ.
In the case of the time-dependent kinetic equilibrium, there is no "one-size-fits-all" approach to this equation and Lightweaver provides the following tools. The advective term in (26) is ignored, as this requires a more complete treatment including consideration of hydrodynamics. We can discretise ∂ n/∂t = Γ n using a theta method where the superscripts t and t + 1 indicate the start and end of the timestep being integrated over, ∆t the duration of the timestep, and θ the degree of implicitness. θ = 0.5 represents the Crank-Nicolson scheme, θ = 1 the backwards Euler scheme, and θ = 0.55 is commonly used as it is often found to cope better with stiff systems (eg. Viallet et al. 2011). This system is solved by storing Γ t at the start the process, and then updating Γ t+1 using revised updates of the populations n t+1 with each iteration. The process of obtaining a new estimate for n t+1 can be found by rearranging (32) into the form where I is the identity matrix. As the right-hand side is known a priori, it can be evaluated directly, and (33) is a matrix-vector system that can be solved equivalently to the statistical equilibrium case, albeit without the need for a constraint equation. Currently only the fully implicit θ = 1 case is supported, as during testing the differences were found to be insignificant, however we plan to include support for other θ in the future, and this has already been implemented in separate packages that use Lightweaver , but without modifying the base framework.

Radiative Transfer Equation
To obtain the intensity terms in the radiative rates, as well as the outgoing intensity, we need to solve the monochromatic radiative transfer equation (RTE), which for a one-dimensional plane-parallel atmosphere stratified along the z-axis is expressed as or along an optical-depth stratification as for the optical depth defined as dτ (ν) = −χ(ν) dz. Solving this equation for multiple projected angles µ provides the radiation field throughout the model atmosphere that is necessary to compute the Γ operator. Typically the optical depth formulation of (35) is solved as it is more numerically robust (Janett et al. 2018; de la Cruz Rodríguez & Piskunov 2013).

Formal Solver
The formal solver is the technique by which the RTE (34) is solved and the approximate operator Ψ * is computed. By default we adopt the third order Bézier spline short-characteristics approach of de la Cruz Rodríguez & Piskunov (2013); de la Cruz Rodríguez et al. (2019), however investigation is also under way into the use of pragmatic formal solvers as discussed in Janett et al.
In the short-characteristics approach the formal solver is provided with the opacity and source function at discretised points throughout the atmosphere, and the behaviour of these between the known points is assumed to follow a simple function that can be analytically integrated, in this case a third order Bézier spline. It is important to choose an interpolating function that varies smoothly and minimises, or better yet, eliminates under-and overshoots in the interpolant. The third order Bézier spline has proven to be robust in this setting and has been applied in other modern codes such as STiC (de la Cruz Rodríguez et al. 2019) and SNAPI (Milić & van Noort 2018).
The integration routine proceeds from one end of the atmosphere to the other, accumulating these terms through the analytic short-characteristics integration to obtain the up-or down-going intensity for this ray at each point in the atmosphere.
The approximate Ψ operator Ψ * is simply the diagonal of the true Ψ operator, a matrix that would map the vector of emissivity to the intensity. Ψ * is trivially computed during the formal solution from the local contribution terms to the intensity and the local opacity.
The simple linear short-characteristics formal solver is also present and new formal solvers that conform to the interface used in Lightweaver can be compiled separately and loaded from a shared code library, allowing Lightweaver to serve as a testbed without need to modify the core package.

Partial Frequency Redistribution
The effects of partial frequency redistribution (PRD) are important for some NLTE lines, typically strong resonance lines and lower density regions where radiative effects dominate over collisional effects (Hubený & Mihalas 2014). For a complete treatment of the theory describing PRD lines we direct readers to Hubený & Mihalas (2014) and references therein, but we will provide a basic overview here. The common assumption of complete frequency redistribution (CRD) in spectral lines is that ψ = φ. The argument is that most lines are formed in regions with sufficient elastic collisions that atoms are well distributed across the sub-states of each energy level. Emission is therefore not correlated with the absorbed photon that excited the atom into this state. When the plasma is less collisional, there is said to be a natural population of a particular level, i.e. a population where the emission frequency is correlated to the absorption frequency. In this case the emission profile ψ differs from the absorption profile φ, and these coherent scattering effects must be considered.
Lightweaver currently adopts the iterative PRD approach presented in Uitenbroek (2001), but may also in future implement a direct solution, as it may prove more robust than the iterative approach for some highly dynamic problems, despite the higher computational cost. Currently cross-redistribution is not implemented, but the ground-work is present, and the remaining changes would be a simple extension following Uitenbroek (2001) and the RH code.
In the common case where flows are lower than the thermal Doppler velocity, the integrations needed to solve the PRD equations can be simplified by assuming isotropy of the radiation field. This is known as angleaveraged PRD. In cases with stronger flows we instead employ the hybrid PRD approach of Leenaarts et al. (2012) which consists of computing ρ in the atom's rest frame. This approximation agrees quite well with a full angle-dependent treatment, is simple to implement, and much faster to evaluate than the full angle-dependent case. Due to the additional computation effort involved in PRD calculations, regardless of the method used, lines need to be explicitly labelled as PRD.
The derivation of the PRD equations and their numerical implementation is described in Appendix C.

Self-consistent Electron Density
The MALI technique assumes that the electron density is known a priori, but this is often not the case. Assuming that the electron density can be given by the LTE ionisation state of the plasma can yield substantially incorrect results for chromospheric and prominence lines (Heinzel 1995;Paletou 1995;Bjørgen et al. 2019). An additional iteration process is therefore needed to determine the correct electron density within the framework of the NLTE problem.
Whilst not quite as robust as the pure MALI treatment a secondary Newton-Raphson iteration to selfconsistently compute electron density was proposed by Heinzel (1995) and Paletou (1995), and forms the basis of the method implemented here. The time-dependent case is based on Kašparová et al. (2003), and the numerical implementation of both of these is described in Appendix D.

Collisional Rates
Based on the RH code, a number of different formulations for collisional rates are available in Lightweaver . Currently these include tabulated collision strength (Ω) against temperature for excitation of ions by electrons, tabulated collisional ionisation and excitation rates of neutrals by electrons (known in RH as CI and CE), tabulated collisional excitation by protons, neutral hydrogen, and charge exchange with these species (CP, CH, CH+, and CH0 respectively). Additionally the collisional ionisation rates of Arnaud & Rothenflug (1985), and Burgess & Chidichimo (1983) are present. These can be extended further in user code with no modifications to the base library. The choice of pre-implemented collisional rates in the Lightweaver "standard library" allow the direct conversion of the majority of model atoms that are distributed with RH to also be distributed with Lightweaver . The collisional rates for a level depend only on the local parameters, and have no wavelength dependence, therefore the implementation is much more straightforward and does not require complicated numerical integration.
By default, the collisional rates are re-evaluated at the start of each formal solution, although this can be disabled by the user.

Full Stokes Treatment
We also support Zeeman splitting and polarisation effects where the complete set of anomalous Zeeman splitting terms can be computed from the quantum numbers J, L, and S for the levels considered through the LS coupling formalism, or a classical Zeeman triplet computed from an effective Landé g-factor present in the definition of a line. Lightweaver does not support full Stokes iteration of the populations, but provides support for both the field-free and polarisation-free approaches (Trujillo Bueno & Landi Degl'Innocenti 1996). The final formal solution is then undertaken with the thirdorder Bézier spline Diagonal Element Lambda Operator (DELO) method of de la Cruz Rodríguez & Piskunov (2013).

Miscellaneous
Like RH, Lightweaver utilises base SI units throughout, with the singular exception of wavelength being treated in nm. The units of a variable are therefore easy to determine, with little consideration of derived units. In the remainder of this Section we will discuss other small implementation details of the code.

Collisional-Radiative Switching
The collisional-radiative switching (CRSW) technique of Hummer & Voels (1988) is available in Lightweaver . Using MALI, many problems will converge without much issue, however in the case of strong atmospheric gradients the corrections to the populations in early iterations can be overly large and drive the system into a poorly conditioned state. To avoid this the CRSW technique multiplies the collisional contributions to Γ by a significant factor, so as to force the system into LTE. This factor is slowly reduced, allowing a graceful departure from LTE towards NLTE. The exact decay of this parameter can be configured by the user.

Isotopes
Isotopic models are also supported as valid atomic models. By default the abundances for all elements and their isotopic proportions are taken from Asplund et al. (2009), however these can easily be modified by the user.

Equation of State
Lightweaver contains a simple equation of state and background opacity package based on Mihalas (1970), implemented by Wittmann, and ported to Python by J. de la Cruz Rodriguez 3 . This equation of state has also been used in SIR (Ruiz Cobo & del Toro Iniesta 1992) and NICOLE (Socas-Navarro et al. 2015). In Lightweaver it is often used to determine the values of unknown parameters in a provided model atmosphere, and determining an LTE hydrostatic stratification if necessary (based on NICOLE). The equation of state also provides an estimate of the reference opacity τ 500 at 500 nm for model atmospheres that provide a height or column mass based stratification.

Molecules
Whilst molecular lines are not currently supported by Lightweaver , it can compute molecular formation in instantaneous chemical equilibrium, using the same molecular models as RH. These molecules reduce the populations of the atoms bound up in them and some (OH, CH, and H − ) contribute to the background opacity. The H − population is always computed, due to its importance in obtaining correct background opacities.

Background Treatment
The default implementation of background emissivities, opacities, and scattering terms currently follows that of RH, but a more general interface that is trivially overrideable in user code without modifying the framework is also present. The components present in the default background opacity package are listed in Table 1. The OH and CH opacities are not present unless these molecules are explicitly loaded and instantaneous chemical equilibrium is computed as discussed in Section 2.9.4.
2.9.6. Interpolation 3 https://github.com/jaimedelacruz/witt/ For interpolation duties, other than those in the formal solver and calculation of the PRD terms, we adopt the rapid, but robust fourth-order weighted essentially non-oscillatory (WENO) approach presented in Janett et al. (2019). Whilst this technique does not guarantee monotonicity around discontinuities, the over-and under-shoots remain very small, with no ringing artifacts, and we feel that the high quality of the solution in smooth regions makes it worthwhile. We have provided a performant implementation of this technique as a separate Python package that is available through pip as weno4, on GitHub 4 and archived on Zenodo (Osborne 2021c).

DESCRIPTION OF MAJOR CODE COMPONENTS
In this section we provide a brief overview of the components of Lightweaver a user will typically interact with. The frontend is entirely constructed in Python with a binding layer written in Cython 5 (Behnel et al. 2011) to allow it communicate with the C++ backend.

Atomic Models
The information stored in model atoms used by contemporary codes is more than simple atomic data, and in essence these codes are defining their own ad hoc scripting languages to support reading the various terms encoded in these files. As we have access to a high-level dynamic language in the form of Python, it is reasonable to model these as object hierarchies where we can take advantage of the common Python convention that obj == eval(repr(obj)) i.e. evaluating the textual representation of the object generates an equivalent object.
The code in Listing 1 shows the complete source necessary for a 3 level with continuum Hydrogen atom. It is constructed from nested Python classes, and through inheritance user code that implements the same interfaces will be able to extend these further.
For example, taking the quadrature component of a spectral line, we see here that the LinearCoreExpWings class is used. This is a derived class of LineQuadrature and any derived instance of this class can be used here. The requirements are that it provide at least functions doppler units, wavelength, and repr , that return the quadrature in Doppler units and wavelength respectively, and specify how to print the object so it can be re-evaluated. The last of these is trivial and there are plenty of examples throughout the Lightweaver codebase. Line broadening terms are implemented similarly. One strength of the model atoms being implemented in terms of objects is the ease at which they can be manipulated with a simple script before being used or saved in text form (or an optimised Python object storage format such as pickle 6 ).

Radiative Set
During the configuration of a simulation, all atomic models, whether "active" (full NLTE), "detailed static" (transitions computed in detail, but populations fixed), or "passive" (background contributions only) are stored in a RadiativeSet object. This is responsible for producing the common wavelength grid from all transitions taken into account, and the final grid for each transition whilst taking the other transitions into account as discussed in Section 2.3.1. This data is returned in a SpectrumConfiguration object, that can create another instance of itself, covering a restricted range of wavelengths, that is often used for computing a final formal solution over a line in detail, after the NLTE iteration is complete.
The RadiativeSet is also responsible for determining the LTE populations of these species from their models and the atmospheric data provided. During this process the electron density can be assumed fixed, as provided in the atmospheric data, or can be iterated to be selfconsistent with the LTE populations. In the future this object will also be responsible for optionally applying a variant of the second order escape probability method of Hummer & Rybicki (1982), applied to MALI by Judge (2017), which currently resides in the C++ backend. This LTE and initial condition atomic population data is returned in a SpeciesStateTable. Table   6 https://docs.python.org/3/library/pickle.html

Species State
The SpeciesStateTable is responsible for holding both the LTE and NLTE populations of the species (and molecules) present in the simulation, as well as the radiative rates for species treated in detail. This object can also update the LTE and H − populations given an updated set of atmospheric data, thus facilitating timedependent simulations. The arrays in this object are updated automatically by the C++ backend, as they are in fact shared by reference, and the backend is operating directly on the same memory, with no duplication necessary. This is achieved with a lightweight C++ library allowing multi-dimensional views onto a non-owned segment of data. These arrays provide a subset of NumPy functionality, and are limited to handling contiguous memory for performance. Thanks to the use of C++ templates for various data types, these have been verified to compile to assembly equivalent to access into a flat array, with no performance loss, but substantially greater memory safety than raw pointers, and the option to enable bounds-checking during debugging (by adjusting compilation flags).

Context
The code objects discussed so far primarily describe the configuration of the simulation which is then controlled by the Context object. The Context takes this data, in addition to several other configuration options, such as the whether to use hybrid PRD, charge conservation, CRSW, Ng acceleration (Ng 1974), multithreading options, and initial solution to use (which as discussed in Section 3.2, will be moved to the frontend in future). The effects of most of these options can also be achieved by calling some extra methods, but are simplified when used as arguments to the Context initialiser. During initialisation the Context computes background opacity, emissivity and scattering, line profiles, and maps the data into a form which can be used by the backend. After this initial setup the Context can be used to interact with the backend by calling various methods. These include • formal sol gamma matrices which evaluates the collisional rates, formal solution for all wavelengths, and constructs the Γ operator.
• single stokes fs which computes the polarised line profiles (if not already present), and computes a full Stokes formal solution.
• prd redistribute which performs a number of PRD subiterations, until either the maximum number of sub-iterations is performed, or the update size falls under a configurable tolerance.
• stat equil which computes the solution of the statistical equilibrium equations given the previously computed Γ operator.
• time dep update which computes the solution of the kinetic equilibrium equations for one step of a provided duration.
• nr post update which computes the selfconsistent electron density following Section 2.6. In the case of statistical equilibrium this is called automatically, if the Context was initialised in charge conservation mode.
• update deps which updates various quantities such as the background opacity and line profiles to handle modifications to the model atmosphere (e.g. computing a finite-difference response function or changing timesteps in the case of a timedependent simulation).
• compute rays which can compute a formal solution for one or multiple different viewing angles, optionally with full Stokes RT.
The Context and all other associated Cython components have been designed to support the Python pickle serialisation and deserialisation standard. Therefore it is possible to save an entire context context to disk, and reload it and continue processing with only a few lines of code. This will be discussed further in Section 3.5.

Parallelisation
Two forms of parallelisation are supported by Lightweaver , one explicitly, and the other implicitly. The Context object explicitly supports parallelisation of the formal solver over multiple threads of a single process by splitting the wavelengths over threads. This approach also applies to PRD lines, for which the scattering integral can be computed in parallel.
When the aim is to process multiple atmospheres, for example in the case of a finite difference response function or a 1.5D atmosphere simulation, it is more efficient to use Lightweaver in a multi-process mode. This can be done simply using e.g. ProcessPoolExecutor 7 from the Python standard library for single computing nodes, or a Python MPI implementation for a multi-node cluster. This method of computing is supported by the ability to pickle the Context, allowing the entire simulation state to be shipped between nodes in a single block if desired (although messages this large are taxing on process interconnects, and in many cases it is simpler to simply ship the data necessary to reconstruct the Context on a different node).

Example
The code in Listing 2 presents a simple script for running the comparison between the line profiles obtained for Ca ii 8542Å with different electron densities in the FAL C atmosphere (Fontenla et al. 1993).
These are plotted against RH's solution for the electron density given in FAL C in Figure 1. The agreement between RH and Lightweaver is extremely good when solving the same problem, as shown by the blue and dashed orange curves. The red curve, using LTE electron density, shows the importance of the correct electron density for Ca ii 8542Å. The green curve, which is the solution computed with charge conservation from an LTE starting solution, approaches the reference electron density solution, and represents a selfconsistent solution for electron density taking into account H and Ca in NLTE, however the final differences between the charge conserved solution and the reference solution are probably due to other elements, such as Fe, being treated in LTE.

Advanced Example
In this section we present a more advanced example, first demonstrating the implementation of a different line profile (in this case Doppler) in a Ca ii model atom, and then using this modified model in a program that reprocesses output from the RADYN (Carlsson 1992; Allred et al. 2015) radiation-hydrodynamic code in a time-dependent fashion.
Listing 3 demonstrates the modification of a 5 level with continuum Ca ii atom to use Doppler line profiles. The DopplerLine class is first defined, with a new implementation of the compute phi method expected on an instance of AtomicLine. It is then necessary to define the NoOpBroadener for the LineBroadening object provided to these lines, this class does nothing but provide comparison against itself and a repr method, allowing the model atom to be constructed from repr as discussed in Section 3.1. Finally, the model atom is constructed as before, but using the newly defined DopplerLine class. In this way model atoms can contain features such as different line profiles and collision rate parameterisations that are not known to the core Lightweaver package but remain compartmentalised in user code.
First, the pre-processed data is loaded and an atmosphere object atmos is constructed from the initial timestep of the data. Several functions are then defined: • construct context for constructs a Context for an atmosphere and collection of model atoms, similarly to Listing 2.
• initial stat eq computes the statistical equilibrium solution using this context similarly to iterate ctx in Listing 2.
• load step loads the thermodynamic atmospheric properties, and hydrogen populations from the chosen timestep into the Context, before recomputing the line profiles and background opacities via ctx.update deps.
• compute time dependent profiles then uses the load step to load each timestep of the data present, solve the radiative transfer problem, and advance the calcium populations in time. It returns a list of the outgoing radiation from each timestep in the data.
Finally these functions are applied twice to construct two different simulations, one for each of the different calcium model atoms used. A complete reprocessing of the RADYN simulation, considering the effects of advection of the atomic populations is substantially more complex, and outside the scope of the core Lightweaver framework. However, after implementing a suitable advection scheme the code presented here could easily be adapted.

Performance Comparison
Taking the FAL C example (with given electron density) presented in Section 3.6, for a 6-level model hydrogen atom and 6-level model calcium atom we will compare the performance of Lightweaver with RH. These comparison tests were run on an Intel Xeon E3-1270 v3 (4 cores/8 logical threads, Haswell microarchitecture) with 1600 MHz DDR3 memory inside the Windows Subsystem for Linux environment in Microsoft Windows 10.0.18363.1016. The compiled components of both codes were compiled with the GNU compiler collection 7.5.0, and Lightweaver was run using Python 3.8.2.
In both codes the atomic populations are initialised to LTE and Ng acceleration is disabled, to allow direct comparison of the iteration speed. All tests were run five times, and the final result is the mean of these. The variability from run-to-run is extremely low, so is not shown here.
The single-threaded results are shown in Table 2. The difference the total wall time (real-world elapsed time) and the "setup & iteration only" time is the time taken to load and correctly configure the model atmosphere. The FAL C model used is defined on a column mass stratification. Both RH and Lightweaver work in geometric height, so this stratification must first be converted. This is a simple procedure with the total hydrogen density specified in the model atmosphere file. To make the model atmosphere easy to manipulate on its own in Lightweaver , the continuum optical depth τ 500 is also evaluated at the same time. In RH this step takes place after the background opacities are computed, and these can be used directly (hence the very low cost of this step). To improve flexibility in Lightweaver , this term uses background opacities obtained from the equation of state package discussed in Section 2.9.3. Many improvements could be made to the speed of this package by reimplementing its most numerically costly functions in a more performant language (or perhaps binding the pre-existing FORTRAN version to Python). In practice, this one-off cost is rarely an issue as many models are now specified in terms of height, and Lightweaver does not require the calculation of column mass and continuum optical depth when a height stratified atmospheric definition is provided.
The codes were also compared when running on multiple threads. In this case 8 threads were used in both codes, as this provided the fastest execution on this system. These results are shown in  Listing 4. Performing a basic time-dependent synthesis from RADYN simulation with and without Doppler line profiles in the Ca ii model atom cost of the τ 500 conversion in Lightweaver can again be seen in these results. Lightweaver 's threading model, that uses a thread-pool and lockless accumulation of the Gamma operator, provides significantly faster results at the cost of slightly higher memory consumption (one copy of the Γ matrix per atom per thread, and one copy of the accumulation terms (see Appendix B) per atom per thread). Ignoring the aforementioned expensive one-off cost of computing τ 500 for the model atmosphere via the EOS, RH achieves a 1.2x speedup from utilising multiple threads, whereas Lightweaver achieves a 3.4x speedup. Accounting for the high cost of RH's threading model on Windows (where thread creation is very costly), this test was also run on a computer running CentOS 7, where a maximum speedup of 1.5x was recorded.

CONCLUSIONS
We have presented a brief overview of NLTE radiative transfer, and the methods used to solve associated problems employed by the Lightweaver framework. We have also discussed the design of the framework, and hope that it will allow simpler experimentation with RT methods due to its "factoring out" of common operations into composable building blocks, and providing a single language approach to running and analysing simulations thanks to the extensive pre-existing set of scientific tools available in Python. The nature of the framework allows programs for specialised tasks to be written far more easily than is possible in the traditional "configuration file" based monolithic code. We are currently working on multi-dimensional extensions to the framework, to allow the synthesis of radiation from 2D and simple 3D atmospheres, but do not anticipate applying the advanced domain decomposition techniques of e.g. Multi3d (Leenaarts & Carlsson 2009) or PORTA (Štěpán & Trujillo Bueno 2013), however such an extension would be relatively simple thanks to the modularity of the codebase and simple serialisation of Context state.

B. MALI NUMERICAL IMPLEMENTATION
In the following the integration and accumulation terms used in the implementation of the MALI method (Section 2.3) are presented.
As discussed in Section 2.3.1 a common wavelength grid is first computed from which the final individual wavelength grid for each transition is extracted. For each transition with its final individual discrete wavelength grid (denoted λ) we define the integration weights with For lines, this can be used to compute the line-profile normalisation factor that ensures that (A5) holds for each discretised point in the atmosphere k. φ num is defined in Appendix A. This line-profile normalisation factor is essential in ensuring that Γ is correctly scaled, especially for more sparsely sampled lines. The integration weights for the terms contributing to Γ R for each transition ij at each depth are then these terms may not be immediately evident upon comparison with (27); the differences arise from ensuring that all terms are integrated in frequency, despite the discretisation being performed in wavelength. From the previous discussion of line profiles in Appendix A, we have under the reasonable approximation that ν = ν ij over the integration range for a spectral line. This latter formulation is used when evaluating U and V in Lightweaver . Additionally, we follow Uitenbroek (2001) and define and which can be expressed as U ji = A ji /B ji V ji for bound-bound transitions. With these expressions the U and V terms are very efficient to evaluate in a vectorised manner, and we therefore choose to not cache them. If one were writing an entirely CRD code, these terms can be computed once per transition and stored, which could be a valuable optimisation in some cases, at the expense of computer memory. During the accumulation of emissivities and opacities at each wavelength and direction for the formal solver we also accumulate the emissivity per atom total U ji per level j and effective self-opacity in for each level l If no lines are associated with a wavelength point, then all the sources of emissivity and opacity are direction independent, and these accumulations can only be performed once for all of the directional formal solutions. As the accumulation of terms into Γ R can be done after the formal solution for each direction is performed, the storage of these terms does not increase with the number of directions. If one adopts the "same-transition" preconditioning approach of (Rybicki & Hummer 1992) then none of these accumulation arrays are needed. This could be advantageous in the implementation of higher dimensional schemes, as in most cases there appears to be no difference in convergence speed between the two methods.
The formal solver then provides the values of I † (ν, d) and Ψ * ν, d , for all depths, one direction at a time. The per-atom I eff term of (28) for this direction can then simply be computed as where the final term is a simple scalar multiplication, under the assumption that Ψ * is simply the diagonal of the true Ψ operator (which is currently the case in Lightweaver ). With these definitions the integration of the off-diagonal entries of Γ R at each spatial point, for each active atom, is performed by looping over each contributing transition ij such that The radiative rates are computed similarly In the following we describe the terms needed to apply PRD to a spectral line and their implementation in Lightweaver . For the previous definition of ρ ij from (18), under the assumptions of a line with an infinitely sharp lower level and broadened upper level, and the validity of PRD being in the atomic frame being approximated by PRD in the observer's frame (Uitenbroek 2001), following Hubený & Mihalas (2014) we then have where R II is the generalised redistibution function for transitions of this kind (Hubený 1982), and γ is the branching ratio, or coherency fraction. The summation over l and lji subscript on R II describe the scattering process. When ignoring cross-redistribution (Raman scattering), we have l = i, and the summation is replaced by a single term, as we are only considering resonance scattering within the line i → j.
The coherency fraction γ describes the normalised probability of a photons being re-emitted from the same sublevel of energy level j before an elastic collision that will redistribute it across sublevels, provided that it is re-emitted at all. This is then given by where P j is the total rate of transitions out of level j (depopulation rate), and Q j is the total rate of elastic collisions affecting this level. Defining g II (ν, ν ) = R II (ν, ν )/φ ij (ν ), which is normalised such that 1 4π g II (ν, ν ) dν dΩ = 1, as per Gouttebroze (1986) and Uitenbroek (1989) wherein fast approximations to this function are derived, and ignoring cross-redistribution we then have Ignoring bulk plasma flows, the integrals over angle and frequency can then be split, providing an angle-averaged form of ρ that is much easier to compute, given by is the frequency-integrated mean intensity across the transition. The numerical implementation of angle-averaged PRD simply follows the method of Uitenbroek (2001), which is briefly summarised below. The redistribution function is often very sharply peaked, an accurate evaluation of the scattering integral therefore requires much finer sampling of the wavelength grid than that which is required the evaluate the terms in Γ R . However for each wavelength in a line's grid there is only a small surrounding region for which g II is non-zero. J(ν) is then interpolated onto a fine grid over this region, over which (C27) is trivially implemented by an application of Simpson's rule. It is essential that the scattering integral component of (C27) be normalised as per (C26) to avoid the addition or destruction of photons in the transition. The term g II (ν, ν )J(ν )dν (C30) is then implemented as where δν i are the integration weights over this fine grid. The iterative method currently employed consists of performing a formal solution over the wavelengths where PRD lines are active, and updating ρ using this method whilst maintaining the populations fixed. When solving a PRD problem, a number of these sub-iterations to update ρ (commonly 3) are interleaved between every complete formal solution and population update.
For the hybrid PRD case of Leenaarts et al. (2012) used when plasma flows exceed the thermal Doppler velocity, J(ν) in (C28) is then replaced by J rest (ν), the mean intensity in the atom's rest frame. This approximation is much faster to evaluate than the full angle-dependent case, as the accumulation of J rest (ν) can be done during the formal solution, using a linear interpolation off the Doppler-shifted frequency grid. ρ can be linearly interpolated from the atomic rest frame during the calculation of the U and V terms, or into a directionally dependent array at the end of each PRD sub-iteration. Currently Lightweaver does the former of these.
To ensure that J rest is accumulated correctly during the PRD sub-iterations, we no longer simply perform formal solutions over wavelengths where PRD lines are present, but also over wavelengths that when shifted back to the rest frame contribute to J rest in these regions. We have found that this modification, that is not present in RH1.5D (Pereira & Uitenbroek 2015), can dramatically aid convergence in atmospheres with high velocity shifts.

D. SELF-CONSISTENT NEWTON-RAPHSON ELECTRON DENSITY ITERATION
From Section 2.3, the equations of statistical equilibrium (ESE) are given by (12). Let us write this system for a level i of a species s as F s,i ( n s , n e ) = j =i n j P ji ( n s , n e ) − n i j =i P ij ( n s , n e ) = 0 (D32) With a fixed electron density the preconditioned linear formulation of these equations for a species s can be written Γ s n s = 0.
We start by obtaining the solution to this linear system that, in the following, will be denoted n s . The previous values of these populations, i.e. the ones at which Γ s was evaluated, will once again be denoted with †.
The principle of Newton-Raphson iteration is to compute F (x 0 + δx) = 0, which can be written as F (x 0 ) + Jδx = 0, with J the Jacobian of F evaluated at x 0 , and δx some small correction to x 0 . This can be rearranged to −Jδx = F (x 0 ). Applying this to technique (D32) and explanding to first order we have where δn j and δn e indicate the corrections to these populations necessary to render them self-consistent. This expression can be written for each level of each active species. Looking more closely at each term we have where the term involving Γ C depends on the exact form of the collisional rates. Due to the number of collisional rate options available in Lightweaver this is evaluated through finite differences, which remains relatively efficient due to the local nature of this term. As can be seen from (D35) and (D36), all terms are linear in δn and δn e , however additional constraints are needed to close this system: a constraint on the total population of each species s, and a constraint on charge neutrality. In total, this forms a system s N level,s + 1 equations, where N level,s is the number of levels treated in detail for species s, and the summmation is performed over all active species. This system can therefore be written at each point in the atmosphere as a block diagonal matrix, with each block being N level,s × N level,s , with a final row and column due to the electron density terms and charge conservation equation that couple all blocks. The block terms are simply −Γ s