Mixed Symbolic-Numeric Perturbation Theory

Symbolics.jl is a fast and modern Computer Algebra System (CAS) written in Julia. It is an integral part of the SciML ecosystem of differential equation solvers and scientific machine learning packages. While Symbolics.jl is primarily designed for modern scientific computing (e.g. automatic differentiation and machine learning), it is also a powerful CAS that can be used for classic scientific computing. One such application is using perturbation theory to solve algebraic and differential equations.

Perturbation methods are a collection of techniques to solve hard problems that generally don't have a closed solution, but depend on a tunable parameter and have closed or easy solutions for some values of this parameter. The main idea is to assume a solution that is a power series in the tunable parameter (say $ϵ$), such that $ϵ = 0$ corresponds to an easy solution, and then solve iteratively for higher-order corrections.

The hallmark of perturbation theory is the generation of long and convoluted intermediate equations by this process. These are subjected to algorithmic and mechanical manipulations, which makes perturbation methods an excellent fit for a CAS. In fact, CAS software have been used for perturbation calculations since the early 1970s.

This tutorial shows how to mix symbolic manipulations and numerical methods to solve algebraic equations with perturbation theory. Another tutorial applies it to differential equations.

Solving the quintic equation

The “hello world!” analog of perturbation problems is to find a real solution $x$ to the quintic (fifth-order) equation

using Symbolics
@variables x
quintic = x^5 + x ~ 1

\[ \begin{equation} x + x^{5} = 1 \end{equation} \]

According to Abel's theorem, a general quintic equation does not have a closed form solution. But we can easily solve it numerically using Newton's method (here implemented for simplicity, and not performance):

function solve_newton(eq, x, x₀; abstol=1e-8, maxiters=50)
    # symbolic expressions for f(x) and f′(x)
    f = eq.lhs - eq.rhs # want to find root of f(x)
    f′ = Symbolics.derivative(f, x)

    xₙ = x₀ # numerical value of the initial guess
    for i = 1:maxiters
        # calculate new guess by numerically evaluating symbolic expression at previous guess
        xₙ₊₁ = substitute(x - f / f′, x => xₙ)
        if abs(xₙ₊₁ - xₙ) < abstol
            return xₙ₊₁ # converged
        else
            xₙ = xₙ₊₁
        end
    end
    error("Newton's method failed to converge")
end

x_newton = solve_newton(quintic, x, 1.0)
println("Newton's method solution: x = ", x_newton)
Newton's method solution: x = 0.7548776662466927

To solve the same problem with perturbation theory, we must introduce an expansion variable $ϵ$ in the equation:

@variables ϵ # expansion variable
quintic = x^5 + ϵ*x ~ 1

\[ \begin{equation} x \epsilon + x^{5} = 1 \end{equation} \]

If $ϵ = 1$, we get our original problem. With $ϵ = 0$, the problem transforms to the easy quintic equation $x^5 = 1$ with the trivial real solution $x = 1$ (and four complex solutions which we ignore). Next, expand $x$ as a power series in $ϵ$:

x_coeffs, = @variables a[0:7] # create Taylor series coefficients
x_taylor = series(x_coeffs, ϵ) # expand x in a power series in ϵ

\[ \begin{equation} a_{0} + a_{1} \epsilon + \epsilon^{2} a_{2} + \epsilon^{3} a_{3} + \epsilon^{4} a_{4} + \epsilon^{5} a_{5} + \epsilon^{6} a_{6} + \epsilon^{7} a_{7} \end{equation} \]

Then insert this into the quintic equation and expand it, too, to the same order:

quintic_taylor = substitute(quintic, x => x_taylor)
quintic_taylor = taylor(quintic_taylor, ϵ, 0:7)

\[ \begin{equation} a_{0}^{5} + \left( a_{0} + 5 a_{0}^{4} a_{1} \right) \epsilon + \frac{1}{2} \epsilon^{2} \left( 2 a_{1} + 10 a_{0}^{4} a_{2} + 20 a_{1}^{2} a_{0}^{3} \right) + \frac{1}{6} \epsilon^{3} \left( 6 a_{2} + 30 a_{0}^{4} a_{3} + 120 a_{0}^{3} a_{1} a_{2} + 60 a_{1}^{3} a_{0}^{2} \right) + \frac{1}{24} \epsilon^{4} \left( 24 a_{3} + 120 a_{0}^{4} a_{4} + 480 a_{0}^{3} a_{1} a_{3} + 240 a_{2}^{2} a_{0}^{3} + 720 a_{1}^{2} a_{0}^{2} a_{2} + 120 a_{1}^{4} a_{0} \right) + \frac{1}{120} \epsilon^{5} \left( 120 a_{4} + 600 a_{0}^{4} a_{5} + 2400 a_{0}^{3} a_{1} a_{4} + 2400 a_{0}^{3} a_{2} a_{3} + 3600 a_{1}^{2} a_{0}^{2} a_{3} + 3600 a_{2}^{2} a_{0}^{2} a_{1} + 2400 a_{1}^{3} a_{0} a_{2} + 120 a_{1}^{5} \right) + \frac{1}{720} \epsilon^{6} \left( 720 a_{5} + 3600 a_{0}^{4} a_{6} + 14400 a_{0}^{3} a_{1} a_{5} + 14400 a_{0}^{3} a_{2} a_{4} + 7200 a_{3}^{2} a_{0}^{3} + 21600 a_{1}^{2} a_{0}^{2} a_{4} + 43200 a_{0}^{2} a_{1} a_{2} a_{3} + 7200 a_{2}^{3} a_{0}^{2} + 14400 a_{1}^{3} a_{0} a_{3} + 21600 a_{2}^{2} a_{1}^{2} a_{0} + 3600 a_{1}^{4} a_{2} \right) + \frac{1}{5040} \epsilon^{7} \left( 5040 a_{6} + 25200 a_{0}^{4} a_{7} + 100800 a_{0}^{3} a_{1} a_{6} + 100800 a_{0}^{3} a_{2} a_{5} + 100800 a_{0}^{3} a_{3} a_{4} + 151200 a_{1}^{2} a_{0}^{2} a_{5} + 302400 a_{0}^{2} a_{1} a_{2} a_{4} + 151200 a_{3}^{2} a_{0}^{2} a_{1} + 151200 a_{2}^{2} a_{0}^{2} a_{3} + 100800 a_{1}^{3} a_{0} a_{4} + 302400 a_{1}^{2} a_{0} a_{2} a_{3} + 100800 a_{2}^{3} a_{0} a_{1} + 25200 a_{1}^{4} a_{3} + 50400 a_{2}^{2} a_{1}^{3} \right) = 1 \end{equation} \]

This messy equation must hold for each power of $ϵ$, so we can separate it into one nicer equation per order:

quintic_eqs = taylor_coeff(quintic_taylor, ϵ, 0:7)
quintic_eqs[1:5] # for readability, show only 5 shortest equations

\[ \begin{align} a_{0}^{5} &= 1 \\ a_{0} + 5 a_{0}^{4} a_{1} &= 0 \\ \frac{1}{2} \left( 2 a_{1} + 10 a_{0}^{4} a_{2} + 20 a_{1}^{2} a_{0}^{3} \right) &= 0 \\ \frac{1}{6} \left( 6 a_{2} + 30 a_{0}^{4} a_{3} + 120 a_{0}^{3} a_{1} a_{2} + 60 a_{1}^{3} a_{0}^{2} \right) &= 0 \\ \frac{1}{24} \left( 24 a_{3} + 120 a_{0}^{4} a_{4} + 480 a_{0}^{3} a_{1} a_{3} + 240 a_{2}^{2} a_{0}^{3} + 720 a_{1}^{2} a_{0}^{2} a_{2} + 120 a_{1}^{4} a_{0} \right) &= 0 \end{align} \]

These equations show three important features of perturbation theory:

  1. The $0$-th order equation is trivial in $x_0$: here $x_0^5 = 1$ has the trivial real solution $x_0 = 1$.
  2. The $n$-th order equation is linear in $x_n$ (except the trivial $0$-th order equation).
  3. The equations are triangular in $x_n$: the $n$-th order equation can be solved for $x_n$ given only $x_m$ for $m<n$.

This structure is what makes the perturbation theory so attractive: we can start with the trivial solution $x_0 = 1$, then linearly solve for $x_n$ order-by-order and substitute in the solutions of $x_m$ for $m<n$ obtained so far.

Here is a simple function that uses this cascading strategy to solve such a set of equations eqs for the variables xs, given a solution x₀ of the first equation eqs[begin]:

function solve_cascade(eqs, xs, x₀, ϵ)
    sol = Dict(xs[begin] => x₀) # store solutions in a map

    # verify that x₀ is a solution of the first equation
    eq0 = substitute(eqs[1], sol)
    isequal(eq0.lhs, eq0.rhs) || error("$sol does not solve $(eqs[1])")

    # solve remaining equations sequentially
    for i in 2:length(eqs)
        eq = substitute(eqs[i], sol) # insert previous solutions
        x = xs[begin+i-1] # need not be 1-indexed
        xsol = Symbolics.symbolic_linear_solve(eq, x) # solve current equation
        sol = merge(sol, Dict(x => xsol)) # store solution
    end

    return sol
end
solve_cascade (generic function with 1 method)

Let us solve our order-separated quintics for the coefficients, and substitute them into the full series for $x$:

x_coeffs_sol = solve_cascade(quintic_eqs, x_coeffs, 1, ϵ)
x_pert = substitute(x_taylor, x_coeffs_sol)

\[ \begin{equation} 1 - \frac{1}{5} \epsilon - \frac{1}{25} \epsilon^{2} - \frac{1}{125} \epsilon^{3} + \frac{21}{15625} \epsilon^{5} + \frac{78}{78125} \epsilon^{6} + \frac{187}{390625} \epsilon^{7} \end{equation} \]

The $n$-th order solution of our original quintic equation is the sum up to the $ϵ^n$-th order term, evaluated at $ϵ=1$:

for n in 0:7
    x_pert_sol = substitute(taylor(x_pert, ϵ, 0:n), ϵ => 1)
    println("$n-th order solution: x = $x_pert_sol = $(x_pert_sol * 1.0)")
end
0-th order solution: x = 1//1 = 1.0
1-th order solution: x = 4//5 = 0.8
2-th order solution: x = 19//25 = 0.76
3-th order solution: x = 94//125 = 0.752
4-th order solution: x = 94//125 = 0.752
5-th order solution: x = 11771//15625 = 0.753344
6-th order solution: x = 58933//78125 = 0.7543424
7-th order solution: x = 294852//390625 = 0.75482112

This is close to the solution from Newton's method!

Solving Kepler's Equation

Historically, perturbation methods were first invented to calculate orbits of the Moon and the planets. In homage to this history, our second example is to solve Kepler's equation, which is central to solving two-body Keplerian orbits:

@variables e E M
kepler = E - e * sin(E) ~ M

\[ \begin{equation} E - e \sin\left( E \right) = M \end{equation} \]

We want to solve for the eccentric anomaly $E$ given the eccentricity $e$ and mean anomaly $M$. This is also easy with Newton's method. With Earth's eccentricity $e = 0.01671$ and $M = \pi/2$:

vals_earth = Dict(e => 0.01671, M => π/2)
E_newton = solve_newton(substitute(kepler, vals_earth), E, π/2)
println("Newton's method solution: E = ", E_newton)
Newton's method solution: E = 1.5875039945829585

Next, let us solve the same problem with the perturbative method. It is most common to expand $E$ as a series in $M$. Repeating the procedure from the quintic example, we get these equations:

E_taylor = series(E, M, 0:5) # this auto-creates coefficients E[0:5]
E_coeffs = taylor_coeff(E_taylor, M) # get a handle to the coefficients
kepler_eqs = taylor_coeff(substitute(kepler, E => E_taylor), M, 0:5)
kepler_eqs[1:4] # for readability

\[ \begin{align} E_{0} - e \sin\left( E_{0} \right) &= 0 \\ E_{1} - E_{1} e \cos\left( E_{0} \right) &= 1 \\ \frac{1}{2} \left( 2 E_{2} - 2 E_{2} e \cos\left( E_{0} \right) + E_{1}^{2} e \sin\left( E_{0} \right) \right) &= 0 \\ \frac{1}{6} \left( 6 E_{3} - 6 E_{3} e \cos\left( E_{0} \right) + 6 E_{1} E_{2} e \sin\left( E_{0} \right) + E_{1}^{3} e \cos\left( E_{0} \right) \right) &= 0 \end{align} \]

The trivial $0$-th order solution (when $M=0$) is $E_0=0$. This gives this full perturbative solution:

E_coeffs_sol = solve_cascade(kepler_eqs, E_coeffs, 0, M)
E_pert = substitute(E_taylor, E_coeffs_sol)

\[ \begin{equation} \frac{ - M^{5} \left( \frac{ - 60 \left( \frac{-1}{-1 + e} \right)^{5} e^{2}}{6 - 6 e} - \left( \frac{-1}{-1 + e} \right)^{5} e \right)}{120 - 120 e} + \frac{ - M}{-1 + e} + \frac{ - \left( \frac{-1}{-1 + e} \right)^{3} M^{3} e}{6 - 6 e} \end{equation} \]

Numerically, the result again converges to that from Newton's method:

for n in 0:5
    println("$n-th order solution: E = ", substitute(taylor(E_pert, M, 0:n), vals_earth))
end
0-th order solution: E = 0//1
1-th order solution: E = 1.59749039123239
2-th order solution: E = 1.59749039123239
3-th order solution: E = 1.585943679042236
4-th order solution: E = 1.585943679042236
5-th order solution: E = 1.5876674054288387

But unlike Newtons method, this example shows how perturbation theory also gives us the powerful symbolic series solution for $E$ (before numbers for $e$ and $M$ are inserted). Our series matches this result from Wikipedia:

E_wiki = 1/(1-e)*M - e/(1-e)^4*M^3/factorial(3) + (9e^2+e)/(1-e)^7*M^5/factorial(5)

\[ \begin{equation} \frac{M}{1 - e} + \frac{\frac{1}{120} M^{5} \left( e + 9 e^{2} \right)}{\left( 1 - e \right)^{7}} + \frac{ - \frac{1}{6} M^{3} e}{\left( 1 - e \right)^{4}} \end{equation} \]

Alternatively, we can expand $E$ in $e$ instead of $M$, giving the solution (the trivial solution when $e = 0$ is $E_0=M$):

E_taylor′ = series(E, e, 0:5)
E_coeffs′ = taylor_coeff(E_taylor′, e)
kepler_eqs′ = taylor_coeff(substitute(kepler, E => E_taylor′), e, 0:5)
E_coeffs_sol′ = solve_cascade(kepler_eqs′, E_coeffs′, M, e)
E_pert′ = substitute(E_taylor′, E_coeffs_sol′)

\[ \begin{equation} M + e \sin\left( M \right) + e^{2} \sin\left( M \right) \cos\left( M \right) - \frac{1}{6} e^{3} \left( 3 \sin^{3}\left( M \right) - 6 \cos^{2}\left( M \right) \sin\left( M \right) \right) - \frac{1}{24} e^{4} \left( 28 \sin^{3}\left( M \right) \cos\left( M \right) + 4 \left( 3 \sin^{3}\left( M \right) - 6 \cos^{2}\left( M \right) \sin\left( M \right) \right) \cos\left( M \right) \right) - \frac{1}{120} e^{5} \left( - 5 \sin^{5}\left( M \right) + 120 \cos^{2}\left( M \right) \sin^{3}\left( M \right) + 5 \left( 28 \sin^{3}\left( M \right) \cos\left( M \right) + 4 \left( 3 \sin^{3}\left( M \right) - 6 \cos^{2}\left( M \right) \sin\left( M \right) \right) \cos\left( M \right) \right) \cos\left( M \right) - 20 \sin^{2}\left( M \right) \left( 3 \sin^{3}\left( M \right) - 6 \cos^{2}\left( M \right) \sin\left( M \right) \right) \right) \end{equation} \]

This looks very different from our first series E_pert. If they are the same, we should get $0$ if we subtract and expand both as multivariate Taylor series in $(e,M)$. Indeed:

taylor(taylor(E_pert′ - E_pert, e, 0:4), M, 0:4)

\[ \begin{equation} 0 \end{equation} \]