Newton Raphson
Newton-Raphson algorithm is widely used in scientific computing. In ADCME, the function for the algorithm is newton_raphson. And the signature is
ADCME.newton_raphson — Functionnewton_raphson(func::Function, 
    u0::Union{Array,PyObject}, 
    θ::Union{Missing,PyObject, Array{<:Real}}=missing,
    args::PyObject...) where T<:RealNewton Raphson solver for solving a nonlinear equation.  ∘ func has the signature 
- func(θ::Union{Missing,PyObject}, u::PyObject)->(r::PyObject, A::Union{PyObject,SparseTensor})(if- linesearchis off)
- func(θ::Union{Missing,PyObject}, u::PyObject)->(fval::PyObject, r::PyObject, A::Union{PyObject,SparseTensor})(if- linesearchis on)
where r is the residual and A is the Jacobian matrix; in the case where linesearch is on, the function value fval must also be supplied. ∘ θ are external parameters. ∘ u0 is the initial guess for u ∘ args: additional inputs to the func function  ∘ kwargs: keyword arguments to func
The solution can be configured via ADCME.options.newton_raphson
- max_iter: maximum number of iterations (default=100)
- rtol: relative tolerance for termination (default=1e-12)
- tol: absolute tolerance for termination (default=1e-12)
- LM: a float number, Levenberg-Marquardt modification $x^{k+1} = x^k - (J^k + \mu^k)^{-1}g^k$ (default=0.0)
- linesearch: whether linesearch is used (default=false)
Currently, the backtracing algorithm is implemented. The parameters for linesearch are supplied via options.newton_raphson.linesearch_options
- c1: stop criterion, $f(x^k) < f(0) + \alpha c_1 f'(0)$
- ρ_hi: the new step size $\alpha_1\leq \rho_{hi}\alpha_0$
- ρ_lo: the new step size $\alpha_1\geq \rho_{lo}\alpha_0$
- iterations: maximum number of iterations for linesearch
- maxstep: maximum allowable steps
- αinitial: initial guess for the step size $\alpha$
As an example, assume we want to solve
\[u_i^2 - 1 = 0, i=1,2,\ldots, 10\]
We first need to construct a function
function f(θ, u)
    return u^2 - 1, 2*spdiag(u)
endHere $2\texttt{spdiag}(u)$ is the Jacobian matrix for the equation. Then we construct a Newton Raphson solver via
nr = newton_raphson(f, constant(rand(10)))nr is a NRResult struct which is runnable and can be materialized by 
nr = run(sess, nr)The signature for NRResult is 
struct NRResult
    x::Union{PyObject, Array{Float64}} # final solution
    res::Union{PyObject, Array{Float64, 1}} # residual
    u::Union{PyObject, Array{Float64, 2}} # solution history
    converged::Union{PyObject, Bool} # whether it converges
    iter::Union{PyObject, Int64} # number of iterations
endu$\in \mathbb{R}^{p\times n}$ where p is the solution dimension and n is the number of iterations. 
Sometimes we want to construct f via some external variables $\theta$, e.g., when $\theta$ is a trainable variable and embeded in the Newton-Raphson solver, we can pass this parameter to newton_raphson via the third parameter
nr = newton_raphson(f, constant(rand(10)),θ)We can provide options to newton_raphson using ADCME.options.newton_raphson. For example
ADCME.options.newton_raphson.verbose = true 
ADCME.options.newton_raphson.tol = 1e-6
nr = newton_raphson(f, constant(rand(10)), missing)This might be useful for debugging.
In the case we want to apply a linesearch step in our Newton-Raphson solver, we can turn on the linesearch option in options. However, in this case, we must provide the function value for f (assuming we are solving a minimization problem).  
function f(θ, u)
    return sum(1/3*u^3-u), u^2 - 1, 2*spdiag(u)
endThe corresponding driver code is
ADCME.options.newton_raphson.verbose = false
ADCME.options.newton_raphson.linesearch = true
ADCME.options.newton_raphson.tol = 1e-12
ADCME.options.newton_raphson.linesearch_options.αinitial = 1.0
nr = newton_raphson(f, constant(rand(10)), missingFinally we consider the differentiable Newton-Raphson algorithm. Consider we want to construct a map $f:x\mapsto y$, which satisfies
\[y^3-x=0\]
In a later stage, we also want to evaluate $\frac{dy}{dx}$. To this end, we can use newton_raphson_with_grad, which provides a differentiable implementation of the Newton-Raphson's algorithm. 
function f(θ, x)
    x^3 - θ, 3spdiag(x^2)
end
θ = constant([2. .^3;3. ^3; 4. ^3])
x = newton_raphson_with_grad(f, constant(ones(3)), θ)
run(sess, x)≈[2.;3.;4.]
run(sess, gradients(sum(x), θ))≈1/3*[2. .^3;3. ^3; 4. ^3] .^(-2/3)