{ "cells": [ { "cell_type": "markdown", "id": "a30eaeba", "metadata": {}, "source": [ "## Nonlinear systems and automatic differentiation" ] }, { "cell_type": "markdown", "id": "df935bca", "metadata": {}, "source": [ "### Exercise on Newton-Raphson in 2D\n", "\n", "Consider the following nonlinear system:\n", "$$\n", "\\left \\{\n", " \\begin{aligned}\n", " &y = (x-1)^2 \\\\\n", " &x^2 + y^2 = 4\n", " \\end{aligned}\n", "\\right.\n", "$$\n", "\n", "1. By exploiting appropriate graphical plots to roughly visualize the area(s) containing solution(s), implement a Newton-Raphson algorithm to calculate an approximation of this (these) solution(s) by choosing a judicious initial point (or points).\n", "\n", "2. Modify the algorithm to collect all iterations and visualize the evolution of the error in terms of iteration for various initial values using a suitable y-scale. One can assume that a given solution is provided by the final value of the algorithm." ] }, { "cell_type": "code", "execution_count": null, "id": "0800dc52", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "8782e8fd", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "f9e89b43", "metadata": {}, "source": [ "### Exercise on the Babylonian Method\n", "\n", "Let $a > 0$ be a real parameter, and consider the sequence defined by\n", "\n", "$$\n", "\\tag{1}\n", "x_0 > 0 \\quad ; \\quad \\forall k \\in \\mathbb{N}, \\quad x_{k+1} = \\frac{1}{2}\\left(x_k + \\frac{a}{x_k}\\right)\n", "$$\n", "\n", "1. By expressing $x_{k+1} - \\sqrt{a}$ in terms of $x_k - \\sqrt{a}$ and then $x_{k+1} - x_k$, show that $(x_k)$ converges quadratically to $x_* = \\sqrt{a}$ for any $x_0 > 0$.\n", "\n", "1. Show that the recurrence formulation (1) is none other than the Newton-Raphson algorithm applied to a function that vanishes at $x_* = \\sqrt{a}$.\n", "\n", "1. Build a function `Babylonian` taking arguments `a` and an integer `n` defaulting to `10`, returning the vector $[x_0, x_1, \\ldots, x_n]$ by initializing the sequence with $x_0 = \\frac{1+a}{2}$.\n", "\n", "1. Test on a few values.\n", "\n", "1. Plot the error $|x_k - x_*|$ as a function of the iteration index $k$ for $a = 2$.\n", "\n", " The idea in the rest of this exercise is to apply the `Babylonian` function defined above to an argument `a` not of type `Float64` but of a new type allowing estimation of both the value of $\\sqrt{a}$ and its derivative $\\frac{1}{2\\sqrt{a}}$. For this purpose, new numbers called **dual numbers** are introduced. These are defined similarly to complex numbers, starting with a special number denoted $ɛ$, such that a dual number is written as $x = a + bɛ$ with $a$ and $b$ real. In a sense, $ɛ$ plays a role analogous to the complex $i$, but here, we set $ɛ^2=0$. The goal of such numbers is to be able to store both the value of a function and its derivative by stating\n", "\n", " \n", " $$\n", " \\tag{2}\n", " f(a+bɛ)=f(a)+f'(a)bɛ\n", " $$\n", "\n", " which implies that the derivative at $a$ of $f$ can be obtained by retrieving the component on $ɛ$ of $f(a+ɛ)$ (i.e., by setting $b=1$).\n", "\n", " In practice, it is necessary to redefine the behavior of basic functions consistently with (2). In the current application, only the operations `+`, `-`, `*`, and `/` will be needed and should be overloaded to accept dual numbers as arguments. Additionally, the functions `convert` should be implemented to convert a real number into a dual number, and `promote_rule` to express that in the presence of an operation involving two numbers, one of which is dual, both must first be expressed as dual numbers before the operation is performed. Note that overloading operators is only possible if they are explicitly imported, for example, using `import Base: +, -, ...`. Also, the `Base.show` function should be defined so that the display of a dual number explicitly has the form `a+bɛ`.\n", "\n", " The operator overloading is mathematically expressed as\n", " $$\n", " \\begin{align*}\n", " (a+bɛ)+(c+dɛ)&=(a+c)+(b+d)ɛ\\\\\n", " (a+bɛ)-(c+dɛ)&=(a-c)+(b-d)ɛ\\\\\n", " (a+bɛ)*(c+dɛ)&=ac+(bc+ad)ɛ\\\\\n", " (a+bɛ)/(c+dɛ)&=\\frac{a}{c}+\\frac{bc-ad}{c^2}ɛ\n", " \\end{align*}\n", " $$\n", "\n", " Alternatively to this last operation, we can also define $\\mathrm{inv}(a+bɛ)=\\frac{1}{a}-\\frac{b}{a^2}ɛ$ and then $u/v=u*\\mathrm{inv}(v)$.\n", "\n", "1. Study the `struct D` defined to represent a dual number and the associated lines of code. Complete the missing code sections.\n", "\n", "1. Define an instance of the number `ɛ` (\\varepsilon then TAB to display ε), in other words, the number `0+1ɛ`, and perform some operations to check the implementations, for example,\n", "\n", " ```julia\n", " (1+2ɛ)*(3+4ɛ)\n", " 1/(1+ɛ)\n", " (1+2ɛ)/(2-ɛ)\n", " ```\n", "\n", "1. Exploit the structure of dual numbers to estimate the derivative of the square root function using the Babylonian method (directly utilizing the `Babylonian` function without rewriting it) on some examples ($a=0.1$, $a=2$, $a=25$) and verify the results with the analytical solution.\n", "\n", "1. Overlay on a graph the derivative of the square root obtained by the Babylonian method on dual numbers and the analytical expression.\n", "\n", "1. Propose a similar method for calculating the $p$-th root of a number $a$, i.e., $\\sqrt[p]{a}$. Verify that the derivative of the $p$-th root can also be obtained by exploiting dual numbers without any additional lines of code.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "05eb72f1", "metadata": {}, "outputs": [], "source": [ "function Babylonian(a; n = 10)\n", " # Your code here\n", "end" ] }, { "cell_type": "code", "execution_count": null, "id": "e4f5e336", "metadata": {}, "outputs": [], "source": [ "for a in (0.1, 2, 25) \n", " # Your test code here\n", "end" ] }, { "cell_type": "code", "execution_count": null, "id": "095a9ed7", "metadata": {}, "outputs": [], "source": [ "using Plots\n", "\n", "# Your plotting code here" ] }, { "cell_type": "code", "execution_count": null, "id": "dbe4bc43", "metadata": {}, "outputs": [], "source": [ "import Base: +, -, *, /, inv, conj, convert, promote_rule\n", "using LinearAlgebra\n", "\n", "struct D <: Number\n", " f::Tuple{Float64,Float64}\n", "end\n", "+(x::D, y::D) = D(x.f .+ y.f)\n", "-(x::D, y::D) = # to fill\n", "*(x::D, y::D) = D((x.f[1]*y.f[1], (x.f[2]*y.f[1] + x.f[1]*y.f[2])))\n", "/(x::D, y::D) = # to fill\n", "convert(::Type{D}, x::Real) = D((x,zero(x)))\n", "promote_rule(::Type{D}, ::Type{<:Real}) = D\n", "Base.show(io::IO,x::D) = print(io,x.f[1],x.f[2]<0 ? \" - \" : \" + \",abs(x.f[2]),\" ε\")\n", "ε = D((0,1))" ] }, { "cell_type": "code", "execution_count": null, "id": "437e8442", "metadata": {}, "outputs": [], "source": [ "@show (1+2ɛ)*(3+4ɛ) ; @assert (1+2ɛ)*(3+4ɛ) == 3+10ɛ \"erreur\"\n", "# Add other tests\n", ";" ] }, { "cell_type": "code", "execution_count": null, "id": "15733be0", "metadata": {}, "outputs": [], "source": [ "for a in (0.1, 2, 25) \n", " # Check \n", "end" ] }, { "cell_type": "code", "execution_count": null, "id": "24c36846", "metadata": {}, "outputs": [], "source": [ "# Add plots here" ] }, { "cell_type": "code", "execution_count": null, "id": "69dfb790", "metadata": {}, "outputs": [], "source": [ "function nthrt(a, p=2; x=1, n=10)\n", " # Your code for the n-th root here \n", "end" ] }, { "cell_type": "code", "execution_count": null, "id": "fc33ab1f", "metadata": {}, "outputs": [], "source": [ "a = 2 ; p = 3\n", "# Your test code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extension to multiple dimensions\n", "\n", "Using dual numbers, write a function `jacobian(f, x)` which returns the Jacobian of `f` at `x`.\n", "Use this function to solve again the first exercise of this notebook\n", "without calculating any derivatives by hand." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "function jacobian(f, x)\n", " # Write your code here\n", "end" ] }, { "cell_type": "markdown", "id": "b0e42447", "metadata": {}, "source": [ "### Application to optimization\n", "\n", "In Julia, automatic differentiation based on dual numbers is implemented in the `ForwardDiff` library.\n", "For example, the following code snippet\n", " ```julia\n", " using ForwardDiff\n", " f(t) = 9.81 * t^2/2\n", " ForwardDiff.derivative(f, 1)\n", " ```\n", "returns the value of $f'(1)$.\n", "Unlike our simple implementation of dual numbers above,\n", "`ForwardDiff` is able to calculate derivatives of higher orders.\n", " For example, \n", " ```julia\n", " using ForwardDiff\n", " f(x) = x -> x[1]^2 + 3x[2]^2\n", " ForwardDiff.hessian(f, [1, 2])\n", " ```\n", "returns the Hessian of the function $f(x, y) = x^2 + 3y^2$ at $(1, 2)$.\n", "Using the `ForwardDiff` library, solve the following exercises.\n", "\n", "\n", "1. We have $n$ points $(x_i, y_i)$ of an unknown function $y = f(x)$. We want to approximate the data with a function of the form\n", " $$\n", " \\widetilde f(x) = \\frac{a}{b + x}\n", " $$\n", " by minimizing\n", " $$\n", " E(a,b) := \\sum_{i=1}^{n} |\\widetilde f(x_i) - y_i|^2\n", " $$\n", " To find the minimizer, we propose to solve the nonlinear system of equations given by\n", " $$\n", " \\begin{aligned}\n", " \\partial_a E(a, b) &= 0, \\\\\n", " \\partial_b E(a, b) &= 0.\n", " \\end{aligned}\n", " $$\n", " Use the Newton-Raphson method to solve these equations, and\n", " then plot on the same graph the given points and the approximating function.\n", "\n", "1. We have $n$ new points $(x_i, y_i)$ of an unknown function $y = f(x)$, and we want to approximate $f$ with an affine function\n", " $$\n", " \\widetilde f(x) = ax+b\n", " $$\n", " by minimizing the sum of Euclidean distances between the points and the line defined by $\\widetilde f$. Given that the distance between a point $(x_i, y_i)$ and the straight line is given by\n", "\n", " $$\n", " \\frac{\\lvert y_i - a x_i - b \\rvert}{\\sqrt{1+a^2}}\n", " $$\n", "\n", " the objective function to minimize is\n", "\n", " $$\n", " J(a, b) := \\sum_{i=1}^{n} \\frac{ \\left( y_i - a x_i - b \\right)^2 }{1+a^2}\n", " $$\n", "\n", " Find the optimal parameters $a$ and $b$ using the Newton-Raphson method and plot on the same graph the line along with the points." ] }, { "cell_type": "code", "execution_count": null, "id": "4fd6ffc8", "metadata": {}, "outputs": [], "source": [ "x = [0.0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1.0]\n", "y = [0.6761488864859304; 0.6345697680852508; 0.6396283580587062; 0.6132010027973919;\n", " 0.5906142598705267; 0.5718728461471725; 0.5524549902830562; 0.538938885654085;\n", " 0.5373495476994958; 0.514904589752926; 0.49243437874655027]\n", "f(a,b) = x -> a / (b+x)\n", "# Votre code ici" ] }, { "cell_type": "code", "execution_count": null, "id": "9c580d22", "metadata": {}, "outputs": [], "source": [ "x = [0.0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1.0]\n", "y = [-0.9187980789440975; -0.6159791344678258; -0.25568734869121856;\n", " -0.14269370171581808; 0.3094396057228459; 0.6318327173549161;\n", " 0.8370437988106428; 1.0970402798788812; 1.6057799131867696;\n", " 1.869090784869698; 2.075369730726694]\n", "f(a,b) = x -> a*x+b\n", "# Votre code ici" ] } ], "metadata": { "citation-manager": { "items": {} }, "kernelspec": { "display_name": "Julia 1.9.0", "language": "julia", "name": "julia-1.9" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.9.0" } }, "nbformat": 4, "nbformat_minor": 5 }