Basics of Morse Theory and the Poincaré-Hopf Theorem

We mainly focus on Morse Theory by J. Milnor. In Chapter 6, we will use Theorem 11.27 and Lemma 11.20 from From Calculus to Cohomology (by Madsen and Tornehave) to prove the Poincaré-Hopf theorem.

0. Preliminaries

Definition 1 (Homotopy). Let \(f, g: X \to Y\) be continuous maps. We say \(f\) is homotopic to \(g\), denoted by \(f \simeq g\), if there exists a continuous map \(H : X \times I \to Y\) such that:

\(\forall x \in X, \ H(x, 0) = f(x)\).
\(\forall x \in X, \ H(x, 1) = g(x)\).

Definition 2 (Homotopy Equivalence). Two spaces \(X\) and \(Y\) are homotopy equivalent if there exist continuous maps \(f: X \to Y\) and \(g: Y \to X\) such that:

\(g \circ f \simeq \mathrm{id}_X\)
\(f \circ g \simeq \mathrm{id}_Y\)

Definition 3 (Deformation Retract). Let \(A \subset X\). A deformation retract is a continuous map \(H: X \times I \to X\) such that:

\(\forall x \in X, \ H(x, 0) = x\).
\(\forall x \in X, \ H(x, 1) \in A\).
\(\forall a \in A, \forall t \in [0, 1], \ H(a, t) = a\).

Lemma 1 (Pasting Lemma). Let \(X = A \cup B\), where \(A\) and \(B\) are closed subsets of \(X\). If \(f: A \to Y\) and \(g: B \to Y\) are continuous maps such that \(f|_{A \cap B} = g|_{A \cap B}\), then the map \(h: X \to Y\) defined by \[h(x) = \begin{cases} f(x), & x \in A \\ g(x), & x \in B \end{cases}\] is continuous on \(X\).

Definition 4 (CW-complex). A CW-complex is defined by induction on its skeleta:

\(X^0\): A discrete set of points (0-cells).
\(X^n\): Obtained by attaching \(n\)-cells \(D_\alpha^n\) to the \((n-1)\)-skeleton \(X^{n-1}\) via continuous attaching maps \(\phi_\alpha : S_\alpha^{n-1} \to X^{n-1}\). Formally, \[X^n = \left( X^{n-1} \coprod \bigsqcup_\alpha D_\alpha^n \right) \Big/ \sim\]

1. Introduction

Let \(f: M \to \mathbb{R}\) be the height function above the plane. We denote the sublevel set by \(M^a = \{x \in M \mid f(x) \le a\}\).

As the height \(a\) increases, the topology of \(M^a\) changes as follows:

\(a < 0 = f(p)\).
\(M^a\) is empty.
\(f(p) < a < f(q)\).
\(M^a \simeq D^2 \simeq \text{a point} \simeq \text{0-cell}\). Attaching a 0-cell
\(f(q) < a < f(r)\).
\(M^a \simeq S^1 \times I \simeq S^1\). Attaching a 1-cell
\(f(r) < a < f(s)\).
\(M^a \simeq \text{Diagram 1.2} \simeq \text{Diagram 1.3} \simeq S^1 \vee S^1\). Attaching a 1-cell
\(f(s) < a\).
\(M^a \simeq T^2\) (the full torus). Attaching a 2-cell

Intuitively, near the critical points \(p, q, r, s\), the function \(f\) can be approximated locally as: \[\begin{aligned} &\text{at } p \text{ (minimum)}: \quad f = x^2 + y^2 \\ &\text{at } q, r \text{ (saddles)}: \quad f = C + x^2 - y^2 \\ &\text{at } s \text{ (maximum)}: \quad f = C - x^2 - y^2 \end{aligned}\]

Whenever we cross a critical value \(c\), it turns out that the topological change is completely determined by attaching a \(k\)-cell to \(M^{c-\epsilon}\), where \(k\) is the number of negative signs in the local quadratic form.

2. Definitions and Lemmas

Definition 5 (Critical point). A point \(p\) is called a critical point of a smooth function \(f\), if for a local coordinate system \((x^1, x^2, \dots, x^n)\) in a neighborhood \(U\) of \(p\), we have \[\frac{\partial f}{\partial x^1}(p) = \dots = \frac{\partial f}{\partial x^n}(p) = 0.\]

Definition 6 (Non-degenerate critical point). A critical point \(p\) is called non-degenerate if and only if the Hessian matrix evaluated at \(p\) is non-singular, i.e., \[\det \left( \frac{\partial^2 f}{\partial x^i \partial x^j}(p) \right) \neq 0.\]

Lemma 2 (Lemma 2.1). Let \(f\) be a \(C^\infty\) function in a convex neighborhood \(V\) of \(0\) in \(\mathbb{R}^n\), with \(f(0)=0\). Then \[f(x_1, \dots, x_n) = \sum_{i=1}^n x_i g_i(x_1, \dots, x_n)\] for some suitable \(C^\infty\) functions \(g_i\) defined in \(V\), satisfying \(g_i(0) = \frac{\partial f}{\partial x_i}(0)\).

Proof. By the fundamental theorem of calculus and the chain rule, we have: \[f(x_1, \dots, x_n) = \int_0^1 \frac{d}{dt}f(tx_1, \dots, tx_n) \, dt = \int_0^1 \sum_{i=1}^n \frac{\partial f}{\partial x_i}(tx_1, \dots, tx_n) \cdot x_i \, dt.\] Therefore, we can simply define \(g_i(x_1, \dots, x_n) = \int_0^1 \frac{\partial f}{\partial x_i}(tx_1, \dots, tx_n) \, dt\). ◻

Lemma 3 (Lemma 2.2, The Morse Lemma). Let \(p\) be a non-degenerate critical point for \(f\). Then there exists a local coordinate system \((y^1, \dots, y^n)\) in a neighborhood \(U\) of \(p\) with \(y^i(p)=0\) for all \(i\), such that the identity \[f = f(p) - (y^1)^2 - \dots - (y^\lambda)^2 + (y^{\lambda+1})^2 + \dots + (y^n)^2\] holds throughout \(U\). Here, the integer \(\lambda\) is called the index of \(f\) at \(p\), which equals the negative index of inertia of the Hessian matrix \(\left( \frac{\partial^2 f}{\partial x^i \partial x^j}(p) \right)\).

Proof. Without loss of generality, assume \(p\) is the origin \(0\), and \(f(p) = 0\). If the lemma holds, the function takes the form \(f(q) = -(z^1)^2 - \dots - (z^\lambda)^2 + (z^{\lambda+1})^2 + \dots + (z^n)^2\), which gives a diagonal Hessian matrix at \(p\): \[\left( \frac{\partial^2 f}{\partial z^i \partial z^j}(p) \right) = \begin{bmatrix} -2 & & & \\ & \ddots & & \\ & & 2 & \\ & & & \ddots \end{bmatrix}.\]

Step 1: Express \(f\) as a quadratic form.
By Lemma 2.1, we can write \(f(x) = \sum_{j=1}^n x_j g_j(x)\) in some neighborhood of \(0\). Since \(0\) is a critical point, \(g_j(0) = \frac{\partial f}{\partial x_j}(0) = 0\). Applying Lemma 2.1 again to each \(g_j\), we obtain \(g_j(x) = \sum_{i=1}^n x_i h_{ij}(x)\).

Substituting this back, we get \(f(x) = \sum_{i,j=1}^n x_i x_j h_{ij}(x)\). By defining \(\bar{h}_{ij} = \frac{1}{2}(h_{ij} + h_{ji})\), we may assume without loss of generality that \(h_{ij}(x)\) is symmetric.

Assertion: \(h_{ij}(0) = \frac{1}{2} \frac{\partial^2 f}{\partial x^i \partial x^j}(0)\). Taking the first derivative yields: \[\frac{\partial f}{\partial x^i} = \sum_{j} x_j h_{ij} + \sum_{j} x_j h_{ji} + \sum_{i,j} x_i x_j \frac{\partial h_{ij}}{\partial x_i}.\] Taking the second derivative and evaluating at the origin gives: \[\frac{\partial^2 f}{\partial x^i \partial x^j}(0) = h_{ij}(0) + h_{ji}(0) = 2h_{ij}(0).\] Thus, the matrix \(\left( h_{ij}(0) \right) = \frac{1}{2} \left( \frac{\partial^2 f}{\partial x^i \partial x^j}(0) \right)\) is non-singular.

Step 2: Imitate the diagonalization process.
By induction, suppose there exist coordinates \(u_1, \dots, u_n\) in a neighborhood \(U_1\) of \(0\) such that \[f = \pm (u_1)^2 \pm \dots \pm (u_{r-1})^2 + \sum_{i,j \ge r} u_i u_j H_{ij}(u_1, \dots, u_n),\] where \((H_{ij})\) is symmetric.

Ensuring that \(H_{rr}(0) \neq 0\): If \(H_{rr}(0) = 0\), we can perform a linear change of coordinates to fix it:

If \(\exists H_{kk}(0) \neq 0\) for \(k > r\), we simply swap coordinates \(u_r\) and \(u_k\).
If \(\forall H_{kk}(0) = 0\) but \(\exists A_{rk} \neq 0\), we apply a \(45^\circ\) rotation \(u_r = \tilde{u}_r + \tilde{u}_k\) and \(u_k = \tilde{u}_r - \tilde{u}_k\), which converts the cross term into square terms, ensuring the new \(H_{rr}(0) \neq 0\).

Now, let \(g = \sqrt{|H_{rr}|}\), which is a smooth and non-zero function in a smaller neighborhood \(U_2 \subset U_1\). We define a new set of coordinates: \[v_i = u_i \quad (\text{for } i \neq r), \quad \text{and} \quad v_r = g \left( u_r + \sum_{i>r} u_i \frac{H_{ir}}{H_{rr}} \right).\]

\((v_1, \dots, v_n)\) is a valid coordinate system: The Jacobian matrix \(J = \left( \frac{\partial v_i}{\partial u_j} \right)\) has the block form: \[J = \begin{bmatrix} I & 0 & 0\\ 0 & \frac{\partial v_r}{\partial u_r} &\dots\\ 0 & 0 & I\\ \end{bmatrix}.\] Evaluated at the origin, \(\frac{\partial v_r}{\partial u_r}(0) = g(0) \neq 0\). Thus, \(\det(J) \neq 0\), making it a valid diffeomorphism by the Inverse Function Theorem.

By completing the square, the function in a neighborhood \(U_3 \subset U_2\) becomes: \[f = \sum_{i<r} \pm (v_i)^2 + \mathrm{sgn}(H_{rr}(0)) v_r^2 + \sum_{i,j > r} v_i v_j H'_{ij}(v_1, \dots, v_n),\] which successfully isolates the \(r\)-th term. The proof follows by induction. ◻

Corollary 1 (Corollary 2.3). Non-degenerate critical points are isolated.

Proof. By the Morse Lemma, there exists a neighborhood \(U\) of \(p\) where \(f = f(p) - (y^1)^2 - \dots + (y^n)^2\). Setting the gradient to zero yields \(\frac{\partial f}{\partial y^i} = \pm 2y^i = 0 \implies y^i = 0\). Thus, \(p\) (the origin) is the only critical point in \(U\). ◻

Definition 7 (1-parameter group of diffeomorphisms). A 1-parameter group of diffeomorphisms of a manifold \(M\) is a \(C^\infty\) map \(\phi: \mathbb{R} \times M \to M\), such that:

For each \(t \in \mathbb{R}\), the map \(\phi_t : M \to M\) defined by \(\phi_t(q) = \phi(t,q)\) is a diffeomorphism of \(M\) onto itself.
For all \(t,s \in \mathbb{R}, \quad \phi_{t+s} = \phi_t \circ \phi_s\).

Definition 8 (Vector field generated by the group). For any smooth, real-valued function \(f\), the vector field \(X\) generated by the group is defined by the directional derivative: \[X_q(f) = \lim_{h \to 0} \frac{f(\phi_h(q)) - f(q)}{h} = \frac{df(q)}{dt}.\]

Lemma 4 (Lemma 2.4). A smooth vector field on \(M\) which vanishes outside of a compact set \(K \subset M\) generates a unique 1-parameter group of diffeomorphisms of \(M\).

Proof. The proof relies on solving the corresponding ordinary differential equation (ODE).

Step 1: Set up the ODE.
We need to find the integral curve \(\phi_t(q)\) satisfying: \[\frac{d\phi_t(q)}{dt} = X_{\phi_t(q)}\] with the initial condition \(\phi_0(q) = q\). (Here, the derivative means \(\frac{d}{dt}f(\phi_t(q)) = X_{\phi_t(q)}(f)\) for any smooth function \(f\)).

Step 2: Local coordinate representation.
For any \(q \in M\), choose a coordinate chart \(\psi: U \to \tilde{U} \subset \mathbb{R}^n\), where \(\psi(q) = (u^1(q), \dots, u^n(q))\).

Step 3: Local existence and uniqueness.
In local coordinates, the vector field equation becomes a standard system of first-order ODEs: \[\frac{d u^j(t)}{dt} = X^j(u^1(t), \dots, u^n(t)), \quad j = 1, 2, \dots, n.\] By the Picard-Lindelöf theorem, this system admits a unique smooth solution locally for \(|t| < \epsilon\).

Step 4: Global extension via compactness.
Since the vector field vanishes outside the compact set \(K\), we only need to worry about points inside \(K\). We can cover \(K\) by a finite number of such neighborhoods \(U_i\), each with a guaranteed survival time \(\epsilon_i\). Let \(\epsilon_0 = \min\{\epsilon_i\} > 0\).

Since \(\epsilon_0\) is strictly positive, the map \(\phi_t\) is well-defined for all \(|t| < \epsilon_0\). For any arbitrary time \(t\), we can extend the flow globally by composing the map a finite number of times: \[\phi_t = \phi_{\epsilon_0/2} \circ \phi_{\epsilon_0/2} \circ \dots \circ \phi_{\text{remainder}}.\] This composition preserves smoothness and the group law (\(\phi_{t+s} = \phi_t \circ \phi_s\)), ensuring that \(\phi_t\) is a globally well-defined diffeomorphism. ◻

Remark 1. The condition that \(X\) vanishes outside a compact set cannot be omitted.
Counterexample: Let the manifold be the open interval \(M = (0,1) \subset \mathbb{R}\), and consider the constant vector field \(X = \frac{d}{dx}\) (i.e., \(X^1(x) = 1\)).

Solving the ODE \(\frac{dx}{dt} = X^1(x)\) yields \(x(t) = x(0) + t\).
Thus, the flow is given by \(\phi_h(x) = x + h\). However, if we start at \(x = 0.5\) and let \(t = 0.5\), the point reaches \(x = 1.0 \notin M\). The solution blows up in finite time, meaning \(\phi_t\) cannot be defined for all \(t \in \mathbb{R}\).

3. Homotopy Type in Terms of Critical Values

Theorem 1 (Theorem 3.1). Let \(f\) be a smooth real valued function on a manifold \(M\). Let \(a < b\) and suppose that the set \(f^{-1}[a,b]\), consisting of all \(p \in M\) with \(a \le f(p) \le b\), is compact, and contains no critical points of \(f\). Then \(M^a\) is diffeomorphic to \(M^b\). Furthermore, \(M^a\) is a deformation retract of \(M^b\), so that the inclusion map \(M^a \to M^b\) is a homotopy equivalence.

(Proof omitted.)

Theorem 2 (Theorem 3.2). Let \(f: M \to \mathbb{R}\) be a smooth function, and let \(p\) be a non-degenerate critical point with index \(\lambda\). Setting \(f(p) = c\), suppose that \(f^{-1}[c-\epsilon, c+\epsilon]\) is compact, and contains no critical point of \(f\) other than \(p\), for some \(\epsilon > 0\). Then, for all sufficiently small \(\epsilon\), the set \(M^{c+\epsilon}\) has the homotopy type of \(M^{c-\epsilon}\) with a \(\lambda\)-cell attached.

Proof of Theorem 3.2

Setup:
Choose a local coordinate system \(u^1, u^2, \dots, u^n\) in a neighborhood \(U\) of \(p\) such that: \[f = c - (u^1)^2 - \dots - (u^\lambda)^2 + (u^{\lambda+1})^2 + \dots + (u^n)^2\] Consider a closed disk \(\overline{D(0, 2\epsilon)} \subset U\), defined by \(\overline{D(0, 2\epsilon)} = \left\{ (u^1, \dots, u^n) \mid \sum (u^i)^2 \le 2\epsilon \right\}\).

Let \(e^\lambda\) be the \(\lambda\)-cell attached, given by: \[e^\lambda = \left\{ \sum_{i=1}^\lambda (u^i)^2 \le \epsilon \quad \text{and} \quad u^{\lambda+1} = \dots = u^n = 0 \right\}\]

Let \(\mu: \mathbb{R} \to \mathbb{R}\) be a smooth function satisfying: \[\begin{aligned} &\mu(0) > \epsilon \\ &\mu(r) = 0 \quad \text{for } r \ge 2\epsilon \\ &-1 < \mu'(r) \le 0 \quad \text{for all } r. \end{aligned}\]

Denote \(\xi = \sum_{i=1}^\lambda (u^i)^2\) and \(\eta = \sum_{j=\lambda+1}^n (u^j)^2\). We define a new function \(F\): \[F = f - \mu(\xi + 2\eta) = c - \xi + \eta - \mu(\xi + 2\eta).\]

Assertion 1 (1). The region \(F^{-1}(-\infty, c+\epsilon]\) coincides with the region \(M^{c+\epsilon} = f^{-1}(-\infty, c+\epsilon]\).

Proof.

If \(\xi + 2\eta > 2\epsilon\), then \(\mu(\xi + 2\eta) = 0 \implies F = f\).
If \(\xi + 2\eta \le 2\epsilon\), then: \[F \le f = c - \xi + \eta \le c + \frac{1}{2}\xi + \eta = c + \frac{1}{2}(\xi + 2\eta) \le c + \epsilon.\]

Thus, the sublevel sets coincide. ◻

Assertion 2 (2). The critical points of \(F\) are the same as those of \(f\).

Proof. Set \(dF = 0\). We compute the differential of \(F\): \[dF = \frac{\partial F}{\partial \xi} d\xi + \frac{\partial F}{\partial \eta} d\eta = \frac{\partial F}{\partial \xi} \sum_{i=1}^\lambda 2u^i du^i + \frac{\partial F}{\partial \eta} \sum_{p=\lambda+1}^n 2u^p du^p.\] By our construction of \(\mu\): \[\frac{\partial F}{\partial \xi} = -1 - \mu'(\xi+2\eta) < -1 - (-1) = 0.\] \[\frac{\partial F}{\partial \eta} = 1 - 2\mu'(\xi+2\eta) \ge 1.\] Since the coefficients are strictly non-zero , \(dF = 0\) requires: \[u^i = 0 \quad \text{and} \quad u^p = 0.\] Hence, the origin is the only critical point. ◻

Assertion 3 (3). The region \(F^{-1}(-\infty, c-\epsilon]\) is a deformation retract of \(M^{c+\epsilon}\).

Proof. Since \(F \le f\), we have \(F^{-1}[c-\epsilon, c+\epsilon] \subset f^{-1}[c-\epsilon, c+\epsilon]\). Because the latter is compact by assumption, the set \(F^{-1}[c-\epsilon, c+\epsilon]\) is also compact.

Suppose \(F^{-1}[c-\epsilon, c+\epsilon]\) contains a critical point. By Assertion 2, it can only be \(p\) (the origin). But at \(p\): \[F(p) = c - \mu(0) < c - \epsilon.\] This violates the assumption that it lies in \([c-\epsilon, c+\epsilon]\). Thus, it contains no critical points. According to Theorem 3.1, the region is a deformation retract. ◻

Assertion 4 (4). \(M^{c-\epsilon} \cup e^\lambda\) is a deformation retract of \(M^{c-\epsilon} \cup H\).

Let \(H = F^{-1}(-\infty, c-\epsilon]\). We construct the deformation retraction by distinguishing three cases (further details omitted).

Remark 2 (Remark 3.3). More generally suppose that there are \(k\) non-degenerate critical points \(p_1, \dots, p_k\) with indices \(\lambda_1, \dots, \lambda_k\) in \(f^{-1}(c)\). Then a similar proof shows that \(M^{c+\epsilon}\) has the homotopy type of \(M^{c-\epsilon} \cup e^{\lambda_1} \cup \dots \cup e^{\lambda_k}\).

Remark 3 (Remark 3.4). A simple modification of the proof of 3.2 shows that the set \(M^c\) is also a deformation retract of \(M^{c+\epsilon}\).

Theorem 3 (Theorem 3.5). If \(f\) is a differentiable function on a manifold \(M\) with no degenerate critical points, and if each \(M^a\) is compact, then \(M\) has the homotopy type of a CW-complex, with one cell of dimension \(\lambda\) for each critical point of index \(\lambda\).

Lemma 5 (Lemma 3.6, Whitehead). Let \(\varphi_0\) and \(\varphi_1\) be homotopic maps from the sphere \(\dot{e}^\lambda\) to \(X\). Then the identity map of \(X\) extends to a homotopy equivalence \[k: X \cup_{\varphi_0} e^\lambda \to X \cup_{\varphi_1} e^\lambda.\]

Lemma 6 (Lemma 3.7). Let \(\varphi: \dot{e}^\lambda \to X\) be an attaching map. Any homotopy equivalence \(f: X \to Y\) extends to a homotopy equivalence \[F: X \cup_{\varphi} e^\lambda \to Y \cup_{f \circ \varphi} e^\lambda.\]

Proof of Theorem 3.5

Case 1: \(M\) is compact.

Let \(c_1 < c_2 < c_3 < \dots\) be the critical values of \(f: M \to \mathbb{R}\).
Since each \(M^a\) is compact, the sequence \(\{c_i\}\) has no cluster point. We proceed by induction on the critical values.

For \(a < c_1\), the set \(M^a = \emptyset\).
Assume that for some \(a\), \(M^a\) has the homotopy type of a CW-complex. Let \[h': M^a \xrightarrow{\simeq} K\] be the homotopy equivalence, where \(K\) is a CW-complex. Let \(c\) be the smallest critical value such that \(c > a\).
By Theorem 3.1, for sufficiently small \(\epsilon > 0\), there is a deformation retraction (and thus a homotopy equivalence): \[h: M^{c-\epsilon} \xrightarrow{\simeq} M^a\]
By Theorem 3.2 and 3.3, \(M^{c+\epsilon}\) is obtained by attaching cells to \(M^{c-\epsilon}\): \[M^{c+\epsilon} \simeq M^{c-\epsilon} \cup_{\varphi_1} e^{\lambda_1} \cup \dots \cup_{\varphi_j} e^{\lambda_j}\] where the attaching maps are \(\varphi_j: \dot{e}^{\lambda_j} \to M^{c-\epsilon}\).
The composition map \(h' \circ h \circ \varphi_j\) maps the boundary \(\dot{e}^{\lambda_j}\) into \(K\). By Cellular Approximation, this continuous map is homotopic to a cellular map: \[h' \circ h \circ \varphi_j \simeq \psi_j\] where \(\psi_j: \dot{e}^{\lambda_j} \to (\lambda_j - 1)\text{-skeleton of } K\).
Because the attaching maps are homotopic, by Lemma 3.6 and Lemma 3.7, the resulting spaces are homotopy equivalent. Therefore: \[M^{c+\epsilon} \simeq K \cup_{\psi_1} e^{\lambda_1} \cup \dots \cup_{\psi_j} e^{\lambda_j}\] Since the boundaries are attached to the correct lower-dimensional skeleta of \(K\), this new space is strictly a valid CW-complex.

This completes the inductive step.

Case 2: \(M\) is not compact.
(Proof omitted.)

6. Manifolds in Euclidean Space

Whitney Embedding Theorem: Any smooth \(n\)-manifold can be differentiably embedded in \(\mathbb{R}^{2n}\).

Let \(M \subset \mathbb{R}^n\) be a manifold of dimension \(k < n\).

Definition 9. We define the space \(N\): \[N = \{ (q, v) : q \in M, \ v \text{ perpendicular to } M \text{ at } q \}\] (\(N\) is an \(n\)-dimensional manifold, and can be differentiably embedded in \(\mathbb{R}^{2n}\)).

Define the endpoint map \(E: N \to \mathbb{R}^n\) by: \[(q, v) \mapsto q + v\]

A point \(e \in \mathbb{R}^n\) is a focal point of \((M, q)\) if:

\(e = q + v\), where \((q, v) \in N\).
The Jacobian matrix of \(E\) at \((q, v)\) is degenerate (singular).

The point \(e\) will be called a focal point of \(M\) if \(e\) is a focal point of \((M, q)\) for some \(q \in M\).

Theorem 4 (Theorem 6.1, Sard). If \(M_1\) and \(M_2\) are differentiable manifolds having a countable basis, of the same dimension, and \(f: M_1 \to M_2\) is of class \(C^1\), then the image of the set of critical points has measure \(0\) in \(M_2\).

Corollary 2 (Corollary 6.2). For almost all \(x \in \mathbb{R}^n\), the point \(x\) is not a focal point of \(M\).

Proof. Consider the map \(E: N \to \mathbb{R}^n\). \(x\) is a focal point \(\iff x \in \{ f(y) \mid y \text{ is a critical point of } E: N \to \mathbb{R}^n \}\). By Sard’s Theorem, the set of focal points has measure \(0\). ◻

Local Coordinates and Fundamental Forms:
Let \(u^1, \dots, u^k\) be local coordinates. The embedding function is given by: \[\vec{x}: M \to \mathbb{R}^n, \quad (u^1, \dots, u^k) \mapsto \big( x_1(u^1, \dots, u^k), \dots, x_n(u^1, \dots, u^k) \big)\]

Then we can naturally get:

The first fundamental form: \((g_{ij}) = \left( \frac{\partial \vec{x}}{\partial u^i} \cdot \frac{\partial \vec{x}}{\partial u^j} \right)\)
The second fundamental form: \((h_{ij}) = \left( \vec{n} \cdot \frac{\partial^2 \vec{x}}{\partial u^i \partial u^j} \right) = (\vec{n} \cdot \vec{x}_{ij})\)

We let \((g_{ij}) = I_{n \times n}\), like the unitized curvature grid in \(\mathbb{R}^3\). Then \(\kappa_i\) are the eigenvalues of \((l_{ij})\).

Lemma 7 (Lemma 6.3). The focal points of \((M, q)\) along \(v\) are precisely the points \(q + \kappa_i^{-1} v\), where \(1 \le i \le k, \kappa_i \neq 0\). Thus there are at most \(k\) focal points of \((M, q)\) along \(l\), each being counted with its proper multiplicity.

Proof. Let \(n-k\) vector fields \(\vec{w}_1, \dots, \vec{w}_{n-k}\) be the unit vectors of orthogonal directions. The map \(E: N \to \mathbb{R}^n\) is written as: \[(u^1, \dots, u^k, t^1, \dots, t^{n-k}) \xrightarrow{E} \vec{x}(u^1, \dots, u^k) + \sum_{\alpha} t^\alpha \vec{w}_\alpha(u^1, \dots, u^k)\] The partial derivatives are: \[\begin{cases} \frac{\partial \vec{e}}{\partial u^i} = \frac{\partial \vec{x}}{\partial u^i} + \sum_{\alpha} t^\alpha \frac{\partial \vec{w}_\alpha}{\partial u^i} \\ \frac{\partial \vec{e}}{\partial t^\beta} = \vec{w}_\beta \end{cases}\] Since \(C = \big( \frac{\partial \vec{x}}{\partial u^1}, \dots, \frac{\partial \vec{x}}{\partial u^k}, \vec{w}_1, \dots, \vec{w}_{n-k} \big)\) are linearly independent (non-degenerate basis). Let the Jacobian matrix of \(E\) be \(J\). \(J\) is degenerate \(\iff\) the block matrix \(J_C\) is degenerate, where: \[J C = \begin{pmatrix} \left( \frac{\partial \vec{x}}{\partial u^i} \cdot \frac{\partial \vec{x}}{\partial u^j} + \sum t^\alpha \frac{\partial \vec{w}_\alpha}{\partial u^i} \cdot \frac{\partial \vec{x}}{\partial u^j} \right) & \left( \sum t^\alpha \frac{\partial \vec{w}_\alpha}{\partial u^i} \cdot \vec{w}_\beta \right) \\ 0 & I \end{pmatrix}\] The degeneracy condition simplifies to: \[\iff \left| g_{ij} + \sum_\alpha t^\alpha \left( \frac{\partial \vec{w}_\alpha}{\partial u^i} \cdot \frac{\partial \vec{x}}{\partial u^j} \right) \right| = 0\] Using the orthogonality condition \(\vec{w}_\alpha \cdot \vec{x}_j = 0\), taking the derivative yields \(\frac{\partial \vec{w}_\alpha}{\partial u^i} \cdot \vec{x}_j + \vec{w}_\alpha \cdot \vec{x}_{ij} = 0\). Substituting this: \[\iff \left| I - \sum_\alpha t^\alpha (\vec{w}_\alpha \cdot \vec{x}_{ij}) \right| = 0\] \[\iff \left| I - t (\vec{n} \cdot \vec{x}_{ij}) \right| = 0\] \[\iff t = \frac{1}{\kappa_i}.\] ◻

Distance Squared Function:
Now for a fixed point \(\vec{p} \in \mathbb{R}^n\), we consider the function \(L_p = f: M \to \mathbb{R}\). \[f(\vec{x}(u^1, \dots, u^k)) = \| \vec{x}(u^1, \dots, u^k) - \vec{p} \|^2 = \vec{x} \cdot \vec{x} - 2\vec{x} \cdot \vec{p} + \vec{p} \cdot \vec{p}\] Taking the first derivative: \[\frac{\partial f}{\partial u^i} = 2 \frac{\partial \vec{x}}{\partial u^i} \cdot (\vec{x} - \vec{p})\] Then \(\vec{q}\) is a critical point of \(f \iff \vec{q} - \vec{p}\) is perpendicular to \(M\) at \(\vec{q}\).

Taking the second derivative (Hessian): \[\frac{\partial^2 f}{\partial u^i \partial u^j} = 2 \left( \frac{\partial \vec{x}}{\partial u^i} \cdot \frac{\partial \vec{x}}{\partial u^j} + \frac{\partial^2 \vec{x}}{\partial u^i \partial u^j} \cdot (\vec{x} - \vec{p}) \right)\] Let \(\vec{p} = \vec{x} + t\vec{n}\) (where \(\vec{v}\) is the unit normal vector, \(t\) is distance). Since \(\vec{x} = \vec{q}\) at the critical point, \(\vec{x} - \vec{p} = -t\vec{v}\): \[= 2(g_{ij} - t \vec{n} \cdot \vec{h}_{ij})\]

Then we have:

Lemma 8 (Lemma 6.5). The point \(q \in M\) is a degenerate critical point of \(f = L_p\) if and only if \(p\) is a focal point of \((M, q)\).

Theorem 5 (Theorem 6.6). For almost all \(p \in \mathbb{R}^n\) (all but a set of measure \(0\)) the function \(L_p: M \to \mathbb{R}\) has no degenerate critical points.

Corollary 3 (Corollary 6.7). On any manifold \(M\) there exists a differentiable function, with no degenerate critical points, for which each \(M^a\) is compact.

Proof.

Embedding \(M\) to \(\mathbb{R}^{2n}\) (as a closed subset).
Set a point \(p\) outside \(M\), apart from focal points. Define \(f(x) = L_p(x) = \|x - p\|^2\).
\(M^a = \{x \in M \mid f(x) \le a\} = \{x \in M \mid \|x - p\|^2 \le a\}\). Being the intersection of a closed set \(M\) and a bounded closed ball, \(M^a\) is compact.

◻

Application 1 (1). A differentiable manifold has the homotopy type of a CW-complex. This follows from the above corollary and Theorem 3.5.

Application 2 (2, Poincaré-Hopf Index Theorem). On a compact manifold \(M\) there is a vector field \(X\) such that the sum of the indices of the critical points of \(X\) equals \(\chi(M)\), the Euler characteristic of \(M\).

Proof. 1. Topological Definitions:
In Chapter 5, we define:

\(R_\lambda(X, Y) = \lambda\)-th Betti number of \((X, Y) = \dim_F H_\lambda(X, Y; F)\).
\(\chi(X, Y) = \sum (-1)^\lambda R_\lambda(X, Y)\).

And when \(Y = \emptyset\), we note the Euler characteristic of \(M\): \[\chi(X) = \chi(X, \emptyset) = \sum_{\lambda} (-1)^\lambda R_\lambda(X).\]

Theorem 6 (Theorem 5.2, Weak Morse Inequalities). If \(C_\lambda\) denotes the number of critical points of index \(\lambda\) on the compact manifold \(M\), then \(R_\lambda(M) \le C_\lambda\), and \[\sum (-1)^\lambda R_\lambda(M) = \sum (-1)^\lambda C_\lambda\]

(Proof omitted.)

2. Vector Field Index:
According to Theorem 11.27 in From Calculus to Cohomology, for any smooth vector field \(X\) on \(M^n\), the total index is a topological invariant: \[\text{Index}(X) = \deg(g)\] (where \(g\) is a Gauss map defined in Thm 11.27).
So, we can selectively choose a specific vector field: \(X = \text{grad}(f)\), where \(f\) is a Morse function.

3. Local Index Calculation:
In every neighborhood of any non-degenerate critical point of \(f\), by the Morse Lemma: \[f = c - x_1^2 - x_2^2 - \dots - x_\lambda^2 + x_{\lambda+1}^2 + \dots + x_n^2\] The gradient vector field is: \[\text{grad}(f) = 2(-x_1, \dots, -x_\lambda, x_{\lambda+1}, \dots, x_n)\] Clearly, \(\text{grad}(f) = 0 \iff x = 0\).

The Jacobian matrix of the vector field (which is the Hessian of \(f\)) at \(0\) is: \[\left( \frac{\partial^2 f}{\partial x_i \partial x_j}(0) \right) = \text{diag}(-2, \dots, -2, 2, \dots, 2)\] From Lemma 11.20 in From Calculus to Cohomology, the local index of the vector field at the zero point is determined by the sign of the determinant of its Jacobian: \[\text{index}(X, p_0) = \text{sign}(\det D_0 f) = (-1)^\lambda\]

4. Conclusion:
So the sum of the indices of the critical points of \(X\) is: \[\sum \text{index}(X, p_0) = \sum (-1)^\lambda C_\lambda = \chi(M).\] ◻

Corollary 4 (Corollary 6.8). Any bounded smooth function \(f: M \to \mathbb{R}\) can be uniformly approximated by a smooth function \(g\) which has no degenerate critical points. Furthermore \(g\) can be chosen so that the \(i\)-th derivatives of \(g\) on the compact set \(K\) uniformly approximate the corresponding derivatives of \(f\), for \(i \le k\).

Proof. Step 1: Choose some embedding \(h: M \to \mathbb{R}^n\) (\(M\) is a bounded and closed subset). \[h(x) = (h_1(x), h_2(x), \dots, h_n(x))\] We can specifically choose the embedding such that \(h_1(x) = f(x)\).

Step 2: Choose a reference point \(p = (-c + \epsilon_1, \epsilon_2, \dots, \epsilon_n)\), where \(c\) is extremely large, and \(\epsilon_i \to 0\). By Theorem 6.6, we can choose \(p\) such that the distance squared function \(L_p: M \to \mathbb{R}\) is non-degenerate.

Step 3: Define a new function \(g(x)\): \[g(x) = \frac{L_p(x) - c^2}{2c}\] Since \(g(x)\) is just a linear transformation of \(L_p(x)\), \(g(x)\) is also perfectly non-degenerate. We expand the expression: \[\begin{aligned} g(x) &= \frac{1}{2c} \left[ (h_1 - (-c + \epsilon_1))^2 + \sum_{i=2}^n (h_i - \epsilon_i)^2 - c^2 \right] \\ &= \frac{1}{2c} \left[ h_1^2 + c^2 + \epsilon_1^2 + 2ch_1 - 2\epsilon_1 h_1 - 2c\epsilon_1 + \sum_{i=2}^n (h_i - \epsilon_i)^2 - c^2 \right] \\ &= h_1(x) + \sum_{i=1}^n \frac{h_i^2}{2c} - \sum_{i=1}^n \frac{\epsilon_i h_i}{c} + \sum_{i=1}^n \frac{\epsilon_i^2}{2c} - \epsilon_1 \end{aligned}\]

Step 4: Convergence Analysis.
Recall that \(h_1(x) = f(x)\). Since the embedding is bounded, \(|h_i^{(m)}| \le M\) on the compact set for all \(m \le k\). When we let \(c \to +\infty\) and \(\epsilon_i \to 0\): \[g(x) \rightrightarrows f(x)\] (The error terms vanish, meaning \(g\) uniformly approximates \(f\)).
Also, for the derivatives, taking the limits yields: \[g^{(m)}(x) \rightrightarrows f^{(m)}(x)\] This completes the proof. ◻