are widely used in many branches of analysis, including in optimization. In optimization, our main concern in this book, they are used, among other things, to derive optimality conditions in extremal problems which are described by di#erentiable functions. We treat the calculus of scalar-valued functions f : U ! R or vectorvalued functions (maps) f : U ! Rm, where U is an open set in Rn. The vector spaces Rn and Rm can replaced with any nite-dimensional vector spaces over R without changing any of our results or methods. In fact, most of our results remain true (with some minor changes) if Rn and Rm are replaced by Banach spaces. Although this extension can be done in a straightforward manner, we desire to keep our presentation fairly concrete, and the nite-dimensional vector space setting is su#cient for our needs. We deviate from this rule only in Chapter 3, where we consider di#erentiable functions in Banach spaces. The interested reader is referred to the books by Edwards [84] and Spivak [245] for more detailed treatments of calculus in nite-dimensional vector spaces, and Dieudonn#e [77] and Hormander [140] in Banach spaces. Surveys of di#erential calculus in even more general vector spaces may be found in the references [11, 12]. 1.1 Taylor's Formula Taylor's formula in one or several variables is needed to obtain necessary and su#cient conditions for local optimal solutions to unconstrained and constrained optimization problems. In this section, we treat Taylor's formula for functions of a single variable. The several-variable version of the formula is treated in later sections of this chapter. We start with Taylor's formula in Lagrange's form. Theorem 1.1. Let f : I = (c; d) ! R be a n-times di#erentiable function. If a; b are distinct points in I, then there exists a point x strictly between a and b such that f(b) = f(a) + f0(a)(b a) + f00(a) 2 (b n a)2 + # # # + f(n1)(a) (n 1)! (b n a)n1 +f(n)(x) n! (b a)n = XnX1 i=0 f(i)(a) i! (b a)i + f(n)(x) n! (b a)n: Note that the case n = 1 is precisely the mean value theorem. Proof. The idea of the proof is similar to that in the case n = 1: create a function g(t) such that g(k)(a) = 0, k = 0; : : : ; n 1, g(b) = 0, and apply Rolle's theorem repeatedly. The (n 1)th-degree Taylor approximation (polynomial) at a, nPn1(t) = f(a) + f0(a)(t a) + f00(a) 2 n a)2 + # # # + f(n1)(a) (t (n 1)! n a)n1; (t satis es the conditions P(k) n1(a) = f(k)(a), k = 0; : : : ; n1. Thus, the function nn n h(t) := f(t) Pn1(t) satis es the condition h(k)(a) = 0, k = 0; : : : ; n 1. However, h(b) may not vanish. We rectify the situation by de ning g(t) n = f(t) Pn1(t) K n! (t a)n; with the constant K chosen such that g(b) = 0, that is, nf(b) = Pn1(b) + K n! (b a)n: (1.1) Then, g(k)(a) = f(k)(a) P(k) n1(a) = 0; k = 0; : : : ; n 1; g(b) = 0: Rolle's theorem implies that there exists x1 strictly between a and b such that g0(x1) = 0. Since g0(a) = g0(x1) = 0, Rolle's theorem applied to g0 implies that there exists x2 strictly between a and x1 such that g00(x2) = 0. nWe continue in this fashion and obtain fxign1 1 such that g(i1)(xi1) = 0. ii Finally, g(n1)(a) = g(n1)(xn1) = 0, and applying Rolle's theorem once nnn nmore, we obtain a point xn strictly between a and xn1 such that g(n)(xn) = 0. )Since g(n)(t) = f(n)(t)K, we have g(n)(xn) = K. Equation (1.1) implies the theorem. ut This proof is adapted from [268]. In practice, the most useful cases of Taylor's theorem correspond to n = 1 and n = 2.