Introduction to Functional AnalysisVladimir V. Kisil |
Abstract: This is lecture notes for several courses on Functional Analysis at School of Mathematics of University of Leeds. They are based on the notes of Dr. Matt Daws, Prof. Jonathan R. Partington and Dr. David Salinger used in the previous years. However all misprints, omissions, and errors are only my responsibility. I am very grateful to Filipa Soares de Almeida, Eric Borgnet, Pasc Gavruta for pointing out some of them. Please let me know if you find more.The notes are available also for download in PDF.
The suggested textbooks are [, , , ]. The other nice books with many interesting problems are [, ].
Exercises with stars are not a part of mandatory material but are nevertheless worth to hear about. And they are not necessarily difficult, try to solve them!
ℤ+, ℝ+ denotes non-negative integers
and reals.
x,y,z,… denotes vectors.
λ,µ,ν,… denotes scalars.
ℜ z, ℑ z stand for real and imaginary parts of a complex number
z.
In this course, the functions we consider will be real or complex valued functions defined on the real line which are locally Riemann integrable. This means that they are Riemann integrable on any finite closed interval [a,b]. (A complex valued function is Riemann integrable iff its real and imaginary parts are Riemann-integrable.) In practice, we shall be dealing mainly with bounded functions that have only a finite number of points of discontinuity in any finite interval. We can relax the boundedness condition to allow improper Riemann integrals, but we then require the integral of the absolute value of the function to converge.
We mention this right at the start to get it out of the way. There are many fascinating subtleties connected with Fourier analysis, but those connected with technical aspects of integration theory are beyond the scope of the course. It turns out that one needs a “better” integral than the Riemann integral: the Lebesgue integral, and I commend the module, Linear Analysis 1, which includes an introduction to that topic which is available to MM students (or you could look it up in Real and Complex Analysis by Walter Rudin). Once one has the Lebesgue integral, one can start thinking about the different classes of functions to which Fourier analysis applies: the modern theory (not available to Fourier himself) can even go beyond functions and deal with generalized functions (distributions) such as the Dirac delta function which may be familiar to some of you from quantum theory.
From now on, when we say “function”, we shall assume the conditions of the first paragraph, unless anything is stated to the contrary.
Before proceed with an abstract theory we consider a motivating example: Fourier series.
In this part of the course we deal with functions (as above) that are periodic.
We say a function f:ℝ→ℂ is periodic with period T>0 if f(x+T)= f(x) for all x∈ ℝ. For example, sinx, cosx, eix(=cos x+i sinx) are periodic with period 2π. For k∈ R∖{0}, sinkx, coskx, and eikx are periodic with period 2π/|k|. Constant functions are periodic with period T, for any T>0. We shall specialize to periodic functions with period 2π: we call them 2π-periodic functions, for short. Note that cosnx, sinnx and einx are 2π-periodic for n∈ℤ. (Of course these are also 2π/|n|-periodic.)
Any half-open interval of length T is a fundamental domain of a periodic function f of period T. Once you know the values of f on the fundamental domain, you know them everywhere, because any point x in ℝ can be written uniquely as x=w+nT where n∈ ℤ and w is in the fundamental domain. Thus f(x) = f(w+(n−1)T +T)=⋯ =f(w+T) =f(w).
For 2π-periodic functions, we shall usually take the fundamental domain to
be ]−π, π]. By abuse of language, we shall sometimes refer to [−π,
π] as the fundamental domain. We then have to be aware that f(π)=f(−π).
We shall need to calculate ∫ab eikx dx, for k∈ℝ. Note first that when k=0, the integrand is the constant function 1, so the result is b−a. For non-zero k, ∫ab eikx dx= ∫ab (coskx+isinkx) dx = (1/k)[ (sinkx − icoskx)]ab = (1/ik)[(coskx+isinkx)]ab = (1/ik)[eikx]ab = (1/ik)(eikb−eika). Note that this is exactly the result you would have got by treating i as a real constant and using the usual formula for integrating eax. Note also that the cases k=0 and k≠0 have to be treated separately: this is typical.
| f(n) = |
|
| f(x) e−inx dx . |
| (c f + d g6) (n) = cf(n) + dĝ(n) . |
| p(x) = |
| p(n)einx . |
This follows immediately from Ex. 2 and Prop.4.
| f(x) = |
| f(n)einx . (1) |
For real-valued functions, the introduction of complex exponentials seems artificial: indeed they can be avoided as follows. We work with (1) in the case of a finite sum: then we can rearrange the sum as
|
Here
|
for n>0 and
| bn =i((f(n)−f(−n))= |
|
| f(x)sin nx dx |
for n>0. a0 = 1/π∫−ππf(x) dx, the constant chosen for consistency.
The an and bn are also called Fourier coefficients: if it is necessary to distinguish them, we may call them Fourier cosine and sine coefficients, respectively.
We note that if f is real-valued, then the an and bn are real numbers and so ℜ f(n) = ℜ f(−n), ℑ f(n) = −ℑf(n): thus f(−n) is the complex conjugate of f(n). Further, if f is an even function then all the sine coefficients are 0 and if f is an odd function, all the cosine coefficients are zero. We note further that the sine and cosine coefficients of the functions coskx and sinkx themselves have a particularly simple form: ak=1 in the first case and bk=1 in the second. All the rest are zero.
For example, we should expect the 2π-periodic function whose value on ]−π,π] is x to have just sine coefficients: indeed this is the case: an=0 and bn=i(f(n)−f(−n)) = (−1)n+12/n for n>0.
The above question can then be reformulated as “to what extent is f(x) represented by the Fourier series a0/2 + ∑n>0(ancosx + bnsinx)?” For instance how well does ∑(−1)n+1(2/n)sinnx represent the 2π-periodic sawtooth function f whose value on ]−π, π] is given by f(x) = x. The easy points are x=0, x=π, where the terms are identically zero. This gives the ‘wrong’ value for x=π, but, if we look at the periodic function near π, we see that it jumps from π to −π, so perhaps the mean of those values isn’t a bad value for the series to converge to. We could conclude that we had defined the function incorrectly to begin with and that its value at the points (2n+1)π should have been zero anyway. In fact one can show (ref. ) that the Fourier series converges at all other points to the given values of f, but I shan’t include the proof in this course. The convergence is not at all uniform (it can’t be, because the partial sums are continuous functions, but the limit is discontinuous.) In particular we get the expansion
| = 2(1−1/3+1/5−⋯) |
which can also be deduced from the Taylor series for tan−1.
In this subsection we shall discuss the formal solutions of the wave equation in a special case which Fourier dealt with in his work.
We discuss the wave equation
| = |
|
| , (2) |
subject to the boundary conditions
| y(0, t) = y(π, t) = 0, (3) |
for all t≥0, and the initial conditions
|
This is a mathematical model of a string on a musical instrument (guitar, harp, violin) which is of length π and is plucked, i.e. held in the shape F(x) and released at time t=0. The constant K depends on the length, density and tension of the string. We shall derive the formal solution (that is, a solution which assumes existence and ignores questions of convergence or of domain of definition).
We first look (as Fourier and others before him did) for solutions of the form y(x,t) = f(x)g(t). Feeding this into the wave equation (2) we get
| f′′(x) g(t) = |
| f(x) g′′(t) |
and so, dividing by f(x)g(t), we have
| = |
|
| . (4) |
The left-hand side is an expression in x alone, the right-hand side in t alone. The conclusion must be that they are both identically equal to the same constant C, say.
We have f′′(x) −Cf(x) =0 subject to the condition f(0) = f(π) =0. Working through the method of solving linear second order differential equations tells you that the only solutions occur when C = −n2 for some positive integer n and the corresponding solutions, up to constant multiples, are f(x) = sinnx.
Returning to equation (4) gives the equation g′′(t)+K2n2g(t) =0 which has the general solution g(t) = ancosKnt + bnsinKnt. Thus the solution we get through separation of variables, using the boundary conditions but ignoring the initial conditions, are
| yn(x,t) = sinnx(an cosKnt + bn sinKnt) , |
for n≥ 1.
To get the general solution we just add together all the solutions we have got so far, thus
| y(x,t) = |
| sinnx(an cosKnt + bn sin Knt) (5) |
ignoring questions of convergence. (We can do this for a finite sum without difficulty because we are dealing with a linear differential equation: the iffy bit is to extend to an infinite sum.)
We now apply the initial condition y(x,0) = F(x) (note F has F(0) =F(π) =0). This gives
| F(x) = |
| ansinnx . |
We apply the reflection trick: the right-hand side is a series of odd functions so if we extend F to a function G by reflection in the origin, giving
| G(x):= | ⎧ ⎨ ⎩ |
|
we have
| G(x) = |
| ansinnx , |
for −π≤ x ≤ π.
If we multiply through by sinrx and integrate term by term, we get
| ar = |
|
| G(x)sinrx dx |
so, assuming that this operation is valid, we find that the an are precisely the sine coefficients of G. (Those of you who took Real Analysis 2 last year may remember that a sufficient condition for integrating term-by -term is that the series which is integrated is itself uniformly convergent.)
If we now assume, further, that the right-hand side of (5) is differentiable (term by term) we differentiate with respect to t, and set t=0, to get
| 0=yt(x,0) = |
| bn K n sinnx. (6) |
This equation is solved by the choice bn=0 for all n, so we have the following result
| y(x,t) = |
| an sinnx cosKnt ,(2.11) |
| an = |
|
| G(x)sinnx dx |
Joseph Fourier, Civil Servant, Egyptologist, and mathematician, was born in 1768 in Auxerre, France, son of a tailor. Debarred by birth from a career in the artillery, he was preparing to become a Benedictine monk (in order to be a teacher) when the French Revolution violently altered the course of history and Fourier’s life. He became president of the local revolutionary committee, was arrested during the Terror, but released at the fall of Robespierre.
Fourier then became a pupil at the Ecole Normale (the teachers’ academy) in Paris, studying under such great French mathematicians as Laplace and Lagrange. He became a teacher at the Ecole Polytechnique (the military academy).
He was ordered to serve as a scientist under Napoleon in Egypt. In 1801, Fourier returned to France to become Prefect of the Grenoble region. Among his most notable achievements in that office were the draining of some 20 thousand acres of swamps and the building of a new road across the alps.
During that time he wrote an important survey of Egyptian history (“a masterpiece and a turning point in the subject”).
In 1804 Fourier started the study of the theory of heat conduction, in the course of which he systematically used the sine-and-cosine series which are named after him. At the end of 1807, he submitted a memoir on this work to the Academy of Science. The memoir proved controversial both in terms of his use of Fourier series and of his derivation of the heat equation and was not accepted at that stage. He was able to resubmit a revised version in 1811: this had several important new features, including the introduction of the Fourier transform. With this version of his memoir, he won the Academy’s prize in mathematics. In 1817, Fourier was finally elected to the Academy of Sciences and in 1822 his 1811 memoir was published as “Théorie de la Chaleur”.
For more details see Fourier Analysis by T.W. Körner, 475-480 and for even more, see the biography by J. Herivel Joseph Fourier: the man and the physicist.
What is Fourier analysis. The idea is to analyse functions (into sine and cosines or, equivalently, complex exponentials) to find the underlying frequencies, their strengths (and phases) and, where possible, to see if they can be recombined (synthesis) into the original function. The answers will depend on the original properties of the functions, which often come from physics (heat, electronic or sound waves). This course will give basically a mathematical treatment and so will be interested in mathematical classes of functions (continuity, differentiability properties).
A person is solely the concentration of an infinite set of interrelations with another and others, and to separate a person from these relations means to take away any real meaning of the life.
Vl. Soloviev
A space around us could be described as a three dimensional Euclidean space. To single out a point of that space we need a fixed frame of references and three real numbers, which are coordinates of the point. Similarly to describe a pair of points from our space we could use six coordinates; for three points—nine, end so on. This makes it reasonable to consider Euclidean (linear) spaces of an arbitrary finite dimension, which are studied in the courses of linear algebra.
The basic properties of Euclidean spaces are determined by its linear and metric structures. The linear space (or vector space) structure allows to add and subtract vectors associated to points as well as to multiply vectors by real or complex numbers (scalars).
The metric space structure assign a distance—non-negative real number—to a pair of points or, equivalently, defines a length of a vector defined by that pair. A metric (or, more generally a topology) is essential for definition of the core analytical notions like limit or continuity. The importance of linear and metric (topological) structure in analysis sometime encoded in the formula:
| Analysis = Algebra + Geometry . (7) |
On the other hand we could observe that many sets admit a sort of linear and metric structures which are linked each other. Just few among many other examples are:
It is a very mathematical way of thinking to declare such sets to be spaces and call their elements points.
But shall we lose all information on a particular element (e.g. a sequence {1/n}) if we represent it by a shapeless and size-less “point” without any inner configuration? Surprisingly not: all properties of an element could be now retrieved not from its inner configuration but from interactions with other elements through linear and metric structures. Such a “sociological” approach to all kind of mathematical objects was codified in the abstract category theory.
Another surprise is that starting from our three dimensional Euclidean
space and walking far away by a road of abstraction to infinite
dimensional Hilbert spaces we are arriving just to yet another picture
of the surrounding space—that time on the language of
quantum mechanics.
The distance from Manchester to Liverpool is 35 miles—just about the mileage in the opposite direction!
A tourist guide to England
The following definition generalises the notion of distance known from the everyday life.
The following notion is a useful specialisation of metric adopted to the linear structure.
The connection between norm and metric is as follows:
Proof.
This is a simple exercise to derive items 1–3 of Definition 1 from corresponding items of Definition 3. For example, see the Figure 1 to derive the triangle inequality.
An important notions known from real analysis are limit and convergence. Particularly we usually wish to have enough limiting points for all “reasonable” sequences.
For example, the set of integers ℤ and reals ℝ with the natural distance functions are complete spaces, but the set of rationals ℚ is not. The complete normed spaces deserve a special name.
| ⎪⎪ ⎪⎪ | (x1,…,xn) | ⎪⎪ ⎪⎪ | 2 = | √ |
| . |
| ⎪⎪ ⎪⎪ | (x1,…,xn) | ⎪⎪ ⎪⎪ | 1 = |
| . |
| ⎪⎪ ⎪⎪ | (x1,…,xn) | ⎪⎪ ⎪⎪ | ∞ = max( |
| ). |
—We need an extra space to accommodate this product!
A manager to a shop assistant
Although metric and norm capture important geometric information about linear spaces they are not sensitive enough to represent such geometric characterisation as angles (particularly orthogonality). To this end we need a further refinements.
From courses of linear algebra known that the scalar product ⟨ x,y ⟩= x1 y1 + ⋯ + xn yn is important in a space ℝn and defines a norm ||x||2=⟨ x,x ⟩. Here is a suitable generalisation:
Last two properties of the scalar product is oftenly encoded in the phrase: “it is linear in the first variable if we fix the second and anti-linear in the second if we fix the first”.
| l2={ sequences {xj}1∞ ∣ |
| ⎪ ⎪ | xj | ⎪ ⎪ | 2 < ∞}. (8) |
| ⟨ f,g ⟩= |
| f(x)ḡ(x) dx and | ⎪⎪ ⎪⎪ | f | ⎪⎪ ⎪⎪ | 2= | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | f(x) | ⎪ ⎪ | 2 dx | ⎞ ⎟ ⎟ ⎠ |
| . (9) |
Now we state, probably, the most important inequality in analysis.
| ⎪ ⎪ | ⟨ x,y ⟩ | ⎪ ⎪ | ≤ | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | ⎪⎪ ⎪⎪ | y | ⎪⎪ ⎪⎪ | , (10) |
Proof. For any x, y∈ V and any t∈ℝ we have:
| 0< ⟨ x+t y,x+t y ⟩= ⟨ x,x ⟩+2t ℜ ⟨ y,x ⟩+t2⟨ y,y ⟩), |
Thus the discriminant of this quadratic expression in t is non-positive: (ℜ ⟨ y,x ⟩)2−||x||2||y||2≤ 0, that is | ℜ ⟨ x,y ⟩ |≤||x||||y||. Replacing y by eiαy for an arbitrary α∈[−π,π] we get | ℜ (eiα⟨ x,y ⟩) | ≤||x||||y||, this implies the desired inequality.
Proof. Just to check items 1–3 from Definition 3.
Again complete inner product spaces deserve a special name
The relations between spaces introduced so far are as follows:
| Hilbert spaces | ⇒ | Banach spaces | ⇒ | Complete metric spaces |
| ⇓ | ⇓ | ⇓ | ||
| inner product spaces | ⇒ | normed spaces | ⇒ | metric spaces. |
How can we tell if a given norm comes from an inner product?
| ⎪⎪ ⎪⎪ | x+y | ⎪⎪ ⎪⎪ | 2+ | ⎪⎪ ⎪⎪ | x−y | ⎪⎪ ⎪⎪ | 2=2 | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2+2 | ⎪⎪ ⎪⎪ | y | ⎪⎪ ⎪⎪ | 2. (11) |
Proof. Just by linearity of inner product:
| ⟨ x+y,x+y ⟩+⟨ x−y,x−y ⟩=2⟨ x,x ⟩+2⟨ y,y ⟩, |
because the cross terms cancel out.
|
Divide and rule!
Old but still much used recipe
To study Hilbert spaces we may use the traditional mathematical technique of analysis and synthesis: we split the initial Hilbert spaces into smaller and probably simpler subsets, investigate them separately, and then reconstruct the entire picture from these parts.
As known from the linear algebra, a linear subspace is a subset of a linear space is its subset, which inherits the linear structure, i.e. possibility to add vectors and multiply them by scalars. In this course we need also that subspaces inherit topological structure (coming either from a norm or an inner product) as well.
We also wish that the both inhered structures (linear and topological) should be in agreement, i.e. the subspace should be complete. Such inheritance is linked to the property be closed.
A subspace need not be closed—for example the sequence
| x=(1, 1/2, 1/3, 1/4, …)∈ l2 because ∑1/k2 < ∞ |
and xn=(1, 1/2,…, 1/n, 0, 0,…)∈ c00 converges to x thus x∈ c00 ⊂ l2.
Proof.
| ⎪⎪ ⎪⎪ | (xn+yn)−(x+y) | ⎪⎪ ⎪⎪ | ≤ | ⎪⎪ ⎪⎪ | xn−x | ⎪⎪ ⎪⎪ | + | ⎪⎪ ⎪⎪ | yn−y | ⎪⎪ ⎪⎪ | → 0, |
Hence c00 is an incomplete inner product space, with inner product ⟨ x,y ⟩=∑1∞xk ȳk (this is a finite sum!) as it is not closed in l2.
Similarly C[0,1] with inner product norm ||f||=(∫01 | f(t) |2 dt)1/2 is incomplete—take the large space X of functions continuous on [0,1] except for a possible jump at 1/2 (i.e. left and right limits exists but may be unequal and f(1/2)=limt→1/2+ f(t). Then the sequence of functions defined on Figure 4(a) has the limit shown on Figure 4(b) since:
| ⎪⎪ ⎪⎪ | f−fn | ⎪⎪ ⎪⎪ | = |
| ⎪ ⎪ | f−fn | ⎪ ⎪ | 2 dt < |
| → 0. |
Obviously f∈C[0,1]∖C[0,1].
Similarly the space C[a,b] is incomplete for any a<b if equipped by the inner product and the corresponding norm:
|
It is practical to realise L2[a,b] as a certain space of “functions” with the inner product defined via an integral. There are several ways to do that and we mention just two:
| f(t)= | ⎧ ⎨ ⎩ |
|
| ⟨ f1,f2 ⟩= | ∫ |
| f1(z) f2(z)e |
| dz. |
Proof. Take a Cauchy sequence x(n)∈l2, where x(n)=(x1(n), x2(n), x3(n), … ). Our proof will have three steps: identify the limit x; show it is in l2; show x(n)→ x.
| ⎪ ⎪ | xk(n)−xk(m) | ⎪ ⎪ | ≤ | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | xk(n)−xk(m) | ⎪ ⎪ | 2 | ⎞ ⎟ ⎟ ⎠ |
| = | ⎪⎪ ⎪⎪ | x(n)−x(m) | ⎪⎪ ⎪⎪ | → 0. |
| ⎪ ⎪ | xk(n)−xk(m) | ⎪ ⎪ | 2 ≤ | ⎪⎪ ⎪⎪ | x(n)−x(m) | ⎪⎪ ⎪⎪ | 2<є2. |
Consequently l2 is complete.
All good things are covered by a thick layer of chocolate (well, if something is not yet–it certainly will)
As was explained into introduction 2, we describe “internal” properties of a vector through its relations to other vectors. For a detailed description we need sufficiently many external reference points.
Let A be a subset (finite or infinite) of a normed space V. We may wish to upgrade it to a linear subspace in order to make it subject to our theory.
Proof. Clearly Lin(A) is a closed subspace containing A thus it should contain CLin(A). Also Lin(A)⊂ CLin(A) thus Lin(A)⊂ CLin(A)=CLin(A). Therefore Lin(A)= CLin(A).
Consequently CLin(A) is the set of all limiting points of finite
linear combination of elements of A.
The following simple result will be used later many times without comments.
| ⟨ xn,yn ⟩=⟨ |
| xn, |
| yn ⟩. |
Proof. Obviously by the Cauchy–Schwarz inequality:
|
since ||xn−x||→ 0, ||yn−y||→ 0, and ||yn|| is bounded.
Pythagoras is forever!
The catchphrase from TV commercial of Hilbert Spaces course
As was mentioned in the introduction the Hilbert spaces is an analog of our 3D Euclidean space and theory of Hilbert spaces similar to plane or space geometry. One of the primary result of Euclidean geometry which still survives in high school curriculum despite its continuous nasty de-geometrisation is Pythagoras’ theorem based on the notion of orthogonality1.
So far we was concerned only with distances between points. Now we would like to study angles between vectors and notably right angles. Pythagoras’ theorem states that if the angle C in a triangle is right then c2=a2+b2, see Figure 5 .
It is a very mathematical way of thinking to turn this property of right angles into their definition, which will work even in infinite dimensional Hilbert spaces.
Look for a triangle, or even for a right triangle
A universal advice in solving problems from elementary geometry.
In inner product spaces it is even more convenient to give a definition of orthogonality not from Pythagoras’ theorem but from an equivalent property of inner product.
An orthogonal sequence (or orthogonal system) en (finite or infinite) is one in which en ⊥ em whenever n≠ m.
An orthonormal sequence (or orthonormal system) en is an orthogonal sequence with ||en||=1 for all n.
| ⎪⎪ ⎪⎪ ⎪⎪ ⎪⎪ |
| ak ek | ⎪⎪ ⎪⎪ ⎪⎪ ⎪⎪ | 2=⟨ |
| ak ek, |
| ak ek ⟩= |
| ⎪ ⎪ | ak | ⎪ ⎪ | 2. |
Proof. A one-line calculation.
The following theorem provides an important property of Hilbert spaces
which will be used many times. Recall, that a subset K of a linear
space V is convex if for all x,
y∈ K and λ∈ [0,1] the point λ x
+(1−λ)y is also in K. Particularly any subspace is convex
and any unit ball as well (see Exercise 1).
Proof. Let d=infy∈ K d(x,y), where d(x,y)—the distance coming from the norm ||x||=√⟨ x,x ⟩ and let yn a sequence points in K such that limn→ ∞d(x,yn)=d. Then yn is a Cauchy sequence. Indeed from the parallelogram identity for the parallelogram generated by vectors x−yn and x−ym we have:
| ⎪⎪ ⎪⎪ | yn−ym | ⎪⎪ ⎪⎪ | 2=2 | ⎪⎪ ⎪⎪ | x−yn | ⎪⎪ ⎪⎪ | 2+2 | ⎪⎪ ⎪⎪ | x−ym | ⎪⎪ ⎪⎪ | 2− | ⎪⎪ ⎪⎪ | 2x−yn−ym | ⎪⎪ ⎪⎪ | 2. |
Note that ||2x−yn−ym||2=4||x−yn+ym/2||2≥ 4d2 since yn+ym/2∈ K by its convexity. For sufficiently large m and n we get ||x−ym||2≤ d +є and ||x−yn||2≤ d +є, thus ||yn−ym||≤ 4(d2+є)−4d2=4є, i.e. yn is a Cauchy sequence.
Let y be the limit of yn, which exists by the completeness of H, then y∈ K since K is closed. Then d(x,y)=limn→ ∞d(x,yn)=d. This show the existence of the nearest point. Let y′ be another point in K such that d(x,y′)=d, then the parallelogram identity implies:
| ⎪⎪ ⎪⎪ | y−y′ | ⎪⎪ ⎪⎪ | 2=2 | ⎪⎪ ⎪⎪ | x−y | ⎪⎪ ⎪⎪ | 2+2 | ⎪⎪ ⎪⎪ | x−y′ | ⎪⎪ ⎪⎪ | 2− | ⎪⎪ ⎪⎪ | 2x−y−y′ | ⎪⎪ ⎪⎪ | 2≤ 4d2−4d2=0. |
This shows the uniqueness of the nearest point.
Liberte, Egalite, Fraternite!
A longstanding ideal approximated in the real life by something completely different
For the case then a convex subset is a subspace we could characterise the nearest point in the term of orthogonality.
Proof. Let z is the nearest point to x existing by the previous Theorem. We claim that x−z orthogonal to any vector in M, otherwise there exists y∈ M such that ⟨ x−z,y ⟩≠ 0. Then
|
if є is chosen to be small enough and such that є ℜ⟨ x−z,y ⟩ is positive, see Figure 6(i). Therefore we get a contradiction with the statement that z is closest point to x.
On the other hand if x−z is orthogonal to all vectors in H1 then particularly (x−z)⊥ (z−y) for all y∈ H1, see Figure 6(ii). Since x−y=(x−z)+(z−y) we got by the Pythagoras’ theorem:
| ⎪⎪ ⎪⎪ | x−y | ⎪⎪ ⎪⎪ | 2= | ⎪⎪ ⎪⎪ | x−z | ⎪⎪ ⎪⎪ | 2 + | ⎪⎪ ⎪⎪ | z−y | ⎪⎪ ⎪⎪ | 2. |
So ||x−y||2≥ ||x−z||2 and the are equal if and only if z=y.
Consider now a basic case of approximation: let x∈ H be fixed and e1, …, en be orthonormal and denote H1=Lin{e1,…,en}. We could try to approximate x by a vector y=λ1 e1+⋯ +λn en ∈ H1.
Proof. Let z=∑1n⟨ x,ei ⟩ ei, then ⟨ x−z,ei ⟩=⟨ x,ei ⟩−⟨ z,ei ⟩=0. By the previous Theorem z is the nearest point to x.
| z=⟨ x,e1 ⟩e1+⟨ x,e2 ⟩e2= | ⎛ ⎜ ⎜ ⎝ |
| ,− |
| ,0 | ⎞ ⎟ ⎟ ⎠ | + | ⎛ ⎜ ⎜ ⎝ |
| , |
| ,− |
| ⎞ ⎟ ⎟ ⎠ | = | ⎛ ⎜ ⎜ ⎝ |
| ,− |
| ,− |
| ⎞ ⎟ ⎟ ⎠ | . |
| e0= |
| , e1= |
| eit, e−1= |
| e−it. |
|
|
| ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2≥ |
| ⎪ ⎪ | ⟨ x,ei ⟩ | ⎪ ⎪ | 2. |
Proof. Let z= ∑1n⟨ x,ei ⟩ei then x−z⊥ ei for all i therefore by Exercise 4 x−z⊥ z. Hence:
|
—Did you say “rice and fish for them”?
A student question
When (ei) is orthonormal we call ⟨ x,en ⟩ the nth Fourier coefficient of x (with respect to (ei), naturally).
Proof. Necessity: Let xk=∑1k λn en and x=limk→ ∞ xk. So ⟨ x,en ⟩=limk→ ∞⟨ xk,en ⟩=λn for all n. By the Bessel’s inequality for all k
| ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2≥ |
| ⎪ ⎪ | ⟨ x,en ⟩ | ⎪ ⎪ | 2= |
| ⎪ ⎪ | λn | ⎪ ⎪ | 2, |
hence ∑1k | λn |2 converges and the sum is at most ||x||2.
Sufficiency: Consider ||xk−xm||=||∑mk λn en||=(∑mk | λn |2)1/2 for k>m. Since ∑mk | λn |2 converges xk is a Cauchy sequence in H and thus has a limit x. By the Pythagoras’ theorem ||xk||2=∑1k | λn |2 thus for k→ ∞ ||x||2=∑1∞| λn |2 by the Lemma about inner product limit.
Observation: the closed linear span
of an orthonormal sequence in any Hilbert space looks like
l2, i.e. l2 is a universal model for a
Hilbert space.
By Bessel’s inequality and the Riesz–Fisher theorem we know that the series ∑1∞⟨ x,ei ⟩ ei converges for any x∈ H. What is its limit?
Let y=x− ∑1∞⟨ x,ei ⟩ ei, then
| ⟨ y,ek ⟩=⟨ x,ek ⟩− |
| ⟨ x,ei ⟩ ⟨ ei,ek ⟩=⟨ x,ek ⟩−⟨ x,ek ⟩ =0 for all k. (16) |
A complete orthonormal sequence is also called orthonormal basis in H.
| x= |
| ⟨ x,en ⟩en and | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2= |
| ⎪ ⎪ | ⟨ x,en ⟩ | ⎪ ⎪ | 2. |
Proof. By the Riesz–Fisher theorem, equation (16) and definition of orthonormal basis.
There are constructive existence theorems in mathematics.
An example of pure existence statement
Natural questions are: Do orthonormal sequences always exist? Could we construct them?
| Lin{x1,x2,…,xn}=Lin{e1,e2,…,en}, for all n. |
Proof. We give an explicit algorithm working by induction. The base of induction: the first vector is e1=x1/||x1||. The step of induction: let e1, e2, …, en are already constructed as required. Let yn+1=xn+1−∑i=1n⟨ xn+1,ei ⟩ei. Then by (16) yn+1 ⊥ ei for i=1,…,n. We may put en+1=yn+1/||yn+1|| because yn+1≠ 0 due to linear independence of xk’s. Also
|
So (ei) are orthonormal sequence.
|
| ⟨ f,g ⟩= |
| f(t) |
| dt. (17) |
| ⟨ f,g ⟩= |
| f(t) |
|
| (18) |
| ⟨ f,g ⟩= |
| f(t) |
| e−t dt. |
See Figure 8 for the five first Legendre and Chebyshev polynomials. Observe the difference caused by the different inner products (17) and (18). On the other hand note the similarity in oscillating behaviour with different “frequencies”.
Another natural question is: When is an orthonormal sequence complete?
Proof. Clearly 1 implies 2 because x=∑1∞⟨ x,en ⟩en in CLin((en)) and ||x||2=∑1∞⟨ x,en ⟩en by Theorem 15.
If (en) is not complete then there exists x∈ H such that x≠ 0 and ⟨ x,ek ⟩ for all k, so 3 fails, consequently 3 implies 1.
Finally if ⟨ x,ek ⟩=0 for all k then ⟨ x,y ⟩=0 for all y∈Lin((en)) and moreover for all y∈CLin((en)), by the Lemma on continuity of the inner product. But then x∉CLin((en)) and 2 also fails because ⟨ x,x ⟩=0 is not possible. Thus 2 implies 1.
| x= |
| ⟨ x,en ⟩en and | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2= |
| ⎪ ⎪ | ⟨ x,en ⟩ | ⎪ ⎪ | 2. |
Proof. Take a countable dense set (xk), then H=CLin((xk)), delete all vectors which are a linear combinations of preceding vectors, make orthonormalisation by Gram–Schmidt the remaining set and apply the previous proposition.
Most pleasant compliments are usually orthogonal to our real qualities.
An advise based on observations
| M⊥={x∈ V: ⟨ x,m ⟩=0 ∀ m∈ M}. |
Proof. Clearly M⊥ is a subspace of H because x, y∈ M⊥ implies ax+by∈ M⊥:
| ⟨ ax+by,m ⟩= a⟨ x,m ⟩+ b⟨ y,m ⟩=0. |
Also if all xn∈ M⊥ and xn→ x then x∈ M⊥ due to inner product limit Lemma.
Proof. For a given x there exists the unique closest point m in M by the Theorem on nearest point and by the Theorem on perpendicular (x−m)⊥ y for all y∈ M.
So x= m + (x−m)= m+n with m∈ M and n∈ M⊥. The identity ||x||2=||m||2+||n||2 is just Pythagoras’ theorem and M∩ M⊥={0} because null vector is the only vector orthogonal to itself.
Finally (M⊥)⊥=M. We have H=M⊕ M⊥=(M⊥)⊥⊕ M⊥, for any x∈(M⊥)⊥ there is a decomposition x=m+n with m∈ M and n∈ M⊥, but then n is orthogonal to itself and therefore is zero.
| PM2=PM, kerPM=M⊥, PM⊥=I−PM. (19) |
Proof. Let us define PM(x)=m where x=m+n is the decomposition from the previous theorem. The linearity of this operator follows from the fact that both M and M⊥ are linear subspaces. Also PM(m)=m for all m∈ M and the image of PM is M. Thus PM2=PM. Also if PM(x)=0 then x⊥ M, i.e. kerPM=M⊥. Similarly PM⊥(x)=n where x=m+n and PM+PM⊥=I.
| ak ek = |
| ak ek + |
| ak ek. |
All bases are equal, but some are more equal then others.
As we saw already any separable Hilbert space posses an orthonormal basis (infinitely many of them indeed). Are they equally good? This depends from our purposes. For solution of differential equation which arose in mathematical physics (wave, heat, Laplace equations, etc.) there is a proffered choice. The fundamental formula: d/dx eax=aeax reduces the derivative to a multiplication by a. We could benefit from this observation if the orthonormal basis will be constructed out of exponents. This helps to solve differential equations as was demonstrated in Subsection 1.2.
7.40pm Fourier series: Episode II
Today’s TV listing
Now we wish to address questions stated in Remark 9. Let us consider the space L2[−π,π]. As we saw in Example 3 there is an orthonormal sequence en(t)=(2π)−1/2eint in L2[−π,π]. We will show that it is an orthonormal basis, i.e.
| f(t)∈ L2[−π,π] ⇔ f(t)= |
| ⟨ f,ek ⟩ek(t), |
with convergence in L2 norm. To do this we show that CLin{ek:k∈ℤ}=L2[−π,π].
Let CP[−π,π] denote the continuous functions f on [−π,π] such that f(π)=f(−π). We also define f outside of the interval [−π,π] by periodicity.
Proof. Let f∈L2[−π,π]. Given є>0 there exists g∈ C[−π,π] such that ||f−g||<є/2. Form continuity of g on a compact set follows that there is M such that | g(t) |<M for all t∈[−π,π].
We can now replace g by periodic g′, which coincides with g on [−π,π−δ] for an arbitrary δ>0 and has the same bounds: | g′(t) |<M, see Figure 9. Then
| ⎪⎪ ⎪⎪ | g−g′ | ⎪⎪ ⎪⎪ | 22= |
| ⎪ ⎪ | g(t)−g′(t) | ⎪ ⎪ | 2 dt ≤ (2M)2δ. |
So if δ<є2/(4M)2 then ||g−g′||<є/2 and ||f−g′||<є.
Now if we could show that CLin{ek: k ∈ ℤ} includes
CP[−π,π] then it also includes
L2[−π,π].
| fn= |
| ⟨ f,ek ⟩ ek , for n=0,1,2,… (20) |
We want to show that ||f−fn||2→ 0. To this end we define nth Fejér sum by the formula
| Fn= |
| , (21) |
and show that
| ⎪⎪ ⎪⎪ | Fn−f | ⎪⎪ ⎪⎪ | ∞ → 0. |
Then we conclude
| ⎪⎪ ⎪⎪ | Fn−f | ⎪⎪ ⎪⎪ | 2= | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | Fn(t)−f | ⎪ ⎪ | 2 | ⎞ ⎟ ⎟ ⎠ |
| ≤ (2π)1/2 | ⎪⎪ ⎪⎪ | Fn−f | ⎪⎪ ⎪⎪ | ∞→ 0. |
Since Fn∈Lin((en)) then f∈CLin((en)) and hence f=∑−∞∞⟨ f,ek ⟩ek.
It took 19 years of his life to prove this theorem
|
Proof. From notation (20):
|
Then from (21):
|
which finishes the proof.
| Kn(t)= |
|
| , for t∉2πℤ. (24) |
Proof. Let z=eit, then:
|
by switch from counting in rows to counting in columns in Table 1.
Let w=eit/2, i.e. z=w2, then
if w≠ ± 1. For the value of Kn(0) we substitute w=1 into (25).
![]()
![]()
Figure 10: A family of Fejér kernels with the parameter m running from 0 to 9 is on the left picture. For a comparison unregularised Fourier kernels are on the right picture.
The first eleven Fejér kernels are shown on Figure 10, we could observe that:
Proof. The first property immediately follows from the explicit formula (24). In contrast the second property is easier to deduce from expression with double sum (23):
|
since the formula (15).
Finally if | t |>δ then sin2(t/2)≥ sin2(δ/2)>0 by monotonicity of sinus on [0,π/2], so:
| 0≤ Kn(t) ≤ |
|
implying:
| 0≤ |
| Kn(t) dt ≤ |
| → 0 as n→ 0. |
Therefore the third property follows from the squeeze rule.
Proof. Idea of the proof: if in the formula (22)
| Fn(x)= |
|
| f(t) Kn(x−t) dt, |
t is long way from x, Kn is small (see Lemma 7 and Figure 10), for t near x, Kn is big with total “weight” 2π, so the weighted average of f(t) is near f(x).
Here are details. Using property 2 and periodicity of f and Kn we could express trivially
| f(x)= f(x) |
|
| Kn(x−t) dt = |
|
| f(x) Kn(x−t) dt. |
Similarly we rewrite (22) as
| Fn(x)= |
|
| f(t) Kn(x−t) dt, |
then
|
Given є>0 split into three intervals: I1=[x−π,x−δ], I2=[x−δ,x+δ], I3=[x+δ,x+π], where δ is chosen such that | f(t)−f(x) |<є/2 for t∈ I2, which is possible by continuity of f. So
| ∫ |
| ⎪ ⎪ | f(x)−f(t) | ⎪ ⎪ | Kn(x−t) dt≤ |
|
| ∫ |
| Kn(x−t) dt < |
| . |
And
|
if n is sufficiently large due to property 3 of Kn. Hence | f(x)−Fn(x) |<є for a large n independent of x.
We almost finished the demonstration that en(t)=(2π)−1/2eint
is an orthonormal basis of L2[−π,π]:
| ⟨ f,en ⟩en= |
| cneint where cn= |
| = |
|
| f(t)e−int dt. |
| ⎪⎪ ⎪⎪ ⎪⎪ ⎪⎪ | f− |
| cneint | ⎪⎪ ⎪⎪ ⎪⎪ ⎪⎪ | 2=0. |
Proof. This follows from the previous Theorem, Lemma 1 about density of CP in L2, and Theorem 15 on orthonormal basis.
The following result first appeared in the framework of L2[−π,π] and only later was understood to be a general property of inner product spaces.
| ⟨ f,g ⟩= |
| f(t) |
| dt=2π |
| cn |
| . (26) |
More generally if f and g are two vectors of a Hilbert space H with an orthonormal basis (en)−∞∞ then
| ⟨ f,g ⟩= |
| cn |
| , where cn=⟨ f,en ⟩, dn=⟨ g,en ⟩, |
are the Fourier coefficients of f and g.
Proof. In fact we could just prove the second, more general, statement—the first one is its particular realisation. Let fn=∑k=−nn ckek and gn=∑k=−nn dkek will be partial sums of the corresponding Fourier series. Then from orthonormality of (en) and linearity of the inner product:
| ⟨ fn,gn ⟩=⟨ |
| ckek, |
| dkek ⟩= |
| ck |
| . |
This formula together with the facts that fk→ f and gk→ g (following from Corollary 9) and Lemma about continuity of the inner product implies the assertion.
Proof. The necessity, i.e. implication f∈L2 ⇒ ⟨ f,f ⟩=||f||2=2π∑| ck |2 , follows from the previous Theorem. The sufficiency follows by Riesz–Fisher Theorem.
| [Wf](x)=⟨ f,ex ⟩ (27) |
Heat and noise but not a fire?
Answer:
We are going to provide now few examples which demonstrate the importance of the Fourier series in many questions. The first two (Example 14 and Theorem 15) belong to pure mathematics and last two are of more applicable nature.
| ⟨ f,en ⟩= |
| te−int dt= | ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ |
| (check!), |
| ⎪⎪ ⎪⎪ | f | ⎪⎪ ⎪⎪ | 22= |
| t2 dt= |
| . |
| ⎪⎪ ⎪⎪ | f | ⎪⎪ ⎪⎪ | 22=2π |
| ⎪ ⎪ ⎪ ⎪ |
| ⎪ ⎪ ⎪ ⎪ | 2=4π |
|
| . |
|
| = |
| . |
Here is another important result.
Proof. Change variable: t=2π(x−a+b/2)/(b−a) this maps x∈[a,b] onto t∈[−π,π]. Let P denote the subspace of polynomials in C[−π,π]. Then eint∈$P_^$ for any n∈ℤ since Taylor series converges uniformly in [−π,π]. Consequently P contains the closed linear span in (supremum norm) of eint, any n∈ℤ, which is CP[−π,π] by the Fejér theorem. Thus $P_^$⊇ CP[−π,π] and we extend that to non-periodic function as follows (why we could not make use of Lemma 1 here, by the way?).
For any f∈C[−π,π] let λ=(f(π)−f(−π))/(2π) then f1(t)=f(t)−λ t∈ CP[−π,π] and could be approximated by a polynomial p1(t) from the above discussion. Then f(t) is approximated by the polynomial p(t)=p1(t)+λ t.
It is easy to see, that the rôle of exponents eint in the
above prove is rather modest: they can be replaced by any functions
which has a Taylor expansion. The real glory of the Fourier analysis
is demonstrated in the two following examples.
Let we have a rod of the length 2π. The temperature at its point x∈[−π,π] and a moment t∈[0,∞) is described by a function u(t,x) on [0,∞)×[−π,π]. The mathematical equation describing a dynamics of the temperature distribution is:
| = |
| or, equivalently, | ⎛ ⎝ | ∂t−∂x2 | ⎞ ⎠ | u(t,x)=0. (28) |
For any fixed moment t0 the function u(t0,x) depends only from x∈[−π,π] and according to Corollary 9 could be represented by its Fourier series:
| u(t0,x)= |
| ⟨ u,en ⟩en= |
| cn(t0)einx, |
where
| cn(t0)= |
| = |
|
| u(t0,x)e−inx dx, |
with Fourier coefficients cn(t0) depending from t0. We substitute that decomposition into the heat equation (28) to receive:
|
Since function einx form a basis the last equation (29) holds if and only if
| c′n(t)+n2cn(t)=0 for all n and t. (30) |
Equations from the system (30) have general solutions of the form:
| cn(t)=cn(0)e−n2t for all t∈[0,∞), (31) |
producing a general solution of the heat equation (28) in the form:
| u(t,x)= |
| cn(0)e−n2teinx = |
| cn(0)e−n2t+inx, (32) |
where constant cn(0) could be defined from boundary condition. For example, if it is known that the initial distribution of temperature was u(0,x)=g(x) for a function g(x)∈L2[−π,π] then cn(0) is the n-th Fourier coefficient of g(x).
The general solution (32) helps produce both the analytical study of the heat equation (28) and numerical simulation. For example, from (32) obviously follows that
The example of numerical simulation for the initial value problem with g(x)=2cos(2*u) + 1.5sin(u). It is clearly illustrate our above conclusions.
![]()
Figure 12: Two oscillation with unharmonious frequencies and the appearing dissonance. Click to listen the blue and green pure harmonics and red dissonance.
The earliest observations are that
![]()
![]()
![]()
![]()
Figure 13: Graphics of G5 performed on different musical instruments (click on picture to hear the sound). Samples are taken from Sound Library.
The musical tone, say G5, performed on different instruments clearly has something in common and different, see Figure 13 for comparisons. The decomposition into the pure harmonics, i.e. finding Fourier coefficient for the signal, could provide the complete characterisation, see Figure 14.
![]()
Figure 14: Fourier series for G5 performed on different musical instruments (same order and colour as on the previous Figure)
The Fourier analysis tells that:
The Fourier analysis is very useful in the signal processing and is indeed the fundamental tool. However it is not universal and has very serious limitations. Consider the simple case of the signals plotted on the Figure 15(a) and (b). They are both made out of same two pure harmonics:
This appear to be two very different signals. However the Fourier performed over the whole interval does not seems to be very different, see Figure 15(c). Both transforms (drawn in blue-green and pink) have two major pikes corresponding to the pure frequencies. It is not very easy to extract differences between signals from their Fourier transform (yet this should be possible according to our study).
Even a better picture could be obtained if we use windowed Fourier transform, namely use a sliding “window” of the constant width instead of the entire interval for the Fourier transform. Yet even better analysis could be obtained by means of wavelets already mentioned in Remark 12 in connection with Plancherel’s formula. Roughly, wavelets correspond to a sliding window of a variable size—narrow for high frequencies and wide for low.
Everything has another side
Orthonormal basis allows to reduce any question on Hilbert space to a question on sequence of numbers. This is powerful but sometimes heavy technique. Sometime we need a smaller and faster tool to study questions which are represented by a single number, for example to demonstrate that two vectors are different it is enough to show that there is a unequal values of a single coordinate. In such cases linear functionals are just what we needed.
–Is it functional?
–Yes, it works!
| α(ax+by)=aα(x)+bα(y), for all x,y∈ V and a,b∈ℂ. |
We will not consider any functionals but linear, thus bellow functional always means linear functional.
Proof. Implication 1 ⇒ 2 is trivial.
Show 2 ⇒ 3. By the definition of continuity: for any є>0 there exists δ>0 such that ||v||<δ implies | α(v)−α(0) |<є . Take є=1 then | α(δ x) |<1 for all x with norm less than 1 because ||δ x||< δ. But from linearity of α the inequality | α(δ x) |<1 implies | α(x) |<1/δ<∞ for all ||x||≤ 1.
3 ⇒ 1. Let mentioned supremum be M. For any x, y∈ V such that x≠ y vector (x−y)/||x−y|| has norm 1. Thus | α ((x−y)/||x−y||) |<M. By the linearity of α this implies that | α (x)−α(y) |<M||x−y||. Thus α is continuous.
| ⎪⎪ ⎪⎪ | α | ⎪⎪ ⎪⎪ | = |
| ⎪ ⎪ | α(x) | ⎪ ⎪ | . (33) |
Proof. Due to Exercise 6 we only need to show that X* is complete. Let (αn) be a Cauchy sequence in X*, then for any x∈ X scalars αn(x) form a Cauchy sequence, since | αm(x)−αn(x) |≤||αm−αn||·||x||. Thus the sequence has a limit and we define α by α(x)=limn→∞αn(x). Clearly α is a linear functional on X. We should show that it is bounded and αn→ α. Given є>0 there exists N such that ||αn−αm||<є for all n, m≥ N. If ||x||≤ 1 then | αn(x)−αm(x) |≤ є, let m→∞ then | αn(x)−α(x) |≤ є, so
| ⎪ ⎪ | α(x) | ⎪ ⎪ | ≤ | ⎪ ⎪ | αn(x) | ⎪ ⎪ | +є≤ | ⎪⎪ ⎪⎪ | αn | ⎪⎪ ⎪⎪ | + є, |
i.e. ||α|| is finite and ||αn−α||≤ є, thus αn→α.
Study one and get any other for free!
Hilbert spaces sale
Proof. Uniqueness: if ⟨ x,y ⟩=⟨ x,y′ ⟩ ⇔ ⟨ x,y−y′ ⟩=0 for all x∈ H then y−y′ is self-orthogonal and thus is zero (Exercise 1).
Existence: we may assume that α≢0 (otherwise take y=0), then M=kerα is a closed proper subspace of H. Since H=M⊕ M⊥, there exists a non-zero z∈ M⊥, by scaling we could get α(z)=1. Then for any x∈ H:
| x=(x−α(x)z)+α(x)z, with x−α(x)z∈ M, α(x)z∈ M⊥. |
Because ⟨ x,z ⟩=α(x)⟨ z,z ⟩=α(x)||z||2 for any x∈ H we set y=z/||z||2.
Equality of the norms ||α||H*=||y||H follows from the Cauchy–Bunyakovskii–Schwarz inequality in the form α(x)≤ ||x||·||y|| and the identity α(y/||y||)=||y||.
| ⎪⎪ ⎪⎪ | α | ⎪⎪ ⎪⎪ | = | ⎪⎪ ⎪⎪ | t2 | ⎪⎪ ⎪⎪ | = | ⎛ ⎜ ⎜ ⎝ |
| (t2)2 dt | ⎞ ⎟ ⎟ ⎠ |
| = |
| . |
All the space’s a stage,
and all functionals and operators
merely players!
All our previous considerations were only a preparation of the stage and now the main actors come forward to perform a play. The vectors spaces are not so interesting while we consider them in statics, what really make them exciting is the their transformations. The natural first steps is to consider transformations which respect both linear structure and the norm.
| kerT ={x∈ X: Tx=0} Im T={y∈ Y: y=Tx, for some x∈ X}. |
As usual we are interested also in connections with the second (topological) structure:
| ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | =sup{ | ⎪⎪ ⎪⎪ | Tx | ⎪⎪ ⎪⎪ | Y: | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | X≤ 1}. (34) |
T is a bounded linear operator if ||T||=sup{||Tx||: ||x||}<∞.
|
Proof. Proof essentially follows the proof of similar Theorem 4.
Proof. The proof repeat proof of the Theorem 7, which is a particular case of the present theorem for Y=ℂ, see Example 3.
Proof. Clearly (ST)x=S(Tx)∈ Z, and
| ⎪⎪ ⎪⎪ | STx | ⎪⎪ ⎪⎪ | ≤ | ⎪⎪ ⎪⎪ | S | ⎪⎪ ⎪⎪ | ⎪⎪ ⎪⎪ | Tx | ⎪⎪ ⎪⎪ | ≤ | ⎪⎪ ⎪⎪ | S | ⎪⎪ ⎪⎪ | ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | , |
which implies norm estimation if ||x||≤1.
Proof. It is induction by n with the trivial base n=1 and the step following from the previous theorem.
| ST= IX and TS=IY. |
| ⟨ Th,k ⟩K=⟨ h,T*k ⟩H for all h∈ H, k∈ K. |
Proof. For any fixed k∈ K the expression h:→ ⟨ Th,k ⟩K defines a bounded linear functional on H. By the Riesz–Fréchet lemma there is a unique y∈ H such that ⟨ Th,k ⟩K=⟨ h,y ⟩H for all h∈ H. Define T* k =y then T* is linear:
|
So T*(λ1k1+λ2k2)=λ1T*k1+λ2T*k2. T** is defined by ⟨ k,T**h ⟩=⟨ T*k,h ⟩ and the identity ⟨ T**h,k ⟩=⟨ h,T*k ⟩=⟨ Th,k ⟩ for all h and k shows T**=T. Also:
|
which implies ||T*k||≤||T||·||k||, consequently ||T*||≤||T||. The opposite inequality follows from the identity ||T||=||T**||.
|
| D(x1,x2,…)=(λ1 x1, λ2 x2, …). |
| D* (x1,x2,…)=(λ1 x1, λ2 x2, …), |
| ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | = |
| ⎪ ⎪ | ⟨ Tx,x ⟩ | ⎪ ⎪ | . |
Proof. If Tx=0 for all x∈ H, both sides of the identity are 0. So we suppose that ∃ x∈ H for which Tx≠ 0.
We see that | ⟨ Tx,x ⟩ |≤ ||Tx||||x|| ≤ ||T||||x2||, so sup||x|| =1 | ⟨ Tx,x ⟩ |≤ ||T||. To get the inequality the other way around, we first write s:=sup||x|| =1 | ⟨ Tx,x ⟩ |. Then for any x∈ H, we have | ⟨ Tx,x ⟩ |≤ s||x2||.
We now consider
| ⟨ T(x+y),x+y ⟩ =⟨ Tx,x ⟩ +⟨ Tx,y ⟩+⟨ Ty,x ⟩ +⟨ Ty,y ⟩ = ⟨ Tx,x ⟩ +2ℜ ⟨ Tx,y ⟩ +⟨ Ty,y ⟩ |
(because T being Hermitian gives ⟨ Ty,x ⟩=⟨ y,Tx ⟩ =⟨ Tx,y ⟩) and, similarly,
| ⟨ T(x−y),x−y ⟩ = ⟨ Tx,x ⟩ −2ℜ ⟨ Tx,y ⟩ +⟨ Ty,y ⟩. |
Subtracting gives
| 4ℜ ⟨ Tx,y ⟩ = ⟨ T(x+y),x+y ⟩−⟨ T(x−y),x−y ⟩≤ s( | ⎪⎪ ⎪⎪ | x+y | ⎪⎪ ⎪⎪ | 2 + | ⎪⎪ ⎪⎪ | x−y | ⎪⎪ ⎪⎪ | 2) = 2s( | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2 + | ⎪⎪ ⎪⎪ | y | ⎪⎪ ⎪⎪ | 2), |
by the parallelogram identity.
Now, for x∈ H such that Tx≠ 0, we put y=||Tx||−1||x|| Tx. Then ||y|| =||x|| and when we substitute into the previous inequality, we get
| 4 | ⎪⎪ ⎪⎪ | Tx | ⎪⎪ ⎪⎪ | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | =4ℜ⟨ Tx,y ⟩ ≤ 4s | ⎪⎪ ⎪⎪ | x2 | ⎪⎪ ⎪⎪ | , |
So ||Tx||≤ s||x|| and it follows that ||T||≤ s, as required.
Proof. 1⇒2. Clearly unitarity of operator implies its invertibility and hence surjectivity. Also
| ⎪⎪ ⎪⎪ | Ux | ⎪⎪ ⎪⎪ | 2=⟨ Ux,Ux ⟩=⟨ x,U*Ux ⟩=⟨ x,x ⟩= | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2. |
2⇒3. Using the polarisation identity (cf. polarisation in equation (12)):
|
Take T=U*U and T=I, then
|
3⇒1. Indeed ⟨ U*U x,y ⟩=⟨ x,y ⟩ implies ⟨ (U*U−I)x,y ⟩=0 for all x,y∈ H, then U*U=I. Since U should be invertible by surjectivity we see that U*=U−1.
Beware of ghosts2 in this area!
As we saw operators could be added and multiplied each other, in some sense they behave like numbers, but are much more complicated. In this lecture we will associate to each operator a set of complex numbers which reflects certain (unfortunately not all) properties of this operator.
The analogy between operators and numbers become even more deeper since we could construct functions of operators (called functional calculus) in a way we build numeric functions. The most important functions of this sort is called resolvent (see Definition 5). The methods of analytical functions are very powerful in operator theory and students may wish to refresh their knowledge of complex analysis before this part.
An eigenvalue of operator T∈B(H) is a complex number λ such that there exists a nonzero x∈ H, called eigenvector with property Tx=λ x, in other words x∈ker(T−λ I).
In finite dimensions T−λ I is invertible if and only if λ is not an eigenvalue. In infinite dimensions it is not the same: the right shift operator S is not invertible but 0 is not its eigenvalue because Sx=0 implies x=0 (check!).
| ρ (T)={λ∈ℂ: T−λ I is invertible}. |
| σ(T)={λ∈ℂ: T−λ I is not invertible}. |
Even this example demonstrates that spectrum does not provide a complete description for operator even in finite-dimensional case. For example, both operators in ℂ2 given by matrices (
| 0 | 0 |
| 0 | 0 |
) and (
| 0 | 0 |
| 1 | 0 |
) have a single point spectrum {0}, however are rather different. The situation became even worst in the infinite dimensional spaces.
For the proof we will need several Lemmas.
| (I−A)−1=I+A+A2+A3+…= |
| Ak. (35) |
Proof. Define the sequence of operators Bn=I+A+⋯+AN—the partial sums of the infinite series (35). It is a Cauchy sequence, indeed:
|
for a large m. By the completeness of B(H) there is a limit, say B, of the sequence Bn. It is a simple algebra to check that (I−A)Bn=Bn(I−A)=I−An+1, passing to the limit in the norm topology, where An+1→ 0 and Bn→ B we get:
| (I−A)B=B(I−A)=I ⇔ B=(I−A)−1. |
| R(λ,T)=(T−λ I)−1. (36) |
Proof.
| R(λ,T)= (T−λ I)−1=− |
| λ−k−1Tk. (37) |
|
| R(λ,T)−R(µ,T)=(λ−µ)R(λ,T)R(µ,T) (38) |
Proof. Let us assume the opposite, σ(T)=∅ then the resolvent function R(λ,T) is well defined for all λ∈ℂ. As could be seen from the von Neumann series (37) ||R(λ,T)||→ 0 as λ→ ∞. Thus for any vectors x, y∈ H the function f(λ)=⟨ R(λ,T)x,y) ⟩ is analytic (see Exercise 3) function tensing to zero at infinity. Then by the Liouville theorem from complex analysis R(λ,T)=0, which is impossible. Thus the spectrum is not empty.
Proof.[Proof of Theorem 3] Spectrum is nonempty by Lemma 8 and compact by Corollary 6.
The following definition is of interest.
| r(T)=sup{ | ⎪ ⎪ | λ | ⎪ ⎪ | : λ∈ σ(T)}. |
From the Lemma 1 immediately follows that r(T)≤||T||. The more accurate estimation is given by the following theorem.
We start from the following general lemma:
Proof. The statements follows from the observation that for any n and m=nk+l with 0≤ l≤ n we have am≤ kan+la1 thus, for big m we got am/m≤ an/n +la1/m ≤ an/n+є.
Proof.[Proof of Theorem 11] The existence of the limit limn→∞||Tn||1/n in (39) follows from the previous Lemma since by the Lemma 9 log||Tn+m||≤ log||Tn||+log||Tm||. Now we are using some results from the complex analysis. The Laurent series for the resolvent R(λ,T) in the neighbourhood of infinity is given by the von Neumann series (37). The radius of its convergence (which is equal, obviously, to r(T)) by the Hadamard theorem is exactly limn→∞||Tn||1/n.
Proof. Indeed, as its known from the complex analysis the boundary of the convergence circle of a Laurent (or Taylor) series contain a singular point, the singular point of the resolvent is obviously belongs to the spectrum.
Proof. If (T−λ I)V=V(T−λ I)=I the by taking adjoints V*(T*−λI)=(T*−λI)V*=I. So λ ∈ ρ(T) implies λ∈ρ(T*), using the property T**=T we could invert the implication and get the statement of proposition.
Proof.
| U=(T−iI)(T+iI)−1. |
| U−µ I=(T−iI)(T+iI)−1−(λ−i)(λ+i)−1I= 2i(λ+i)−1(T−λ I)(T+iI)−1, |
The above reduction of a self-adjoint operator to a unitary one (it
can be done on the opposite direction as well!) is an important tool
which can be applied in other questions as well, e.g. in the following
exercise.
It is not easy to study linear operators “in general” and there are many questions about operators in Hilbert spaces raised many decades ago which are still unanswered. Therefore it is reasonable to single out classes of operators which have (relatively) simple properties. Such a class of operators more closed to finite dimensional ones will be studied here.
These operators are so compact that we even can fit them in our course
Let us recall some topological definition and results.
In the finite dimensional vector spaces ℝn or ℂn there is the following equivalent definition of compactness (equivalence of 1 and 2 is known as Heine–Borel theorem):
The set of finite rank operators is denote by F(X,Y) and the set of compact operators—by K(X,Y)
We intend to show that F(X,Y)⊂K(X,Y).
Proof. The proof is given by an explicit construction. Let N=dimZ and z1, z2, …, zN be a basis in Z. Let us define
| S: l2N → Z by S(a1,a2,…,aN)= |
| ak zk, |
then we have an estimation of norm:
|
So ||S||≤ (∑1N ||zk||2)1/2 and S is continuous.
Clearly S has the trivial kernel, particularly ||Sa||>0 if ||a||=1. By the Heine–Borel theorem the unit sphere in l2N is compact, consequently the continuous function a↦ ||∑1N ak zk|| attains its lower bound, which has to be positive. This means there exists δ>0 such that ||a||=1 implies ||Sa||>δ , or, equivalently if ||z||<δ then ||S−1 z||<1. The later means that ||S−1||≤ δ−1 and boundedness of S−1.
Proof. Let T∈F(X,Y), if (xn)1∞ is a bounded sequence in X then ((Txn)1∞⊂ Z=Im T is also bounded. Let S: l2N→ Z be a map constructed in the above Lemma. The sequence (S−1T xn)1∞ is bounded in l2N and thus has a limiting point, say a0. Then Sa0 is a limiting point of (T xn)1∞.
There is a simple condition which allows to determine which diagonal operators are compact (particularly the identity operator IXis not compact if dimX =∞):
Proof. If λn↛0 then there exists a subsequence λnk and δ>0 such that | λnk |>δ for all k. Now the sequence (enk) is bounded but its image T enk=λ nk enk has no convergent subsequence because for any k≠ l:
| ⎪⎪ ⎪⎪ | λ nkenk−λ nlenl | ⎪⎪ ⎪⎪ | = ( | ⎪ ⎪ | λ nk | ⎪ ⎪ | 2 + | ⎪ ⎪ | λ nl | ⎪ ⎪ | 2)1/2≥ | √ |
| δ , |
i.e. T enk is not a Cauchy sequence, see Figure 16.
For the converse, note that if λn→ 0 then we can define a finite rank operator Tm, m≥ 1—m-“truncation” of T by:
| Tm en = | ⎧ ⎨ ⎩ |
| (40) |
Then obviously
| (T−Tm) en = | ⎧ ⎨ ⎩ |
|
and ||T−Tm||=supn>m| λn |→ 0 if m→ ∞. All Tm are finite rank operators (so are compact) and T is also compact as their limit—by the next Theorem.
Proof.
Take a bounded sequence (xn)1∞. From compactness
| of T1 | ⇒ ∃ | subsequence (xn(1))1∞ of (xn)1∞ | s.t. | (T1xn(1))1∞ is convergent. |
| of T2 | ⇒ ∃ | subsequence (xn(2))1∞ of (xn(1))1∞ | s.t. | (T2xn(2))1∞ is convergent. |
| of T3 | ⇒ ∃ | subsequence (xn(3))1∞ of (xn(2))1∞ | s.t. | (T3xn(3))1∞ is convergent. |
| … | … | … | … | … |
Could we find a subsequence which converges for all Tm
simultaneously? The first guess “take the intersection of all
above sequences (xn(k))1∞” does not work because the
intersection could be empty. The way out is provided by the
diagonal argument (see Table 2):
a subsequence (Tm xk(k))1∞ is convergent for
all m, because at latest after the term xm(m) it is a
subsequence of (xk(m))1∞.
We are claiming that a subsequence (T xk(k))1∞ of (T xn)1∞ is convergent as well. We use here є/3 argument (see Figure 17): for a given є>0 choose p∈ℕ such that ||T−Tp||<є/3.
Because (Tp xk(k))→ 0 it is a Cauchy sequence, thus there exists n0>p such that ||Tp xk(k)−Tp xl(l)||< є/3 for all k, l>n0. Then:
|
Thus T is compact.
A relation to compact operator is as follows.
Proof. Let T∈ B(H,K) have a convergent series ∑ ||T en||2 in an orthonormal basis (en)1∞ of H. We again (see (40)) define the m-truncation of T by the formula
| Tm en = | ⎧ ⎨ ⎩ |
| (41) |
Then Tm(∑1∞ak ek)=∑1m ak ek and each Tm is a finite rank operator because its image is spanned by the finite set of vectors Te1, …, Ten. We claim that ||T−Tm||→ 0. Indeed by linearity and definition of Tm:
|
Thus:
|
so ||T−Tm||→ 0 and by the previous Theorem T is compact as a limit of compact operators.
| ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | ≤ | ⎛ ⎜ ⎜ ⎝ |
| ⎪⎪ ⎪⎪ | (Ten) | ⎪⎪ ⎪⎪ | 2 | ⎞ ⎟ ⎟ ⎠ |
| . |
Proof. Just consider difference of T and T0=0 in (42)–(43).
| (T f)(x)= |
| K(x,y)f(y) dy, f(y)∈L2[0,1], (44) |
Proof. Let (en)−∞∞ be an orthonormal basis of L2[0,1], e.g. (e2π i nt)n∈ℤ. Let us consider the kernel Kx(y)=K(x,y) as a function of the argument y depending from the parameter x. Then:
| (T en)(x)= |
| K(x,y)en(y) dy= |
| Kx(y)en(y) dy= ⟨ Kx,ēn ⟩. |
So ||T en||2= ∫01| ⟨ Kx,ēn ⟩ |2 dx. Consequently:
|
| (Tf)(x)= |
| (x−y)f(y) dy =x |
| f(y) dy − |
| yf(y) dy |
|
| Tf= |
| ⟨ f,e1 ⟩e1− |
| ⟨ f,e2 ⟩e2, |
Recall from Section 6.4 that an operator T is normal if TT*=T*T; Hermitian (T*=T) and unitary (T*=T−1) operators are normal.
Proof.
|
| λ⟨ x,y ⟩=⟨ Tx,y ⟩ =⟨ x,T*y ⟩=µ⟨ x,y ⟩ |
| ⎪⎪ ⎪⎪ | Sx | ⎪⎪ ⎪⎪ | 2=⟨ Sx,Sx ⟩=⟨ S2x,x ⟩≤ | ⎪⎪ ⎪⎪ | S2 | ⎪⎪ ⎪⎪ | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2 |
Now we claim ||S||=||T||2. From Theorem 9 and 15 we get ||S||=||T*T||≤ ||T||2. On the other hand if ||x||=1 then
| ⎪⎪ ⎪⎪ | T*T | ⎪⎪ ⎪⎪ | ≥ | ⎪ ⎪ | ⟨ T*Tx,x ⟩ | ⎪ ⎪ | =⟨ Tx,Tx ⟩= | ⎪⎪ ⎪⎪ | Tx | ⎪⎪ ⎪⎪ | 2 |
implies the opposite inequality ||S||≥||T||2. And because (T2m)*T2m=(T*T)2m we get the equality
| ⎪⎪ ⎪⎪ | T2m | ⎪⎪ ⎪⎪ | 2= | ⎪⎪ ⎪⎪ | (T*T)2m | ⎪⎪ ⎪⎪ | = | ⎪⎪ ⎪⎪ | T*T | ⎪⎪ ⎪⎪ | 2m = | ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | 2m+1. |
Thus:
| r(T)= |
| ⎪⎪ ⎪⎪ | T2m | ⎪⎪ ⎪⎪ | 1/2m= |
| ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | 2m+1/2m+1 = | ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | . |
by the spectral radius formula (39).
| 0 | 1 |
| 0 | 0 |
Proof.
Proof.[Solution] Or straightforwardly assume opposite: there exist an δ>0 and infinitely many eigenvalues λn such that | λn |>δ. By the previous Theorem there is an orthonormal sequence vn of corresponding eigenvectors T vn=λn vn. Now the sequence (vn) is bounded but its image T vn=λ n en has no convergent subsequence because for any k≠ l:
| ⎪⎪ ⎪⎪ | λ kvk−λ lel | ⎪⎪ ⎪⎪ | = ( | ⎪ ⎪ | λ k | ⎪ ⎪ | 2 + | ⎪ ⎪ | λl | ⎪ ⎪ | 2)1/2≥ | √ |
| δ , |
i.e. T enk is not a Cauchy sequence, see Figure 16.
Proof. Assume without lost of generality that T≠ 0. Let λ∈σ(T), without lost of generality (multiplying by a scalar) λ=1.
We claim that if 1 is not an eigenvalue then there exist δ>0 such that
| ⎪⎪ ⎪⎪ | (I−T)x | ⎪⎪ ⎪⎪ | ≥ δ | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | . (46) |
Otherwise there exists a sequence of vectors (xn) with unit norm such that (I−T)xn→ 0. Then from the compactness of T for a subsequence (xnk) there is y∈ H such that Txnk → y, then xn→ y implying Ty=y and y≠ 0—i.e. y is eigenvector with eigenvalue 1.
Now we claim Im (I−T) is closed, i.e. y∈Im(I−T) implies y∈Im(I−T). Indeed, if (I−T)xn → y, then there is a subsequence (xnk) such that Txnk→ z implying xnk→ y+z, then (I−T)(z+y)=y.
Finally I−T is injective, i.e ker(I−T)={0}, by (46). By the property 1, ker(I−T*)={0} as well. But because always ker(I−T*)=Im(I−T)⊥ (check!) we got surjectivity, i.e. Im(I−T)⊥={0}, of I−T. Thus (I−T)−1 exists and is bounded because (46) implies ||y||>δ ||(I−T)−1y||. Thus 1∉σ(T).
The existence of eigenvalue λ such that | λ |=||T|| follows from combination of Lemma 13 and Theorem 3.
| Tx= |
| λn ⟨ x,en ⟩ en, for all x∈ H. (47) |
Conversely, if T is given by a formula (47) then it is compact and normal.
Proof. Suppose T≠ 0. Then by the previous Theorem there exists an eigenvalue λ1 such that | λ1 |=||T|| with corresponding eigenvector e1 of the unit norm. Let H1=Lin(e1)⊥. If x∈ H1 then
| ⟨ Tx,e1 ⟩=⟨ x,T*e1 ⟩=⟨ x,λ1 e1 ⟩=λ1⟨ x,e1 ⟩=0, (48) |
thus Tx∈ H1 and similarly T* x ∈ H1. Write T1=T|H1 which is again a normal compact operator with a norm does not exceeding ||T||. We could inductively repeat this procedure for T1 obtaining sequence of eigenvalues λ2, λ3, …with eigenvectors e2, e3, …. If Tn=0 for a finite n then theorem is already proved. Otherwise we have an infinite sequence λn→ 0. Let
| x= |
| ⟨ x,ek ⟩ek +yn ⇒ | ⎪⎪ ⎪⎪ | x | ⎪⎪ ⎪⎪ | 2= |
| ⎪ ⎪ | ⟨ x,ek ⟩ | ⎪ ⎪ | 2 + | ⎪⎪ ⎪⎪ | yn | ⎪⎪ ⎪⎪ | 2 , yn∈ Hn, |
from Pythagoras’s theorem. Then ||yn||≤ ||x|| and ||T yn||≤ ||Tn||||yn||≤ | λn |||x||→ 0 by Lemma 3. Thus
| T x = |
| ⎛ ⎜ ⎜ ⎝ |
| ⟨ x,en ⟩ Ten + Tyn | ⎞ ⎟ ⎟ ⎠ | = |
| λn⟨ x,en ⟩ en |
Conversely, if T x = ∑1∞λn⟨ x,en ⟩ en then
| ⟨ Tx,y ⟩= |
| λn⟨ x,en ⟩ ⟨ en,y ⟩ = |
| ⟨ x,en ⟩ λn |
| , |
thus T* y = ∑1∞λn⟨ y,en ⟩ en. Then we got the normality of T: T*Tx=TT*x= ∑1∞| λn |2⟨ y,en ⟩ en. Also T is compact because it is a uniform limit of the finite rank operators Tnx=∑1n λn⟨ x,en ⟩en.
| Tx= |
| λn⟨ x,gn ⟩ gn, |
Proof. Let (en) be the orthonormal sequence constructed in the proof of the previous Theorem. Then x is perpendicular to all en if and only if its in the kernel of T. Let (fn) be any orthonormal basis of kerT. Then the union of (en) and (fn) is the orthonormal basis (gn) we have looked for.
Proof. Operator T*T is compact and Hermitian (hence normal). From the previous Corollary there is an orthonormal basis (ek) such that T*T x= ∑n λn⟨ x,ek ⟩ek for some positive λn=||T en||2. Let µn=||Ten|| and fn=Ten/µn. Then fn is an orthonormal sequence (check!) and
| Tx= |
| ⟨ x,en ⟩ Ten = |
| ⟨ x,en ⟩ µn fn. |
Proof. Sufficiency follows from 9. Necessity: by the previous Corollary Tx =∑n ⟨ x,en ⟩ µn fn thus T is a uniform limit of operators Tm x=∑n=1m ⟨ x,en ⟩ µn fn which are of finite rank.
In this lecture we will study the Fredholm equation defined as follows. Let the integral operator with a kernel K(x,y) defined on [a,b]×[a,b] be defined as before:
| (Tφ)(x)= |
| K(x,y)φ(y) dy. (49) |
The Fredholm equation of the first and second kinds correspondingly are:
| Tφ=f and φ −λ Tφ=f, (50) |
for a function f on [a,b]. A special case is given by Volterra equation by an operator integral operator (49) T with a kernel K(x,y)=0 for all y>x which could be written as:
| (Tφ)(x)= |
| K(x,y)φ(y) dy. (51) |
We will consider integral operators with kernels K such that ∫ab∫ab K(x,y) dx dy<∞, then by Theorem 15 T is a Hilbert–Schmidt operator and in particular bounded.
As a reason to study Fredholm operators we will mention that solutions of differential equations in mathematical physics (notably heat and wave equations) requires a decomposition of a function f as a linear combination of functions K(x,y) with “coefficients” φ. This is an continuous analog of a discrete decomposition into Fourier series.
Using ideas from the proof of Lemma 4 we define Neumann series for the resolvent:
| (I−λ T)−1=I+λ T + λ2T2+⋯, (52) |
which is valid for all λ<||T||−1.
| φ(x)−λ |
| y φ(y) dy=x2, on L2[0,1]. |
| K(x,y)= | ⎧ ⎨ ⎩ |
|
|
| (Tnf)(x) = |
| y |
| dy= |
| . |
|
Among other integral operators there is an important subclass with separable kernel, namely a kernel which has a form:
| K(x,y)= |
| gj(x)hj(y). (53) |
In such a case:
|
i.e. the image of T is spanned by g1(x), …, gn(x) and is finite dimensional, consequently the solution of such equation reduces to linear algebra.
|
|
|
We develop some Hilbert–Schmidt theory for integral operators.
| Tφ= |
| λn ⟨ φ,vn ⟩vn where φ= |
| ⟨ φ,vn ⟩vn |
Proof.
|
|
| ⎪ ⎪ | vn(x1)−vn(x2) | ⎪ ⎪ | ≤ |
| ⎪⎪ ⎪⎪ | vn | ⎪⎪ ⎪⎪ | 2 |
| ⎪ ⎪ | K(x1,y)−K(x2,y) | ⎪ ⎪ | dy |
| φ= |
|
| vn. (54) |
Proof. Let φ=∑1∞an vn where an=⟨ φ,vn ⟩, then
| φ−λ Tφ= |
| an(1−λ λn) vn =f= |
| ⟨ f,vn ⟩vn |
if and only if an=⟨ f,vn ⟩/(1−λ λn) for all n. Note 1−λ λn≠ 0 since λ−1∉σ(T).
Because λn→ 0 we got ∑1∞| an |2 by its comparison with ∑1∞| ⟨ f,vn ⟩ |2=||f||2, thus the solution exists and is unique by the Riesz–Fisher Theorem.
See Exercise 30 for an example.
|
Proof.
| (I−λ T)φ= |
| (1−λ λn)⟨ φ,vn ⟩vn = |
| (1−λ λn)⟨ φ,vn ⟩vn. |
| φ= |
|
| vn +φ0, for any φ0∈Lin(v1,…,vN), |
| (Tφ)(x)= |
| (2xy−x−y+1)φ(y) dy. |
| (Tφ)(x)=x |
| (2y−1)φ(y) dy+ |
| (−y+1)φ(y) dy, |
| or T is given by the matrix | ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ |
| ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ |
|
|
| φ=f+ |
| ⟨ f,v2 ⟩v2+Cv1=f+ |
| ⟨ f,v2 ⟩v2+Cv1, C∈ℂ. |
| φ=f+ |
| ⟨ f,v1 ⟩v1+Cv2=f− |
| ⟨ f,v2 ⟩v2+Cv2, C∈ℂ. |
We will work with either the field of real numbers ℝ or the complex numbers ℂ. To avoid repetition, we use K to denote either ℝ or ℂ.
Recall, see Defn. 3, a norm on a vector space V is a map ||·||:V→[0,∞) such that
A norm induces a metric, see Defn. 1, on V by setting d(u,v)=||u−v||. When V is complete, see Defn. 6, for this metric, we say that V is a Banach space.
We will use the following simple inequality:
| ⎪ ⎪ | ab | ⎪ ⎪ | ≤ |
| + |
| , (57) |
Proof. Consider the function φ(t)=tm−mt for an 1<m<∞. From its derivative φ(t)=m(tm−1−1) we find the only critical point on [0,∞), which is its maximum. Thus write the inequality φ(t)≤ φ(1) for t=ap/bp and m=1/p. After transformation we get ab−q/p−1≤ 1/p(apb−q−1) and multiplication by bq with rearrangements lead to the desired result.
| ⎪ ⎪ | uj vj | ⎪ ⎪ | ≤ | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | uj | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | vj | ⎪ ⎪ | q | ⎞ ⎟ ⎟ ⎠ |
| . |
Proof. For reasons become clear soon we use the notation ||u||=( ∑j=1n | uj |p )1/p and ||v||= ( ∑j=1n | vj |q )1/q and define for 1≤ i ≤ n:
| ai= |
| and bi= |
| . |
Summing up for 1≤ i ≤ n all inequalities obtained from (57):
| ⎪ ⎪ | ai bi | ⎪ ⎪ | ≤ |
| + |
| , |
we get the result.
Using Hölder inequality we can derive the following one:
| ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | uj+vj | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| ≤ | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | uj | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| + | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | vj | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| . |
Proof. For p>1 we have:
| ⎪ ⎪ | xk+yk | ⎪ ⎪ | p = |
| ⎪ ⎪ | xk | ⎪ ⎪ | ⎪ ⎪ | xk+yk | ⎪ ⎪ | p−1 + |
| ⎪ ⎪ | yk | ⎪ ⎪ | ⎪ ⎪ | xk+yk | ⎪ ⎪ | p−1. (58) |
By Hölder inequality
| ⎪ ⎪ | xk | ⎪ ⎪ | ⎪ ⎪ | xk+yk | ⎪ ⎪ | p−1 ≤ | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | xk | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | xk+yk | ⎪ ⎪ | q(p−1) | ⎞ ⎟ ⎟ ⎠ |
| . |
Adding a similar inequality for the second term in the right hand side of (58) and division by (∑1n | xk+yk |q(p−1))1/q yields the result.
Minkowski’s inequality shows that for 1≤ p<∞ (the case p=1 is easy) we can define a norm ||·||p on Kn by
| ⎪⎪ ⎪⎪ | u | ⎪⎪ ⎪⎪ | p = | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | uj | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| ( u =(u1,⋯,un)∈Kn ). |
We can define an infinite analogue of this. Let 1≤ p<∞, let ℓp be the space of all scalar sequences (xn) with ∑n | xn |p < ∞. A careful use of Minkowski’s inequality shows that ℓp is a vector space. Then ℓp becomes a normed space for the ||·||p norm.
Recall that a Cauchy sequence, see Defn. 5, in a normed space is bounded: if (xn) is Cauchy then we can find N with ||xn−xm||<1 for all n,m≥ N. Then ||xn|| ≤ ||xn−xN|| + ||xN|| < ||xN||+1 for n≥ N, so in particular, ||xn|| ≤ max( ||x1||,||x2||,⋯,||xN−1||,||xN||+1).
Proof. Most completeness proofs are similar to this, see Thm. 24. So we shall prove this result in detail. Let (x(n)) be a Cauchy-sequence in ℓp; we wish to show this converges to some vector in ℓp.
For each n, x(n)∈ℓp so is a sequence of scalars, say (xk(n))k=1∞. As (x(n)) is Cauchy, for each є>0 there exists Nє so that ||x(n) − x(m)||p ≤ є for n,m≥ Nє.
For k fixed,
| ⎪ ⎪ | xk(n) − xk(m) | ⎪ ⎪ | ≤ | ⎛ ⎜ ⎜ ⎝ |
| ⎪ ⎪ | xj(n) − xj(m) | ⎪ ⎪ | p | ⎞ ⎟ ⎟ ⎠ |
| = | ⎪⎪ ⎪⎪ | x(n) − x(m) | ⎪⎪ ⎪⎪ | p ≤ є, |
when n,m≥ Nє. Thus the scalar sequence (xk(n))n=1∞ is Cauchy in K and hence converges, to yk say.
Let y=(yk), so that y is a candidate for the limit of (x(n)). Firstly, we check that y∈ℓp. We calculate,
|
as (x(n)) is Cauchy, and hence bounded.
Finally, we check that x(n)→ y in ℓp. For є>0, let n≥ Nє, so that
|
as n≥ Nє. Hence ||x(n)−y||p→0.
For p=∞, there are two analogies to the ℓp spaces. The first is arguably more natural, but we write c0 for it. c0 is the space of all scalar sequences (xn) which converge to 0. We equip c0 with the sup norm,
| ⎪⎪ ⎪⎪ | (xn) | ⎪⎪ ⎪⎪ | ∞= |
| ⎪ ⎪ | xn | ⎪ ⎪ | ( (xn)∈ c0 ). |
This is defined, as if xn→0, then (xn) is bounded. Similarly, we define ℓ∞ to be the vector space of all bounded scalar sequences, with the ||·||∞ norm. Hence c0 is a subspace of ℓ∞, and we can check that c0 is closed.
Proof. This will be a variant of the previous proof: it’s shorter, but the “trick” is maybe harder to remember. We do the ℓ∞ case. Again, let (x(n)) be a Cauchy sequence in ℓ∞, and for each n, let x(n)=(xk(n))k=1∞. For є>0 we can find N such that ||x(n)−x(m)||∞< є for n,m≥ N. Thus, for any k, we see that | xk(n) − xk(m) | < є when n,m≥ N. So (xk(n))n=1∞ is Cauchy, and hence converges, say to xk∈K. Let x=(xk).
Let m≥ N, so that for any k, we have that
| ⎪ ⎪ | xk − xk(m) | ⎪ ⎪ | = |
| ⎪ ⎪ | xk(n) − xk(m) | ⎪ ⎪ | ≤ є. |
As k was arbitrary, we see that supk | xk−xk(m) | ≤ є. So, firstly, this shows that (x−x(m))∈ℓ∞, and so also x = (x−x(m)) + x(m) ∈ ℓ∞. Secondly, we have shown that ||x−x(m)||∞≤ є when m≥ N, so x(m)→ x in norm.
| ⎪⎪ ⎪⎪ | f | ⎪⎪ ⎪⎪ | p= | ⎛ ⎜ ⎜ ⎜ ⎜ ⎝ | ∫ |
| ⎪ ⎪ | f(t) | ⎪ ⎪ | p dt | ⎞ ⎟ ⎟ ⎟ ⎟ ⎠ |
| . |
Recall what a linear map is, see Defn. 1. A linear map is often called an operator. A linear map T:E→ F between normed spaces is bounded if there exists M>0 such that ||T(x)|| ≤ M ||x|| for x∈ E, see Defn. 3. We write B(E,F) for the set of operators from E to F. For the natural operations, B(E,F) is a vector space. We norm B(E,F) by setting
| ⎪⎪ ⎪⎪ | T | ⎪⎪ ⎪⎪ | = sup | ⎧ ⎪ ⎨ ⎪ ⎩ |
| : x∈ E, x≠0 | ⎫ ⎪ ⎬ ⎪ ⎭ | . (59) |