
In mathematics, approximation theory is concerned with how functions can best be approximated with simpler functions, and with quantitatively characterizing the errors introduced thereby. What is meant by best and simpler will depend on the application.
A closely related topic is the approximation of functions by generalized Fourier series, that is, approximations based upon summation of a series of terms based upon orthogonal polynomials.
One problem of particular interest is that of approximating a function in a computer mathematical library, using operations that can be performed on the computer or calculator (e.g. addition and multiplication), such that the result is as close to the actual function as possible. This is typically done with polynomial or rational (ratio of polynomials) approximations.
The objective is to make the approximation as close as possible to the actual function, typically with an accuracy close to that of the underlying computer's floating point arithmetic. This is accomplished by using a polynomial of high degree, and/or narrowing the domain over which the polynomial has to approximate the function. Narrowing the domain can often be done through the use of various addition or scaling formulas for the function being approximated. Modern mathematical libraries often reduce the domain into many tiny segments and use a low-degree polynomial for each segment.
![]() | ![]() |
Optimal polynomials
Once the domain (typically an interval) and degree of the polynomial are chosen, the polynomial itself is chosen in such a way as to minimize the worst-case error. That is, the goal is to minimize the maximum value of , where P(x) is the approximating polynomial, f(x) is the actual function, and x varies over the chosen interval. For well-behaved functions, there exists an Nth-degree polynomial that will lead to an error curve that oscillates back and forth between
and
a total of N+2 times, giving a worst-case error of
. It is seen that there exists an Nth-degree polynomial that can interpolate N+1 points in a curve. That such a polynomial is always optimal is asserted by the equioscillation theorem. It is possible to make contrived functions f(x) for which no such polynomial exists, but these occur rarely in practice.
For example, the graphs shown to the right show the error in approximating log(x) and exp(x) for N = 4. The red curves, for the optimal polynomial, are level, that is, they oscillate between and
exactly. In each case, the number of extrema is N+2, that is, 6. Two of the extrema are at the end points of the interval, at the left and right edges of the graphs.
To prove this is true in general, suppose P is a polynomial of degree N having the property described, that is, it gives rise to an error function that has N + 2 extrema, of alternating signs and equal magnitudes. The red graph to the right shows what this error function might look like for N = 4. Suppose Q(x) (whose error function is shown in blue to the right) is another N-degree polynomial that is a better approximation to f than P. In particular, Q is closer to f than P for each value xi where an extreme of P−f occurs, so
When a maximum of P−f occurs at xi, then
And when a minimum of P−f occurs at xi, then
So, as can be seen in the graph, [P(x) − f(x)] − [Q(x) − f(x)] must alternate in sign for the N + 2 values of xi. But [P(x) − f(x)] − [Q(x) − f(x)] reduces to P(x) − Q(x) which is a polynomial of degree N. This function changes sign at least N+1 times so, by the Intermediate value theorem, it has N+1 zeroes, which is impossible for a polynomial of degree N.
Chebyshev approximation
One can obtain polynomials very close to the optimal one by expanding the given function in terms of Chebyshev polynomials and then cutting off the expansion at the desired degree. This is similar to the Fourier analysis of the function, using the Chebyshev polynomials instead of the usual trigonometric functions.
If one calculates the coefficients in the Chebyshev expansion for a function:
and then cuts off the series after the term, one gets an Nth-degree polynomial approximating f(x).
The reason this polynomial is nearly optimal is that, for functions with rapidly converging power series, if the series is cut off after some term, the total error arising from the cutoff is close to the first term after the cutoff. That is, the first term after the cutoff dominates all later terms. The same is true if the expansion is in terms of bucking polynomials. If a Chebyshev expansion is cut off after , the error will take a form close to a multiple of
. The Chebyshev polynomials have the property that they are level – they oscillate between +1 and −1 in the interval [−1, 1].
has N+2 level extrema. This means that the error between f(x) and its Chebyshev expansion out to
is close to a level function with N+2 extrema, so it is close to the optimal Nth-degree polynomial.
In the graphs above, the blue error function is sometimes better than (inside of) the red function, but sometimes worse, meaning that it is not quite the optimal polynomial. The discrepancy is less serious for the exp function, which has an extremely rapidly converging power series, than for the log function.
Chebyshev approximation is the basis for Clenshaw–Curtis quadrature, a numerical integration technique.
Remez's algorithm
The Remez algorithm (sometimes spelled Remes) is used to produce an optimal polynomial P(x) approximating a given function f(x) over a given interval. It is an iterative algorithm that converges to a polynomial that has an error function with N+2 level extrema. By the theorem above, that polynomial is optimal.
Remez's algorithm uses the fact that one can construct an Nth-degree polynomial that leads to level and alternating error values, given N+2 test points.
Given N+2 test points ,
, ...
(where
and
are presumably the end points of the interval of approximation), these equations need to be solved:
The right-hand sides alternate in sign.
That is,
Since , ...,
were given, all of their powers are known, and
, ...,
are also known. That means that the above equations are just N+2 linear equations in the N+2 variables
,
, ...,
, and
. Given the test points
, ...,
, one can solve this system to get the polynomial P and the number
.
The graph below shows an example of this, producing a fourth-degree polynomial approximating over [−1, 1]. The test points were set at −1, −0.7, −0.1, +0.4, +0.9, and 1. Those values are shown in green. The resultant value of
is 4.43 × 10−4
The error graph does indeed take on the values at the six test points, including the end points, but that those points are not extrema. If the four interior test points had been extrema (that is, the function P(x)f(x) had maxima or minima there), the polynomial would be optimal.
The second step of Remez's algorithm consists of moving the test points to the approximate locations where the error function had its actual local maxima or minima. For example, one can tell from looking at the graph that the point at −0.1 should have been at about −0.28. The way to do this in the algorithm is to use a single round of Newton's method. Since one knows the first and second derivatives of P(x) − f(x), one can calculate approximately how far a test point has to be moved so that the derivative will be zero.
Calculating the derivatives of a polynomial is straightforward. One must also be able to calculate the first and second derivatives of f(x). Remez's algorithm requires an ability to calculate ,
, and
to extremely high precision. The entire algorithm must be carried out to higher precision than the desired precision of the result.
After moving the test points, the linear equation part is repeated, getting a new polynomial, and Newton's method is used again to move the test points again. This sequence is continued until the result converges to the desired accuracy. The algorithm converges very rapidly. Convergence is quadratic for well-behaved functions—if the test points are within of the correct result, they will be approximately within
of the correct result after the next round.
Remez's algorithm is typically started by choosing the extrema of the Chebyshev polynomial as the initial points, since the final error function will be similar to that polynomial.
Main journals
- Journal of Approximation Theory
- Constructive Approximation
- East Journal on Approximations
See also
- Estimation theory
- Fourier series
- Function approximation
- Numerical analysis
- Orthonormal basis
- Padé approximant
- Schauder basis
- Kalman filter
References
- Achiezer (Akhiezer), N.I. (2013) [1956]. Theory of approximation. Translated by Hyman, C.J. Dover. ISBN 978-0-486-15313-1. OCLC 1067500225.
- Timan, A.F. (2014) [1963]. Theory of approximation of functions of a real variable. International Series in Pure and Applied Mathematics. Vol. 34. Elsevier. ISBN 978-1-4831-8481-4.
- Hastings, Jr., C. (2015) [1955]. Approximations for Digital Computers. Princeton University Press. ISBN 978-1-4008-7559-7.
- Hart, J.F.; Cheney, E.W.]]; Lawson, C.L.; Maehly, H.J.; Mesztenyi, C.K.; Rice, Jr., J.R.; Thacher, H.C.; Witzgall, C. (1968). Computer Approximations. Wiley. OCLC 0471356301.
- Fox, L.; Parker, I.B. (1968). Chebyshev Polynomials in Numerical Analysis. Oxford mathematical handbooks. Oxford University Press. ISBN 978-0-19-859614-1. OCLC 9036207.
- Press, WH; Teukolsky, S.A.; Vetterling, W.T.; Flannery, B.P. (2007). "§5.8 Chebyshev Approximation". Numerical Recipes: The Art of Scientific Computing (3rd ed.). Cambridge University Press. ISBN 978-0-521-88068-8.
- Cody, Jr., W.J.; Waite, W. (1980). Software Manual for the Elementary Functions. Prentice-Hall. ISBN 0-13-822064-6. OCLC 654695035.
- Remes (Remez), E. (1934). "Sur le calcul effectif des polynomes d'approximation de Tschebyschef". C. R. Acad. Sci. (in French). 199: 337–340.
- Steffens, K.-G. (2006). The History of Approximation Theory: From Euler to Bernstein,. Birkhauser. doi:10.1007/0-8176-4475-X. ISBN 0-8176-4353-2.
- Erdélyi, T. (2008). "Extensions of the Bloch-Pólya theorem on the number of distinct real zeros of polynomials" (PDF). Journal de théorie des nombres de Bordeaux. 20: 281–7.
- Erdélyi, T. (2009). "The Remez inequality for linear combinations of shifted Gaussians". Math. Proc. Camb. Phil. Soc. 146: 523–530. doi:10.1017/S0305004108001849.
- Trefethen, L.N. (2020). Approximation theory and approximation practice. SIAM. ISBN 978-1-61197-594-9. Ch. 1–6 of 2013 edition
External links
- History of Approximation Theory (HAT)
- Surveys in Approximation Theory (SAT)
In mathematics approximation theory is concerned with how functions can best be approximated with simpler functions and with quantitatively characterizing the errors introduced thereby What is meant by best and simpler will depend on the application A closely related topic is the approximation of functions by generalized Fourier series that is approximations based upon summation of a series of terms based upon orthogonal polynomials One problem of particular interest is that of approximating a function in a computer mathematical library using operations that can be performed on the computer or calculator e g addition and multiplication such that the result is as close to the actual function as possible This is typically done with polynomial or rational ratio of polynomials approximations The objective is to make the approximation as close as possible to the actual function typically with an accuracy close to that of the underlying computer s floating point arithmetic This is accomplished by using a polynomial of high degree and or narrowing the domain over which the polynomial has to approximate the function Narrowing the domain can often be done through the use of various addition or scaling formulas for the function being approximated Modern mathematical libraries often reduce the domain into many tiny segments and use a low degree polynomial for each segment Error between optimal polynomial and log x red and Chebyshev approximation and log x blue over the interval 2 4 Vertical divisions are 10 5 Maximum error for the optimal polynomial is 6 07 10 5 Error between optimal polynomial and exp x red and Chebyshev approximation and exp x blue over the interval 1 1 Vertical divisions are 10 4 Maximum error for the optimal polynomial is 5 47 10 4 Optimal polynomialsOnce the domain typically an interval and degree of the polynomial are chosen the polynomial itself is chosen in such a way as to minimize the worst case error That is the goal is to minimize the maximum value of P x f x displaystyle mid P x f x mid where P x is the approximating polynomial f x is the actual function and x varies over the chosen interval For well behaved functions there exists an Nth degree polynomial that will lead to an error curve that oscillates back and forth between e displaystyle varepsilon and e displaystyle varepsilon a total of N 2 times giving a worst case error of e displaystyle varepsilon It is seen that there exists an Nth degree polynomial that can interpolate N 1 points in a curve That such a polynomial is always optimal is asserted by the equioscillation theorem It is possible to make contrived functions f x for which no such polynomial exists but these occur rarely in practice For example the graphs shown to the right show the error in approximating log x and exp x for N 4 The red curves for the optimal polynomial are level that is they oscillate between e displaystyle varepsilon and e displaystyle varepsilon exactly In each case the number of extrema is N 2 that is 6 Two of the extrema are at the end points of the interval at the left and right edges of the graphs Error P x f x for level polynomial red and for purported better polynomial blue To prove this is true in general suppose P is a polynomial of degree N having the property described that is it gives rise to an error function that has N 2 extrema of alternating signs and equal magnitudes The red graph to the right shows what this error function might look like for N 4 Suppose Q x whose error function is shown in blue to the right is another N degree polynomial that is a better approximation to f than P In particular Q is closer to f than P for each value xi where an extreme of P f occurs so Q xi f xi lt P xi f xi displaystyle Q x i f x i lt P x i f x i When a maximum of P f occurs at xi then Q xi f xi Q xi f xi lt P xi f xi P xi f xi displaystyle Q x i f x i leq Q x i f x i lt P x i f x i P x i f x i And when a minimum of P f occurs at xi then f xi Q xi Q xi f xi lt P xi f xi f xi P xi displaystyle f x i Q x i leq Q x i f x i lt P x i f x i f x i P x i So as can be seen in the graph P x f x Q x f x must alternate in sign for the N 2 values of xi But P x f x Q x f x reduces to P x Q x which is a polynomial of degree N This function changes sign at least N 1 times so by the Intermediate value theorem it has N 1 zeroes which is impossible for a polynomial of degree N Chebyshev approximationOne can obtain polynomials very close to the optimal one by expanding the given function in terms of Chebyshev polynomials and then cutting off the expansion at the desired degree This is similar to the Fourier analysis of the function using the Chebyshev polynomials instead of the usual trigonometric functions If one calculates the coefficients in the Chebyshev expansion for a function f x i 0 ciTi x displaystyle f x sim sum i 0 infty c i T i x and then cuts off the series after the TN displaystyle T N term one gets an Nth degree polynomial approximating f x The reason this polynomial is nearly optimal is that for functions with rapidly converging power series if the series is cut off after some term the total error arising from the cutoff is close to the first term after the cutoff That is the first term after the cutoff dominates all later terms The same is true if the expansion is in terms of bucking polynomials If a Chebyshev expansion is cut off after TN displaystyle T N the error will take a form close to a multiple of TN 1 displaystyle T N 1 The Chebyshev polynomials have the property that they are level they oscillate between 1 and 1 in the interval 1 1 TN 1 displaystyle T N 1 has N 2 level extrema This means that the error between f x and its Chebyshev expansion out to TN displaystyle T N is close to a level function with N 2 extrema so it is close to the optimal Nth degree polynomial In the graphs above the blue error function is sometimes better than inside of the red function but sometimes worse meaning that it is not quite the optimal polynomial The discrepancy is less serious for the exp function which has an extremely rapidly converging power series than for the log function Chebyshev approximation is the basis for Clenshaw Curtis quadrature a numerical integration technique Remez s algorithmThe Remez algorithm sometimes spelled Remes is used to produce an optimal polynomial P x approximating a given function f x over a given interval It is an iterative algorithm that converges to a polynomial that has an error function with N 2 level extrema By the theorem above that polynomial is optimal Remez s algorithm uses the fact that one can construct an Nth degree polynomial that leads to level and alternating error values given N 2 test points Given N 2 test points x1 displaystyle x 1 x2 displaystyle x 2 xN 2 displaystyle x N 2 where x1 displaystyle x 1 and xN 2 displaystyle x N 2 are presumably the end points of the interval of approximation these equations need to be solved P x1 f x1 eP x2 f x2 eP x3 f x3 e P xN 2 f xN 2 e displaystyle begin aligned P x 1 f x 1 amp varepsilon P x 2 f x 2 amp varepsilon P x 3 f x 3 amp varepsilon amp vdots P x N 2 f x N 2 amp pm varepsilon end aligned The right hand sides alternate in sign That is P0 P1x1 P2x12 P3x13 PNx1N f x1 eP0 P1x2 P2x22 P3x23 PNx2N f x2 e displaystyle begin aligned P 0 P 1 x 1 P 2 x 1 2 P 3 x 1 3 dots P N x 1 N f x 1 amp varepsilon P 0 P 1 x 2 P 2 x 2 2 P 3 x 2 3 dots P N x 2 N f x 2 amp varepsilon amp vdots end aligned Since x1 displaystyle x 1 xN 2 displaystyle x N 2 were given all of their powers are known and f x1 displaystyle f x 1 f xN 2 displaystyle f x N 2 are also known That means that the above equations are just N 2 linear equations in the N 2 variables P0 displaystyle P 0 P1 displaystyle P 1 PN displaystyle P N and e displaystyle varepsilon Given the test points x1 displaystyle x 1 xN 2 displaystyle x N 2 one can solve this system to get the polynomial P and the number e displaystyle varepsilon The graph below shows an example of this producing a fourth degree polynomial approximating ex displaystyle e x over 1 1 The test points were set at 1 0 7 0 1 0 4 0 9 and 1 Those values are shown in green The resultant value of e displaystyle varepsilon is 4 43 10 4 Error of the polynomial produced by the first step of Remez s algorithm approximating ex over the interval 1 1 Vertical divisions are 10 4 The error graph does indeed take on the values e displaystyle pm varepsilon at the six test points including the end points but that those points are not extrema If the four interior test points had been extrema that is the function P x f x had maxima or minima there the polynomial would be optimal The second step of Remez s algorithm consists of moving the test points to the approximate locations where the error function had its actual local maxima or minima For example one can tell from looking at the graph that the point at 0 1 should have been at about 0 28 The way to do this in the algorithm is to use a single round of Newton s method Since one knows the first and second derivatives of P x f x one can calculate approximately how far a test point has to be moved so that the derivative will be zero Calculating the derivatives of a polynomial is straightforward One must also be able to calculate the first and second derivatives of f x Remez s algorithm requires an ability to calculate f x displaystyle f x f x displaystyle f x and f x displaystyle f x to extremely high precision The entire algorithm must be carried out to higher precision than the desired precision of the result After moving the test points the linear equation part is repeated getting a new polynomial and Newton s method is used again to move the test points again This sequence is continued until the result converges to the desired accuracy The algorithm converges very rapidly Convergence is quadratic for well behaved functions if the test points are within 10 15 displaystyle 10 15 of the correct result they will be approximately within 10 30 displaystyle 10 30 of the correct result after the next round Remez s algorithm is typically started by choosing the extrema of the Chebyshev polynomial TN 1 displaystyle T N 1 as the initial points since the final error function will be similar to that polynomial Main journalsJournal of Approximation Theory Constructive Approximation East Journal on ApproximationsSee alsoEstimation theory Fourier series Function approximation Numerical analysis Orthonormal basis Pade approximant Schauder basis Kalman filterReferencesAchiezer Akhiezer N I 2013 1956 Theory of approximation Translated by Hyman C J Dover ISBN 978 0 486 15313 1 OCLC 1067500225 Timan A F 2014 1963 Theory of approximation of functions of a real variable International Series in Pure and Applied Mathematics Vol 34 Elsevier ISBN 978 1 4831 8481 4 Hastings Jr C 2015 1955 Approximations for Digital Computers Princeton University Press ISBN 978 1 4008 7559 7 Hart J F Cheney E W Lawson C L Maehly H J Mesztenyi C K Rice Jr J R Thacher H C Witzgall C 1968 Computer Approximations Wiley OCLC 0471356301 Fox L Parker I B 1968 Chebyshev Polynomials in Numerical Analysis Oxford mathematical handbooks Oxford University Press ISBN 978 0 19 859614 1 OCLC 9036207 Press WH Teukolsky S A Vetterling W T Flannery B P 2007 5 8 Chebyshev Approximation Numerical Recipes The Art of Scientific Computing 3rd ed Cambridge University Press ISBN 978 0 521 88068 8 Cody Jr W J Waite W 1980 Software Manual for the Elementary Functions Prentice Hall ISBN 0 13 822064 6 OCLC 654695035 Remes Remez E 1934 Sur le calcul effectif des polynomes d approximation de Tschebyschef C R Acad Sci in French 199 337 340 Steffens K G 2006 The History of Approximation Theory From Euler to Bernstein Birkhauser doi 10 1007 0 8176 4475 X ISBN 0 8176 4353 2 Erdelyi T 2008 Extensions of the Bloch Polya theorem on the number of distinct real zeros of polynomials PDF Journal de theorie des nombres de Bordeaux 20 281 7 Erdelyi T 2009 The Remez inequality for linear combinations of shifted Gaussians Math Proc Camb Phil Soc 146 523 530 doi 10 1017 S0305004108001849 Trefethen L N 2020 Approximation theory and approximation practice SIAM ISBN 978 1 61197 594 9 Ch 1 6 of 2013 editionExternal linksHistory of Approximation Theory HAT Surveys in Approximation Theory SAT