Αναπαράσταση αριθμών στον υπολογιστή Σφάλματα ΑΡΙΘΜΗΤΙΚΗ ΑΝΑΛΥΣΗ Αναπαράσταση αριθμών στον υπολογιστή Σφάλματα
Scientific computing is a discipline concerned with the development and study of numerical algorithms for solving mathematical problems that arise in various disciplines in science and engineering
Numerical methods are an essential part of an engineer’s life They allow to solve a much wider range of physical/engineering problems than analytical methods do. Numerical methods are however only one part of the solution: a good physical understanding of the problem is essential. Numerical methods are the elementary pieces of more complex codes and software used for scientific simulation. Studying numerical methods allows one to - understand how more complex codes work - be able to modify an existing code or create a new one - understand the errors/limits introduced by the numerical simulation
2. ΑΝΑΠΑΡΑΣΤΑΣΗ ΑΡΙΘΜΩΝ ΣΤΟΝ ΥΠΟΛΟΓΙΣΤΗ Base 10 Base 2
Numbers that have a finite expansion in one numbering system may have an infinite expansion in another numbering system: You can never represent 1.1 exactly in binary system.
Convert Base 10 Integer to binary representation Converting a base-10 integer to binary representation. Quotient Remainder 11/2 5 5/2 2 2/2 1 1/2
Fractional Decimal Number to Binary Converting a base-10 fraction to binary representation. Number Number after decimal Number before 0.375 0.75 1.5 0.5 1.0 0.0 THG picture
Decimal Number to Binary Since and we have
All Fractional Decimal Numbers Cannot be Represented Exactly Converting a base-10 fraction to approximate binary representation. Number Number after decimal before Decimal 0.6 1.2 0.2 0.4 0.8 1.6
Floating Point Representation (decimal) Floating Decimal Point : Scientific Form
Floating-Point Arithmetic (cont.) A computer number has three parts the sign (+ or -) the fraction part (called the mantissa) the exponent part There are three level of precision and these are the number of bits used for mantissa and exponent. Length Sign Mantissa Exponent Range Single 32 1 23 8 10±38 Double 64 52 11 10±308 Extended 80 15 10±4931
The form is or Example: For
Floating Point Format for Binary Numbers 1 is not stored as it is always given to be 1.
Example 9 bit-hypothetical word the first bit is used for the sign of the number, the second bit for the sign of the exponent, the next four bits for the mantissa, and the next three bits for the exponent We have the representation as 1 mantissa exponent Sign of the number Sign of the exponent
Machine Epsilon Defined as the measure of accuracy and found by difference between 1 and the next number that can be represented Ten bit word Sign of number Sign of exponent Next four bits for exponent Next four bits for mantissa Next number 1
Relative Error and Machine Epsilon The absolute relative true error in representing a number will be less then the machine epsilon Example 10 bit word (sign, sign of exponent, 4 for exponent, 4 for mantissa) 1 Sign of the number exponent mantissa Sign of the exponent
Subtracting two almost equal numbers
IEEE-754 Floating Point Standard Standardizes representation of floating point numbers on different computers in single and double precision. Standardizes representation of floating point operations on different computers.
IEEE-754 Format Single & Double Precision Exponent8 Fraction23 32 bits for single precision Sign (s) Biased Exponent (e’) Mantissa (m) Double precision S Exponent11 Fraction52 (continued)
Example#1 1 Sign (s) Biased Exponent (e’) Mantissa (m)
Represent -5.5834x1010 as a single precision floating point number. Example#2 Represent -5.5834x1010 as a single precision floating point number. ? Sign (s) Biased Exponent (e’) Mantissa (m)
Exponent for 32 Bit IEEE-754 8 bits would represent Bias is 127; so subtract 127 from representation
Exponent for Special Cases Actual range of and are reserved for special numbers Actual range of
Special Exponents and Numbers all zeros all ones s m Represents all zeros 1 -0 all ones 0 or 1 non-zero NaN
IEEE-754 Format The largest number by magnitude The smallest number by magnitude Machine epsilon
Significant Digits Significant digits are those digits that can be used with confidence. Single-Precision: 7 Significant Digits 1.175494… × 10-38 to 3.402823… × 1038 Double-Precision: 15 Significant Digits 2.2250738… × 10-308 to 1.7976931… × 10308
Using Floating Point Numbers Beware of meaningless precision! In 1424, Jamshid Masud al-Kashi published = 3.141 592 653 589 793 25… …but noted that the error in computing the perimeter of a circle with a radius 600’000 times that of earth would be less than the thickness of a horse’s hair. Donald E. Knuth : Floating point arithmetic is by nature inexact, and it is not difficult to misuse it so that the computed answers consist almost entirely of “noise”. One of the principal problems of numerical analysis is to determine how accurate the results of certain numerical methods will be. At that time, the radius of the earth was not known exactly. The radius of the circle corresponding to the error quoted by Jamshid Masud (0.7mm) is 20 times larger than the half axis of the orbit of Pluto.
Using Floating Point Numbers Never use equality between two floating point numbers !!!!!!!! Use a special method to compare them!!!!! (define an acceptable error for the specific problem)
3. Σφάλματα (Errors) Η λύση ενός προβλήματος με τη βοήθεια αριθμητικών μεθόδων διαφέρει πάντοτε από την ακριβή λύση λόγω της παρουσίας σφαλμάτων. Διάφορες πηγές μπορούν να προκαλέσουν σφάλματα στα αριθμητικά αποτελέσματα Πηγές σφαλμάτων: Μετρήσεις φυσικών ή χημικών συσκευών και μηχανισμών (δεδομένα που περιέχουν εγγενώς σφάλματα), ονομάζονται αρχικά σφάλματα. Σφάλμα του μαθηματικού προβλήματος ή σφάλμα της μαθηματικής περιγραφής. Προέρχεται από τη μετατροπή του προβλήματος σε μαθηματικό πρόβλημα, περιέχοντας απλοποιήσεις ή και παραλείψεις. Σφάλματα από προβλήματα κακής κατάστασης :αριθμητικά προβλήματα που είναι πολύ ευαίσθητα σε μικρές μεταβολές των δεδομένων.
Accuracy and Precision Accuracy refers to how closely a computed or measured value agrees with the true value, while precision refers to how closely individual computed or measured values agree with each other (Μέτρο της ικανότητας διάκρισης μεταξύ σχεδόν ίσων τιμών). a) inaccurate and imprecise b) accurate and imprecise c) inaccurate and precise d) accurate and precise
Four possible sources of errors ➡ Measurement error ➡ Modeling error ➡ Truncation error ➡ Round-off error
A. Measurement error ➡ Any instrument has a limit on its precision, and an experimental result is always obtained with a tolerance: e.g. 20.2 ± 0.1 cm B. Modeling error ➡ Difference between the real system and the simplified description used. ➡ Example of the bridge: representing the elements as homogeneous or with a simplified geometry.
C. Truncation error ➡ This error arises due to the discrete or iterative nature of the numerical methods used. ➡ For example, an iterative scheme can be developed to obtain the physical quantity G. - If the scheme is well-designed then G(n) approaches the true value of G0 when n→∞. - However, we always have to stop at a finite value of n=N. The truncation error is the difference between G(N) and G(∞). ➡ A truncation error also arises when approximating a continuous quantity by a discrete form or a derivative by a discrete limit:
D. Round-off error This error is intrinsic to the use of a computer. A computer does not use the real number but finite-precision numbers (e.g. some decimals are discarded.) The propagation of rounding errors from one floating point operation to the next is the most frequent source of numerical instabilities.
Sources of Numerical Errors Round off error (σφάλμα στρογγυλοποίησης) Προκύπτει από την ανάγκη αναπαράστασης των αριθμών με πεπερασμένο πλήθος ψηφίων Truncation error (σφάλμα αποκοπής) Δημιουργείται από τον αλγόριθμο που χρησιμοποιείται και την προσέγγιση που επιλέγεται (κατά την αντικατάσταση μιας ακριβούς διαδικασίας υπολογισμού με μία προσεγγιστική)
Round off Error Caused by representing a number approximately
Rounding errors are random
Problems created by round off error 28 Americans were killed on February 25, 1991 by an Iraqi Scud missile in Dhahran, Saudi Arabia. The patriot defense system failed to track and intercept the Scud. Why?
Problem with Patriot missile Clock cycle of 1/10 seconds was represented in 24-bit fixed point register created an error of 9.5 x 10-8 seconds. The battery was on for 100 consecutive hours, thus causing an inaccuracy of The shift calculated in the ranging system of the missile was 687 meters. The target was considered to be out of range at a distance greater than 137 meters.
Example: quadrature of a circle
A large collection of software bugs Careless numerical computing does occasionally lead to disasters http://wwwzenger.informatik.tu-muenchen.de/persons/huckle/bugse.html
Άσκηση (σε MATLAB)
Why measure errors? 1) To determine the accuracy of numerical results. 2) To develop stopping criteria for iterative algorithms.
True Error (απόλυτο σφάλμα) Defined as the difference between the true value in a calculation and the approximate value found using a numerical method etc. True Error = True Value – Approximate Value
Example—True Error The derivative, of a function can be approximated by the equation, and If a) Find the approximate value of b) True value of c) True error for part (a)
Example (cont.) Solution: a) For and
Example (cont.) Solution: b) The exact value of can be found by using our knowledge of differential calculus. So the true value of is True error is calculated as True Value – Approximate Value
Relative True Error (σχετικό απόλυτο σφάλμα) Defined as the ratio between the true error, and the true value. True Error Relative True Error ( ) = True Value
Example—Relative True Error Following from the previous example for true error, find the relative true error for at with From the previous example, Relative True Error is defined as as a percentage,
Approximate Error (προσεγγιστικό σφάλμα) What can be done if true values are not known or are very difficult to obtain? Approximate error is defined as the difference between the present approximation and the previous approximation. Approximate Error ( ) = Present Approximation – Previous Approximation
Example—Approximate Error For at find the following, a) using b) using c) approximate error for the value of for part b) Solution: a) For and
Example (cont.) Solution: (cont.) b) For and
Example (cont.) Solution: (cont.) c) So the approximate error, is Present Approximation – Previous Approximation
Relative Approximate Error (σχετικό σφάλμα) Defined as the ratio between the approximate error and the present approximation. Approximate Error Relative Approximate Error ( ) = Present Approximation
Example—Relative Approximate Error For at , find the relative approximate error using values from and Solution: From Example 3, the approximate value of using and using Present Approximation – Previous Approximation
Example (cont.) Solution: (cont.) as a percentage, Approximate Error Present Approximation as a percentage, Absolute relative approximate errors may also need to be calculated,
How is Absolute Relative Error used as a stopping criterion? If where is a pre-specified tolerance, then no further iterations are necessary and the process is stopped. If at least m significant digits are required to be correct in the final answer, then
Table of Values For at with varying step size, 0.3 10.263 N/A 0.15 0.15 9.8800 3.877% 1 0.10 9.7558 1.273% 0.01 9.5378 2.285% 0.001 9.5164 0.2249% 2
Θεώρημα If a positive number x has n correct digits in the narrow sense, the relative error ER of this number does not exceed 1 10 𝑛−1 divided by the first significant digit of the given number or 𝐸 𝑅 ≤ 1 𝑎 𝑚 1 10 𝑛−1 ,where 𝑎 𝑚 is first significant digit of number x.
Παράδειγμα How many digits are to be taken in computing 20 so that the error does not exceed 0.1%?
Σφάλμα, Απόλυτο Σφάλμα και σημαντικά ψηφία Αν η τιμή 𝑥 είναι μια προσέγγιση της τιμής 𝑥 τότε : Σφάλμα : 𝐸 𝑥 =𝑥− 𝑥 Απόλυτο σφάλμα : 𝐸 𝑥 = 𝑥− 𝑥 Σχετικό σφάλμα : 𝑅 𝑥 = 𝑥− 𝑥 𝑥 , 𝑥≠0 Ο αριθμός 𝑥 λέμε ότι προσεγγίζει την πραγματική τιμή 𝑥 με 𝑑 σημαντικά ψηφία αν 𝑑 είναι ο μεγαλύτερος θετικός ακέραιος για τον οποίον ικανοποιείται η ανισότητα: 𝑥− 𝑥 𝑥 < 1 2 10 −𝑑 Εφαρμογή για 𝑥 =3,14 και 𝑥=3,141592
Problem conditioning and algorithm stability The problem is ill-conditioned if a small perturbation in the data may produce a large difference in the result. The problem is well-conditioned otherwise. The algorithm is stable if its output is the exact result of a slightly perturbed input.
An unstable algorithm
A stable algorithm
Effect of Carrying Significant Digits in Calculations
Find the contraction in the diameter Ta=80oF; Tc=-108oF; D=12.363” α = a0+ a1T + a2T2
Thermal Expansion Coefficient vs Temperature T(oF) α (μin/in/oF) -340 2.45 -300 3.07 -220 4.08 -160 4.72 -80 5.43 6.00 40 6.24 80 6.47
Regressing Data in Excel (general format) α = -1E-05T2 + 0.0062T + 6.0234
Observed and Predicted Values α = -1E-05T2 + 0.0062T + 6.0234 T(oF) α (μin/in/oF) Given Predicted -340 2.45 2.76 -300 3.07 3.26 -220 4.08 4.18 -160 4.72 4.78 -80 5.43 5.46 6.00 6.02 40 6.24 6.26 80 6.47 6.46
Regressing Data in Excel (scientific format) α = -1.2360E-05T2 + 6.2714E-03T + 6.0234
Observed and Predicted Values α = -1.2360E-05T2 + 6.2714E-03T + 6.0234 T(oF) α (μin/in/oF) Given Predicted -340 2.45 2.46 -300 3.07 3.03 -220 4.08 4.05 -160 4.72 4.70 -80 5.43 5.44 6.00 6.02 40 6.24 6.25 80 6.47 6.45
Observed and Predicted Values α = -1.2360E-05T2 + 6.2714E-03T + 6.0234 α = -1E-05T2 + 0.0062T + 6.0234 T(oF) α (μin/in/oF) Given Predicted -340 2.45 2.46 2.76 -300 3.07 3.03 3.26 -220 4.08 4.05 4.18 -160 4.72 4.70 4.78 -80 5.43 5.44 5.46 6.00 6.02 40 6.24 6.25 6.26 80 6.47 6.45 6.46
Truncation error Error caused by truncating or approximating a mathematical procedure.
Example of Truncation Error Taking only a few terms of a Maclaurin series to approximate If only 3 terms are used,
relative approximate error of less than 1%. n Calculate the value of with an absolute relative approximate error of less than 1%. n 1 __ ___ 2 2.2 1.2 54.545 3 2.92 0.72 24.658 4 3.208 0.288 8.9776 5 3.2944 0.0864 2.6226 6 3.3151 0.020736 0.62550 6 terms are required. How many are required to get at least 1 significant digit correct in your answer?
Taylor Series (1)
Taylor Series (2) To gain insight consider the mathematical formulation that is used widely in numerical methods - TAYLOR SERIES. A Taylor series :provides a means to predict a function value at one point in terms of the function value at and its derivatives at another point Some examples of Taylor series which you must have seen
General Taylor Series The general form of the Taylor series is given by provided that all derivatives of f(x) are continuous and exist in the interval [x, x+h] . As Archimedes would have said, “Give me the value of the function at a single point, and the value of all (first, second, and so on) its derivatives at that single point, and I can give you the value of the function at any other point”
Example—Taylor Series Find the value of given that and all other higher order derivatives of at are zero. Solution:
Example (cont.) Since the higher order derivatives are zero, Note that to find exactly, we only need the value of the function and all its derivatives at some other point, in this case
Derivation for Maclaurin Series for ex Derive the Maclaurin series The Maclaurin series is simply the Taylor series about the point x=0
Derivation (cont.) Since and the Maclaurin series is then So,
When x = 0.5 Terms Result εt (True percentage relative error) εa (Approx. percentage relative error) 1 (1.6487-1)/1.6487 = 39.3% 2 1.5 (1.6487-1.5)/1.6487 = 9.02% (1.5-1)/1.5 = 33.3% 3 1.625 1.44% (1.625-1.5)/1.625 = 7.69% 4 1.645833333 0.175% 1.27% 5 1.648437500 0.0172% 0.158% 6 1.648697917 0.00142% 0.0158%
How many terms should we use? Result εt εa 1 39.3% 2 1.5 9.02% 33.3% 3 1.625 1.44% 7.69% 4 1.645833333 0.175% 1.27% 5 1.648437500 0.0172% 0.158% 6 1.648697917 0.00142% 0.0158% Computation stops when |εa| < εs εs = pre-determined acceptable percentage relative error
Error in Taylor Series The Taylor polynomial of order n of a function f(x) with (n+1) continuous derivatives in the domain [x, x+h] is given by where the remainder is given by where that is, c is some point in the domain [x, x+h]
Example—error in Taylor series The Taylor series for at point is given by It can be seen that as the number of terms used increases, the error bound decreases and hence a better estimate of the function can be found. How many terms would it require to get an approximation of e1 within a magnitude of true error of less than 10-6.
Example—(cont.) Solution: Using terms of Taylor series gives error bound of Since
Example—(cont.) Solution: (cont.) So if we want to find out how many terms it would require to get an approximation of within a magnitude of true error of less than , (as we do not know the value of but it is less than 3) So 9 terms or more are needed to get a true error less than
Compare this value with one get from a calculator ( 0.909296723 ) Exercise-1 Use Taylor series to find the value of function for Compare this value with one get from a calculator ( 0.909296723 )