Applied Numerical Methods Using MATLAB. Won Y. Yang

Чтение книги онлайн.

Читать онлайн книгу Applied Numerical Methods Using MATLAB - Won Y. Yang страница 22

Applied Numerical Methods Using MATLAB - Won Y. Yang

Скачать книгу

= 1)(1.2.6.2a) (1.2.6.2b)

       (3) Basic normalized range (with the value of hidden bit bh = 1)(1.2.6.3a) (1.2.6.3b)

       (4) The largest normalized range (with the value of hidden bit bh = 1)(1.2.6.4a) (1.2.6.4b)

       (5) ± ∞(inf) with Exp = 211 − 1 = 2047, E = Exp − 1023 = 1024 (meaningless)

      From what has been mentioned earlier, we know that the minimum and maximum positive numbers are, respectively,

      (1.2.7a)equation

      (1.2.7b)equation

      where the three MATLAB constants, i.e. eps, realmin, and realmax, represent 2−52, 2−1022, and (2 − 2−52) × 21023, respectively. This can be checked by running the script “nm109.m” in Section 1.I..

      Now, in order to gain some idea about the arithmetic computational mechanism, let us see how the addition of two numbers, 3 and 14, represented in the IEEE 64‐bit floating number system, is performed.

Image described by caption and surrounding text.

       >x=2̂30; x+2̂-22==x, x+2̂-23==x ans= 0(false) ans= 1(true)

      1 (cf) Each range has a different minimum unit (LSB value) described by Eq. (1.2.5). It implies that the numbers are uniformly distributed within each range. The closer the range is to 0, the denser the numbers in the range are. Such a number representation makes the absolute quantization error large/small for large/small numbers, decreasing the possibility of large relative quantization error.

      There are various kinds of errors that we encounter when using a computer for computation.

       Truncation error: Caused by adding up to a finite number of terms, while we should add infinitely many terms to get the exact answer in theory.

       Round‐off error: Caused by representing/storing numeric data in finite bits.

       Overflow/underflow: Caused by too large or too small numbers to be represented/stored properly in finite‐bits, more specifically, the numbers having absolute values larger/smaller than the maximum ( fmax)/minimum ( fmin) number that can be represented in MATLAB.

       Negligible addition: Caused by adding two numbers of magnitudes differing by over 52 bits, as can be seen in the last section.

       Loss of significance: Caused by a ‘bad subtraction’, which means a subtraction of a number from another one that is almost equal in value.

       Error magnification: Caused and magnified/propagated by multiplying/dividing a number containing a small error with a large/small number.

       Errors depending on the numerical algorithms, step size, and so on.

      For instance, consider the following two formulas:

      (1.2.8)equation

      These are theoretically equivalent, whence we expect them to give exactly the same value. However, running the following MATLAB script “nm122.m” to compute the values of the two formulas, we see a surprising result that, as x increases, the step of f1(x) incoherently moves hither and thither, while f2(x) approaches 1/2 at a steady pace. We might feel betrayed by the computer and have a doubt about its reliability. Why does such a flustering thing happen with f1(x)? It is because the number of significant bits abruptly decreases when the subtraction images is performed for large values of x, which is called ‘loss of significance’. In order to take a close look at this phenomenon, let x = 1015. Then we have

equation

      These two numbers have 52 significant bits, or equivalently 16 significant digits (252 ≈ 1052×3/10 ≈ 1015) so that their significant digits range from 108 to 10−8. Accordingly, the least significant digit (LSD) of their sum and difference is also the eighth digit after the decimal point (10−8).

equation equation

      %nm122.m f1=@(x)sqrt(x)*(sqrt(x+1)-sqrt(x)); f2=@(x)sqrt(x)./(sqrt(x+1)+sqrt(x)); x=1; format long e for k=1:15 fprintf('At x=%15.0f, f1(x)=%20.18f, f2(x)=%20.18f', x,f1(x),f2(x)); x= 10*x; end sx1=sqrt(x+1); sx=sqrt(x); d=sx1-sx; s=sx1+sx; fprintf('sqrt(x+1)=%25.13f, sqrt(x)=%25.13f ',sx1,sx); fprintf(' diff=%25.23f, sum=%25.23f ',d,s);

      >nm122 At x= 1, f1(x)=0.414213562373095150, f2(x)=0.414213562373095090 At x= 10, f1(x)=0.488088481701514750, f2(x)=0.488088481701515480 At x= 100, f1(x)=0.498756211208899460, f2(x)=0.498756211208902730 At x= 1000, f1(x)=0.499875062461021870, f2(x)=0.499875062460964860 At x= 10000, f1(x)=0.499987500624854420, f2(x)=0.499987500624960890 At x= 100000, f1(x)=0.499998750005928860, f2(x)=0.499998750006249940 At x= 1000000, f1(x)=0.499999875046341910, f2(x)=0.499999875000062490 At x= 10000000, f1(x)=0.499999987401150920, f2(x)=0.499999987500000580 At x= 100000000, f1(x)=0.500000005558831620, f2(x)=0.499999998749999950 At x= 1000000000, f1(x)=0.500000077997506340, f2(x)=0.499999999874999990

Скачать книгу