Beruflich Dokumente
Kultur Dokumente
Number Formats
I
University of Toronto
I
1 / 29
fixed-point format
floating-point format
Number Formats
Number Formats
Fixed-Point Format
I
I
2 / 29
simplest scheme
number is represented as an integer or fraction using a fixed
number of bits
An n-bit fixed-point signed integer 2n1 x 2n1 1 is
represented as:
represented with [1 0 0 0 0]
2n1
2n1 1
represented with [0 1 1 1 1]
3 / 29
4 / 29
Number Formats
Number Formats
Example: n = 3
n1
2
231
22
4
x
x
x
x
n1
2
1
31
2
1
2
2 1
3 = x {4, 3, 2, 1, 0, 1, 2, 3}
x = s 22 + b1 21 + b0
x = 1 22 + 1 21 + 0 20
5 / 29
Number Formats
Number Formats
6 / 29
= 1 20 + 0 21 + + 0 2(n2) + 0 2(n1)
=
I
x = s 20 + b1 21 + b2 22 +
+ b(n2) 2(n2) + b(n1) 2(n1)
represented with [1 0 0 0]
= 0 20 + 1 21 + + 1 2(n2) + 1 2(n1)
=
1 2(n1)
represented with [0 1 1 1]
resolution
7 / 29
8 / 29
Number Formats
Number Formats
Example: n = 3
How do you represent the number x = 14 ?
1 x (1 2(n1) )
1 x (1 2(31) )
1 x (1 22 ) = 1 x
3
4
x = s 20 + b1 21 + b2 22
1
= 0 20 + 0 21 + 1 22
4
[ 0 0 1 ]
3 1 1
1 1 3
x {1, , , , 0, , , }
4 2 4
4 2 4
x = s 20 + b1 21 + b2 22
1
4
1
2n1
9 / 29
Number Formats
10 / 29
Number Formats
Fixed-Point Format
Fixed-Point Format
Q: Why would one use fractional representation instead of integer
representation of numbers?
...
(a) Fixed-point format to represent signed integers
implied
binary point
11 / 29
...
implied
binary point
12 / 29
Number Formats
Number Formats
increase its size (i.e., the number of bits n); doubling the size
substantially increases the range of numbers represented
I
I
for n = 3, 4 x 3
for n = 6, 261 x 261 1 = 32 x 31
A: floating-point!
13 / 29
Number Formats
Number Formats
Floating-Point Format
Floating-Point Format
I
14 / 29
xy = Mx My 2Ex +Ey
x = Mx 2Ex
where Mx is called the mantissa and Ex is called the exponent.
15 / 29
16 / 29
Number Formats
Number Formats
Floating-Point Format
I
Floating-Point Format
-bits
implied
binary point
17 / 29
Significand
Number Formats
18 / 29
Number Formats
Floating-Point Format
Floating-Point Format
Example:
Find the decimal equivalent of the floating-point binary number with
bias = 23 1 = 7:
1
|{z}
sign
1011000011100
0110
00011100
|{z}
| {z }
biased exponent
2
significand
19 / 29
20 / 29
Dynamic Range
Dynamic Range
Example: n = 24 (fixed-point format)
2n1 x 2n1 1
2241 x 2241 1
8, 388, 608 x 8, 388, 607
maxx {|x|}
minx6=0 {|x|}
where
x = s 2n1 + bn2 2n2 + bn3 2n3 + + b1 21 + b0 20 ,
21 / 29
8, 388, 608
= 8, 388, 608
1
= 20 log10 (8, 388, 608) = 138 dB
dynamic range =
Dynamic Range
Dynamic Range
bias = 128.
with
8 1bias
xmax = 1 22
dynamic range =
22 / 29
maxx {|x|}
minx6=0 {|x|}
22 1bias
2bias
xmin = 1 20bias 1
!
8 1
= 20 log10 (22
24 / 29
Resolution
I
Precision
Resolution =
2k1
25 / 29
I
I
1
2k1
100%
26 / 29
Precision
1
2n1
(E bias)
x = (1) 2
Dynamic Range
Precision
(1 + F )
Fixed-Point
Floating-Point
138 dB
1535 dB
1.2 105 % 3.0 103 %
Precision =
1
100 = 3.0 103 %
15
2
27 / 29
28 / 29
Dynamic Range:
I
I
Resolution/Precision:
I
I
29 / 29