Beruflich Dokumente
Kultur Dokumente
Learning Objectives
Describe how data is represented and
stored within computer hardware
Describe how nonnumeric data is
represented
Describe how simple data types are
used as building blocks to create more
complex data structures (e.g., arrays,
records)
2
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
3
Compactness
Accuracy
Range
Ease of manipulation
Standardization
Accuracy
Precision of representation increases with
number of data bits used
5
Standardization
Ensures correct and efficient data transmission
Provides flexibility to combine hardware from
different vendors with minimal data
communication problems
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
8
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
10
Binary
68
BCD
= 0100 0100
= 0110
1000
= 26 + 22 = 64 + 4 = 68
= 22 + 21 = 6
23 = 8
99
(largest 8-bit
BCD)
= 0110 0011
= 1001
1001
+ + =
=
= 64 + 32 + 2 + 1 = 99
=
=
255
(largest 8-bit
binary)
= 1111 1111
= 0010
= 28 1 = 255
= 21
= 2
26 +
25
21
20
23
+
9
20
23 + 20
9
0101
22 + 20
5
0101
22 + 20
5
BCD Range
Binary Range
0-9
1 digit
0-15
1+ digit
0-99
2 digits
0-255
2+ digits
12
0-999
3 digits
0-4,095
3+ digits
16
0-9,999
4 digits
0-65,535
4+ digits
20
0-99,999
5 digits
0-1 million
6 digits
24
0-999,999
6 digits
0-16 million
7+ digits
32
0-99,999,999
8 digits
0-4 billion
9+ digits
64
0-(1016-1)
16 digits
0-16 quintillion
19+ digits
precision
Support by business-oriented languages like
COBOL
Signed Integers
Sign-and-magnitude representation
Excess-N notation
2s complement (most common)
16
Excess-8 Notation
How to represent?
Add the offset to the
decimal value of the
signed integer to
represent it.
For example, 0 is
represented by 0+8
= 8 = 1000, -5 is
represented by -5+8
= 3 = 0011, and +5 is
represented by 5+8
= 13 = 1101
19
Complementary Representation
Sign of the number does not have to be
handled separately
Consistent for all different signed
combinations of input numbers
10
Twos Complement
Most common use in computers.
A fixed number of bit positions are used.
Consistent for all different signed combinations of
input numbers
Subtraction can be performed as addition of a negative
value: a - b = a + (-b)
Numbers
Representation method
Range of decimal
numbers
Calculation
Representation
example
Negative
Positive
Complement
Number itself
-12810
-110
Inversion+1
10000000
11111111
+010
12710
None
00000000
01111111
22
11
Subtraction: converted to
addition
add 2 8-bit numbers with
different signs:
45 - 58 = 45 + (-58) = -13
0010 1101 =
45
+0011 1010 =
0110 0111 =
+58
103
0010 1101 =
+1100 0110 =
1111 0011 =
45
58
13
23
12
Integer Overflow
Fixed word size has a fixed range size.
Occurs when absolute value of a
computational result contains too many bits
to fit into fixed-width data format
i.e., the value is too large to be stored.
13
Example of Overflow
8-bit signed number
256 different values
Data range:
-128 to +127
0100 0000 =
64
+ 0100 0001 =
+65
1000 0001
-127
Add
2 positive inputs
produced negative
result
overflow!
Wrong answer!
0111 1111
Invert +1 to
get magnitude
12710
Avoiding Overflow
Overflow can be avoided by increasing number of
bits representing the data.
On the previous example, if we add one more bit, the
new data range will be from -256 to +255 for 9-bit
signed numbers.
Each number is represented by 9-bit notation.
2 positive inputs
0 0100 0000 =
produced positive
+ 0 0100 0001 =
result
0 1000 0001
Right answer!
01000001=+12910
64
+65
+129
28
14
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
29
Representation formats:
Fixed radix point format
Floating point notation (commonly used)
30
15
16
-6.35790 x 10-6
Location
of decimal
point
Mantissa
Base
Exponent
33
-1.110010 x 2-6
Location
of radix
point
Mantissa
Base
Exponent
34
17
8-digit Exponent
23-digit Mantissa
35
36
18
37
38
19
127
128
255
00000000
01111111
10000000
11111111
-127
Increasing value
128
+
39
40
20
1000 0000
Mantissa
1011 0000 0000 0000 0000 000
= +1.1011 x 21 = +11.0 11 = 3.37510
1000 0011
0111 1101
21
Truncation
Stores numeric value in the mantissa until
available bits are consumed; discards
remaining bits
Causes an error or approximation which can
magnify
Avoid by using integer types
43
Processing Complexity
Floating point formats
Optimized for processing efficiency
Require complex processing circuitry
(translates to difference in speed)
44
22
Programming Considerations
Integer advantages
Programming Considerations
Real numbers
Variable or constant has fractional part
Numbers take on very large or very
small values outside integer range
Program should use least precision
sufficient for the task
5-46
23
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
47
24
49
25
51
NUL
DLE
space
SOH
DC1
STX
DC2
ETX
DC3
EOT
DC4
ENQ
NAK
ACJ
SYN
&
BEL
ETB
BS
CAN
HT
EM
LF
SUB
VT
ESC
FF
FS
<
CR
GS
SO
RS
>
SI
US
DEL
7416
111 0100
52
26
LSD
ASCII strings:
1010011 1100001 1101101
S
a
m
Hexadecimal encoding:
53
61
6D
97
109
54
27
ASCII Limitations
Insufficient range
Uses 7-bit code, providing 128 table
entries
English-based
55
Unicode
The prevalent code today is Unicode.
Multilingual character encoding standard
encompassing all of the worlds written
languages.
Defines codes for
Nearly every character-based alphabet
Large set of ideographs for Chinese, Japanese
and Korean
Composite characters for vowels and syllabic
clusters required by some languages
56
28
Unicode (cont.)
Unicode was originally a 2-byte character set.
Each character is coded using 16 bit binary
strings.
65,535, or 216 characters are represented.
ASCII as a subset
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
58
29
59
30
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
61
62
31
Segment identifies the page, offset identifies the byte within the
page.
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
64
32
Data Structures
Related groups of primitive data elements
organized for a type of common processing
Defined and manipulated within software
Commonly used data structures: arrays,
linked lists, records, tables, files, indices, and
objects
Many use pointers to link primitive data
components
65
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
66
33
68
34
Linked Lists
A pointer is a data
element that contains
the address of another
data element.
A linked list is a data
structure that uses
pointers so list
elements can be
scattered among nonsequential storage
locations
Easier to expand or
shrink than an array
69
70
35
Adding a new
element to the existing
linked list is easy:
1. Allocate an empty
storage unit for the new
element C;
2. Copy the pointer from
the element B which is
preceding C into the
pointer field of C;
3. Make a new pointer
connecting B and C.
71
36
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
73
Files or Databases
Sequence of records on secondary storage.
74
37
Indexed files
Records will not be stored in contiguous storage locations.
Uses an index which is an array of pointers to records.
Efficient record insertion, deletion, and retrieval. Each time
upon records change, the index needs updating. But as the
index is a small array, it is fast to update.
75
76
38
Outline
Goals of data representation
Primitive data types
Integer
Real number
Character
Boolean
Memory address
Data structures
Arrays and lists
Records and files
Classes and objects
77
Objects
One instance, or variable, of the class
78
39
A Class Example
79
Summary
Understanding data representation is
key to understanding hardware and
software technology
All data, including nonnumeric data, are
represented within a modern computer
system as strings of binary digits, or
bits.
Each bit string has a specific data
format and coding method.
80
40
Summary (Cont.)
Numeric data is stored using integer, real
number, and floating point formats.
Characters are converted to numbers by
means of a coding table.
Boolean values can have only two values,
true and false.
Data structures are used by programs to
define and manipulate data in larger and
more complex units than primitive CPU data
types.
81
41