Sie sind auf Seite 1von 1

When is –32,104 equal to +25,201?

Many non-programmers process massive amounts of data on a daily basis. Often these
files come from sources outside of their company, and go to users who likewise are
outside of the organization. In the meantime, records have been added or dropped and
formats have been changed. It has been our observation that when these employees do
not understand data representation, quality suffers!

Many of these employees risk corrupting their data when they transfer files from the
mainframe to the PC. They often aren't aware that they are dealing with two different
coding schemes, and the impact this has on the file transfer process.

These coding schemes are EBCDIC and ASCII. Generally, we can think of EBCDIC as
the mainframe code and ASCII as the PC code. Technically, the word "code" is incorrect
here, as these schemes are really "substitution ciphers". Even the Morse Code is really a
cipher! You may have worked with ciphers as a kid. The most common cipher is where
each letter is substituted with its ordinal position in the alphabet: A=1, B=2, C=3, … ,
Z=26. As a kid, I could encipher CAB as 3-1-2.

EBCDIC and ASCII are just different ciphers. In EBCDIC, CAB would be 195-193-194
(or hexadecimal X'C3C1C2'), whereas in ASCII, CAB would be 43-41-42 (or
hexadecimal X'2B292A'.)

The file transfer process is really a very dumb process; that is, it has no idea what is being
transferred. The file transfer program transfers – and translates – one byte at a time.
This is not a problem if the file being transferred contains "text" data only. But if the file
contains binary or packed decimal fields, problems will likely occur. Binary and packed
decimal fields must be converted to "text" before file transfers!

Consider the following example. Assume –32,104 is stored on the mainframe as a binary
halfword. This would occupy two bytes as X'8298'. This file needs to be downloaded to a
PC. But, as stated before, the file transfer program will translate one byte at a time. In
EBCDIC, a X'82' is a lower-case letter 'b'. The file transfer program attempts to find the
equivalent character in ASCII. In ASCII, a lower-case letter 'b' is X'62'. Likewise, in
EBCDIC, the X'98' is a lower-case letter 'q'. In ASCII, a lower-case letter 'q' is X'71'. So
X'8298' on the mainframe becomes X'6271' on the PC.

On the mainframe side, X'8298' was the number –32,104 but it was also the letters 'bq' –
it's an issue of context. On the PC side, the letters 'bq' are represented as X'6271' which if
read as a binary halfword is the number +25,201. If the PC-based program which would
process this downloaded data was expecting a binary halfword, the value of that
halfword has changed!

Das könnte Ihnen auch gefallen