Sie sind auf Seite 1von 14

TERM PAPER

ON
CSE 318

TOPIC: DISASSEMBLER

SUBMITTED TO: SUBMITTED BY:


MS. N. PRIYANKA GUNJA KUMARI
R.NO.-RB1801A10
REG.NO.-10810231

TABLE OF CONTENTS
• Abstract
• Introduction
• System Requirements
 Memory
 Input devices
 Output devices
 Software
 Input
 Memory
 Disk sectors
 Disk boot files
 Binary load files
• Output
 Screen
 Printer
 Disk
 Cassette
• Working
• Unrecognized instructions
• Examples of disassemblers
• Disassembler issues
 Separating code from data
 Lost information
• Conclusion
• Flow chart of disassembler
• References
ABSTRACT

A “disassembler” is a well
recognized computer system
program. It translates machine
language into assembly language,
the inverse operation of
an assembler. Disassembler
differs from a decompiler, which
targets a high level
language rather than an assembly
language. Disassembly, the
output of a disassembler, is often
formatted for human-readability
rather than suitability for input to
an assembler, making it
principally a reverse-
engineering tool.

KEYWORDS:
DISK SECTORS
DISK BOOT FILES
BINARY LOAD FILES
CASSETTE
IDA 3.7
IDA Pro Freeware
PVDasm
diStorm64
if quant>100 if
A001>100
goto bigorder goto R002
INTRODUCTION print "end" print "end"
stop stop

A disassembler unassembles the


compiled executable into
assembly language statements. A DISASSEMBLER 6502 was
good disassembler will show the written for the hardcore
logic between various sections of assembler language programmer
code, comment on code, pull out who is seldom happy with an
the visible ASCII strings, and existing piece of software. When
separate the program between the no source code is available,
instructions and stored data modifying machine language
(often a disassembler’s most programs can be extremely
confounding task). difficult.
DISASSEMBLER 6502 creates
It is a software that converts source code from machine
machine language back into language that can be modified,
assembly language. Since there is reassembled, and executed.
no way to determine the human
thinking behind the logic of the
instructions, the resulting
assembly language routines and
variables are named and
numbered sequentially (A001,
A002, etc.). Disassembled code
can be very difficult to maintain
in its original state; however, the
code can be manually renamed SYSTEM
for future maintenance. REQUIREMENTS:
Hypothetical Hypothetical
MEMORY: 24k minimum and
Human-Written Machine-
48k desirable.
Created
Assembler Code
INPUT DEVICES: Keyboard
Disassembler Code
and one or more disk drives.

start in quant R001 in A001


OUTPUT DEVICES: Screen
and one disk drive. A second DISK BOOT FILES - By simply
disk drive or cassette drive is placing the disk in the specified
needed for machine-readable drive after disk boot option
output of disk boot files or large selection, any boot file on that
disassemblies. Double density disk will be disassembled. This
can be very beneficial. does not work on double density
disks. The ATARI operating
SOFTWARE: An system initialization code only
assembler/editor is required if supports double density disk boot
you wish to reassemble the in half sectors. Caution - Many
output. DISASSEMBLER 6502 times the disk boot file is simply
output has been tested on the a loader to load the remainder of
ATARI ASSEMBLER/EDITOR the program. Use the sector
cartridge and MAC/65 from disassembly option to
Optimized Systems Software. disassemble the remainder of the
program once it has been
INPUT: Input to the disassembler determined what sectors are
can be from the following being loaded.
sources:
BINARY LOAD FILES -
MEMORY - Any range of DISASSEMBLER 6502 will
memory addresses from 0 to disassemble any DOS 2.0 or
$FFFF can be disassembled. OS/A+ Version 2 format binary
Addresses can be entered in load file. Compound structures
either decimal or hex (e.g. 100 or are also supported.
$64).

DISK SECTORS - Any range of


sector numbers from 1 to 720 can
be disassembled. Sector numbers
may be entered in either decimal
or hex. The last three bytes of a
DOS format sector are control OUTPUT:
bytes. They contain the directory
entry number, next sector number There are four output options
and number of bytes used in the available for DISASSEMBLER
sector. Options are available to 6502. Any one or more in
disassemble sectors including or combination can be used. You
excluding these control bytes. may continue to select options
until the <RETURN> key is
pressed. At that time remaining the .INCLUDE facility and/or
information may be obtained and disk assembly facility which are
disassembly begun. available in many good
assemblers. They can also be
SCREEN - Output is directed to assembled separately and
the screen editor. A line of combined using the binary save
output includes the hex machine feature of DUP.SYS. A 1600
instruction, the 6502 assembler line file will use from 120 to 150
language instruction, and the hex single density disk sectors. Thus
address of the instruction. an empty disk can hold two files.
Double density disks have double
PRINTER - Output is directed to the capacity.
the printer. You will be
prompted for an optional page CASSETTE - Output is directed
heading to be printed at the top of to a cassette recorder in the same
each page. A line of output format as that directed to disk.
includes the hex instruction, the The files are also split into 1600
line sequence number (as it line files. This will require most
would appear on a disk output of one side of a sixty minute
file), the assembler instruction cassette.
and the hex address of the
instruction.

DISK - Output is directed to a


WORKING:
specified disk file. The output is
in LIST format and includes a
line sequence number, assembler Disassembler only disassembles
instruction, and hex address. The object code that is present in
file can be entered into the memory (ROM Included), the
assembler/editor for modification first menu allows us to load the
and reassembly. Since most code from some external Sources.
assemblers can only assemble Select the option by pressing the
about 1800-2100 lines of number key. Except for binary
instructions, the output is put into disk files, which contain their
multiple files of 1600 lines or own load addresses, all load
less. This allows room for options require that you specify a
modifications. An extender of starting address for the load. The
X01 - Xnn is appended to the file program asks if we wish to use a
name. All of the output files can string. If we don't, then just press
be reassembled as a unit by using return, and input the address.
Any free ram, including page six,
can be used. If we need to reserve M goes back to the top menu.
low memory, do that before Q Ends the program.
loading the basic program. The S writes a source file! First
device spec is optional. choice is whether to write a
Disassembler uses d1: as a regular source file, or a byte file,
default. Disk files may be either which is used for text, tables, and
data files (handy for those weird such. Files may be written to the
character strings) or regular printer. Input p or p: for a
binary files. Multi-stage binary filename. A line count option is
loads will ask for permission to provided for those with single
poke the bytes. When the entire sheet printers. If using
file has been read, input the continuous feed, input something
starting address for disassembly. like 10000 at the prompt. When
Data reads data statements, which the count is reached, the program
should be entered after will halt and beep to signal that
disassembler has been loaded. it's waiting for a key press to
Make sure the line numbers are continue. Disk files are in listed
above 1540. The program will format, so they can be entered
read the whole block, poking into asm/ed, mac/65, or any other
bytes starting at the first address line oriented assembler which
specified. Keyboard lets us type uses standard opcodes. Line
in programs directly. Typing 999 numbers correspond to addresses,
back up five bytes to correct so if you see a 'jmp 1608', you
typos. Any minus number starts can’ list 1608' to see what's there.
the disassembly. Once the screen No .org address is included in the
is full, select an option from the file. Once you have the file, don't
menu by pressing the appropriate renumber it until you've provided
letter key. labels for all the appropriate
C (or return) continues references.
disassembly inline.
N shifts to a new address (Ex: to
check a jump instruction).
P dumps the current screen to the
printer (the screen is turned off
for this and all other I/O to speed UNRECOGNIZED
things up). INSTRUCTIONS:
E goes to the exit menu. From
this menu, If an Opcodes is encountered that
R starts over from the original is not recognized as a valid 6502
starting address. opcodes, a .BYTE instruction is
generated. Up to three considerably more
unrecognizable characters will be limited. It can
included in a single .BYTE disassemble code for the
record. A BRK instruction ($00) Z80, 6502, Intel 8051,
most often occurs as a data byte Intel i860, and PDP-11
rather than an instruction. The processors, as well as x86
disassembler treats binary zeros instructions up to the 486.
as a .BYTE character.
Since data bytes that are valid
opcodes cannot be distinguished
by a disassembler as data bytes,  IDA Pro: It is the most
they will be interpreted as popular disassembler for
instructions. The logic flow of tracing malware. It has a
the program should indicate fantastic GUI and feature
which of these instructions are set, and it supports most
actually data bytes. This Windows executable
misinterpretation of data bytes types, including EXE,
will not prevent the reassembled NE, and PE files. It
program from looking just like automatically detects the
the original. data and code portions of
a program. It will auto-
comment code, show
graphical relationships
EXAMPLES
between code jumps,
OF DISASSEMBLERS: document local variables,
and automatically
Any interactive debugger will recognize the standard
include some way of viewing the library functions
disassembly of the program being generated by popular C
debugged. Often, the same compilers. The purchase
disassembly tool will be cost includes a year of
packaged as a standalone free e-mail support and
disassembler distributed along updates. Its feature set
with the debugger. For and popularity have
example, objdump, part of GNU resulted in a broad
Binutils, is related to the community to support
interactive debugger gdb. questions, active plug-in
development, and training
 IDA 3.7: A DOS GUI classes.
tool that behaves very
much like IDA Pro, but is
library for IA-32 and
intel64 architectures
(coded in C and usable in
various languages: C,
Python, Delphi,
PureBasic, WinDev,
masm, fasm, nasm,
GoAsm).
 HT Editor: An analyzing
disassembler for Intel x86
instructions. The latest
version runs as a console
GUI program on
Windows, but there are
versions compiled for
An IDA Pro logic Linux as well.
diagram  Texe: is a Free, 32bit
disassembler and
windows PE file analyzer.
 diStorm64: diStorm is an
 ILDASM: It is a tool open source highly
contained in the .NET optimized stream
Framework SDK. It can disassembler library for
be used to disassemble PE 80x86 and AMD64.
files containing Common
Intermediate  PE Explorer: Another
Language code. great disassembler
 OllyDbg: It is a 32-bit commercial alternative is
assembler level analyzing PE Explorer. Like IDA
debugger Pro (but with fewer
 PVDasm: It is a Free, features), it is very
Interactive, Multi-CPU Windows-friendly and
disassembler. easy to navigate. It
 SIMON: a extracts APIs, shows
test/debugger/animator dependencies, pulls out
with integrated dis- code sections, and
assembler for Assembler, disassembles compiled
COBOL and PL/1. programs into commented
 BeaEngine: BeaEngine is code dumps.
a complete disassembler
Figure below shows PE Explorer
disassembling Netlog1.exe Separating Code from
(renamed Netlog1.vir). As we
Data
can see, it offers a thorough look
at the PE file structure and all of The problem wouldn't be as
the resources in the file, and tells difficult if data were limited to
us just about every little detail we the .data section of an executable
could possibly want to know and if executable code was
about a PE file. limited to the .code section of an
executable, but this is often not
the case. Data may be inserted
directly into the code section (e.g.
jump address tables, constant
strings), and executable code may
be stored in the data section
(although new systems are
working to prevent this for
security reasons).

PE Explorer disassembing
Netlog1.exe

DISASSEMBLER
ISSUES:

There are a number of issues and


difficulties associated with the
disassembly process. The two
most important difficulties are
the division between code and
data, and the loss of text
information.
Many interactive disassemblers combination of interactive and
will give the user the option to automatic analysis and
render segments of code as either perseverance can handle all but
code or data, but non-interactive programs specifically designed to
disassemblers will make the thwart reverse engineering, like
separation automatically. using encryption and decrypting
Disassemblers often will provide code just prior to use, and
the instruction AND the moving code around in memory.
corresponding hex data on the
same line, to reduce the need for
decisions to be made about the Lost Information
nature of the code. Some
disassemblers (e.g. ciasdis) will
All text-based identifiers, such as
allow you to specify rules about
variable names, label names, and
whether to disassemble as data or
macros are removed by the
code and invent label names,
assembly process. They may still
based on the content of the object
be present in generated object
under scrutiny. Scripting your
files, for use by tools like
own "crawler" in this way is
debuggers and relocating linkers,
more efficient; for large
but the direct connection is lost
programs interactive
and re-establishing that
disassembling may be impractical
connection requires more than a
to the point of being unfeasible.
mere disassembler. These
The general problem of identifiers, in addition to
separating code from data in comments in the source file, help
arbitrary executable programs is to make the code more readable
equivalent to the halting problem. to a human, and can also shed
As a consequence, it is not some clues on the purpose of the
possible to write a disassembler code. Without these comments
that will correctly separate code and identifiers, it is harder to
and data for all possible input understand the purpose of the
programs. Reverse engineering is source code, and it can be
full of such theoretical difficult to determine the
limitations, although by Rice's algorithm being used by that
theorem all interesting questions code. When you combine this
about program properties are problem with the possibility that
undecidable (so compilers and the code you are trying to read
many other tools that deal with may, in reality, be data then it can
programs in any form run into be ever harder to determine what
such limits as well). In practice a is going on.
output becomes more difficult for
a human to interpret than the
original annotated source code.
CONCLUSION Some disassemblers make use of
the symbolic
debugging information present in
Assembly language source object files such as ELF.
code generally permits the use The Interactive
of constants and Disassembler allows the human
programmer comments. These user to make up mnemonic
are usually removed from the symbols for values or regions of
assembled machine code by the code in an interactive session:
assembler. If so, a disassembler human insight applied to the
operating on the machine code disassembly process often
would produce disassembly parallels human creativity in the
lacking these constants and code writing process.
comments; the disassembled
Flow chart of disassembler
REFERENCES:

http://www.ebook-search-engine.com/disassembler-ebook-doc.html
http://en.wikibooks.org/wiki/X86_Disassembly/Disassemblers_and_Deco
mpilers
http://www.answers.com/topic/disassembler
http://www.digitalmars.com/ctg/obj2asm.html
http://www.heaventools.com/PE_Explorer_disassembler.html

Das könnte Ihnen auch gefallen