Beruflich Dokumente
Kultur Dokumente
0
Doug Burger
Todd M. Austin
SimpleScalar LLC
2395 Timbercrest Court
Ann Arbor, MI 48105
*Contact: info@simplescalar.com
http://www.simplescalar.com
easy annotation of instructions, without requiring a retargeted
compiler for incremental changes. The instruction definition
method, along with the ported GNU tools, makes new simulators
easy to write, and the old ones even simpler to extend. Finally,
the simulators have been aggressively tuned for performance,
and can run codes approaching real sizes in tractable amounts
of time. On a 200-MHz Pentium Pro, the fastest, least detailed
simulator simulates about four million machine cycles per second, whereas the most detailed processor simulator simulates
about 150,000 per second.
The current release (version 2.0) of the tools is a major
improvement over the previous release. Compared to version 1.0
[2], this release includes better documentation, enhanced performance, compatibility with more platforms, precompiled SPEC95
SimpleScalar binaries, cleaner interfaces, two new processor
simulators, option and statistic management packages, a sourcelevel debugger (DLite!) and a tool to trace the out-of-order pipeline.
The rest of this document contains information about obtaining, installing, running, using, and modifying the tool set. In
Section 2 we provide a detailed procedure for downloading the
release, installing it, and getting it up and running. In Section 3,
we describe the SimpleScalar architecture and details about the
target (simulated) system. In Section 4, we describe the SimpleScalar processor simulators and discuss their internal workings. In
Section 5, we describe two tools that enhance the utility of the
tool set: a pipeline tracer and a source-level debugger (for stepping through the program being simulated). In Section 6, we provide the history of the tools development, describe current and
planned efforts to extend the tool set, and conclude. In
Appendix A and Appendix B contain detailed definitions of the
SimpleScalar instructions and system calls, respectively.
1 Overview
Modern processors are incredibly complex marvels of engineering that are becoming increasingly hard to evaluate. This
report describes the SimpleScalar tool set (release 2.0), which
performs fast, flexible, and accurate simulation of modern processors that implement the SimpleScalar architecture (a close
derivative of the MIPS architecture [4]). The tool set takes binaries compiled for the SimpleScalar architecture and simulates
their execution on one of several provided processor simulators.
We provide sets of precompiled binaries (including SPEC95),
plus a modified version of GNU GCC (with associated utilities)
that allows you to compile your own SimpleScalar test binaries
from FORTRAN or C code.
The advantages of the SimpleScalar tools are high flexibility,
portability, extensibility, and performance. We include five execution-driven processor simulators in the release. They range
from an extremely fast functional simulator to a detailed, out-oforder issue, superscalar processor simulator that supports nonblocking caches and speculative execution.
The tool set is portable, requiring only that the GNU tools
may be installed on the host system. The tool set has been tested
extensively on many platforms (listed in Section 2). The tool set
is easily extensible. We designed the instruction set to support
These utilities are not required to run the simulators themselves, but is required to compile your own SimpleScalar
benchmark binaries (e.g. test programs other than the ones
we provide). The compressed file is 3 MB, the uncompressed file is 14 MB, and the build requires 52 MB.
Permission is granted to anyone to make or distribute copies of this tool set, either as received or modified, in any
medium, provided that all copyright notices, permission and
nonwarranty notices are preserved, and that the distributor
grants the recipient permission for further redistribution as
permitted by this document.
tar xf filename.tar
If you download and unpack all files, release, you should have
the following subdirectories with following contents:
http://www.simplescalar.com/license.html
Note the tar.gz suffix: by requesting the file without the .gz
suffix, the ftp server uncompresses it automatically. To get the
compressed version, simply request the file with the .gz suffix.
The five distribution files in the directory (which are symbolic
links to the files containing the latest version of the tools) are:
simpleutils.tar.gz - contains the GNU binutils source (version 2.5.2), retargeted to the SimpleScalar architecture.
FORTRAN
benchmark source
C
benchmark source
Simulator source
(e.g., sim-outorder.c)
f2c
SimpleScalar
GCC
Host C compiler
SimpleScalar
assembly
Simulator
SimpleScalar
GAS
SS libc.a
SS libm.a
RESULTS
Object files
Simplescalar
GLD
SS libF77.a
SimpleScalar
executables
Precompiled SS
binaries (test, SPEC95)
We provide pre-built copies of the necessary libraries in ssbigna-sstrix/lib/, so you do not need to build the code in
glibc-1.09, unless you change the library code. Building these
libraries is tricky, and we do not recommend it unless you have a
specific need to do so. In that event, to build the libraries:
cd $IDIR/binutils-2.5.2
configure --host=$HOST --target=ssbig-na-
cd $IDIR/glibc-1.09
configure --prefix=$IDIR/ssbig-na-sstrix
ssbig-na-sstrix
setenv CC $IDIR/bin/ssbig-na-sstrix-gcc
unsetenv TZ
unsetenv MACHINE
make
make install
Note that you must have already built the SimpleScalar simulators to build this library, since the glibc build requires a compiled
simulator to test target machine-specific parameters such as
endian-ness.
If you have FORTRAN benchmarks, you will need to build
f2c:
cd $IDIR/f2c-1994.09.27
make
make install
The entire tool set should now be ready for use. We provide precompiled test binaries (big- and little-endian) and their sources in
$IDIR/simplesim2.0/tests). To run a test:
cd $IDIR/simplesim-2.0
sim-safe tests/bin.big/test-math
lw/a
$r6,4($r7)
The annotation in this example is /a. It specifies that the first bit
of the annotation field should be set. Bit annotations /a through /p
set bits 0 through 15, respectively. Field annotations are written
in the form:
The test should generate about a page of output, and will run very
quickly. The release has been ported toand should run onthe
following systems:
- gcc/AIX 413/RS6000
- xlc/AIX 413/RS6000
- gcc/HPUX/PA-RISC
- gcc/SunOS 4.1.3/SPARC
- gcc/Linux 1.3/x86
- gcc/Solaris 2/SPARC
- gcc/Solaris 2/x86
- gcc/DEC Unix 3.2/Alpha
- c89/DEC Unix 3.2/Alpha
- gcc/FreeBSD 2.2/x86
- gcc/WindowsNT/x86
lw/6:4(7)
$r6,4($r7)
This annotation sets the specified 3-bit field (from bit 4 to bit 6
within the 16-bit annotation field) to the value 7.
System calls in SimpleScalar are managed by a proxy handler
(located in syscall.c) that intercepts system calls made by
the simulated binary, decodes the system call, copies the system
call arguments, makes the corresponding call to the hosts operating system, and then copies the results of the call into the simulated programs memory. If you are porting SimpleScalar to a
new platform, you will have to code the system call translation
from SimpleScalar to your host machine in syscall.c. A list
of all SimpleScalar system calls is provided in Appendix B.
SimpleScalar uses a 31-bit address space, and its virtual
memory is laid out as follows:
There are no architected delay slots: loads, stores, and control transfers do not execute the succeeding instruction.
A square-root instruction, which implements both singleand double-precision floating point square roots.
0x00000000
0x00400000
0x10000000
0x7fffc000
Unused
Start of text segment
Start of data segment
Stack base (grows down)
The top of the data segment (which includes init and bss) is held
in mem_brk_point. The areas below the text segment and
above the stack base are unused.
4 Simulator internals
In this section, we describe the functionality of the processor
simulators that accompany the tool set. We describe each of the
simulators, their functionality, command-line arguments, and
internal structures.
The compiler outputs binaries that are compatible with the
MIPS ECOFF object format. Library calls are handled with the
ported version of GNU GLIBC and POSIX-compliant Unix system calls. The simulators currently execute only user-level code.
All SimpleScalar-related extensions to GCC are contained in the
config/ss subdirectory of the GCC source tree that comes
Control
Load/Store
Integer Arithmetic
j - jump
jal - jump and link
jr - jump register
jalr - jump and link register
beq - branch == 0
bne - branch != 0
blez - branch <= 0
bgtz - branch > 0
bltz - branch < 0
bgez - branch >= 0
bct - branch FCC TRUE
bcf - branch FCC FALSE
lb - load byte
lbu - load byte unsigned
lh - load half (short)
lhu - load half (short) unsigned
lw - load word
dlw - load double word
l.s - load single-precision FP
l.d - load double-precision FP
sb - store byte
sbu - store byte unsigned
sh - store half (short)
shu - store half (short) unsigned
sw - store word
dsw - store double word
s.s - store single-precision FP
s.d - store double-precision FP
addressing modes:
(C)
(reg+C) (with pre/post inc/dec)
(reg+reg) (with pre/post inc/dec)
Miscellaneous
nop - no operation
syscall - system call
break - declare program error
Software Name
$zero
$at
$v0-$v1
$a0-$a3
$t0-$t7
$s0-$s7
$t8-$t9
$k0-$k1
$gp
$sp
$s8
$ra
$hi
$lo
$f0-$f31
$fcc
Description
zero-valued source/sink
reserved by assembler
fn return result regs
fn argument value regs
temp regs, caller saved
saved regs, callee saved
temp regs, caller saved
reserved by OS
global pointer
stack pointer
saved regs, callee saved
return address reg
high result register
low result register
floating point registers
floating point condition code
16-opcode
8-rs
8-rt
8-rd
8-ru/shamt
Register format:
63
32 31
16-annote
16-opcode
0
8-rs
16-imm
8-rt
Immediate format:
63
32 31
16-annote
16-opcode
6-unused
0
26-target
Jump format:
63
32 31
-cache:dl1 <config>
configures a level-one data cache.
-cache:dl2 <config>
configures a level-two data cache.
-cache:il1 <config>
configures a level-one instr. cache.
-cache:il2 <config>
configures a level-two instr. cache.
-tlb:dtlb <config>
configures the data TLB.
-tlb:itlb <config>
configures the instruction TLB.
-flush <boolean>
flush all caches on a system call;
(<boolean> = 0 | 1 | true | TRUE | false | FALSE).
-icompress
remap
SimpleScalars
64-bit
instructions to a 32-bit equivalent in
the simulation (i.e., model a
machine with 4-word instructions).
-pcstat <stat>
generate a text-based profile, as
described in Section 4.3.
The cache configuration (<config>) is formatted as follows:
<name>:<nsets>:<bsize>:<assoc>:<repl>
-cache:il1
-cache:il2
-cache:dl1
-cache:dl2
il1:128:64:1:l
dl2
dl1:256:32:1:l
ul2:1024:64:2:l
il1:256:32:1:l
dl1:256:32:1:l
ul2:1024:64:4:l
itlb:16:4096:4:l
dtlb:32:4096:4:l
(8 KB)
(8 KB)
(256 KB)
(64 entries)
(128 entries)
-pcstat <stat>
-R [lru | opt]
-a <sets>
-b <sets>
-l <line>
-n <assoc>
-in <interval>
-M <size>
-C <size>
4.3 Profiling
The distribution comes with a functional simulator that produces voluminous and varied profile information. sim-profile
can generate detailed profiles on instruction classes and
addresses, text symbols, memory accesses, branches, and data
segment symbols.
sim-profile takes the following command-line arguments,
which toggle the various profiling features:
-iclass
instruction class profiling (e.g. ALU,
branch).
-iprof
instruction profiling (e.g., bnez, addi).
-brprof
branch class profiling (e.g., direct, calls, conditional).
-amprof
addr. mode profiling (e.g., displaced, R+R).
-segprof
load/store segment profiling (e.g., data,
heap).
-tsymprof
execution profile by text symbol (functions).
-dsymprof
reference profile by data segment symbol.
-taddrprof
execution profile by text address.
-all
turn on all profiling listed above.
Three of the simulators (sim-profile, sim-cache, and sim-outorder) support text segment profiles for statistical integer
counters. The supported counters include any added by users, so
long as they are correctly registered with the SimpleScalar
stats package included with the simulator code (see Section 4.5).
To use the counter profiles, simply add the command-line flag:
ruu_init();
for (;;) {
ruu_commit();
ruu_writeback();
lsq_refresh();
ruu_issue();
ruu_dispatch();
ruu_fetch();
}
executed
13 times
never
executed
{
{
00401a10: (
strtod.c:79
00401a18: (
strtod.c:87
00401a20:
00401a28:
strtod.c:89
00401a30: (
00401a38: (
00401a40: (
13,
13,
13,
13,
13,
Fetch
Dispatch
Scheduler
Exec
Memory
scheduler
Mem
I-Cache
Writeback
Commit
D-Cache D-TLB
Virtual memory
Figure 5. Pipeline for sim-outorder
load is sent to the memory system.
The execute stage is also handled in ruu_issue(). Each
cycle, the routine gets as many ready instructions as possible
from the scheduler queue (up to the issue width). The functional
units availability is also checked, and if they have available
access ports, the instructions are issued. Finally, the routine
schedules writeback events using the latency of the functional
units (memory operations probe the data cache to obtain the correct latency of the operation). Data TLB misses stall the issue of
the memory operation, are serviced in the commit stage of the
pipeline, and currently assume a fixed latency. The functional
units latencies are hardcoded in the definition of
fu_config[] in sim-outorder.c.
The writeback stage resides in ruu_writeback(). Each
cycle it scans the event queue for instruction completions. When
it finds a completed instruction, it walks the dependence chain of
instruction outputs to mark instructions that are dependent on the
completed instruction. If a dependent instruction is waiting only
for that completion, the routine marks it as ready to be issued.
The writeback stage also detects branch mispredictions; when it
determines that a branch misprediction has occurred, it rolls the
state back to the checkpoint, discarding the erroneously issued
instructions.
ruu_commit() handles the instructions from the writeback
stage that are ready to commit. This routine does in-order committing of instructions, updating of the data caches (or memory)
with store values, and data TLB miss handling. The routine keeps
pattern
history
2-bit
predictors
branch
address
branch
prediction
l2size
l1size
hist_size
predictor
l1_size
hist_size
l2_size
xor
GAg
2W
GAp
>2W
PAg
2W
PAp
2N+W
gshare
2W
-bpred:ras <size>
set the return stack size to <size> (0 entries
means to return stack). The default is 8.
entries.
-bpred:btb <sets> <assoc>
configure the BTB to have <sets> sets and an
associativity of <assoc>. The defaults are
512 sets and an associativity of 4.
-bpred:spec_update <stage>
allow speculative updates of the branch predictor in the decode or writeback stages
(<stage> = [ID|WB]). The default is nonspeculative updates in the commit stage.
Visualization
-pcstat <stat>
10
pipeview.pl <ptrace_file>
5 Utilities
In this section we describe the utilities that accompany the
SimpleScalar tool set; pipeline tracing and a source-level debugger.
Printing information:
print [modifiers] <expr>
print the value of <expr> using optional
modifiers.
display [modifiers] <expr>
display the value of <expr> using optional
modifiers.
option <string> print the value of option <string>.
options
print the values of all options.
stat <string> print the value of a statistical variable.
stats
print the values of all statistical variables.
whatis <expr> print the type of <expr>.
regs
print all register contents.
iregs
print all instruction register contents.
11
new cycle
indicator
@ 610
new instruction
definitions
gf = 0x0040d098: addiu
gg = 0x0040d0a0: beq
current pipeline
state
[IF]
gf
gg
[DA]
gb
gc
gd\
ge
[EX]
fy
fz
ga+
[WB]
fr\
fs
ft
fu
[CT]
fq
inst. being
fetched, or in
fetch queue
inst. being
decoded, or
awaiting issue
inst.
executing
inst. writing
results into
RUU, or
awaiting retire
inst. retiring
results to
register file
pipeline event:
(misprediction
detected), see output
header for event defs
r2, r4, -1
r3, r5, 0x30
fpregs
print all floating point register contents.
mstate [string] print machine-specific state.
dump <addr> [count]
dump memory at <addr> (optionally for
<count> words).
dis <addr> [count]
disassemble instructions at <addr> (optionally for <count> instructions).
symbols
print the value of all program symbols.
tsymbols
print the value of all program text symbols.
dsymbols
print the value of all program data symbols.
symbol <string>
print the value of symbol <string>.
Legal arguments:
Arguments <addr>, <cnt>, <expr>, and <id> are any legal
expression:
<expr> <factor> +|- <expr>
<factor> <term> *|/ <factor>
<term> ( <expr> )
| - <term> | <const> | <symbol> | <file:loc>
<symbol> <literal> | <function name> | <register>
<literal> [0-9]+ | 0x[0-9,a-f]+ | 0[0-7]+
<register> $r[0-31] | $f[0-31] | $pc | $fcc | $hi | $lo
6 Summary
The SimpleScalar tool set was written by Todd Austin over
about one and a half years, between 1994 and 1996. He continues
to add improvements and updates. The ancestors of the tool set
date back to the mid to late 1980s, to tools written by Manoj
Franklin. At the time the tools were developed, both individuals
were research assistants at the University of Wisconsin-Madison
Computer Sciences Department, supervised by Professor Guri
Sohi. Scott Breach provided valuable assistance with the implementation of the proxy system calls. The first release was assembled, debugged, and documented by Doug Burger, also a
research assistant at Wisconsin, who is the maintainer of the second release as well. Kevin Skadron, currently at Princeton,
implemented many of the more recent branch prediction mechanisms.
Many exciting extensions to SimpleScalar are both underway
and planned. Efforts have begun to extend the processor simula-
Legal ranges:
<range> <address> | <instruction> | <cycle>
<address> @<function name>:{+<literal>}
<instruction> {<literal>}:{<literal>}
<cycle> #{<literal>}:{<literal>}
Omitting optional arguments to the left of the colon will default
to the smallest value permitted in that range. Omitting an
optional argument at the right of the colon will default to the
largest value permitted in that range.
Legal command modifiers:
b print a byte
h print a half (short)
12
Semantics:
SET_NPC((CPC\&0xf0000000) | (TARGET<<2))
SET_GPR(31, CPC + 8))
JR:
Opcode:
Format:
Semantics:
JALR:
Opcode:
Format:
Semantics:
TALIGN(GPR(RS))
SET_NPC(GPR(RS))
TALIGN(GPR(RS))
SET_GPR(RD, CPC + 8)
SET_NPC(GPR(RS))
References
[1]
[2]
[3]
[4]
[5]
[6]
BEQ:
Opcode:
Format:
Semantics:
Branch if equal.
0x05
BEQ rs,rt,offset
BNE:
Opcode:
Format:
Semantics:
BLEZ:
Opcode:
Format:
Semantics:
BGTZ:
Opcode:
Format:
Semantics:
BLTZ:
Opcode:
Format:
Semantics:
BGEZ:
Opcode:
Format:
Semantics:
BC1F:
if (GPR(RS) == GPR(RT))
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
if (GPR(RS) != GPR(RT))
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
if (GPR(RS) <= 0)
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
if (GPR(RS) > 0)
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
if (GPR(RS) < 0)
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
13
if (GPR(RS) >= 0)
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
Operator/operand
FS
FT
FD
UIMM
IMM
OFFSET
CPC
NPC
SET_NPC(V)
GPR(N)
SET_GPR(N,V)
FPR_F(N)
SET_FPR_F(N,V)
FPR_D(N)
SET_FPR_D(N,V)
FPR_L(N)
SET_FPR_L(N,V)
HI
SET_HI(V)
LO
SET_LO(V)
READ_SIGNED_BYTE(A)
READ_UNSIGNED_BYTE(A)
WRITE_BYTE(V,A)
READ_SIGNED_HALF(A)
READ_UNSIGNED_HALF(A)
WRITE_HALF(V,A)
READ_WORD(A)
WRITE_WORD(V,A)
TALIGN(T)
FPALIGN(N)
OVER(X,Y)
UNDER(X,Y)
DIV0(V)
Semantics
same as field RS
same as field RT
same as field RD
IMM field unsigned-extended to word value
IMM field sign-extended to word value
IMM field sign-extended to word value
PC value of executing instruction
next PC value
Set next PC to value V
General purpose register N
Set general purpose register N to value V
Floating point register N single-precision value
Set floating point register N to single-precision value V
Floating point register N double-precision value
Set floating point register N to double-precision value V
Floating point register N literal word value
Set floating point register N to literal word value V
High result register value
Set high result register to value V
Low result register value
Set low result register to value V
Read signed byte from address A
Read unsigned byte from address A
Write byte value V at address A
Read signed half from address A
Read unsigned half from address A
Write half value V at address A
Read word from address A
Write word value V at address A
Check target T is aligned to 8 byte boundary
Check register N is wholly divisible by 2
Check for overflow when adding X to Y
Check for overflow when subtraction Y from X
Check for division by zero error with divisor V
BC1T:
Opcode:
Format:
Semantics:
Semantics:
0x0b
BC1F offset
if (!FCC)
SET_NPC(CPC + 8 + (OFFSET << 2))
else
SET_NPC(CPC + 8)
LBU:
Opcode:
Format:
Semantics:
LBU:
Opcode:
Format:
Semantics:
SET_GPR(RT, READ_SIGNED_BYTE(GPR(RS)
+ OFFSET))
LH:
Opcode:
Format:
Semantics:
LH:
Opcode:
LB:
Opcode:
Format:
SET_GPR(RT,
READ_SIGNED_BYTE(GPR(RS)+GPR(RD)))
14
SET_GPR(RT,
READ_UNSIGNED_BYTE(GPR(RS)+OFFSET))
SET_GPR(RT,
READ_UNSIGNED_BYTE(GPR(RS)+GPR(RD)
))
SET_GPR(RT,
READ_SIGNED_HALF(GPR(RS)+OFFSET))
Format:
Semantics:
LHU:
Opcode:
Format:
Semantics:
SET_GPR(RT,
READ_SIGNED_HALF(GPR(RS)+GPR(RD)))
LW:
Opcode:
Format:
Semantics:
LW:
Opcode:
Format:
Semantics:
DLW:
Opcode:
Format:
Semantics:
DLW:
Opcode:
Format:
Semantics:
Opcode:
Format:
Semantics:
L.S:
Opcode:
Format:
Semantics:
L.D:
Opcode:
L.D:
SET_GPR(RT,
READ_UNSIGNED_HALF(GPR(RS)+OFFSET))
LHU:
Opcode:
Format:
Semantics:
L.S:
Format:
Semantics:
LH rt,(rs+rd) inc_dec
Opcode:
Format:
Semantics:
SET_GPR(RT,
READ_UNSIGNED_HALF(GPR(RS)+GPR(RD)
))
SET_GPR(RT, READ_WORD(GPR(RS)+OFFSET))
SET_GPR(RT,
READ_WORD(GPR(RS)+GPR(RD)))
SET_GPR(RT, READ_WORD(GPR(RS)+OFFSET))
SET_GPR(RT+1,
READ_WORD(GPR(RS)+OFFSET+4))
SET_GPR(RT,
READ_WORD(GPR(RS)+GPR(RD)))
SET_GPR(RT+1,
READ_WORD(GPR(RS)+GPR(RD)+4))
15
LWL:
Opcode:
Format:
Semantics:
LWR:
Opcode:
Format:
Semantics:
SB:
Opcode:
Format:
Semantics:
SB:
Opcode:
Format:
Semantics:
SH:
Opcode:
Format:
Semantics:
SH:
Opcode:
Format:
Semantics:
SW:
Opcode:
Format:
Semantics:
SW:
Opcode:
Format:
Semantics:
DSW:
Opcode:
Format:
Semantics:
WRITE_BYTE(GPR(RT), GPR(RS)+OFFSET)
WRITE_BYTE(GPR(RT), GPR(RS)+GPR(RD))
WRITE_HALF(GPR(RT), GPR(RS)+OFFSET)
WRITE_HALF(GPR(RT), GPR(RS)+GPR(RD))
WRITE_WORD(GPR(RT), GPR(RS)+OFFSET)
WRITE_WORD(GPR(RT), GPR(RS)+GPR(RD))
WRITE_WORD(GPR(RT), GPR(RS)+OFFSET)
Format:
Semantics:
WRITE_WORD(GPR(RT+1), GPR(RS)+OFFSET+4)
DSW:
Opcode:
Format:
Semantics:
DSZ:
Opcode:
Format:
Semantics:
S.S:
Opcode:
Format:
Semantics:
S.S:
Opcode:
Format:
Semantics:
S.D:
Opcode:
Format:
Semantics:
S.D:
Opcode:
Format:
Semantics:
SWL:
Opcode:
Format:
Semantics:
SWR:
Opcode:
WRITE_WORD(GPR(RT), GPR(RS)+GPR(RD))
WRITE_WORD(GPR(RT+1),
GPR(RS)+GPR(RD)+4)
DSZ:
Opcode:
Format:
Semantics:
SWR rt,offset(rs)
See ss.def for a detailed description of this
instructions semantics. NOTE: SWR does not
support pre-/post- inc/dec.
WRITE_WORD(0, GPR(RS)+OFFSET)
WRITE_WORD(0, GPR(RS)+OFFSET+4)
WRITE_WORD(0, GPR(RS)+GPR(RD))
WRITE_WORD(0, GPR(RS)+GPR(RD)+4)
Store double word from floating point register file, displaced addressing.
0x37
S.D ft,offset(rs) inc_dec
WRITE_WORD(FPR_L(FT), GPR(RS)+OFFSET)
WRITE_WORD(FPR_L(FT+1), GPR(RS)+OFFSET+4)
Store double word from floating point register file, indexed addressing.
0xd2
S.D ft,(rs+rd) inc_dec
WRITE_WORD(FPR_L(FT),
GPR(RS)+GPR(RD))
WRITE_WORD(FPR_L(FT+1),
GPR(RS)+GPR(RD)+4)
16
ADD:
Opcode:
Format:
Semantics:
ADDI:
check).
Opcode:
Format:
Semantics:
OVER(GPR(RT),GPR(RT))
SET_GPR(RD, GPR(RS) + GPR(RT))
0x41
ADDI rd,rs,rt
OVER(GPR(RS),IMM)
SET_GPR(RT, GPR(RS) + IMM)
ADDU:
Opcode:
Format:
Semantics:
ADDIU:
check).
Opcode:
Format:
Semantics:
0x43
ADDIU rd,rs,rt
SET_GPR(RT, GPR(RS) + IMM)
SUB:
Opcode:
Format:
Semantics:
SUBU:
check).
Opcode:
Format:
Semantics:
UNDER(GPR(RS),GPR(RT))
SET_GPR(RD, GPR(RS) - GPR(RT))
0x45
SUBU rd,rs,rt
SET_GPR(RD, GPR(RS) - GPR(RT))
MULT:
Opcode:
Format:
Semantics:
Multiply signed.
0x46
MULT rs,rt
MULTU:
Opcode:
Format:
Semantics:
Multiply unsigned.
0x47
MULTU rs,rt
DIV:
Opcode:
Format:
Semantics:
Divide signed.
0x48
DIV rs,rt
SET_HI(((unsigned)RS * (unsigned)RT)/(1<<32))
SET_LO(((unsigned)RS*(unsigned)RT) %
(1<<32))
DIV0(GPR(RT))
SET_LO(GPR(RS) / GPR(RT))
SET_HI(GPR(RS) % GPR(RT))
DIVU
Opcode:
Format:
Semantics:
Semantics:
Divide unsigned.
0x49
DIVU rs,rt
DIV0(GPR(RT))
SET_LO((unsigned)GPR(RS)/
(unsigned)GPR(RT))
SET_HI((unsigned)GPR(RS)%(unsigned)GPR(R
T))
MFHI:
Opcode:
Format:
Semantics:
MTHI:
Opcode:
Format:
Semantics:
Move to HI register.
0x4b
MTHI rs
MFLO:
Opcode:
Format:
Semantics:
MTLO:
Opcode:
Format:
Semantics:
Move to LO register.
0x4d
MTLO rs
AND:
Opcode:
Format:
Semantics:
Logical AND.
0x4e
AND rd,rs,rt
ANDI:
Opcode:
Format:
Semantics:
OR:
Opcode:
Format:
Semantics:
Logical OR.
0x50
OR rd,rs,rt
ORI:
Opcode:
Format:
Semantics:
Logical OR immediate.
0x51
ORI rd,rt,imm
XOR:
Opcode:
Format:
Semantics:
Logical XOR.
0x52
XOR rd,rs,rt
XORI:
Opcode:
Format:
Semantics:
NOR:
Opcode:
Format:
Logical NOR.
0x54
NOR rd,rs,rt
SET_GPR(RD, HI)
SET_HI(GPR(RS))
SET_GPR(RD, LO)
SLL:
Opcode:
Format:
Semantics:
SLLV:
Opcode:
Format:
Semantics:
SRL:
Opcode:
Format:
Semantics:
SRLV:
Opcode:
Format:
Semantics:
SRA:
Opcode:
Format:
Semantics:
SRAV:
Opcode:
Format:
Semantics:
SLT:
Opcode:
Format:
Semantics:
SLTI:
Opcode:
Format:
Semantics:
SLTU:
Opcode:
Format:
Semantics:
SLTIU:
Opcode:
Format:
Semantics:
SET_LO(GPR(RS))
SET_GPR(RD,
((unsigned)GPR(RS)<(unsigned)GPR(RT)) ? 1 : 0)
SET_GPR(RD,
((unsigned)GPR(RS)<(unsigned)GPR(RT)) ? 1 : 0)
ADD.S:
Opcode:
Format:
Semantics:
17
Semantics:
FPALIGN(FS)
FPALIGN(FT)
SET_FPR_F(FD, FPR_F(FS) + FPR_F(FT)))
ADD.D:
Opcode:
Format:
Semantics:
SUB.S:
Opcode:
Format:
Semantics:
SUB.D:
Opcode:
Format:
Semantics:
MUL.S:
Opcode:
Format:
Semantics:
MUL.D:
Opcode:
Format:
Semantics:
DIV.S:
Opcode:
Format:
Semantics:
DIV.D:
Opcode:
Format:
Semantics:
ABS.S:
Opcode:
Format:
FPALIGN(FD)
FPALIGN(FS)
FPALIGN(FT)
SET_FPR_D(FD, FPR_D(FS) + FPR_D(FT)))
FPALIGN(FD)
FPALIGN(FS)
FPALIGN(FT)
SET_FPR_F(FD, FPR_F(FS) - FPR_F(FT)))
18
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_F(FD, fabs((double)FPR_F(FS))))
ABS.D:
Opcode:
Format:
Semantics:
MOV.S:
Opcode:
Format:
Semantics:
MOV.D:
Opcode:
Format:
Semantics:
NEG.S:
Opcode:
Format:
Semantics:
NEG.D:
sion.
Opcode:
Format:
Semantics:
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_D(FD, fabs(FPR_D(FS))))
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_F(FD, FPR_F(FS))
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_D(FD, FPR_D(FS))
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_F(FD, -FPR_F(FS))
0x7d
NEG.D fd,fs
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_D(FD, -FPR_D(FS))
CVT.S.D:
Opcode:
Format:
Semantics:
CVT.S.W:
Opcode:
Format:
Semantics:
CVT.D.S:
Opcode:
Format:
Semantics:
CVT.D.W:
Opcode:
Format:
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_D(FD, -FPR_D(FS))
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_F(FD, (float)FPR_L(FS))
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_D(FD,(double)FPR_F(FS))
Semantics:
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_D(FD,(double)FPR_L(FS))
CVT.W.S:
Opcode:
Format:
Semantics:
CVT.W.D:
Opcode:
Format:
Semantics:
C.EQ.S:
Opcode:
Format:
Semantics:
FPALIGN(FS)
SET_FPR_F(FD,sqrt((double)FPR_F(FS)))
SQRT.D:
Opcode:
Format:
Semantics:
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_L(FD, (long)FPR_F(FS))
C.EQ.D:
Opcode:
Format:
Semantics:
C.LT.S:
Opcode:
Format:
Semantics:
FPALIGN(FD)
FPALIGN(FS)
SET_FPR_L(FD, (long)FPR_D(FS))
FPALIGN(FS)
FPALIGN(FT)
SET_FCC(FPR_F(FS) == FPR_F(FT))
C.LT.D:
Opcode:
Format:
Semantics:
C.LE.S:
Opcode:
Format:
Semantics:
C.LE.D:
Opcode:
Format:
Semantics:
SQRT.S:
Opcode:
Format:
Semantics:
NOP:
Opcode:
Format:
Semantics:
No operation.
0x00
NOP
None
SYSCALL:
Opcode:
Format:
Semantics:
System call.
0xa0
SYSCALL
See Appendix B for details
BREAK:
Opcode:
Format:
Semantics:
LUI:
Opcode:
Format:
Semantics:
MFC1:
Opcode:
Format:
Semantics:
MTC1:
FPALIGN(FS)
FPALIGN(FT)
SET_FCC(FPR_D(FS) < FPR_D(FT))
Opcode:
Format:
Semantics:
SET_GPR(RT, FPR_L(FS))
FPALIGN(FS)
FPALIGN(FT)
SET_FCC(FPR_F(FS) <= FPR_F(FT))
FPALIGN(FS)
FPALIGN(FT)
SET_FCC(FPR_D(FS) <= FPR_D(FT))
0x03 is loaded into $v0, fd is loaded into $a0, buf into $a1, and
nbyte into $a2.
EXIT:
FPALIGN(FD)
19
Exit process.
Syscode:
Interface:
Semantics:
0x01
void exit(int status);
See exit(2).
READ:
Syscode:
Interface:
Semantics:
WRITE:
Syscode:
Interface:
Semantics:
OPEN:
Syscode:
Interface:
Semantics:
Open a file.
0x05
int open(char *fname, int flags, int mode);
See open(2).
CLOSE:
Syscode:
Interface:
Semantics:
Close a file.
0x06
int close(int fd);
See close(2).
CREAT:
Syscode:
Interface:
Semantics:
Create a file.
0x08
int creat(char *fname, int mode);
See creat(2).
UNLINK:
Syscode:
Interface:
Semantics:
Delete a file.
0x0a
int unlink(char *fname);
See unlink(2).
CHDIR:
Syscode:
Interface:
Semantics:
CHMOD:
Syscode:
Interface:
Semantics:
CHOWN:
Syscode:
Interface:
Semantics:
BRK:
Syscode:
Interface:
Semantics:
LSEEK:
Syscode:
Interface:
Semantics:
GETPID:
Syscode:
Interface:
Semantics:
See getpid(2).
GETUID:
Syscode:
Interface:
Semantics:
ACCESS:
Syscode:
Interface:
Semantics:
STAT:
Syscode:
Interface:
Semantics:
20
LSTAT:
Syscode:
Interface:
Semantics:
DUP:
Syscode:
Interface:
Semantics:
PIPE:
Syscode:
Interface:
Semantics:
GETGID:
Syscode:
Interface:
Semantics:
IOCTL:
Syscode:
Interface:
Semantics:
FSTAT:
Syscode:
Interface:
Semantics:
GETPAGESIZE:
Syscode:
Interface:
Semantics:
Syscode:
Interface:
Semantics:
FCNTL:
Syscode:
Interface:
Semantics:
File control.
0x5c
int fcntl(int fd, int cmd, int arg);
See fcntl(2).
SELECT:
Syscode:
Interface:
Semantics:
UTIMES:
Syscode:
Interface:
Semantics:
GETRLIMIT:
Syscode:
Interface:
Semantics:
SETRLIMIT:
21
0x91
int setrlimit(int res, struct rlimit *rlp);
See setrlimit(2).