You are on page 1of 78

Inside the

The K Virtual
Machine (KVM)
Frank Yellin
Sun Microsystems, Inc.

TS-1507: Inside the K Virtual Machine

Goals of This Talk


You should be able to
Read the source code to the KVM
without difficulty
Port the KVM
Debug your port
Understand some of the design decisions
You should want to get a copy of the
sources right away

TS-1507: Inside the K Virtual Machine

KVM Design Goals


A small footprint virtual machine for
the Java platform (Java virtual machine)
for resource constrained devices
Build a Java VM that would:
Be easy to understand and maintain
Be highly portable,
Be small without sacrificing features of
the Java programming language

TS-1507: Inside the K Virtual Machine

K Virtual Machine
KVM is a low tech Java virtual machine
No dynamic compilation or other advanced
performance optimization techniques
Easy to read and port

KVM is based on the Spotless system


developed originally at Sun Labs

TS-1507: Inside the K Virtual Machine

KVM Technical Facts


Implemented in C
Core of the VM about 24,000 lines of wellcommented source code

Static size of VM executable


40-80 kbytes depending on the platform
and compilation options (w/o romizing)
On Palm and Win32 about 60 kB

Runs anywhere from 30-80% of


the speed of JDK 1.1 software
without a JIT
5

TS-1507: Inside the K Virtual Machine

KVM Portability
Source code available under SCSL
(Sun Community Source License)

Release includes three KVM ports


Win32
PalmOS (3.01 or newer)
Solaris operating environment

Suns early access partners and


customers have ported the KVM to
various other platforms
More than 25 ports have been done so far
6

TS-1507: Inside the K Virtual Machine

KVM Design
Unlike some other small Java technology
implementations
KVM supports dynamic class loading and
regular class files
Supports JAR file format

Supports full byte code set (basic word


size 32 bits)

TS-1507: Inside the K Virtual Machine

JVM/KVM Compatibility
General goal
Full Java language and Java virtual machine
specification compatibility

Main language-level difference


Floating point not supported in CLDC 1.0
No hardware floating point support on most
devices due to space limitations

Other differences (next slide) are mainly


because
Libraries included in CLDC are limited
Cant afford to use full J2SE platform
security model
8

TS-1507: Inside the K Virtual Machine

Virtual Machine Compatibility


VM implementation differences

No Java native interface (JNI)


No reflection
No thread groups
No weak references
No finalization
Limited error handling support
New implementation of bytecode verification

TS-1507: Inside the K Virtual Machine

Features of the VM

10

Memory model
Garbage collector
Interpreter
Frames
Threads and monitors
File loading
Verification
Security
Native methods

TS-1507: Inside the K Virtual Machine

Romizer
Palm-romizer
Inflater
Restartability
Port specific changes
Compiler flags
Porting dangers
In the future

Important Data Structures


cell
CLASS

OBJECT
INSTANCE
ARRAY
STRING_INSTANCE

INSTANCE_CLASS
ARRAY_CLASS

11

FIELD
METHOD
FIELDTABLE
METHODTABLE

TS-1507: Inside the K Virtual Machine

FRAME
THREAD
BYTEARRAY
SHORTARRAY
POINTERLIST
HASHTABLE

Variable Length
typedef struct instanceStruct {
INSTANCE_CLASS ofClass;
MONITOR monitor;
union {
cell *cellp;
cell cell;
} data[1];
}
#define SIZEOF_INSTANCE(n) \
StructSizeInCells(instanceStruct) + (n - 1)

12

TS-1507: Inside the K Virtual Machine

Hash Tables
Three system hashtables
ClassTable
Package/base to CLASS

InternStringTable
Char*/length to unique java.lang.String

UTFStringTable
Char*/length to unique instance

13

TS-1507: Inside the K Virtual Machine

UTFStringTable and UString


bucket count

next

next

len

len

key
abcde....

key
java.lang

next
len
key
Connector

14

TS-1507: Inside the K Virtual Machine

UString Creation
getUString(const char *string)
getUStringX(const char *string,
int length)
strcmp(x,y) == 0 iff
getUString(x) == getUString(y)

15

TS-1507: Inside the K Virtual Machine

ClassTable and Classes


count

16

ofClass

ofClass

ofClass

monitor

monitor

monitor

base name

base name

base name

package name

package name

package name

next

next

next

superclass

superclass

primitiveType

field table

field table

itemSize

method table

method table

GC type

constant pool

constant pool

interfaces

interfaces

key

key

TS-1507: Inside the K Virtual Machine

Interned String Table


next

next

next

string

string

string

java.langString
monitor

17

charArray

class [C

offset

monitor

length

length

TS-1507: Inside the K Virtual Machine

Key Spaces
Classes (and field types)
Raw classes
Array classes

Names
Field and method names
Class packages
Class base names

Method signatures

18

TS-1507: Inside the K Virtual Machine

Class Keys
0 - 0xff: primitive type
Integer = I; boolean = Z, etc

0x100 - 0x1fff: instance class


Find the item in classtable

0x2000 - 0xdfff: array class, dim <= 6


High three bits give the dimension,

0xe000 - 0xffff: array class, dim >= 7


Look up low 13 bits in the class table

19

TS-1507: Inside the K Virtual Machine

Name Keys
1. Look up name in UTFStringTable
2. Get the key

getUTFString(name)->key

20

TS-1507: Inside the K Virtual Machine

Method Signature
1. Encode signature into a string
Algorithm
1 byte gives number of arguments
1-3 bytes for each argument
1-3 bytes for the return value

Encoding
Primitive types encoded as A - Z
Non primitives encoded as high/low
unless high in range A - Z, then use L

2. getUTFStringX(str, len)->key
21

TS-1507: Inside the K Virtual Machine

Signature Encoding
Class com.sun.cldc.io.j2me.datagram.Protocol {
Datagram newDatagram(byte[], int, String);
}
([BILjava/lang/String;)Ljavax/microedition/io/Datagram;

0x20

byte[]

0x01

0x19

0x01 0x70

java.lang.String
javax.microedition.io.Datagram

22

TS-1507: Inside the K Virtual Machine

Why Keys?
Saves space for C strings
Makes comparisons faster
Much less space
But. . . .
Makes debugging more complicated
Debug functions exist to convert keys
to strings, and vice versa

23

TS-1507: Inside the K Virtual Machine

Memory Model

24

Flat address space


32kb64meg (but 2meg, realistically)
Special code for the Palm
Every heap-allocated object has a header
No explicit use of malloc() or free()

TS-1507: Inside the K Virtual Machine

Object Header
Reserved
24 Bits Size (in Words)

32-bit Header Word

25

TS-1507: Inside the K Virtual Machine

Type

Marked

Header Types
Objects that are visible to the user
GCT_INSTANCE
GCT_ARRAY
GCT_OBJECTARRAY
GCT_INSTANCE_CLASS
GCT_ARRAY_CLASS

26

TS-1507: Inside the K Virtual Machine

More Header Types


Internal objects allocated in heap
GCT_FIELDTABLE
GCT_METHODTABLE
GCT_MONITOR
GCT_GLOBAL_ROOTS
GCT_POINTERLIST
GCT_FREE
GCT_HASHTABLE

20 in totalwe have space for 64


27

TS-1507: Inside the K Virtual Machine

Memory Layout
FreePointer

Free

Free
Free

28

TS-1507: Inside the K Virtual Machine

Creating Roots
Global roots (permanent)
Temporary roots (stack discipline)
Transient roots (non stack discipline)

29

TS-1507: Inside the K Virtual Machine

Global Roots
Created using makeGlobalRoot
Cannot be undone
Native code can create new roots

makeGlobalRoot(&globalVariable)

30

TS-1507: Inside the K Virtual Machine

Temporary Roots
Roots used in a stack-like manner
Can be nested
START_TEMPORARY_ROOTS
MAKE_TEMPORARY_ROOT(x)
...
MAKE_TEMPORARY_ROOT(y)
...
END_TEMPORARY_ROOTS
START_TEMPORARY_ROOT(x)
code
END_TEMPORARY_ROOT
31

TS-1507: Inside the K Virtual Machine

Transient Roots
Non-stack like behavior
Special handling of NULL
i = makeTransientRoot(x)
makeTransientRoot(y)
. . . . .
removeTransientRootByIndex(i);
or
removeTransientRootByValue(y);

32

TS-1507: Inside the K Virtual Machine

Allocating Objects
mallocBytes()
mallocHeapObject(size, type)
mallocObject(size, type)
callocObject(size, type)
instantiate(instance_class)
instantiateArray(arrayclass, count)
instantiateMultiArray(class,
dims,count)
instantiateString(string, length)

33

TS-1507: Inside the K Virtual Machine

Garbage Collecting
Can happen any time an object is
allocated
EXCESSIVE_GARBAGE_COLLECTION

34

TS-1507: Inside the K Virtual Machine

KVM Design
Garbage collector

Based on a non-copying collector


Small and simple
Non-moving, non-incremental, single-space
Mark-and-sweep algorithm
Optimized for small heaps (32 - 512 kb)
Designed to limit recursion

Will have an alternative, more advanced


collector later
35

TS-1507: Inside the K Virtual Machine

Non-Compacting
Garbage Collectors
Advantages
Does not move objects => simple and clean
codebase
Single space => uses less memory

Disadvantages
Object allocation not so fast
Memory fragmentation can cause it to run out
of heap
Non-incremental => long GC pauses when
using large heaps
36

TS-1507: Inside the K Virtual Machine

Interpreter
Runnable Threads

CurrentThread
Java stack of the
current thread

Straightforward
bytecode
interpreter with five
VM registers:
UP thread pointer
IP instruction pointer
SP stack pointer
FP frame pointer
LP locals pointer
(instruction pointer)

IP
(top of stack)

(current frame)
(current thread)
(locals of the
current frame)

37

TS-1507: Inside the K Virtual Machine

SP
FP
LP

Optimizations
Space optimizations
System class preloading (romizing) using
JavaCodeCompact
Runtime simonizing of immutable structures
Chunky stacks and segmented heap

Performance optimizations
Quick byte codes
Monomorphic inline caching

Optimizations configurable via


conditional compilation, same
as with debugging support
38

TS-1507: Inside the K Virtual Machine

Class Loading
Early interning of Strings
Early binding of Classes to Class
structure
Removal of all UTF Strings from
constant pool

39

TS-1507: Inside the K Virtual Machine

Stack Frames
operands

SP (top of Java stack)

locals
syncObject
previousIp
previousFp
returnCode

Operand stack starts here


Pointer to local variables inside the frame
Pointer to locked object (if synchronized call)
Previous instruction pointer
Pointer to previous stack frame
What to do upon returning from method

thisMethod
constPool

Pointer to the current constant pool

locals
params
this

40

TS-1507: Inside the K Virtual Machine

FP (frame pointer)
LP (parameters + local variables)

Thread Design
KVM supports platform-independent
multithreading (green threads)
Fully deterministic; no complex mutual
exclusion issues
This makes porting the KVM to new consumer
devices easier and faster
Substantially more portable and much cleaner
code base than with native threads
All active threads kept in a simple linked
queue, and given execution time based
on priority
41

TS-1507: Inside the K Virtual Machine

Threads and Monitors


Monitor
Waiters

42

Condvars

TS-1507: Inside the K Virtual Machine

Timer Queue

Owner

File Loading
Loaderfile.c

43

Generic code for reading from classpath


Reads classes from directories or zip files
Supports resources from the classpath
Uses standard IO
Zip files read using standard IO

TS-1507: Inside the K Virtual Machine

Generic File Loading


struct filePointerStruct;
typedef struct filePointerStruct *FILEPOINTER;
FILEPOINTER openClassfile(const char* className);
loadByte(FILEPOINTER);
loadShort(FILEPOINTER);
loadCell(FILEPOINTER);
loadBytes(FILEPOINTER, char *buffer, int len);
skipBytes(FILEPOINTER, unsigned int len);
closeClassfile(FILEPOINTER);
initializeClassPath()

44

TS-1507: Inside the K Virtual Machine

Reading Zip Files


findJARdirectories()
loadJARfile()
inflate()

45

TS-1507: Inside the K Virtual Machine

Zip Inflater
Old implementation
Two new implementations
FILE* contains compressed bytes
Char* containing compressed bytes

Inflater can be used outside the KVM

46

TS-1507: Inside the K Virtual Machine

64-Bit Support
#DEFINE COMPILER_SUPPORTS_LONGS 1
long64, ulong64
NEED_LONG_ALIGNMENT
NEED_DOUBLE_ALIGNMENT

#Define COMPILER_SUPPORTS_LONGS 0
ll_mul, ll_div, ll_rem,
ll_shl, ll_shr, ll_ushr
BIG_ENDIAN, LITTLE_ENDIAN

47

TS-1507: Inside the K Virtual Machine

Security
Cannot support full J2SE platform security model
in CLDC target devices:
J2SE platform security model is much larger than the
entire CLDC implementation

CLDC security discussion consists of two parts:


! Low-level virtual machine security
An application running in the VM must not be able to harm
the device in any way
Guaranteed by the bytecode verifier (discussed later)

" Application-level security


Sandbox model

48

TS-1507: Inside the K Virtual Machine

Security
CLDC sandbox model requires that:
Classfiles have been properly verified and
guaranteed to be valid
Only a limited, predefined set of APIs are
available to the application programmer
(as defined by the CLDC, profiles and
licensee open classes)
Downloading and management of applications
takes place at the native code level, and the
programmer cannot override the standard class
loading mechanisms or the system classes of
the virtual machine
49

TS-1507: Inside the K Virtual Machine

Security
Sandbox requirements continued
The set of native functions accessible to the
virtual machine is closed, meaning that the
application programmer cannot download any
new libraries containing native functionality, or
access any native functions that are not part of
the APIs provided by CLDC, profiles and
licensee open classes

50

TS-1507: Inside the K Virtual Machine

Class File Verification


Standard JVM class file verifier is too large
for a typical CLDC target device
The size of the JVM verifier is larger than
KVM itself
Dynamic memory consumption is excessive
(>100 kb for typical applications)

CLDC/KVM introduces a new, two-pass


class file verifier

51

TS-1507: Inside the K Virtual Machine

Class File Verification


Development Workstation
Target device
(KVM runtime)

MyApp.java
download...

javac
MyApp.class
preverifier
MyApp.class
52

TS-1507: Inside the K Virtual Machine

verifier

interpreter

Class File Verification


Features of the new verifier
Space-intensive processing (stack map
generation, removal of JSR-RET byte codes)
performed off-device
Results checked for correctness on device;
cannot be spoofed
Code signing not required
On-device footprint ~10 kb; constant space
(<100 bytes) at runtime; linear time (one pass,
no recursion)
5% overhead in class file size
53

TS-1507: Inside the K Virtual Machine

Old Verifier
Theorem prover
Consistent set of values for the stack
Consistent set of values for each register
Consistent use of jsr/ret instruction

54

TS-1507: Inside the K Virtual Machine

Old Verifier
Required no information outside
the class
Complex data flow analysis
Can require multiple passes
Can require large amount of space
Jsr and ret instructions cause massive
difficulties

55

TS-1507: Inside the K Virtual Machine

New Verifier

56

Theorem verifier
Single pass through the code
Very little space needed
Theorem proving occurs off line
May improve garbage collection in
the future!

TS-1507: Inside the K Virtual Machine

Old vs. New Verifier


static void test(Long x) {
Number y = x;
while (y.intValue() != 0)
{
y = nextValue(y);
}
return result;
}

57

TS-1507: Inside the K Virtual Machine

Old Verifier
Long

<>

Long

Long

0. aload_0
1. astore_1
Long

Long

<>

Long

Long

Number

<>

Long

Long

Number

Long

6. invokeStatic nextValue(Number)
Long Long

Number

Number

Long

Number

<>

Long Long
11. invokevirtual intValue()

Number

Long

2. goto 10
5. aload_1
Number

9. astore_1
Long
10. aload_1

Long

Long

Number

int

Long

Long

Number

<>

Number

14. ifne 5
17. return

58

TS-1507: Inside the K Virtual Machine

Done!

New Verifier
Long

<>

Long

Long

0. aload_0
1. astore_1
Long

Long

<>

Long

Number

<>

Long

Number

Number

6. invokeStatic nextValue(Number)
Long Number

Number

2. goto 10
5. aload_1

9. astore_1
Long

Number

<>

10. aload_1
Long Number
11. invokevirtual intValue()

Number

Long

Number

int

Long

Number

<>

14. ifne 5
17. return

59

TS-1507: Inside the K Virtual Machine

Done!

Native Methods
void <JNI name for Function> (void) {
pop arguments off stack;
pop <this> off stack (if instance)
perform calculations
push result (if any) onto stack;
return;
}

60

TS-1507: Inside the K Virtual Machine

Native Function Table


Automatically generated
Romized build
Non-romized build

Full documentation in porting guide


Catches missing implementations
No support for runtime loading of
object code

61

TS-1507: Inside the K Virtual Machine

JNI Caveats
Object allocation functions
Using the right push/pop macro
Dont use type coercion!

Popping the right number of values


Garbage collection issues

62

TS-1507: Inside the K Virtual Machine

Asynchronous JNI
Readers and writers
Two implementations
Call back with continuation
Start a separate thread to perform the task

Allows KVM to use the most wanted


features of OS-level threads, without
the hassle

63

TS-1507: Inside the K Virtual Machine

Event Handling
Synchronous notification
Polling in Java programming
language code
Polling in the interpreter
Asynchronous notification

64

TS-1507: Inside the K Virtual Machine

Romizer
Based on romizer used in PersonalJava
API implementation
Generates data structures that JVM would
have generated
Classes are loaded but not initialized

65

TS-1507: Inside the K Virtual Machine

Palm Romizer
Problems
64K limit per resource
Resources are relocatable
Native code is independently relocatable

Solutions

66

Multiple small resources


Expandable resources
Runtime relocation
Reset checking

TS-1507: Inside the K Virtual Machine

Restartability

67

Required for embedded systems


Romizer issues
Global variable issues
Other issues

TS-1507: Inside the K Virtual Machine

Starting a Port
machine_md.h
runtime_md.c
main.c

68

TS-1507: Inside the K Virtual Machine

Porting: Functions
AlertUser()
allocateHeap(), freeHeap()
InitializeNativeCode(),
FinalizeNativeCode()
InitializeVM(), FinalizeVM()
CurrentTime_md()
Ulong64, long64
BIG_ENDIAN or LITTLE_ENDIAN

69

TS-1507: Inside the K Virtual Machine

Porting: More Functions


C library calls
Minimal support for stdio
Native support for any generic
connections needed by your port
Asynchronous support

70

TS-1507: Inside the K Virtual Machine

Port-Specific Changes

71

Where class files are located


Where memory is located
What to do in case of severe errors
Window system startup/bringdown
Startup linking

TS-1507: Inside the K Virtual Machine

Compiler Flags

72

Tracing flags
Floating point
Alignment issues
Romizing
Generic networking
Generic storage

TS-1507: Inside the K Virtual Machine

Directory Structure

73

Api
Bin
Build
Doc
Jam
Samples
Tools
Kvm
TS-1507: Inside the K Virtual Machine

Subdirectores of tools/
tools/jcc
tools/palm

Subdirectories of kvm/
kvm/VMCommon/h, kvm/VMCommon/src
kvm/VMExtra/h, kvm/VMExtra/h
kvm/VM<port>/h, kvm/VM<port>/h,
kvm/VM<port>/build

74

TS-1507: Inside the K Virtual Machine

Porting
Create machine_md.h
Create runtime_md.h
Determine values of compiler flags

75

TS-1507: Inside the K Virtual Machine

Application Representation
Public representation of applications
Whenever applications intended for a CLDC
device are stored publicly, compressed JAR
format must be used
Class files must have been preverified and
contain the preverification information

At the implementation or network


transport level
Alternative formats can be used, as long
as the observable semantics of applications
remain the same as with standard
class files
76

TS-1507: Inside the K Virtual Machine

Java Application Manager


KVM/CLDC includes an optional
component known as Java
application manager
Helps in integrating KVM with a
microbrowser
Allows the downloading of
applications from the internet via HTTP
Allows the management (installing, launching,
deletion) of applications on devices without
a file system
Intended to facilitate porting efforts
77

TS-1507: Inside the K Virtual Machine

78

TS-1507: Inside the K Virtual Machine