Beruflich Dokumente
Kultur Dokumente
0 - Rationales
Yes, that means the kernel, the CPU and all the rest
Tooling criteria
...
Testing Methodology
Document your investigation (Again!)
Testing Methodology
Document your investigation (Again!)
Check again
Be very patient
Use a process
Others need to know you have not escaped the country (yet)
Profiling Objectives
Don't ever optimize everything
Be clear on purpose
Optimization
memory, permanent storage, net thruput, speed, etc
Debugging
memory leaks
Testing
coverage, load testing
Tools Lie
Interfering with normal operation
Common results
I Gory Details
Less is more
11
12
Pentium 4:
hardware prefetch
branching errors
data dependencies (micro-ops, leverage pipelines)
thruput and latency
13
loop unrolling
contrasting objectives
keep it small (p4 caches decoded instructions)
Pause (new)
delayed NOOP that reduces polling to RAM bus speed (__mm_pause)
14
Precomputed table
maximize cache hits from table
keep it small (or access it cleverly)
Fetching
top point to optimize
Numeric exceptions
denormals
integer rounding (x86) vs truncation (c)
partial flag stalls
Kernel Details
Caching and buffering
18
19
us : user mode
sy : system (kernel/)
ni : niced
id : idle
wa : I/O wait (blocking)
hi : irq handlers
si : softirq handlers
>
>
20
in : interrupts
cs : context switches
us : total CPU in user (including nice)
sy : total CPU in system (including irq and softirq)
wa : total CPU waiting
id : total CPU idle
CPU: /proc
/proc
/proc/interrupts
/proc/ioports
/proc/iomem
/proc/cpuinfo
procinfo
mpstat
21
CPU: others
sar
oprofile
22
RAM: free
free
/proc
/proc/meminfo
>
/proc/slabinfo
slabtop
23
RAM: others
vmstat, top, procinfo, sar
24
I/O: vmstat
vmstat
25
D : all
>
d : individual disk
>
p : partition
output
>
>
>
>
>
I/O: iostat
iostat
output
>
>
>
>
>
>
>
>
>
>
26
27
output (continued)
>
>
>
>
>
>
>
svctm : average service time (await includes service and queue time)
I/O: others
sar
hdparm
also, great way to make sure your disks are being used to their
full potential
bonnie
28
Network I/O
iptraf
29
31
32
PID : pid
PR : priority
NI : niceness
S : Process status (S | R | Z | D | T)
WCHAN : which I/O op blocked on (if any) , aka sleeping function
TIME : total (system + user) spent since spent since startup
COMMAND : command that started the process
#C : last CPU seen executing this process
FLAGS : task flags (sched.h)
USER : username of running process
VIRT : virtual image size
RES : resident size
SHR : shared mem size
...
33
34
c : profile
>
>
>
p : trace PID
>
>
>
v : verbose
35
c : profile
>
>
p : trace PID
>
environment controlled
36
libs
display library search paths
reloc
display relocation processing
files
display progress for input file
symbols display symbol table processing
bindings display information about symbol binding
versions display version dependencies
all
all previous options combined
statistics display relocation statistics
unused
determined unused DSOs
>
gcov
37
similar process
process RAM
/proc
/proc/<PID>/status
>
>
>
>
>
>
>
/proc/<PID>/maps
use of virtual address space
memprof
38
Valgrind
valgrind
addrcheck
faster, but less errors caught
massif
heap profiler
helgrind
race conditions
cachegrind / kcachegrind
cache profiler
39
Interprocess communication
ipcs
40
t : time of creation
>
>
>
>
a : all information
41
IV VM test Dummies
VM Details
Garbage collection
floating garbage
stop-the world
compacting
...
Profiling APIs:
43
JVMTI (current)
JVMPI (since 1.1) legacy, flaky on hotspot, deprecated in 1.5
JVMDI (since 1.1) legacy, also targeted at removal for 1.6
Netbeans profiler
44
much less information than jprofiler, but can redefine instrumentation at runtime (!)
telemetry has zero footprint, and can be used to identify moment for a snapshot
generation information provides great hints for really small leaks (bad signal/noise)
extremely stable
Roll-your-own
V - Conclusion
There are a lot of tools out there, start picking them up and
go get your hands dirty!
46
Resources:
Books
Code Optimization: Effective Memory Usage (Kris
Karspersky)
The Software Optimization Cookbook (Richard Gerber)
Optimizing Linux Performance (Philip Ezolt)
Linux Debugging and Performance Tuning (Steve Best)
Linux Performance Tuning and Capacity Planning (Jason
Fink and Matthew Sherer)
Self Service Linux (Mark Wilding and Dan Behman)
Graphics Programming Black Book (Michael Abrash)
Other
Kernel Optimization / Tuning (Gerald Pfeifer - Brainshare)
47
Any Questions ?
<flucifredi@acm.org>
48
49