RTL Book1

Real-time Programming in RTCore
FSMLabs, Inc. Copyright Finite State Machine Labs Inc. 2001-2004

All rights reserved.
17th January 2005

2
Contents
1 Introduction 11
1.1 Some background . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 How the book works . . . . . . . . . . . . . . . . . . . . . . . 13
I RTCore Basics 15
2 Introductory Examples 17
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Using RTCore . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Hello world . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Multithreading . . . . . . . . . . . . . . . . . . . . . . 19
2.2.3 Basic communication . . . . . . . . . . . . . . . . . . . 20
2.2.4 Signalling and multithreading . . . . . . . . . . . . . . 23
2.3 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Real-time Concepts and RTCore 27

3.1 RTOS kingdom/phylum/order . . . . . . . . . . . . . . . . . . 27
3.1.1 Non-real-time systems . . . . . . . . . . . . . . . . . . 27
3.1.2 Soft real-time . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.3 Hard real-time . . . . . . . . . . . . . . . . . . . . . . 29
3.2 The RTOS design dilemma . . . . . . . . . . . . . . . . . . . . 30
3.2.1 Expand an RTOS . . . . . . . . . . . . . . . . . . . . . 30
3.2.2 Make a general purpose OS real-time capable . . . . . 31
3.2.3 The RTCore approach to the problem . . . . . . . . . . 32
3.3 Interrupt emulation . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Flow of control on interrupt . . . . . . . . . . . . . . . 33
3.3.2 Limits of interrupt emulation . . . . . . . . . . . . . . 34
3
4 CONTENTS
3.4 Services Available to Real-Time Code . . . . . . . . . . . . . . 35

3.4.1 Memory management . . . . . . . . . . . . . . . . . . . 35
3.4.2 Networking - Ethernet and FireWire . . . . . . . . . . 36
3.4.3 Integration with other services . . . . . . . . . . . . . . 36
3.4.4 What’s next . . . . . . . . . . . . . . . . . . . . . . . . 37
4 The RTCore API 39

4.1 POSIX compliance . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 The POSIX PSE 51 standard . . . . . . . . . . . . . . 40
4.1.2 Roadmap to future API development . . . . . . . . . . 40
4.2 POSIX threading functions . . . . . . . . . . . . . . . . . . . . 40
4.2.1 Thread creation . . . . . . . . . . . . . . . . . . . . . . 41
4.2.2 Thread joining . . . . . . . . . . . . . . . . . . . . . . 43
4.2.3 Thread destruction . . . . . . . . . . . . . . . . . . . . 44
4.2.4 Thread management . . . . . . . . . . . . . . . . . . . 44
4.2.5 Thread attribute functions . . . . . . . . . . . . . . . . 45
4.3 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3.1 POSIX spinlocks . . . . . . . . . . . . . . . . . . . . . 46
4.3.2 Comments on SMP safe/unsafe functions . . . . . . . . 47
4.3.3 Asynchronously unsafe functions . . . . . . . . . . . . 47
4.3.4 Cancel handlers . . . . . . . . . . . . . . . . . . . . . . 48
4.4 Mutexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4.1 Locking and unlocking mutexes . . . . . . . . . . . . . 50
4.4.2 Mutex creation and destruction . . . . . . . . . . . . . 51
4.4.3 Mutex attributes . . . . . . . . . . . . . . . . . . . . . 52
4.5 Condition variables . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5.1 Creation and destruction . . . . . . . . . . . . . . . . . 53
4.5.2 Condition waiting and signalling . . . . . . . . . . . . . 53
4.5.3 Condition variable attribute calls . . . . . . . . . . . . 54
4.6 Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6.1 Creation and destruction . . . . . . . . . . . . . . . . . 55
4.6.2 Semaphore usage calls . . . . . . . . . . . . . . . . . . 55
4.6.3 Semaphores and Priority . . . . . . . . . . . . . . . . . 56
4.7 Clock management . . . . . . . . . . . . . . . . . . . . . . . . 56
4.8 Extensions to POSIX (* np()) . . . . . . . . . . . . . . . . . . 57
4.8.1 Advance timer . . . . . . . . . . . . . . . . . . . . . . . 57
4.8.2 CPU affinity calls . . . . . . . . . . . . . . . . . . . . . 58
4.8.3 Enabling FPU access . . . . . . . . . . . . . . . . . . . 59
CONTENTS 5
4.8.4 CPU reservation . . . . . . . . . . . . . . . . . . . . . 59

4.8.5 Concept of the extensions . . . . . . . . . . . . . . . . 60
4.9 ”Pure POSIX” - writing code without the extensions . . . . . 60
4.10 The RTCore API and communication models . . . . . . . . . 60
5 More concepts 61
5.1 Copying synchronization objects . . . . . . . . . . . . . . . . . 61
5.2 API Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Resource cleanup . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Synchronization-induced priority inversion . . . . . . . . . . . 63
5.6 Memory management . . . . . . . . . . . . . . . . . . . . . . . 63
5.7 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.7.1 Methods and safety . . . . . . . . . . . . . . . . . . . . 64
5.7.2 One-way queues . . . . . . . . . . . . . . . . . . . . . . 65
5.7.3 Atomic operations . . . . . . . . . . . . . . . . . . . . 70
6 Communication between RTCore and the GPOS 73

6.1 printf() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 rtl printf() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Real-time FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3.1 Using FIFOs from within RTCore . . . . . . . . . . . . 74
6.3.2 Using FIFOs from the GPOS . . . . . . . . . . . . . . 75
6.3.3 A simple example . . . . . . . . . . . . . . . . . . . . . 75
6.3.4 FIFO allocation . . . . . . . . . . . . . . . . . . . . . . 78
6.3.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4 Shared memory . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4.1 mmap() . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . 81
6.4.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . 86
6.5 Soft interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.5.1 The API . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.5.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . 89
7 Debugging in RTCore 93
7.1 Enabling the debugger . . . . . . . . . . . . . . . . . . . . . . 93
7.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6 CONTENTS
7.3.1 Overhead . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.3.2 Intercepting unsafe FPU use . . . . . . . . . . . . . . . 99
7.3.3 Remote debugging . . . . . . . . . . . . . . . . . . . . 99
7.3.4 Safely stopping faulted applications . . . . . . . . . . . 100
7.3.5 GDB notes . . . . . . . . . . . . . . . . . . . . . . . . 100
8 Tracing in RTCore 101

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.2 Basic Usage of the Tracer . . . . . . . . . . . . . . . . . . . . 102
8.3 POSIX Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
9 IRQ Control 105

9.1 Interrupt handler control . . . . . . . . . . . . . . . . . . . . . 105
9.1.1 Requesting an IRQ . . . . . . . . . . . . . . . . . . . . 105
9.1.2 Releasing an IRQ . . . . . . . . . . . . . . . . . . . . . 106
9.1.3 Pending an IRQ . . . . . . . . . . . . . . . . . . . . . . 106
9.1.4 A basic example . . . . . . . . . . . . . . . . . . . . . . 106
9.1.5 Specifics when on NetBSD . . . . . . . . . . . . . . . . 108
9.2 IRQ state control . . . . . . . . . . . . . . . . . . . . . . . . . 109
9.2.1 Disabling and enabling all interrupts . . . . . . . . . . 109
9.3 Spinlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10 Writing Device Drivers 113

10.1 Real-time FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.2 POSIX files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.2.1 Error values . . . . . . . . . . . . . . . . . . . . . . . . 117
10.2.2 File operations . . . . . . . . . . . . . . . . . . . . . . 117
10.3 Reference counting . . . . . . . . . . . . . . . . . . . . . . . . 117
10.3.1 Reference counting and userspace . . . . . . . . . . . . 119
II RTLinuxPro Technologies 121

11 Real-time Networking 123
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
12 PSDD 125
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12.2 Hello world with PSDD . . . . . . . . . . . . . . . . . . . . . . 125
CONTENTS 7
12.3 Building and running PSDD programs . . . . . . . . . . . . . 127

12.4 Programming with PSDD . . . . . . . . . . . . . . . . . . . . 127
12.5 Standard Initialization and Cleanup . . . . . . . . . . . . . . . 129
12.6 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . 130
12.7 Example: User-space PC speaker driver . . . . . . . . . . . . . 131
12.8 Safety Considerations . . . . . . . . . . . . . . . . . . . . . . . 133
12.9 Debugging PSDD Applications . . . . . . . . . . . . . . . . . 135
12.10PSDD API . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
12.11Frame Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 138
12.11.2 Command-line interface to the scheduler . . . . . . . . 140
12.11.3 Building Frame Scheduler Programs . . . . . . . . . . . 141
12.11.4 Running Frame Scheduler Programs . . . . . . . . . . . 142
12.12Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
13 Controls Kit (CKit) 145

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
13.2 Operation of the Ckit . . . . . . . . . . . . . . . . . . . . . . . 148
13.3 PD Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
13.3.1 Entity Registration . . . . . . . . . . . . . . . . . . . . 149
13.3.2 Program Execution . . . . . . . . . . . . . . . . . . . . 153
13.4 XML-RPC API . . . . . . . . . . . . . . . . . . . . . . . . . . 157
13.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
14 RTLinuxPro Optimizations 161

14.1 General optimizations . . . . . . . . . . . . . . . . . . . . . . 161
14.2 RTCore-internal optimizations . . . . . . . . . . . . . . . . . . 162
14.3 CPU management . . . . . . . . . . . . . . . . . . . . . . . . . 163
14.3.1 Targetting specific CPUs . . . . . . . . . . . . . . . . . 163
14.3.2 Reserving CPUs . . . . . . . . . . . . . . . . . . . . . . 163
14.3.3 Interrupt focus . . . . . . . . . . . . . . . . . . . . . . 164
14.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 165
15 SlickEdit IDE for RTCore 167

15.1 Creating a RTCore application project . . . . . . . . . . . . . 167
15.2 Compiling RTCore applications . . . . . . . . . . . . . . . . . 168
8 CONTENTS
III Appendices 171

A List of abbreviations 173
B Terminology 177
C Familiarizing with RTLinuxPro 189

C.1 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
C.1.1 Self and cross-hosted development . . . . . . . . . . . . 190
C.2 Loading and unloading RTCore . . . . . . . . . . . . . . . . . 190
C.2.1 Running the examples . . . . . . . . . . . . . . . . . . 191
C.3 Using the root filesystem . . . . . . . . . . . . . . . . . . . . . 191
C.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
D Important system commands 195
E Things to Consider 201

E.0.1 System Management Interrupts (SMIs) . . . . . . . . . 201
E.0.2 Drivers that have hard coded cli/sti . . . . . . . . . . 202
E.0.3 Power management (APM) . . . . . . . . . . . . . . . 202
E.0.4 Hardware platforms . . . . . . . . . . . . . . . . . . . . 202
E.0.5 Floppy drives . . . . . . . . . . . . . . . . . . . . . . . 203
E.0.6 ISA devices . . . . . . . . . . . . . . . . . . . . . . . . 203
E.0.7 DAQ cards . . . . . . . . . . . . . . . . . . . . . . . . 204
F RTCore Drivers 205

F.1 Digital IO Device Common API . . . . . . . . . . . . . . . . . 205
F.2 Intel 82C55 Digital IO . . . . . . . . . . . . . . . . . . . . . . 207
F.2.1 Driver specifics . . . . . . . . . . . . . . . . . . . . . . 207
F.3 Marvell GT64260 and GT64360 Digital IO Driver . . . . . . . 207
F.3.1 Driver specifics . . . . . . . . . . . . . . . . . . . . . . 207
F.4 Video Framebuffer Driver . . . . . . . . . . . . . . . . . . . . 208
F.4.1 Calling Contexts . . . . . . . . . . . . . . . . . . . . . 208
F.4.2 Operations on the Framebuffer . . . . . . . . . . . . . 208
F.4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 209
F.5 IEEE-1284 – Parallel Port Digital IO Driver . . . . . . . . . . 209
F.6 Power Management Driver . . . . . . . . . . . . . . . . . . . . 209
F.7 Frequency changing . . . . . . . . . . . . . . . . . . . . . . . . 210
F.8 CPU Idle calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
CONTENTS 9
F.9 Additional Uses . . . . . . . . . . . . . . . . . . . . . . . . . . 210

F.10 PPS driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
F.11 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
F.12 Timing and how it works . . . . . . . . . . . . . . . . . . . . . 211
F.13 Using the driver . . . . . . . . . . . . . . . . . . . . . . . . . . 212
F.13.1 Starting th driver . . . . . . . . . . . . . . . . . . . . . 212
F.13.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . 212
F.14 Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
F.14.1 Jitter value . . . . . . . . . . . . . . . . . . . . . . . . 213
F.14.2 SMP systems . . . . . . . . . . . . . . . . . . . . . . . 214
F.15 Serial driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
F.16 VME driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
F.16.1 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . 217
F.16.2 Slave memory regions . . . . . . . . . . . . . . . . . . . 217
F.16.3 Master memory regions . . . . . . . . . . . . . . . . . . 218
F.16.4 DMA transfers . . . . . . . . . . . . . . . . . . . . . . 219
F.16.5 Performance . . . . . . . . . . . . . . . . . . . . . . . . 219
G The RTCore POSIX namespace 221

G.1 Clean applications . . . . . . . . . . . . . . . . . . . . . . . . 221
G.2 Polluted applications . . . . . . . . . . . . . . . . . . . . . . . 222
G.3 PSDD users . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
G.4 Include hierarchies and rules . . . . . . . . . . . . . . . . . . . 225
G.4.1 app/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
G.4.2 rtcore/ . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
G.4.3 gpos bridge/ . . . . . . . . . . . . . . . . . . . . . . . . 226
G.5 Including GPOS files . . . . . . . . . . . . . . . . . . . . . . . 227
G.6 Quick rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
G.6.1 Older apps that must be polluted . . . . . . . . . . . . 227
G.6.2 Older users that want to avoid pollution . . . . . . . . 228
G.6.3 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 228
H System Testing 231

H.1 Running the regression test . . . . . . . . . . . . . . . . . . . 231
H.1.1 Stress testing . . . . . . . . . . . . . . . . . . . . . . . 232
H.2 Jitter measurement . . . . . . . . . . . . . . . . . . . . . . . . 233
10 CONTENTS
I Sample programs 235

I.1 Hello world . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
I.2 Multithreading . . . . . . . . . . . . . . . . . . . . . . . . . . 235
I.3 FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
I.3.1 Real-time component . . . . . . . . . . . . . . . . . . . 236
I.3.2 Userspace component . . . . . . . . . . . . . . . . . . . 238
I.4 Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
I.5 Shared Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 240
I.5.1 Real-time component . . . . . . . . . . . . . . . . . . . 240
I.5.2 Userspace application . . . . . . . . . . . . . . . . . . . 243
I.6 Cancel Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . 244
I.7 Thread API . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
I.8 One Way queues . . . . . . . . . . . . . . . . . . . . . . . . . 246
I.9 Processor reserve/optimization . . . . . . . . . . . . . . . . . . 248
I.10 Soft IRQs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
I.11 PSDD sound speaker driver . . . . . . . . . . . . . . . . . . . 251
Chapter 1
Introduction
Real-time software is needed to run multimedia systems, telescopes, machine

tools, robots, communication devices, and many other kinds of systems. The
RTCore hard real-time operating system has been used to control the me-
chanical animals in the movie Dr. Doolittle, perform jet engine testing for
the Joint Strike Fighter, aim the telescope at Kitt Peak, run flight simula-
tors, collect weather data for NASA, balance magnetic bearings, milk cows,
control Fujitsu’s humanoid robot, and more. The system is flexible enough
that for one customer, it can control an engine, while for another it just as
easily mimics the human hand playing a violin.
RTCore is designed to make real-time programming more convenient and
less mysterious. Real-time programming is still pretty challenging, but once
you start to understand the basic ideas, if you have some C and UNIX back-
ground, programming RTCore applications should be, if not simple, at least
feasible. In this book we will cover the basic principles of the OS and general
real-time programming, stressing examples and practical methods as much
as possible.
RTCore follows the UNIX philosophy of making it convenient to build
complex applications by connecting existing pieces of software. One way to
think of RTCore is as a small operating system that runs a second operating
system as its lowest priority task. All the non-time-critical applications can
be put in the second operating system. For most programmers it’s probably
more useful to think of RTCore as a special real-time process that runs
within a non-real-time operating system. It schedules itself and can always
pre-empt both the operating system and any applications. The non-real-time
operating system is usually Linux (RTLinux), although it can also be BSD
11
12 CHAPTER 1. INTRODUCTION
UNIX (RTCoreBSD) or even a Java VM. 1

The real-time system is multi-threaded using the POSIX threads API.
Real-time applications are written as threads and signal handlers that can
be installed in the real-time process. These threads and signal handlers can
be scheduled with great precision and can respond to interrupts with very
low latency.
On a 1.2GHz Athlon (lower end hardware these days), an RTCore in-
terrupt handler runs within 9 microseconds of the assertion of a hardware
interrupt, under heavy load. So if we have a device that generates an in-
terrupt when the temperature gets too high, at the worst case, the signal
handler connected to that device will start running 9 microseconds after the
interrupt is generated. On the same system, a periodic thread scheduled to
run every millisecond will run at most 13 microseconds late after the sum
of interrupt latency, scheduling overhead, and context switch. So if we have
a data acquisition device we can poll it at a regular rate, and know that
the polling thread starts up within 13 microseconds of the scheduled time2 .
RTCore is a hard real-time system, so these are absolute worst-case times,
not average or ”typical” times. Be wary of tests that demonstrate other
approaches - many are done against quiescent systems, for short periods of
time, or quote numbers that have no relevance to real situations, and are in
no way indicative of real-world results. We will be discussing what numbers
to look for in later examples.
Of course, all this speed does no good if programming the system is too
complicated. So we have designed RTCore to meet two goals. First, the time
critical software can be written in the familiar and well documented POSIX
threads/signals API. And second, it’s pretty easy to put non-time-critical
software into the application operating system. Our favorite example of how
to write a data logging program makes use of a single line of shell script on
the UNIX side:
./rtcore app > mylogfile
This runs the real-time application and logs output to a non-real-time Linux
file. For those who have used UNIX at all, this should look very familiar.
1
Generally, we use the term ”GPOS”, or General Purpose Operating System to gener-
ically refer to the non-real-time system. The RTCore API and behavior remain the same
regardless of which GPOS is being used.
2
In later chapters, we will see how to reduce this down to 0 microseconds, bypassing
hardware jitter.
1.1. SOME BACKGROUND 13
This book starts with some background and simple examples and then
takes a detour for an in-depth introduction to the basic concepts of RTCore
and an overview of the API. Next, the available communication models for
exchanging data between real-time threads and the non-real-time domain are
presented. The sample programs then use these mechanisms to show they
apply to simple problems. These chapters are devoted to stepping through
these programs, making every step as clear as possible, and require little
prior knowledge. Following that, several chapters are devoted to the more
advanced features of RTLinux Professional, or RTLinuxPro. After having
covered the basic concepts in a few sample programs, we then provide a
basic model for writing real-time drivers.
1.1 Some background

RTLinux began as a research project in 1995, to investigate a simple method
of providing hard real-time services within the context of a general purpose
operating system. Soon after, it began to be used in a variety of domains.
FSMLabs was formed to provide a dedicated effort to improving the technol-
ogy, and to provide top tier support for commercial users of the product.
RTLinuxPro was developed out of this effort, and is licensed for commer-
cial use. FSMLabs continues to move the technology forward via RTLin-
uxPro and the RTCore OS, having dedicated many man years to providing
a solid and integrated hard real-time component for commercial customers.
The RTLinuxFree project, based on the GPL-released project, is commu-
nity supported and developed. FSMLabs continues to provide the necessary
resources to support the RTLinuxFree community.
1.2 How the book works

The main body of each chapter discusses the principles of how the software
examples work. In each chapter, side notes describe how to implement the
examples or test behavior in RTLinuxPro. There is an appendix with a basic
usage guide for RTLinuxPro. As mentioned, the RTCore OS can run different
non-real-time operating systems, but Linux and BSD UNIX will generally be
referred to by default. Ports to other operating systems are in development.
The target audience for this book is the engineer who is interested in
14 CHAPTER 1. INTRODUCTION
learning how to write real-time applications using the RTCore OS. The book
focuses on getting the user spun up on each facet quickly so they can become
productive quickly, rather than trying to intuit facts from scattered sources.
Experience in developing real-time applications is helpful but not necessary,
as RTCore uses the standard POSIX API. Users with some knowledge of
POSIX and UNIX should feel right at home.
The full sources of the programs referenced here can be found in Appendix
I and are provided with the RTLinuxPro development kit.
Part I
RTCore Basics
15
Chapter 2
Introductory Examples
2.1 Introduction
The RTCore OS is a small, hard real-time operating system that can run
Linux or BSD UNIX as an application server. This allows a standard oper-
ating system to be used as a component of a real-time application. In this
part, we will provide an overview of RTCore capabilities, introducing basic
concepts, the API, and some of the add-on components. This book starts
assuming you have already installed RTLinuxPro, RTCoreBSD or RTLin-
uxFree - refer to the installation instructions that came with your package
for details. This chapter will assume an RTLinuxPro environment, but the
procedures apply equally to a BSD host.
2.2 Using RTCore

RTCore extends the UNIX “design with components” philosophy to real-
time. A typical RTCore application consists of one of more real-time com-
ponents that run under the direct control of the real-time kernel, and a set
of non-real-time components that run as user-space programs. Let’s start off
with a couple of simple programs.
At this point, we assume some very basic familiarity with RTCore con-
cepts as discussed so far. If you would like more information to get up
to speed before continuing, please refer to the whitepaper in Appendix ??.
Also provided is a basic guide to RTLinuxPro (C) to help you learn your way
around the system. Also, if you are working through these examples and need
17
18 CHAPTER 2. INTRODUCTORY EXAMPLES
more grounding, skip ahead to Chapter 3 for background information.

For this example, you will need to have the core RTCore OS loaded as
described in Appendix C, and we assume that your current working directory
as the root user is the rtlinuxpro directory of RTLinuxPro (Or the appro-
priate installation point for your RTLinuxFree installation). If you don’t see
the referenced files in the directory, type make to ensure that everything is
up to date. Now, on with the code:
2.2.1 Hello world

As with any other system, it makes sense to start things off with a simple
”hello world” application. This is no exception. The real-time merits of such
an application are dubious, but it does serve to show how simple the API is.
Without further ado, here is the standard introductory program:
#include <stdio.h>
int main(void)
{
printf("Hello from the RTL base system\n");
return 0;
}
Surprised? This is all that is involved - nothing more than what you
would see in a normal C introduction. Running the example (./hello.rtl)
forces the RTCore OS to load the application, and enter the main() context.
Here it prints a message out through standard I/O for the user to see, and
exits.
Those familiar with older RTLinux versions are used to these messages
silently appearing in the kernel’s ring buffer, but now they print through
stdout just like any other application. Also, there is a standard printf(),
rather than the rtl printf() some users have seen. This printf() is fully
capable, and can handle any format that a normal printf() can handle.
Once the message has been printed, the program exits, RTCore unloads
the application, and we’re done. Now, let’s move on to something a little bit
more useful.
2.2. USING RTCORE 19
2.2.2 Multithreading
If you’re familiar with POSIX threading, you’ll feel at home with RTCore.
If you’re not familiar with it, there are many solid references on the subject,
such as the O’Reilly book on Pthreads Programming. Let’s start with a
basic example of the pthread model here, with a task that operates on a 1
millisecond interval.
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
pthread t thread;
void *thread code(void *t)

{
struct timespec next;
int count = 0; 10
clock gettime(CLOCK REALTIME, &next);
while (1) {
timespec add ns(&next, 1000*1000);
clock nanosleep(CLOCK REALTIME, TIMER ABSTIME,
&next, NULL);
count++;
if (!(count % 1000))
printf("woke %d times\n",count); 20
}
return NULL;
}
int main(void)
{
pthread create(&thread, NULL, thread code, (void *)0);
rtl main wait(); 30

pthread cancel(thread);
pthread join(thread, NULL);
return 0;
}
Again, everything starts with a normal main() function. A standard

thread is spawned right away (pthread attributes will be covered later), and
the code calls rtl main wait(). This is really a blocking function that allows
the application to stay suspended until otherwise shut down. For those of
you who have ever done graphical applications with a main event loop, the
same concept applies here.
If the application is killed (via CTRL-C or otherwise), the waiting call
will complete, and the rest of the function will cancel the thread, join its
resources, and return.
The thread itself is a hard real-time thread running under RTCore that
executes on an exact 1 millisecond period. It samples the current time ex-
actly, adds 1 millisecond to that value, and sleeps until that time hits. It
counts the number of wakeups, and prints a count every 1000 iterations.
(1000 printf() calls per second clutters the terminal pretty quickly.) This
thread will execute indefinitely, until the application is actively unloaded.
Details follow later, but it is important to note that code in the main()
routine is inherently non-real-time. Any potentially non-real-time activity
should be done here, such as memory allocation and other initialization tasks.
(We’ll cover why memory allocation is a potentially non-real-time activity in
a later chapter.)
2.2.3 Basic communication

There needs to be some communication from one real-time thread to another,
and also between real-time threads and non-real-time threads, such as Linux
processes. Later chapters will discuss this in more detail, but here we’ll just
look at the simplest of mechanisms, the FIFO.
Real-time FIFOs are just like any other FIFO device - a producer (whether
it is a real-time thread or a userspace application) pushes data in, and a
consumer recieves it in the order it was submitted. Real-time FIFOs are
constructed such that real-time threads will never block on data submission
- they will always perform the write() and move on as quickly as possible.
This way real-time applications can never be stalled because of the FIFO’s
state.
First, here is the real-time component:
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
pthread t thread;
int fd1;
10
{
while (1) {
timespec add ns(&next, 1000*1000*1000);
clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 20

&next, NULL);
write( fd1, "a message\n", strlen("a message\n"));

}
return NULL;
}
int main(void)
{ 30
mkfifo("/communicator", 0666);
fd1 = open("/communicator", O RDWR | O NONBLOCK);
ftruncate(fd1, 16<<10);
rtl main wait();

40
close(fd1);
unlink("/communicator");
return 0;
}
This code starts up and creates the FIFO with standard POSIX calls.
mkfifo() creates the FIFO with permissions such that a device will appear
in the GPOS filesystem dynamically. We then open the file normally and
call ftruncate() to size it - this sets the ’depth’ of the FIFO.
A thread is spun, we wait to be killed, and the main code is done. Once
rtl main wait() completes, we need to close/unlink the FIFO in addition
to the thread cleanup, just like any normal file. RTCore will catch dangling
devices and clean them up for a user, but good programming practice is to
do the work right in the first place.
Our thread in this instance sleeps on a one second interval and writes
to the FIFO every time it wakes up. As before, it will do this indefinitely.
There are no real surprises here, so let’s look at the userspace code:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char **argv) {

int fd;
char buf[255];
10
fd = open("/communicator", O RDONLY);
while (1) {
read(fd,buf,255);
printf("%s",buf);
sleep(1);
}
}
Again, there should be no surprises here - this is a normal non-real-

time, userspace application. It opens the other end of the FIFO, and reads
periodically, getting the message from the other end. This could have been
some device data protocol from the RTOS, the userspace application could
write data up to the RTOS to direct thread execution, or FIFOs can be
used between real-time threads. Either way, they provide a simple file-based
means of exchanging data.
2.2.4 Signalling and multithreading

Communication between threads is also done via standard POSIX mecha-
nisms. Again, all of the different means are covered later, but here let’s look
at semaphores, which are a very convenient method of signalling between
threads. Here’s an example:
#include <stdio.h>
#include <unistd.h>
#include <semaphore.h>
pthread t wait thread;

pthread t post thread;
sem t sema;
void *wait code(void *t) 10

{
while (1) {
sem wait(&sema);
printf("Waiter woke on a post\n");
}
}
void *post code(void *t)

{
struct timespec next; 20
while ( 1 ) {
clock nanosleep( CLOCK REALTIME,

TIMER ABSTIME, &next, NULL);
printf("Posting to the semaphore\n"); 30

sem post(&sema);
}
return NULL;
}
int main(void)
{
sem init(&sema, 1, 0);
40
pthread create(&wait thread, NULL, wait code, (void *)0);
pthread create(&post thread, NULL, post code, (void *)0);
rtl main wait();
pthread cancel(post thread);

pthread join(post thread, NULL);
pthread cancel(wait thread);

pthread join(wait thread, NULL); 50
2.3. PERSPECTIVE 25
sem destroy(&sema);
return 0;
}
Instead of a single thread, two are spun up, once the semaphore is ini-
tialized. One thread waits on the semaphore, while the other sleeps and
periodically performs the sem post() operation. Before the post occurs, and
after the waiter wakes, a message is printed to indicate the sequence of events.
Semaphores really are that easy - we’ll see how they can be used later on
to very easily handle synchronization problems.
2.3 Perspective
At this point, take a step back and look at what we’ve just covered. In a
short introduction, you’ve seen code that performs standard output, POSIX
threads, communication through real-time devices, and synchronization through
standard POSIX semaphores. None of it required much experience beyond
basic knowledge of C and POSIX, and a little bit of UNIX background. In
fact, these applications are no different than what you would see in a normal
C environment under another UNIX. The difference here is that you get hard
real-time response in your threads.
The point of this was to get you, the reader, handling useful code as
quickly as possible, easing the stigma surrounding real-time programming.
Now that you see that it doesn’t involve occult knowledge, we’ll step back
and take a broader view of RTCore, the API and some of the grounding
principles of real-time programming.
Chapter 3
Real-time Concepts and

RTCore
You’ve now seen some basic RTCore code, and can see that real-time pro-
gramming isn’t as mystifying as it sounds. However, before we dive into
detailed coverage of the API, some basic concepts need to be demonstrated.
For those familiar with RTOS concepts, most of this chapter should be re-
view, but skimming is recommended as we will be explaining how RTCore
handles real-time problems.
3.1 RTOS kingdom/phylum/order

The definition of real-time varies greatly based on its use. Anything from
stock quotes to stepper motors can be said to be real-time. Within the
computing industry, real-time has many different meanings depending on
the requisite service level. Here is a simple breakdown of operating systems
in relation to real-time applicability.
3.1.1 Non-real-time systems

”Non-real-time” systems are the operating systems most often used. These
systems have no hard guarantees and are able to utilize optimization strate-
gies contradictory to real-time requirements, such as caching and buffering.
Non-real-time systems have the following characteristics:
27
28 CHAPTER 3. REAL-TIME CONCEPTS AND RTCORE
• No guaranteed worst-case scheduling jitter. Under heavy system load,

the system may defer scheduling of a task as long as it deems necessary.
• No theoretical limit on interrupt response times. System load may
result in delayed interrupt response. Also, running with interrupts
disabled for considerable periods, while considered to be bad form, is
not catastrophic.
• No guarantee that an event will be handled. Varying the system load
affects the number of events it intercepts, such as interrupts.
• System response is strongly load-dependent. Tasks that take x amount
of time under one system load will take y amount of time under a differ-
ent load. Response prediction with any surety is generally impossible.
• System timing is a unmanaged resource. This means that timing data
is not considered important to system execution, and is not tracked
with precision.
Non-real-time systems are unpredictable even at a statistical level, as
system reaction is highly dependent on system load. Rough predictions can
be made if the error window is opened widely, but the results cannot be
proven to fall inside the predicted range.
3.1.2 Soft real-time

In cases where missing an event is not critical, as in a video application where
a missed frame or two is not fatal, a ”soft real-time” system may do. Such a
system is characterized by the following criteria:
• The system can guarantee a rough worst-case average jitter, but not
an absolute worst case scenario.
• Events may still be missed occasionally. This is better than the non
real-time system, as there is more control over response, but as the
absolute worst case is unknown, events such as interrupts may still be
lost. This may occur even when not in a worst case situation.
Soft real-time systems are statistically predictable for the average case,
but a single event can not be predicted reliably. Soft real-time systems are
generally not suited for handling mission-critical events.
3.1. RTOS KINGDOM/PHYLUM/ORDER 29
3.1.3 Hard real-time

The following list of requirements define a hard real-time system. The ab-
sence of predictability for any one of these items disqualifies a system from
being a hard real-time system.
• System time is a managed resource. Timing resources are managed

with the highest possible level of precision.
• Guaranteed worst-case scheduling jitter. If a task needs to be happen

within a certain deviation, it is guaranteed to occur.
• Guaranteed maximum interrupt response time. As with scheduling

latency, interrupts are guaranteed to be acknowledged and handled
within a certain window.
• No real-time event is ever missed. This is important. Under no cir-

cumstances will a scheduled task not be run on time, an interrupt be
missed, or any other event the real-time code is interested in.
• System response is load-independent. Execution of real-time tasks is

guaranteed to fall within the worst case value range, regardless of the
system load factor. A thrashing database process will not delay move-
ment of a robotic arm.
A system that can fulfill these criteria is fully deterministic and considered
to be ”hard” real-time. Of course, there are varying levels of service, as some
hard real-time systems might have a worst case jitter of 2 seconds, while
others provide 25 microseconds. Both qualify according to the definition, but
only one is usable for a wide range of applications. The RTCore approach
qualifies on all of these counts, as response time is near the limits of the
underlying hardware.
Hard real-time systems will generally have slightly lower average perfor-
mance than soft real-time systems, which in turn are generally not as efficient
with resources as non-real-time systems. This is because non-real-time sys-
tems are concerned with throughput - if an Ethernet transfer is delayed a
little in order to burst out several disk transfers, this results in higher system
output, and has no significant repercussions in a non-real-time environment.
In a hard real-time system, not performing this optimization results in lower
overall throughput, but it maintains determinism. This determinism is what
makes the difference between getting your task done without fail and doing
a ”best effort” based on available system resources.
3.2 The RTOS design dilemma

The fundamental problem of an RTOS is that users have conflicting demands
with respect to system design. On one hand, an RTOS should obviously be
capable of real-time operations. On the other hand, users want access to
the same rich feature sets found in general-purpose operating systems which
run on desktop PCs and workstations. To resolve this dilemma, two general
concepts have traditionally been used.
3.2.1 Expand an RTOS

Design guidelines for an RTOS include the following: It needs to be compact,
predictable and efficient; it should not need to manage an excessive number
of resources, and it should not be dependent on any dynamically allocated
resources. If one expands a compact RTOS to incorporate the features of
typical desktop systems, it is hard (if not impossible) to fulfill the demands
of the core RTOS. Problems that arise from this approach include:
• The OS becomes very complex. This makes it difficult to ensure de-

terminism, since ALL core capabilities must be fully preemptive. This
means that all developers must now take into account every possible
real-time demand in addition to solving problems in their specific do-
main.
• Drivers for hardware become very complex. Since priority inversion

must not occur, drivers must be able to handle situations in which
they are not being serviced. Again, this forces all developers to deal
with additional possibilities outside their domain.
• Since the core system is an RTOS, the vast amount of available software
cannot (in most cases) be used without modification or at least signifi-
cant analysis with respect to real-time demands. It is almost impossible
to determine interactions between the software and the RTOS.
3.2. THE RTOS DESIGN DILEMMA 31
• Many mechanisms for efficiency, like caching and queuing, become

problematic. This prohibits usage of many typical optimization strate-
gies for the non-real-time applications in the system.
• Maintenance costs of such a system are considerable for both developers
and customers. Since every component of the system can influence
the entire system’s behavior, it is very hard to evaluate updates and
modifications with respect to how they will influence real-time behavior
of the rest of the system. Engineering costs skyrocket as reliability
becomes questionable.
3.2.2 Make a general purpose OS real-time capable

The seemingly natural alternative strategy would be to add real-time capa-
bilities to a general purpose OS. In practice, this approach meets constraints
similar to those noted above, as both are converging on the same idea from
different directions. Problems that arise with such an approach include:
• General purpose operating systems are event-driven, not time-triggered.
• General Purpose OSs are (generally) not fully preemptive systems.
Making them fully preemptive requires modifications to all hardware
drivers and to all resource handling code. For a constantly evolving sys-
tem, tracking these modifications is prohibitive from both a manpower
perspective and becomes even more difficult as the OS is patched and
modified in the field. In addition, preemption has been found to reduce
throughput and response characteristics in many common scenarios.
• Lack of built-in high-resolution timing functions entail substantial sys-
tem modification.
• Modifying existing applications to be preemptive is very costly and
error-prone.
• The use of modified applications would also greatly increase mainte-
nance costs.
• Optimization strategies used in general purpose OSes can contradict
the real-time requirements. For example, removing all caching and
queueing from an OS would substantially degrade performance in areas
where there are no real-time demands.
• Because such systems are very complex (and often not well-documented),
it is extremely difficult to reliably achieve full preemption, especially
without performance degradation in many usage scenarios. Add in the
fact that the system is constantly developing, and the problem worsens.
General purpose operating systems are efficient with resources. Because

they don’t manage time as an explicit resource, trying to modify the system
to do so violates many of its design goals, and causes components to be used
in ways they were never designed for. This is in principle a bad strategy,
especially when there are many developers, all with different visions of what
the exact behavior of the machine should be.
3.2.3 The RTCore approach to the problem

To resolve these conflicting demands, a simple solution has been developed.
RTCore splits the OS entirely, so that one kernel (Linux or BSD UNIX) runs
as a general purpose system (GPOS) with no hard real-time capabilities but
with a large capability set, and a second kernel (RTCore), designed around
real-time capabilities efficiently handles real-time work. The real-time kernel
allows the GPOS to run when there are no real-time demands. This approach
allows the non-real-time side of the OS to provide all the capabilities that
desktop users are used to, while the real-time side can be kept small, fast,
deterministic, and verifiable.
Three major attributes make RTCore work:
• It disables all hardware interrupts in the GPOS.
• It provides interrupts via interrupt emulation.
• It runs full featured non-real-time Linux (or BSD) as the lowest priority
task. It is the ”idle task” of the RTOS, meaning that it is run whenever
the real-time system has nothing else to execute.
3.3 Interrupt emulation

The main problem in adding hard real-time capabilities to a general purpose
operating system is that the disabling of interrupts is widely used in the
kernel for synchronization purposes. The strategy of disabling interrupts in
3.3. INTERRUPT EMULATION 33
critical code sequences (as opposed to using synchronization mechanisms like

semaphores or mutexes), is quite efficient. It also makes code simpler, since
it does not need to be designed to be reentrant. But disabling of interrupts
for long periods results in lost events.
To maintain the structure of the GPOS kernel while providing real-time
capabilities, one must provide an ”interrupt interface” that gives full control
over interrupts, but at the same time appears to the rest of the system like
regular hardware interrupts. This interrupt interface is essentially an inter-
rupt emulation layer, and is one of the core concepts in RTCore. Interrupt
emulation is achieved by replacing all occurrences of sti and cli with emu-
lation code. This introduces a software layer between the hardware interrupt
controller and the GPOS kernel, allowing the real-time kernel to handle in-
terrupts as needed by real-time code, but still allowing the general purpose
OS to handle them if there is a need.
Interrupts that are not destined for a real-time task must be passed on to
the GPOS kernel for proper handling when there is time to deal with them.
In other words, RTCore has full control over the hardware and non-real-time
GPOS sees soft interrupts, not the ”real” interrupts. Hardware interrupt
interaction is simply emulated in the GPOS. This means that there is no
need to recode GPOS drivers, provided there are no hard-coded instructions
in binary-only drivers that bypass the emulation. (See E.0.2 for details.)
3.3.1 Flow of control on interrupt

What happens when an interrupt occurs in RTCore? The following pseu-
docode shows how RTCore handles such an event.
if (there is an RT-handler for the interrupt) {

call the RT-handler
}
if (there is a GPOS-handler for the interrupt) {
call the GPOS handler for this interrupt
} else {
mark the interrupt as pending
}
This pseudocode represents the priority introduced by the emulation layer

between hardware and the GPOS kernel. If there is a real-time handler
available, it is called. After this handler is processed, the GPOS handler is

called. This calling of the GPOS handler is done indirectly: it runs as the
idle task of the RTCore kernel, so the GPOS handler will be called as soon as
there is time to do so, but a GPOS interrupt handler cannot block RTCore.
That is, the interrupt handler for the GPOS is called from within the GPOS,
not from RTCore. If the interrupt is deferred to the GPOS and its interrupt
handler is executing when a real-time task must run, the real-time kernel will
suspend it’s execution, and the real-time code will execute as needed.
3.3.2 Limits of interrupt emulation

Interrupt emulation does have limits. Even for non-real-time interrupts, the
system must take the time to acknowledge the interrupt controller and record
the fact that the interrupt has happened. The hardware interrupts have pri-
ority over the real-time tasks, and so a GPOS hardware interrupt may disturb
the real-time scheduling. Fortunately, the actual code is well optimized and
has very little impact even on older platforms. Also, the system works in
such a way that a particular GPOS interrupt may not preempt real-time
system more often than once a period of RT activity. Therefore, the worst
case scheduling jitter that can be attributed to the non-real-time hardware
interrupts is bounded by the number of such interrupts that can be received
by the current CPU times the maximum time to acknowledge an interrupt
and record the interrupt occurence. The CPU reservation facility (see Sec-
tion 4.8.4) can eliminate this and minimize other sources of scheduling jitter,
making for excellent real-time performance on SMP systems. The RTCore
advance timer option (Section 4.8.1) may be used to improve performance
on both uniprocessor and multiprocessor systems.
Since non-real-time activity may have effect on worst-case timings in the
system (e.g. ping flooding a system while running a critical real-time task
may shift its timings), the worst possible conditions should be used to test a
system’s worst-case scheduling jitter and interrupt response time.
In later chapters, and Appendix H, we will cover some basic testing en-
vironments you can use to stress your hardware.
3.4. SERVICES AVAILABLE TO REAL-TIME CODE 35
3.4 Services Available to Real-Time Code

Code run in the RTCore real-time kernel does not exist in a vacuum. Services
are available to real-time code, although applicability may vary depending
on system configuration, real-time demands, and RTLinuxPro components
available. In later chapters, we will cover these components in detail, but a
few notes are in order to aid in the understanding of the examples.
3.4.1 Memory management

Strictly speaking, there is no memory management from within RTCore. The
reason for this is that memory allocation is difficult to manage determinis-
tically with respect to real-time demands. Simple memory allocators can be
written for known usage patterns, and some users have written basic systems
for their own applications. When applied to the generic case, however, the
problem becomes difficult to provably handle.
There are alternatives for real-time code:
• Allocate memory in intialization code, during execution of main(). As

this is in the startup context, not in a real-time thread, interrupt han-
dler, or otherwise, it is perfectly safe. Also, the memory could be
declared as static to the module.
• Soft-IRQs: We will discuss this in detail later, but this approach in-
volves creating a virtual IRQ that is visible to the GPOS. Using this
IRQ, real-time code can signal to the non-real-time system that it needs
memory, and a handler on the other side safely takes care of the possible
blocking when allocating the memory. When this operation completes,
a signal is sent back to the RTCore code. There may be any amount
of delay before this handler gets to do the work, though.
• Simple memory allocators/deallocators. As mentioned previously, it

is possible to write a deterministic memory allocator if you know the
usage patterns of the code that will need it.
This point is important, and should be considered carefully. Many users

need to do work involving memory allocation, and don’t understand when it is
safe to do what. Whenever possible, perform allocations within main(), along
with any potentially blocking kernel calls, such as PCI device initialization.
This also includes RTCore calls such as ftruncate(fd, size); on FIFOs

and shared memory, which involves a kernel memory allocation to create
space for device data. Put simply, you cannot safely perform allocations
from within real-time code, including threads and interrupt handlers. Calls
chaining directly from main() are safe to allocate from.
Later on, in Chapter 4, we will demonstrate how to use preallocated
memory in order to spawn new threads from within real-time code. This
involves exactly what we describe here - allocating a block of memory before
you enter the real-time system, and then using it safely later on, when in the
context of real-time code.
3.4.2 Networking - Ethernet and FireWire

RTLinuxPro offers a component called Light Net (LNet) allowing real-time
networking, from raw packets up to UDP, over Ethernet or FireWire. This
allows one to easily create and send raw packets destined for the network, for
hard real-time communication with other machines. Both transport mediums
are in heavy field use by FSMLabs’ customers.
Of course, you can still interact with other machines without LNet, but
the networking stacks are all dependent on the GPOS. The traditional path
is for data to be collected by real-time code, pushed over a FIFO or shared
memory to userspace, which then does any packaging work and pushes it
through the network stack via a socket. With LNet, your real-time data
can be collected and dumped to the hardware through a zero-copy interface
immediately, allowing deterministic network transfers between machines, and
saving the trouble of going to userspace and back through kernelspace. Later
chapters will cover this in detail, but for now begin to consider the idea
that individual real-time systems do not have to operate without real-time
assistance from other processing nodes.
3.4.3 Integration with other services

As we will cover in a later chapter, the Controls Kit offers a means of inte-
grating low-level real-time systems with the rest of your organization. Com-
ponents that use the Controls Kit can be directed through web interfaces,
Excel spreadsheets, and other systems. This simplifies integration with ex-
isting infrastructure - for example, now it’s easy to let your Oracle database
3.4. SERVICES AVAILABLE TO REAL-TIME CODE 37
retain statistical information on how your machine floor devices are doing.
Later on, we’ll cover the capabilities of this package in detail.
3.4.4 What’s next

At this point, let’s shift gears and cover the RTCore programming API. As it
is POSIX-based, it provides few surprises, but it needs some coverage before
diving into more advanced topics and techniques. We recommend at least
skimming the API sections even if you are familiar with POSIX, as there are
some areas that RTCore’s API covers but POSIX does not handle. After this
chapter, there will be many more examples and techniques for more advanced
work.
Chapter 4
The RTCore API
The RTCore API is POSIX-based with some extensions. The development

of the API continues to evolve to reflect new needs in the industry, but
compatibility with previous releases is provided. Current efforts include con-
tinued POSIX compliance, along with some extensions to cover needs either
not mentioned by POSIX, or not sufficiently addressed in current standards
specifications.
4.1 POSIX compliance

To ease the real-time learning curve, FSMLabs long ago moved RTLinux
(and thus RTCore) to a POSIX-compliant API. Most developers learning
a real-time system have a solid programming background, and only need
to adjust to the specific API set provided by the RTOS. With RTCore, this
adjustment comes for free, as code under RTCore looks familiar to just about
anyone who has used a UNIX.
It should be noted that the POSIX standard has evolved and will continue
to do so, but in a controlled manner. FSMLabs will continue to maintain
POSIX compliance in light of new developments. Existing POSIX-based
systems are easily moved to RTCore, although source-code compatibility
with other RTOSs should not be expected. Source compatibility is provided
when moving between RTLinuxPro and RTCoreBSD, as both use the POSIX
API.
RTCore provides POSIX extensions when needed (indicated by an np
in the name) to implement features that fall outside of the POSIX domain.
39
40 CHAPTER 4. THE RTCORE API
These are mainly relegated to performance improvements in areas such as

SMP where POSIX does not provide full guidance. Some of these may not
be an option for those in strict development environments, but it is up to
the programmer to determine the best approach.
4.1.1 The POSIX PSE 51 standard

The guiding standard for the RTCore API was POSIX PSE 51, a minimum
set of POSIX threading functions for real-time and embedded systems. Pro-
grammers that have learned the various pthread *() calls for normal thread-
ing and synchronization will have the same function set they are used to. The
major shift involved will be to keep in mind the contraints of timing-specific
real-time code, such as scheduling, minimalism, and other real-time-specific
demands, but the programmer will not be burdened with learning a new API.
RTCore’s API is designed to be used from within real-time code, so as
a user, calling a POSIX function means that you’re entering a hard real-
time function that was designed to be used in this fashion. However, due
to some interactions with the GPOS, a few calls can only be used from an
initialization context. Please refer to the RTCore man pages for specifics.
4.1.2 Roadmap to future API development

The RTCore OS will continue to follow the POSIX standard in order to
maintain a proper model for the developer community. At the same time,
there will continue to be a need for extensions, as POSIX does not cover
all of the possible industry needs. Sometimes these ideas are moved into
later versions of the POSIX standard. Some are specific to a certain system
configuration, such as SMP systems, where CPU affinity calls are needed for
performance reasons. In some cases, extensions are added in order to simplify
work that could be done with standard calls. These extensions are presented
as an option that may facilitate development, but most work revolves around
the POSIX calls.
4.2 POSIX threading functions

Here we present the POSIX functions available from RTCore, a brief descrip-
tion of what they do, and some notes with respect to real-time usage. These
4.2. POSIX THREADING FUNCTIONS 41
calls are used throughout the examples in the book, and you should be able
to get a good practical grasp of their usage from these. For specific notes,
refer to the man pages, provided in various forms with RTLinuxPro. 1
4.2.1 Thread creation

int pthread_create(pthread_t *thread, pthread_attr_t *attr,
void *(*start_routine)(void *), void *arg);
This will create a thread whose handle is stored in *thread. The thread’s
execution will begin in the start routine() function with the argument
arg. Attributes controlling the thread are specified by attr, and will use the
default values and create a stack internally if this value is NULL.
Note that pthread create() calls are generally limited to being within
the intialization context of main(). If the call is needed during normal real-
time operation, threads can be created with preallocated stack space. Other-
wise, calling pthread create() from another real-time thread would at the
worst cause deadlock, and at best delay the first real-time thread an unknown
amount while memory is allocated for the stack.
There is an attribute function (pthread attr setstackattr()) that al-
lows a thread to be prepared with a preallocated stack for operation. Let’s
look at an example:
#include <time.h>
#include <stdio.h>
pthread t thread1, thread2;

void *thread stack;
void *handler(void *arg)

{
printf("Thread %d started\n",arg); 10
if (arg == 0) { //first thread spawns the second
pthread attr t attr;
pthread attr init(&attr);
1
For a full description of the POSIX threading API concepts and usage, refer to the
O’Reilly book on PThreads Programming or the POSIX standard directly.
pthread attr setstacksize(&attr, 32768);

pthread attr setstackaddr(&attr,thread stack);
pthread create(&thread2,&attr,handler,(void*)1);
}
return 0;
} 20
thread stack = rtl gpos malloc(32768);

if (!thread stack)
return −1;
pthread create(&thread1, NULL, handler, (void*)0);
rtl main wait(); 30
pthread cancel(thread1);
pthread join(thread1, NULL);
rtl gpos free(thread stack);
return 0;
}
This again demonstrates the point that anything outside of the main()
call cannot directly allocate memory. Instead, we allocate a stack with
rtl gpos malloc()2 in main(), where it is safe to block while the system
handles any work associated with the allocation, such as defragmentation.
Note that on some architectures a global static value may not be a safe place
to store the stack of a running thread.
Next, a real-time thread is spawned. Within the handler function, it
initializes an attribute and configures it to use our preallocated area for the
stack. Finally, we spawn the thread and execution occurs just as you would
expect POSIX calls to behave, with the exception being that the stack is
2
rtl gpos malloc() uses the correct malloc() available on the host GPOS.
already present. Note: A thread created with pthread create() is not

guaranteed to be started when the call returns, it is just slated for initial
scheduling.
Note that thread stacks in RTCore are static, and will not grow as needed
depending on call sequence. Users need to make sure that they create enough
stack space for the thread, and prevent too many large structures from being
placed on the stack. In a system that allows for dynamic memory manage-
ment and the possible delays incurred by doing so, stacks can dynamically
grow as the application needs space. Under RTCore, growing the stack would
require the program to wait while proper memory is found, possibly destroy-
ing real-time performance. Instead, the stack is allocated at thread creation
and does not grow.
This stack is generally only a couple dozen kilobytes in size, but users
with large data structures in function contexts need to understand that these
structures can soak up available stack space very quickly, causing an over-
flow. If a thread has a 20K stack, and calls a function 3 times recursively,
with a local structure of 7K per invocation, an overflow will occur. Smaller
structures should be used, or large structures should be kept off the stack,
or the thread’s stack should be enlarged to compensate.
4.2.2 Thread joining

int pthread_join(pthread_t thread, void **arg);
This joins with a running thread, storing the return value into arg, and
has no restriction on the length of time it takes to complete. If the thread has
already completed, this call returns immediately, otherwise it blocks until the
intended thread exits. As expected, this frees resources associated with the
thread, such as the stack, if it was not configured by hand. If you look at our
previous example, you can see that we use this call to join both a preallocated
stack thread and a normal thread, and it cleans up the resources for both,
except for the stack on the second thread, which we explicitly have to free.
pthread_detach(pthread_t thread);
The pthread detach() call will ’unhook’ a running thread whose status
was previously joinable. After the thread is detached, it is no longer joinable,
and needs no further management. Its resources will be cleaned up on thread
completion.
4.2.3 Thread destruction

int pthread_cancel(pthread_t thread);
This will cancel the thread specified by the given parameter. There are
many caveats to this as specified in the full man page, such as the fact that
as a cancelled thread works through its cancel handlers, it is not required to
release any mutex locks it holds at the point of cancellation. (Though this
is a good idea to do if you want a stable system.) Also, it may not cancel
immediately, depending on the state the thread is in at the point of the call.
The target thread will continue to execute until it enters a cancellation point,
when it will begin to unwind itself through its registered cancel handlers.
For most users, pthread cancel() followed by a pthread join() is most
effective as a means of shutting down real-time code from within the tail end
of main().
4.2.4 Thread management

pthread_t pthread_self(void);
This is a very simple function, generally used by threads to get their own
thread handle for further calls.
pthread_setcancelstate(int state, int *oldstate);
pthread_setcanceltype(int state, int *oldstate);
Threads may use the pthread setcancelstate() to disable cancella-
tion for themselves. The previous state is stored in the oldstate vari-
able. Likewise, the pthread setcanceltype() call is used to determine
the type of cancellation used, either PTHREAD CANCEL DEFERRED or
PTHREAD CANCEL ASYNCHRONOUS. In real-time environments, how-
ever, most systems have a minimal set of simple, continuous threads, and do
not make heavy use of cancellation calls.
pthread_testcancel(void);
This call ensures that any pending cancellation requests are delivered to
the thread. It has little use in real-time applications, as cancellation must be
a deterministic call in the first place. If there are ambiguities present in the
code, it may be better to remove them, rather than being forced to check if
the real-time thread should continue.
int pthread_kill(pthread_t thread, int signo);
pthread kill() sends the signal specified by signo to the thread speci-
fied. This is fast and deterministic if called on a thread running on the local
CPU, but there can be a delay when signalling a thread on a remote CPU.
4.2.5 Thread attribute functions

In addition to the normal thread calls, RTCore also exposes the pthread attr *()
functions, which control attributes of a thread. These functions behave as
they would in any other situation, and we refer you to the standard docu-
mentation for more detail.
int pthread_attr_init(pthread_attr_t *attr);

int pthread_attr_destroy(pthread_attr_t *attr);
These two functions initialize and destroy attribute objects, respectively.

Attribute objects should be created or destroyed with these calls, not by
hand.
int pthread_attr_setstacksize(pthread_attr_t *attr,

size_t stacksize);
int pthread_attr_getstacksize(pthread_attr_t *attr,
size_t *stacksize);
Programmers can use these calls to manipulate the stack size of the thread
the attribute is tied to. Note that this must be done within the main()
context, where memory management is possible. Refer back to our example
Section 4.2.1 for details, both on this and the pthread attr setstackaddr()
call. If these attributes are not set, the RTCore OS will handle the stack
manipulation internally.
Again, note that thread stacks under RTCore are static, and will not
grow as needed based on what functions are called. Users need to ensure
that they have enough stack space for their thread from the start. Section
4.2.1 has more details.
int pthread_attr_setschedparam(pthread_attr_t *attr,

const struct sched_param *param);
int pthread_attr_getschedparam(pthread_attr_t *attr,
struct sched_param *param);
As with normal POSIX threads, these two routines determine scheduling

parameters as driven by the contents of the param parameter. Also, as usual,
use the sched get priority min() and sched get priority max() calls
with the appropriate scheduling policy to get the priority ranges. SCHED FIFO
is the default scheduling mechanism, and while it does not have to be speci-
fied, it is helpful to ensure forward compatibility.
int pthread_attr_setstackaddr(pthread_attr_t *attr,

void *stackaddr);
int pthread_attr_getstackaddr(pthread_attr_t *attr,
void **stackaddr);
These calls are important when creating threads from within the real-time
kernel. As there is no memory management, threads need to be spawned us-
ing preallocated memory areas. By using these calls to manage the stack
address, one can create threads from inside the real-time kernel. We’ve al-
ready seen this used in the thread creation example, and as you can see, it
is not difficult to manage.
int pthread_attr_setdetachstate(pthread_attr_t *attr,

int detachstate);
int pthread_attr_getdetachstate(const pthread_attr_t *attr,
int *detachstate);
Use these two calls to switch a thread’s joinable state from PTHREAD
CREATE JOINABLE to PTHREAD CREATE DETACHED. Alternatively,
the pthread detach() call can be used to alter a running thread’s state.
4.3 Synchronization
4.3.1 POSIX spinlocks
RTCore provides support for the POSIX spinlock functions too. The API is
much like other POSIX objects - there is an initialization/destruction set:
pthread_spin_init(pthread_spinlock_t *lock, int pshared);

pthread_spin_destroy(pthread_spinlock_t *lock);
4.3. SYNCHRONIZATION 47
As with other similar calls, these initialize or destroy a given spinlock -

no surprises there. The following calls are also supported:
pthread_spin_lock(pthread_spinlock_t *lock);
pthread_spin_trylock(pthread_spinlock_t *lock);
pthread_spin_unlock(pthread_spinlock_t *lock);
Again, no surprises - these calls allow you to take a lock, try to take it but
return if the lock is already held, and unlock a given spinlock, respectively.
These behave like other spinlocks - they will spin a given thread in a busy
loop waiting for the resource, rather than putting it on a wait queue to be
woken up later.
As a result, the same spinlock caveats apply - they are generally only
preferable to other synchronization methods when the given thread will spin
a shorter amount of time waiting than the sum of the work involved in putting
it on a queue (and any associated locking), and waking it up appropriately
when the resource becomes available. In a real-time system, it is also of
course important that the resource is available quickly so the thread does
not lose determinism due to a faulty locking chain in other subsystems.
4.3.2 Comments on SMP safe/unsafe functions

The functions described here are inherently safe in SMP situations, although
there are real-time considerations. For calls that target threads running on
other CPUs, there may be a delay in getting the signal to the running code.
pthread cancel() and pthread kill() are two examples of this - when
sending a signal to code on the current CPU, the code is fast and determin-
istic, but may delay slightly when targetting a ’remote’ thread. While in
normal situations this is unimportant, the incurred delay may have reper-
cussions for real-time code. Keep these factors in mind when writing the
real-time component of your application - it may help to reconfigure which
CPUs run which threads.
4.3.3 Asynchronously unsafe functions

Some functions are not asynchronously safe, at least in a real-time environ-
ment. To ensure correct behavior, pthread cancel() is not recommended
for threads that use any of these functions. By ’asynchronously unsafe’ we
mean calls that may leave the system in an unknown state if the call is in-
terrupted in the middle of execution. An example would be a function that
locks several mutexes in order to do work, and installs no cleanup handlers.
If the call is halfway through and is cancelled by a remote pthread cancel()
call, that thread will exit while holding some mutexes, potentially blocking
other threads indefinitely.
It is possible to handle mutex cleanups in a safe manner if one pushes
cleanup handlers for all shared resources, but this is complicated. Extreme
care must be taken to ensure that held resources are freed in a manner
that doesn’t incur locking, and that everything is cleaned properly for every
possible means of failure. Failing to get this correct will leave all waiting
threads blocked forever, as the cancelled thread will terminate with locked
resources left behind.
4.3.4 Cancel handlers

We’ve already mentioned these a couple of times, and will continue to do so
as we cover more of the API. These calls are difficult to get right in all cases,
and many developers don’t come into contact with them too often. In the
interest of sidestepping future confusion and grounding the discussion, we
will now diverge into a short example.
Put simply, cancel handlers are hooks attached to a running thread, as
functions, and are executed in the case that a thread is cancelled while a
resource is held. The handlers are pushed on as a stack, so that if the thread
is cancelled, the handlers are executed in the order they were pushed on the
stack.
Also, a cancelled thread does not execute cleanup functions at the time the
cancel is received: Rather, it continues execution until it enters a ’cancellation
point’, which is generally a blocking function. Refer to the POSIX specifica-
tion for specific cancellation points, but this generally means that code will
continue to execute until it hits a blocking call like pthread cond wait().
Let’s look at an example:
#include <time.h>
#include <unistd.h>
#include <stdio.h>
pthread t thread;
pthread mutex t mutex;
void cleanup handler(void *mutex) {

pthread mutex unlock((pthread mutex t *)mutex); 10
}
void *thread handler(void *arg)

{
pthread cleanup push(cleanup handler,&mutex);
pthread mutex lock(&mutex);
while (1) { usleep(1000000); }
pthread cleanup pop(0);
pthread mutex unlock(&mutex);
return 0; 20
}

pthread mutex init (&mutex, NULL);
pthread create (&thread, NULL, thread handler, 0);
rtl main wait();
pthread cancel (thread);

pthread join (thread, NULL); 30
pthread mutex destroy(&mutex);
return 0;
}
This code correctly handles the cancellation problem. In our initialization

code, we create a mutex and spawn a thread. This thread correctly pushes
a cleanup handler on the stack before it locks the mutex, and then enters
a useless loop. (Yes, it should do something useful and unlock, but this is
only for illustrative purposes.) Now the mutex is locked indefinitely, and
any cancellation must cause the mutex to be unlocked. If we cancel the
application with CTRL-C at the command line, it induces the cancel and
cleanup handler, causing a proper exit.
Note again the concept of a cancellation point - if the code pushes the can-
cel handler on, but the thread is cancelled asynchronously before it actually
locks the mutex, the thread will continue to run until it enters a cancellation
point. It will continue to execute, running through the after the cleanup han-
dler push but before the mutex lock. Once it locks the mutex, the thread is
cancellable, the signal will be delivered, and the handler will be called from a
known point. Think of cancellation points as being places where the system
checks to see if it should stop and clean up.
Consider this case without the cleanup handler, even where the code
wasn’t infinitely blocked. Once the thread locks the mutex, and another
process asynchronously cancels the thread, the thread will still wait for a
cancellation point, but without the handler, it will exit with the mutex held,
and any other code that depends on it will be blocked indefinitely. Now
imagine what happens if you have multiple resources held at various times,
depending on the call chain. Any lockable resource that isn’t attached to a
cleanup handler properly can cause a deadlock if the holding thread thread
is cancelled.
As you can see, while there are mechanisms to avoid cancellation prob-
lems, care must be taken to make sure that everything is handled properly.
Failure to do so in every possible cancel situation will result in system dead-
lock. With a real-time system, this can be disastrous, and it is for this reason
we’ve taken this time to demonstrate how careful one must be.
4.4 Mutexes
The POSIX-style mutexes are also available to real-time programmers as a
means of controlling access to shared resources. As timing is critical, it is
important that mutexes are handled in such a way that blocking will not
impede correct operation of the real-time application.
4.4.1 Locking and unlocking mutexes

int pthread_mutex_lock(pthread_mutex_t *mutex);
As with the standard POSIX call, this locks a mutex, allowing the caller
to know that it is safe to work on whatever resources the mutex protects. In
4.4. MUTEXES 51
a real-time context, locks around mutexes must be short, as long locks could
cause serious delays in other waiting threads.
int pthread_mutex_trylock(pthread_mutex_t *mutex);
The pthread mutex trylock() call will attempt to lock a mutex, and
will return immediately, whether it gets the lock or not. Based on the return
value, one can tell whether the lock is held, and take appropriate action. For
some applications that may not be able to wait for a lock indefinitely, this is
a way to avoid long delays.
int pthread_mutex_timedlock(pthread_mutex_t *mutex,

const struct timespec *abstime);
Similar to the above pthread mutex trylock() function, pthread mutex

timedlock() provides a way to attempt to grab a lock, with an upper bound
on the length of the wait. If the mutex is made available and locked by the
caller before the allotted time has passed, the mutex will be locked. If the al-
lowed time passes and the mutex cannot be locked by the caller, the function
returns with an error so that the caller can recover appropriately.
int pthread_mutex_unlock(pthread_mutex_t *mutex);
As you would guess, this unlocks a held mutex. It signals a wakeup on

those threads that are blocking on the mutex.
4.4.2 Mutex creation and destruction

int pthread_mutex_init(pthread_mutex_t *mutex,
const pthread_mutex_attr *attr);
int pthread_mutex_destroy(pthread_mutex_t *mutex);
As with normal POSIX calls, this inititalizes a given mutex. If a pthread mutex attr
is provided, it will use it, otherwise a default attribute set will be created
and attached. The second call of course destroys an existing mutex, assum-
ing that it is in a proper state and not already locked. Destroying a mutex
that is in use will result in an error to the caller.
4.4.3 Mutex attributes

int pthread_mutexattr_init(pthread_mutexattr_t *attr);
int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);
This is used to initialize a given mutex attribute with the normal values,
or destroy an already existing attribute.
int pthread_mutexattr_settype(
pthread_mutexattr_t *attr,
int type);
int pthread_mutexattr_gettype(
pthread_mutexattr_t *attr,
int *type);
This call allows you to set the type of mutex used. For example, the
type can be either PTHREAD MUTEX NORMAL, which implies normal
mutex blocking, or PTHREAD MUTEX SPINLOCK NP, which will force
the mutex to use spinlock semantics when attempting to grab a lock. The
second call will return the type previously set, or the default value.
int pthread_mutex_setprioceiling(
pthread_mutex_t *mutex,
int prioceiling,
int *old_ceiling);
int pthread_mutex_getprioceiling(
const pthread_mutex_t *mutex,
int *prioceiling);
This call sets the priority ceiling for the given mutex, returning the old
value in old ceiling. This call blocks until the mutex can be locked for
modification. The second call returns the current ceiling. More detail on
priority ceilings will follow later on.
4.5 Condition variables

These calls are the same as those used in normal POSIX environments. Keep
in mind that if a thread waiting on a condition variable is cancelled while
4.5. CONDITION VARIABLES 53
blocked in either pthread cond wait() or pthread cond timedwait(), the

associated mutex is reacquired by the cancelled thread. To prevent deadlocks
a cleanup handler that will unlock all acquired mutexes must be installed.
Reacquiring the associated mutex will take place before the cleanup handlers
are called.
4.5.1 Creation and destruction

int pthread_cond_init(pthread_cond_t *cond,
const pthread_condattr_t *attr);
int pthread_cond_destroy(pthread_cond_t *cond);
A condition variable must be created and destroyed just like any other
object. Note that there is an attribute object that is specific to condition
variables, and can be used to drive the behavior of the variable.
4.5.2 Condition waiting and signalling

int pthread_cond_wait(pthread_cond_t *cond,
pthread_mutex_t *mutex);
This behaves as one would expect. The caller waits on a condition to

happen specified by cond, and coordinates usage with the mutex parameter.
The mutex must be held at the point of the call, at which point it is released
for other threads to cause the condition to occur, also using the mutex. When
the call returns, signalling that the condition has occurred, the mutex is again
held by the caller. The associated mutex must be released after the critical
section is complete.
int pthread_cond_timedwait(pthread_cond_t *cond,

pthread_mutex_t *mutex,
const struct timespec *abstime);
As with pthread cond wait(), this call waits for a condition to happen,
locked by a mutex. In this version, however, it will only wait the amount
of time specified by abstime. Based on the return value, the caller can
determine whether the call succeeded and the condition occurred, or if time
ran out.
int pthread_cond_broadcast(pthread_cond_t *cond);

int pthread_cond_signal(pthread_cond_t *cond);
This is a very simple function that broadcasts a condition signal to all
those waiting or to a single thread waiting on a condition variable, respec-
tively. Note that the caller of these functions does not need to hold the mutex
that waiting threads have associated with the condition variable.
4.5.3 Condition variable attribute calls

int pthread_condattr_init(pthread_condattr_t *attr);
int pthread_condattr_destroy(pthread_condattr_t *attr);
The attribute object calls appropriate for condition variables are no differ-
ent than any other attribute calls. The same object creation and destruction
methods apply.
int pthread_condattr_getpshared(
const pthread_condattr_t *attr,
int *pshared);
int pthread_condattr_setpshared(
pthread_condattr_t *attr,
int pshared);
Relative to thread and other object types, there is not much that can be
modified for conditional variable attributes. These calls toggle the status of
a conditional variable’s shared status. No other methods apply to this type.
4.6 Semaphores
Again, RTCore semaphores look just like POSIX semaphores. As with the
conditional variables, if a thread is cancelled with a process-shared semaphore
blocked, this semaphore will never be released, and consequently, a deadlock
situation can occur. It is the programmer’s responsibility to ensure that
semaphores are handled properly in cleanup handlers.
Signals that interrupt sem wait() and sem post() will terminate these
functions, so that neither acquiring or releasing the semaphore is accom-
plished. The function call interrupted by a signal will return with value
EINTR.
4.6. SEMAPHORES 55
4.6.1 Creation and destruction

int sem_init(sem_t *sem, int pshared,
unsigned int value);
int sem_destroy(sem_t *sem);
These functions operate properly on semaphores. As with the mutex

functions, these functions will detect in-use semaphores and other problems
that could cause unpredictable behavior. Refer to the examples and full
documentation for more details on their use, but in general they will behave
as they would in any other environment.
4.6.2 Semaphore usage calls

int sem_getvalue(sem_t *sem, int *sval);
This function will store the current value of the semaphore in the sval
variable.
int sem_post(sem_t *sem);
sem post() increases the count of the semaphore, and never blocks, al-
though it may induce an immediate switch if posting to a semaphore that a
higher priority thread is waiting for.
int sem_wait(sem_t *sem);

int sem_trywait(sem_t *sem);
int sem_timedwait(sem_t *sem,
const struct timespec *abs_timeout);
These are the calls used to force a wait until the semaphore reaches a
non-zero count, and operate in the same way the mutex wait calls do. The
sem wait() call blocks the caller until a non-zero count is reached, and the
sem trywait() does the same without blocking, returning EAGAIN if the
count was 0. The sem timedwait() call blocks up to the amount of time
specified by abs timeout.
4.6.3 Semaphores and Priority

Semaphores must be handled with care in the context of real-time code. If
you have low priority code that does a sem post(), you must keep in mind
that if a higher priority thread was waiting on that semaphore, the post will
induce an immediate transfer of control to the higher priority thread.
This comes as a surprise to some users, but you must keep in mind that in
real-time systems, speed is of course the most important factor. If this means
that your real-time thread suspends the moment it does the post, that’s all
right - the alternative is to further block the high priority thread that needs
the semaphore.
Aside from ensuring the best possible performance, semaphores are also
used in this way to simplify driver development. Interrupt handlers can
be kept very simple and succinct, with semaphore posts after the minimal
amount of work is done. This will cause a switch to the handling thread,
which can perform the rest of the work. As threads are more capable than
interrupt handlers (being able to use the FPU, the debugger, etc), the data
can be handled in a simple thread context rather than building complex
interrupt handlers.
4.7 Clock management

RTCore provides standard POSIX mechanisms for managing the clock, thread
sleeps, delays, and similar tasks. Examples include clock nanosleep(),
clock gettime(), and so on. For detailed information on these functions,
refer to the Single UNIX Specification provided with RTLinuxPro.
One additional piece of information worth noting here is the addition
of an advance timer to clock nanosleep(). Virtually every system has an
inherent amount of jitter, depending on hardware load. Some applications
require determinism below the threshold of this jitter. For these applications,
RTCore provides the advance timer. Threads generally sleep with:
struct timespec t;
clock_gettime(CLOCK_REALTIME, &t);
timespec_add_ns(&t, 500000);
clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &t, NULL);
This way, the thread will be woken when the absolute time specified in t,
which was the current time plus 500us. If there is inherent hardware jitter,
4.8. EXTENSIONS TO POSIX (* NP()) 57
though, the thread may be delayed by a couple of microseconds. Please refer

to section ?? for details on reducing this jitter to 0.
4.8 Extensions to POSIX (* np())

There are some calls available to developers that are specific to RTCore.
These calls are not part of the POSIX specification, but do fill in some of the
gaps left by it, and may make their way in some form into future revisions
of the standard. We list them here as an option to developers. In order to
properly handle some situations, such as SMP environments, these calls may
make life much easier.
4.8.1 Advance timer

For some applications, the allowable worst case hardware jitter and schedul-
ing deviation may run very close to what RTCore and the underlying hard-
ware is capable of delivering. Suppose your application has a worst case jitter
allowance of 13 microseconds, meaning that if you schedule thread X to run
at a certain time, under a worst case load, execution of thread X cannot
deviate from that time by more than 13 microseconds. If the hardware, un-
der load, deviates by 10 microseconds, and the RTCore scheduling takes 3,
and the context switch time takes 2, you are already outside of the allowable
range.
For some users, the application might not be too cost-sensitive, and it is
just a matter of getting faster hardware. But for low cost systems, or where
there is no faster hardware, RTCore offers the advance timer. This allows
you to compensate for things like scheduling and hardware jitter.
The advance timer works as follows: When you make a call to clock nanosleep()
to sleep until your next scheduling period, RTCore allows an extra flag and
structure to perform early scheduling. So instead of:
clock nanosleep( CLOCK REALTIME, TIMER ABSTIME, &next, NULL);
The code uses:
clock nanosleep( CLOCK REALTIME, TIMER ABSTIME|TIMER ADVANCE,

&next, &ts advance);
The ts advance structure is a normal struct timespec, with the tv nsec

field used to indicate how much deviation you would like to account for, in
nanoseconds. So for our example, if we had a worst case latency of 9 mi-
croseconds on our hardware, but the application demanded 4 microseconds,
tv nsec would be set to something like 13000.
This tells the scheduler to account for 13 microseconds of latency. RTCore
will then switch the thread in early and spin it in a busy wait until the exact
scheduling moment occurs. It then releases the thread to run, and as it
is already prepared, the problem of scheduling deviation is averted. It is
important to ensure that hard interrupts are disabled upon entering this
call. If your thread’s advance point has occurred, and it is in a busy-wait
until the real scheduling point occurs, it is possible that another interrupt
can come in and interrupt the thread’s execution just before it needs to
actually run. So in order to be entirely safe, this call must be surrounded by
rtl stop interrupts() and rtl allow interrupts() calls.
With this method, a developer can get much better scheduling resolution
than would normally be possible, given the underlying hardware. This option
is only available in RTLinuxPro.
4.8.2 CPU affinity calls

int pthread_attr_setcpu_np(pthread_attr_t *attr,
int cpu);
int pthread_attr_getcpu_np(pthread_attr_t *attr,
int *cpu);
These two functions modify a given thread attribute in order to get a

thread to run on a specific CPU. By default, the thread is run on the same
CPU as it was created on. This is a means of ensuring that work is specifically
distributed throughout the system.
When developing real-time applications, it is generally required that dif-
ferent threads operate in phase with each other, so that thread x is doing
some work so that data will be ready when thread y needs it. On an SMP
system, this may mean that the first thread must be bound to one CPU while
the second is on the second CPU, and they operate in tandem to provide the
highest throughput. Without these calls, both threads may end up on the
same CPU, and the correct phase relationship may not be possible.
4.8. EXTENSIONS TO POSIX (* NP()) 59
Refer for the reserve CPU capabilities for more advanced calls. (pthread
attr setreserve np())
4.8.3 Enabling FPU access

int pthread_setfp_np(pthread_t thread, int flag);
By default, real-time threads do not have access to the CPU’s floating
point unit, since the system’s context switch times are faster if it doesn’t
have to restore floating point registers. This call will enable or disable that
access. For threads running on another CPU, pthread attr setfp np() is
the proper way of enabling FPU support for the thread.
4.8.4 CPU reservation

int pthread_attr_setreserve_np(&attr, 1);
int pthread_attr_getreserve_np(&attr);
In SMP applications, especially very high speed systems, it is benefi-
cial to reserve a CPU for only real-time applications, whether they are
threads or interrupt handlers. By using this thread attribute along with
the pthread attr setcpu np() call, one can spawn a real-time thread on
a CPU such that the GPOS cannot run on that CPU. The benefit is that
the real-time code can then in many cases live entirely in cache, and achieve
more deterministic results at high speeds, as the GPOS cannot run on that
CPU and disturb the cache usage.
Tests on larger scale systems with significant bus traffic indicate that
reserve CPU capabilities can reduce jitter by an order of magnitude.
Affinity on NetBSD
RTCore on NetBSD also allows users to pin userspace processes to a specific
CPU. (Generally, they can be scheduled to run on any CPU.) To pin a
process, user programs must include the rtl pthread.h header file, and
call:
rtl_setproc_affinity(cpu_num);
A positive number should be used to bind to a specific CPU - if this
succeeds, the call returns 0. To unpin the process, pass -1 to the function
from the reserved process.
4.8.5 Concept of the extensions

The idea behind the extensions is simple - they are there to provide easy
means of handling aspects of real-time programming not covered (or not
covered well) by the the POSIX standard. In some situations, they provide
an easy way to get something done that coule be done with standard calls,
but would be much more work and would result in convoluted code. In other
cases, the standard doesn’t specify with detail certain aspects of real-time
operations, and the extra calls are there to work around the ambiguities.
Most of these situations relate to how certain operations are carried out
in SMP mode, and handle ambiguities associated with targetting code on
another CPU in real-time. The RTCore extensions take what could be a
non-deterministic situation and remove execution ambiguities.
4.9 ”Pure POSIX” - writing code without

the extensions
There are users who don’t want to use the non-POSIX extensions. In these
cases, there is usually some need for all of the code to be POSIX-compliant,
usually based on an internal coding standard. If you are in a similar situation,
it is possible to write code without the extensions, although there may be
performance issues as a result. RTCore does not force you to deviate from
the standard, it simply offers some solutions to improve performance.
4.10 The RTCore API and communication

models
We’ve focused so far on demonstrating the API used in RTCore to commu-
nicate between threads and other code living in the real-time kernel. A later
chapter will focus more on the auxiliary communication models, such as FI-
FOs and shared memory, when you need to communicate with the GPOS.
Surprises are few and far between, though, as you’ll see just as many POSIX
examples there as you do here.
Chapter 5
More concepts
So far, we’ve looked at some basic real-time concepts, introduced some ex-
amples, and walked through the basics of the API. Given that the API is
POSIX, much of the learning curve is gone, and we can now hop back into
some general programming practices and concepts, and how they work in
RTCore. Let’s start off with some basic practices:
5.1 Copying synchronization objects

Do not copy any objects of type mutex, conditional variable, or semaphore,
as operations on a copied synchronization object can result in unpredictable
behavior. All of the synchronisation objects should be initialized and de-
stroyed with the appropriate function for the data type.
The same holds true for attributes associated with synchronization ob-
jects. They should never be copied, instead initialize them with the appro-
priate calls. The following is wrong:
pthread_mutex_t mutex1, mutex2;

pthread_mutex_init(&mutex1,0);
memcpy(&mutex2, &mutex1, sizeof(pthread_mutex_t);
Instead, this should be used:
pthread_mutex_t mutex1, mutex2;

61
62 CHAPTER 5. MORE CONCEPTS
5.2 API Namespace

The RTCore API is POSIX, which we have reiterated enough times by now.
However, the RTCore API is also available using an rtl prefix. This means
that pthread create() can also be referenced as rtl pthread create().
This is an added feature so that users can explicitly reference RTCore
functions when needed, if there is any ambiguity. In PSDD, as we will
see later, real-time applications exist inside of normal GPOS applications.
In these situations, an ambiguity exists - pthread create() will by de-
fault refer to the normal userspace GPOS function, rather than the RTCore
pthread create(). In situations such as these, the rtl prefix is needed.
5.3 Resource cleanup

RTCore will clean up some unfreed resources for you if your application
doesn’t explicitly catch everything on cleanup. As we saw in the first exam-
ples, devices and open file descriptors are cleaned up automatically. If you
exit your program and forget to call close(), RTCore will detect this and
make the call for you. This will allow file usage counts to remain in proper
order. This also holds true for POSIX I/O-based devices that your code may
have registered - it will do the proper deregistration.
However, some resources are not handled at this time. Threads are not
cleaned up automatically, so it is up to the programmer to make sure that
each thread belonging to an application is cancelled and joined properly. The
same goes for memory allocated through rtl gpos malloc() - the caller must
free these areas with rtl gpos free() to prevent memory leaks.
5.4 Deadlocks
When using synchronization primitives, it is the programmers responsibility
to ensure that either all shared resources correctly freed if asynchronous
signals are enabled or that these are blocked. Make sure to use thread cleanup
handlers to safely free resources if the thread is cancelled while holding a
resource.
5.5. SYNCHRONIZATION-INDUCED PRIORITY INVERSION 63
5.5 Synchronization-induced priority inversion

If a high priority thread blocks on a mutex (or any other synchronisation
object) that was previously locked by a low priority task, this will lead to
priority inversion: The lower priority thread must gain a higher priority in
order to guarantee execution time. A high priority thread may come along
and block execution of the low priority task from running, preventing mutex
release and stalling both the low and high priority threads. The high priority
thread is waiting for the low priority thread to release, and the low priority
thread is waiting for execution time. The mutex will never be unlocked.
Any scenario that allows a lower priority task blocking a high priority
task is an implicit priority inversion. Theories abound on what the correct
mechanism is to handle this problem, and FSMLabs has found that analysis
of code is the best means of avoiding the problem. Based on internal and
external experience, it follows that if you don’t know what resources your
code might or might not hold at a given point, the chances of there being
dangerous potential situations is very high.
Protocols such as priority inheritance exist to solve this problem, but in
turn induce potentially unbounded priority inversion. Inheritance involves
lower priority resources being promoted to higher priority levels such as when
a higher priority task is waiting on the lower. This approach can lead to un-
bounded suspension, though - consider a high priority thread that is waiting
on a lower priority thread that holds a lock. The lower priority thread is
promoted so it can execute and release. However, this thread now needs a
lock held by an even lower priority thread. This third thread is then raised so
that it can execute, and so on. In the meantime, the high priority thread may
not be considered ’real-time’ anymore, as it can easily lose its deterministic
characteristic.
RTCore provides optional support for the ’priority ceiling protocol’, in
which resources are given a ceiling priority they cannot exceed. This still
requires analysis and is not perfect, but does provide a middle ground for
users.
5.6 Memory management

As we have mentioned, general purpose memory management is not available
to real-time threads. If your application does have a need for a memory pool,
it is best to allocate it during initialization and then allocate pieces from that
pool by hand during execution.
The reason for this approach is simple - bounded time allocation in a
general purpose memory allocator is difficult to prove. For non-real-time
applications, a generic allocator is fine - the user calls malloc(), and the call
may return immediately if a chunk is available, or it may block indefinitely.
On an active system, it is entirely possible that memory may be extremely
fragmented, and the allocator might have to do a lot of work in order to
defragment existing pieces enough that the user request can be handled. In
a real-time system, this may mean that your thread is indefinitely blocked.
Users can allocate a large chunk during initialization (in main()), where
the code does not have real-time demands. This will ensure that a pool is
around for real-time use. If the usage pattern of the pool is known, a simple
allocator/deallocator could be implemented on top of this pool that would
allow for memory management calls in bounded time.
Bounded-time allocators will return with an answer in a specific time
frame, but may not use memory as efficiently as it could, while generic allo-
cators will optionally take extra time to use every last bit of memory. Some
bounded-time allocation mechanisms and algorithms do exist, and FSMLabs
is evaluating and testing some of these options. Future releases of RTCore
will likely include a bounded time allocator of some type as a convenience to
users.
5.7 Synchronization
This is possibly the most important concept in real-time systems engineering.
While synchronization is important as a protection mechanism in normal,
non-real-time threaded applications, it can make or break a real-time system.
In a normal application, a waiting thread will do just that - wait. In a hard
real-time system, a waiting thread might mean that a fuel pump isn’t being
properly regulated, as it is waiting on a mutex that another thread has held
too long.
5.7.1 Methods and safety

Safe synchronization relies on several things: judicious use of it, code analysis,
and above all, understanding of the code at hand. No amount of software
protection will save the system from a programmer who doesn’t understand
or care what locks are held when in a real-time system. In fact, the presence
of it may result in carelessness on the part of the developer.
RTCore offers the standard POSIX synchronization methods, such as
semaphores, mutexes, and spinlocks, but also focuses on other, higher per-
formance synchronization methods. In fact, much of RTCore is designed
in such a way that synchronization is not necessary, or is very lightweight.
Heavy synchronization methods such as spinlocks can disable interrupts and
interfere with other activity in the system. Lighter mechanisms such as
atomic operations create very little bus traffic and have a minimal impact.
Of course, an entirely lock-free mechanism is even better, if possible.
An example of this is the RTCore POSIX I/O subsystem. The original
Free versions were very fast but had no locking mechanisms whatsoever.
While the performance was good, it didn’t hold up to industrial use. It
needed proper locking in order to traverse the layers properly. The layer also
needed to stay as fast as it was before - users want it to be fast and safe. A
simple and effective method would be to put mutexes around each contended
piece, locking and unlocking as needed.
While simple, this would severely slow the system down, as mutexes in-
volve waiting on queues, switching threads while others complete, and so
on. Instead, FSMLabs added a light algorithm based on atomic operations
(Please refer to section 5.7.3). As requests come in to add a device name
to the pool, atomic operations such as rtl xchg are used to grab pieces of
the pool. This prevents interrupt disabling, and allows other threads to use
other areas of the pool at the same time.
Some other restructuring was also used, resulting in a more flexible and
safe architecture that was just as fast as it had always been, except it is now
safe. Other systems require different approaches, from heavier synchroniza-
tion to none at all, but it is very important that the correct method is chosen,
not just one that works.
Now that we’ve briefly covered the topic (synchronization is a very broad
topic in real-time systems), let’s look at a specific example of a light syn-
chronization method in RTCore.
5.7.2 One-way queues

As we have said, POSIX provides several synchronization methods, but other
approaches are sometimes called for - usually, when a very light and quick
method is needed. The one-way queues provided by RTCore handle many of

these situations.
The Basic Idea

Many usage patterns require that one thread sends messages to another, in
real-time. In one form or another, this results in a queue. As queues are
simple, many users write their own, and rather than leaving it open, protect
queue operations with a lock. While the lock will rarely be contended, the act
of grabbing and releasing the lock may interfere with other system activity.
In light of this, RTCore provides a ’one-way queue’ implementation. This
allows a user to declare specific message queues, shared between a single
consumer and a single producer thread. Each queue declaration implicitly
defines a set of functions to operate specifically on that distinct queue, so
that all code using the queue interacts with it using a specific function name.
These queues are lock-free, meaning that there is no locking on sends and
receives. The API can handle concurrent enqueue and dequeue operations,
but the user must wrap the calls with a lock if there are multiple consumers
or multiple producers. (A locking version does also exist that offers a built
in lock.) The result is a very fast mechanism for exchanging data that needs
very little, if any, management overhead. Let’s look at an example:
#include <time.h>
#include <stdio.h>
#include <unistd.h>
#include <onewayq.h>
DEFINE OWQTYPE(our queue,32,int,0,−1);

DEFINE OWQFUNC(our queue,32,int,0,−1); 10
our queue Q;
void *queue thread(void *arg)

{
int count = 1;

while (1) {
timespec add ns(&next, 1000000000); 20
clock nanosleep(CLOCK REALTIME,
if (our queue enq(&Q,count)) {

printf("warning: queue full\n");
}
count++;
}
}
30
void *dequeue thread(void *arg)
{
int read count;

while (1) {
timespec add ns(&next, 500000000);
TIMER ABSTIME, &next, NULL); 40
read count = our queue deq(&Q);
if (read count) {
printf("dequeued %d\n",
read count);
} else {
printf("queue empty\n");
}
}
}
50
our queue init(&Q);
pthread create (&thread1, NULL,
queue thread, 0);

dequeue thread, 0);
rtl main wait();
pthread cancel (thread1); 60

pthread join (thread1, NULL);
pthread cancel (thread2);
return 0;
}
This requires some explanation, as the syntax hides much of the work.
There are two threads, spawned as normal, where one enqueues data and the
other dequeues. Both are periodic, and as a quick method of preventing the
queue from overflowing, the dequeueing thread defines a period half as long
as the enqueueing thread. Half of the dequeue calls result in an empty queue
being found, but this is acceptable for our purposes.
Now let’s break down the interesting part into discrete steps, starting
with the initial declarations.
Declarations
We need to define a queue for data to flow between the threads. The syntax
defines two steps: Let’s look at just step 1 first:
DEFINE_OWQTYPE(our_queue,32,int,0,-1);
This first step creates a datatype for our queue.1 sing DEFINE OWQFUNC LOCKED
will define an automatically locking version of the queue. Think of this as
the backing for the queue operations - it defines the queue, it’s properties,
and structure. Parameter 1 is the name that will be provided so that we can
instantiate the queue itself, and parameter 2 defines the length of the queue.
Parameter 3 defines the type of unit the queue is made of - here we use an
integer as the base element, but we could have used pointers or anything
else. As the queue operations copy data into the queue, light units such as
1
U
pointers are favored over large structures. Parameters 4 and 5 are not used
at the moment.
We now have a queue structure named our queue containing 32 elements,
each the size of an int. If you were passing characters or structures through
the queue, you would use char or struct x as parameter 3.
Now let’s look at the step 2:
DEFINE_OWQFUNC(our_queue,32,int,0,-1);
This defines functions to be used explicitly on the queue type defined in

step 1.2 The parameters work in a similar fashion: Parameter 1 defines both
the prepending name of the new queue operations and the type of queue
structure that the functions will work on. Parameter 2 again defines the size
of each unit, and 3 determines the type.
The last two parameters are used in this case: One defines the return
value for a dequeue call on an empty queue, and the other is the return value
for an enqueue call on a full queue. Values such as 0 and -1 are generally
safe, but are configurable in light of situations where 0 is a valid value to be
pushing through the queue. If you enqueue 0, and the other end dequeues it,
there must be some means of determining that the value of 0 was intended,
and is not a result of a call on an empty queue. Select a value that is known
to be unique from your valid queue values.
Lastly, we define an instance of our queue structure to be used in our
threads with the line:
our_queue Q;
Usage
Looking on to the thread code, you can see that the actual usage of the queue
is simple: One thread calls our queue enq(&Q,count), which is the enqueue
function created in step 2 above, using our defined structure Q, and pushing
a value of count into it. The other end does a our queue deq(&Q) which
returns a correct type off of the queue for usage in the other thread. Note that
step 2 also defines a few other simple calls for the queue: our queue full()
to see if the queue is full, our queue init() to initialize a queue structure,
2
Again, specifying DEFINE OWQFUNC LOCKED will set up an automatically lock-
ing queue.
and also a our queue top(), which will return the current head of the queue
without removing it. (This also serves as an isempty() function.)
Queue interaction, as you can see, is very easy. It is also extremely fast,
and doesn’t require locking for most cases. The code is safe for multiple
threads that are doing enqueue and dequeue operations at the same time,
which is the common case. The user needs to add an external lock when two
or more threads are enqueueing data at the same time or a set are dequeueing
data at the same time. Otherwise, no additional locking is needed.
This is only one example of a light synchronization method. RTCore pro-
vides this for the user’s convenience, and the user is encouraged to closely
analyze their synchronization needs to ensure that the right approach is cho-
sen.
5.7.3 Atomic operations

We’ve mentioned atomic operations a few times now, and it’s high time we
look at them in some detail. In general, atomic operations include any type
of operation that cannot be further subdivided, and can be viewed as a single
distinct operation. (There are other definitions too.) For our purposes, we
will be looking at atomic bit operations - work that is done on a specific
memory location in a a single step.
Depending on the system at hand, simple steps like setting a variable to
a specific value may appear to be atomic but can be very far from it. Code
that writes to a generic value may end up with that left in cache but not
synchronized with main memory, which on an SMP system can wreak havoc.
Consider writing to a simple integer that another thread on a different CPU
is waiting for - the first write may only makes it to the cache but not to main
memory, allowing the first thread to continue on with other work, while the
second is working from flawed data. Even worse, both threads could update
the value at the same time.
Atomic operations allow you to say ’There is a value at this address.
Set bit 3 of this without ambiguity.’ The operation will be carried out in a
specific atomic fashion, and error if someone else tried to do the same thing
at the same time, signalling that you have to try again or take another route.
RTCore provides some simple API calls to handle these problems. They
are custom, as POSIX does not define functions related to this problem,
but they are meant to be easy to understand. As each architecture handles
atomic operations differently, these functions were designed to do the right
thing depending on the architecture at hand. Let’s take a look:
rtl_a_set(int bit, volatile void *word);

rtl_a_clear(int bit, volatile void *word);
These two atomically set a bit or clear within a word address, respectively.
The first parameter specifies which bit should be toggled, and the second
specifies the address base to be used for the operation.
rtl_a_test_and_set(int bit, volatile void *word);

rtl_a_test_and_clear(int bit, volatile void *word);
Implementations of the standard test and set/clear operations are pro-

vided here. The call will atomically set a specific bit within a word and
return the previous value.
rtl_a_incr(unsigned long *w);

rtl_a_decr(unsigned long *w);
Operating on a single long, these two will simply increment or decrement

the current value safely.
These are simple operations with a simple interface, and can be used
to build very elegant and high performance synchronization methods. While
other, more common mechanisms abound, most of them involve locks, queues,
and other structures. With atomic operations, synchronization can be as
simple as a single bit operation on an address.
Chapter 6
Communication between
RTCore and the GPOS
The two components of a complete RTCore system, real-time and the user-
space, generally run in two separate, protected address spaces. The real-time
component lives in the RTCore kernel, while the rest of the code lives as a
normal process within the GPOS. In order to manage each side, there has
to be some kind of communication between the two. RTCore offers several
mechanisms to facilitate this.
6.1 printf()
printf() is probably the simplest means of communicating from a real time
thread down to non-real-time applications. When an RTCore application
starts up, it creates a ’stdout’ device to communicate to the calling envi-
ronment, usually a terminal device of some kind. Calls to printf() in the
real-time application appear in the calling terminal the same way a printf()
call would in a normal application. This allows you to log real-time output
the following way:
./rtcore_app > log_file
The printf() implementation is fully capable, and can handle any nor-
mal data type and format. It also is a lightly synchronized method compared
to some others we will present here, and very fast as a result, without im-
pacting other core activity.
73
74CHAPTER 6. COMMUNICATION BETWEEN RTCORE AND THE GPOS
6.2 rtl printf()

This can be thought of as a simple method of dropping information in the
GPOS’s kernel ring buffer. rtl printf() is a normal printf() call that
exists within the real-time kernel, and works the exact same way as printf()
or printk(), but is safe from a real-time process.
For simplicity and speed in the kernel, this call does not support all format
types that a standard printf() call does. Most notably, it does not handle
formatting of floating point types.
While the overhead of rtl printf() is minimal, it is important to note
that there are implications. In order to safely synchronize with the GPOS,
interrupts must be briefly disabled. This means that you should avoid heavy
use of it, especially in a tight loop. Any operation that affects timing must
be carefully considered with respect to real-time goals, so make sure that
your debug output isn’t causing more problems than it is helping you solve.
This call is a very useful method of logging via the kernel buffer, but most
users will probably find the normal printf() call to be more convenient and
flexible.
6.3 Real-time FIFOs

Generally, there is a need for bidirectional communication between the real-
time module and the user-space code. The most straightforward mechanism
for this is the real-time FIFO. Applications can instruct RTCore to create
FIFO devices at runtime via POSIX calls, as we will see. The real-time
module reads or writes data to this device in a non-blocking manner, and on
the Linux side, a process can open it and make read()/write() calls on it
to exchange data with the real-time kernel (non-real-time applications can
be blocking or non-blocking.
6.3.1 Using FIFOs from within RTCore

For every FIFO that is used, initialization code must do:
mkfifo("/mydevice", 0777);
fd = open("/mydevice", O_NONBLOCK);
ftruncate(fd, 8192);
6.3. REAL-TIME FIFOS 75
An important factor to remember is that the FIFO creation calls involve

memory management in the ftruncate() operation, which is not available
from within real-time threads. As such, these calls must be made from within
the main() context in order to be safe. This is unlikely to be a problem, as
in nearly all cases, you need to set up your FIFOs before starting real-time
operations. This is only for calls performing initialization, though - real-time
threads that call open("/mydevice", O NONBLOCK); do not invoke memory
management, but instead just attach to the previously created device, and
are safe from within real-time threads.
In the example above, 0777 was used in the mkfifo() call. This indicates
to RTCore that the device should also be present in the GPOS filesystem.
In the process of the call, a device of that name and permissions (masked
with the caller’s umask) will be created. To create FIFOs that are to be used
strictly between real-time threads, specify 0 for the mask. This will register
the device so that real-time threads can use it, but it will not be visible to
the GPOS. More documentation on this can be found in the Arbitrary FIFO
device article provided in PDF form with RTCore.
6.3.2 Using FIFOs from the GPOS

On the user-space side, the FIFO appears to be a normal file. As such, any
normal file operation is usable on the FIFO. For example, the user-space
code could be a perl script, or maybe just a logging utility comprised of:
cat /my_device > logfile
6.3.3 A simple example

FIFOs are extraordinarily simple to work with, but it might be helpful to see
some of the calls described here in a single application:
#include <time.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
pthread t thread;
int fd0, fd1, fd2; 10
void *start routine(void *arg)

{
int ret, status = 1;
int read int;
while (status) {
usleep(1000000);
ret = read(fd1,&read int,sizeof (int));
if (ret) { 20
printf("/mydev0: %d (%d)\n",
read int,ret);
write(fd2,&read int,ret);
}
ret = read(fd0,&read int,sizeof (int));
if (ret) status = 0;
}
return 0;
}
30
mkfifo("/mydev0", 0777);
fd0 = open("/mydev0",O NONBLOCK);
ftruncate(fd0, 4096);
mkfifo("/mydev1", 0777);
mkfifo("/mydev2", 0777); 40
pthread create (&thread, NULL, start routine, 0);

6.3. REAL-TIME FIFOS 77
rtl main wait();

pthread join (thread, NULL);
close(fd0); 50
close(fd1);
close(fd2);
return 0;
}
We’ve already seen most of this code in other examples, but this succinctly
shows you how to use POSIX I/O from within RTCore. As usual, we spawn
a real-time thread from within main(), but we first have to explicitly create,
open, and size our FIFOs with the proper amount of preallocated space. 1
In the thread, we read() from fd0 to see if it’s time to shut down, and
otherwise read() from fd1 and write received data to fd2. These calls are
non-blocking for a reason - if the real-time thread ended up waiting for a
GPOS application that rarely got scheduling time, it would not be determin-
istic. So in this case, we just sleep and attempt to read from the devices.
There isn’t much to look at on the user-space side, but for the sake of
completeness, here it is:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
int main (int argc, char **argv) {

int fd0, fd1, fd2, i, read int;
fd0 = open("/mydev0",O WRONLY); 10

fd1 = open("/mydev1",O RDWR);
fd2 = open("/mydev2",O RDWR);
for (i = 0; i < 10; i++) {

1
Refer to the next section for details on determining the correct amount to preallocate.
write(fd1,&i,sizeof (int));
read(fd2,&read int,sizeof (int));
printf("Received %d from RTCore\n",
read int);
}
write(fd0,&i,sizeof (int)); 20
close(fd2);
close(fd1);
close(fd0);
return 0;
}
Looks pretty much like any other userspace application, doesn’t it? That’s
because it is. All we do is open the FIFOs, dump data over them, and read
it back. After we’re done, we write to a third FIFO to signal the real-time
thread that it’s time to shut down, and then we close the files. One minor
difference is that on this end, we didn’t open the devices as non-blocking,
although it can easily be done that way.
6.3.4 FIFO allocation

There are some rules as to how to handle FIFO allocation. When using the
POSIX interface, it is possible to do a normal
ftruncate(fd,32768);
style of call, but only in the following situations:
1. You are running in the main() context, and not in a thread. This way,
if the call determines that there is no preallocated space to use for the
device, it is safe to block while the memory allocation work is handled.
2. You are in the real-time context, and RTCore is running with preallo-
cated buffers for your data. In this case, even if you never performed an
explicit O CREAT, the open is safe, because RTCore has space set aside
for use by the FIFO. In this case, you will be forced to use the default
FIFO size that RTCore was built to use. This is a legacy option for the
/dev/rtf* devices and does not apply to arbitrarily named devices. It
also depends on specific compilation settings in RTCore, and as such,
6.4. SHARED MEMORY 79
using arbitrarily-named devices with proper sizing during initialization

is recommended.
6.3.5 Limitations
The real-time kernel is not bound to operate synchronously with the normal
operating system thread. If the real-time kernel is under heavy load, it may
not be able to schedule time for the GPOS to pull the data from the FIFO.
Since the FIFO is implemented as a buffer, it is feasible that the buffer
might fill from the real-time side before the user-space thread gets a chance
to catch up. In this case, it is advisable to increase the size of the buffer
(with ftruncate()) or to flush the buffer from the real-time code to prevent
the user-space application from receiving invalid data.
The inverse of this problem is that the FIFO cannot be a deterministic
means of getting command data to the real-time module. The real-time
kernel is not forced to run the GPOS thread with any regularity, as it may
have more important things to do. A command input from a graphical
interface on the OS side through the FIFO may not get across immediately,
and determinism should never be assumed.
A subtler problem that must be overcome by the programmer is that
the data passed through the FIFO is completely unstructured. This means
that if the real-time code pushes a structure into the FIFO with something
like write(fd,&x,sizeof(struct x));, the user-space code should pull it
out on the other side by reading the same amount of data into an identical
structure. There has to be some kind of coordination between the two in
order to determine a protocol for the data, as otherwise it will appear to
be a random stream of bits. For many applications, a simple structure will
suffice, possibly with a timestamp in order to determine when the data was
sampled and placed in the FIFO.
6.4 Shared memory

FIFOs provide serialized access to data, which is appropriate for applications
that operate with data in a queued manner. However, many applications
require both userspace and real-time code to work with large chunks of data,
and this is not always convenient to stream in and out of a FIFO. RTCore
provides an option for these workloads: shared memory with mmap().
6.4.1 mmap()
If you are not familiar with mmap(), please refer to the RTCore or standard
man page for full details. The basic idea is that you open a file descriptor,
call mmap() on it with a given range, and it returns a pointer to an area in
this file or device. Under RTCore, this is used with a device. As we shall see,
both the real-time module and the user-space application both open the same
device, call mmap(), and can subsequently access the same area of memory.
The shared memory devices themselves are created with the POSIX
shm open(), destroyed with shm unlink(), and sized with ftruncate().
Please refer to the man pages for specific details - only an overview will be
given here.
First, the device must be created. This is done with shm open(), which
takes the name of the device, open flags, and optionally a set of permission
bits. If you are the first user and are creating the device, use RTL O CREAT.
Furthermore, if you want this device to be automatically visible in the GPOS
filesystem, specify a non-zero value for the permission bits. For example, the
following call creates a node named /dev/rtl shm region that is visible to
the GPOS with permission 0600, and returns a usable file descriptor attached
to the device:
int shm_fd = shm_open("/dev/rtl_shm_region",

RTL_O_CREAT, 0600);
Now you have a handle to a shared region - however, it doesn’t have a

default size. This must be set via a call to ftruncate, as in:
ftruncate(shm_fd,400000);
Note that this will round up the size of the shared region in order to align
it on a page boundary (page size is dependent on architecture but generally
4096 bytes). Also, as it does perform memory allocation, it must occur in
the initialization segment. Now you can use mmap() from either real-time
code or user-space code, as in:
addr = (char*)mmap(0,MMAP_SIZE,PROT_READ|PROT_WRITE,
MAP_SHARED,shm_fd,0);
The resulting addr can be used to address anything in that region up to

the size specified by the value passed to ftruncate().
Once the code is done with the area, it can call close() on the file
descriptor. The last user calls shm unlink() on the name of the device to
destroy the area and unlink it from the GPOS filesystem:
close(shm_fd);
shm_unlink("/dev/rtl_shm_region");
It is worth noting that these need not be in order: if a thread is still using
the area and another calls shm unlink(), the region will remain valid until
the last user calls close() on the file descriptor. RTCore does reference
counting on devices like shared memory and FIFOs in order to allow this
behavior.
6.4.2 An Example
The theory and practice are very simple, so without further discussion, let’s
look at an example. First, the real-time application:
#include <time.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#define MMAP SIZE 5003
pthread t rthread, wthread; 10

int rfd, wfd;
unsigned char *raddr, *waddr;
void *writer(void *arg)

{
struct sched param p;
p.sched priority = 1;
pthread setschedparam(pthread self(), SCHED FIFO, &p); 20
waddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE,

MAP SHARED,wfd,0);
if (waddr == MAP FAILED) {
printf("mmap failed for writer\n");
return (void *)−1;
}

while (1) { 30
&next, NULL);
waddr[0]++;
waddr[1]++;
waddr[2]++;
waddr[3]++;
}
}
40
void *reader(void *arg)
{
pthread setschedparam(pthread self(), SCHED FIFO, &p);
raddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE,

MAP SHARED,rfd,0); 50
if (raddr == MAP FAILED) {
printf("failed mmap for reader\n");
}

while (1) {
&next, NULL); 60
printf("rtl_reader thread sees "
"0x%x, 0x%x, 0x%x, 0x%x\n",
raddr[0], raddr[1], raddr[2], raddr[3]);
}
}
wfd = shm open("/dev/rtl_mmap_test", RTL O CREAT, 0600);

if (wfd == −1) { 70
printf("open failed for write on "
"/dev/rtl_mmap_test (%d)\n",errno);
return −1;
}
rfd = shm open("/dev/rtl_mmap_test", 0, 0);

if (rfd == −1) {
printf("open failed for read on "
return −1; 80
}
ftruncate(wfd,MMAP SIZE);
pthread create(&wthread, NULL, writer, 0);

pthread create(&rthread, NULL, reader, 0);
rtl main wait();
pthread cancel(wthread); 90
pthread join(wthread, NULL);
pthread cancel(rthread);
pthread join(rthread, NULL);
munmap(waddr, MMAP SIZE);
munmap(raddr, MMAP SIZE);
close(wfd);
close(rfd);
shm unlink("/dev/rtl_mmap_test");
return 0; 100
}
First, we create and open a device twice, once for a reader thread and
once for a writer. A thread is spawned for each task, which actually performs
the mmap(). Note that the ftruncate() call is in the main() context, as it
needs to perform memory allocation to back the shared area. Further calls
such as mmap() that don’t cause allocations can happen anywhere.
The result of the mmap() call is a reference to the shared area, so once we
have the handles needed, can reference the area freely. One thread updates
the area every second, and the other reads it. Now we have an area that
is shared between real-time process, but what about userspace? The same
mechanism applies, as you can see here:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#define MMAP SIZE 5003 10
int main(void)
{
int fd;
unsigned char *addr;
if ((fd=open("/dev/rtl_mmap_test", O RDWR))<0) {
perror("open");
exit(−1);
} 20
addr = mmap(0, MMAP SIZE, PROT READ, MAP SHARED, fd, 0);
if (addr == MAP FAILED) {
printf("return was %d\n",errno);
perror("mmap");
exit(−1);
}
while (1) {
printf("userspace: the rtl shared area contains" 30
" : 0x%x, 0x%x, 0x%x, 0x%x\n",
addr[0], addr[1], addr[2], addr[3]);
sleep(1);
}
munmap(addr, MMAP SIZE);
close(fd);
return 0;
} 40
There isn’t much work involved here. The code opens the device as a
normal file and calls mmap() on it just as before. This piece of code performs
the same action as the reader in the real-time space, dumping the values of
the first few bytes of data every second or so. As the writer updates the area,
both the real-time reader and the user-space program see the same changes.
As with other RTCore mechanisms, it is assumed that the real-time side
does the initial work of creating the shared area. This ensures that the real-
time code has a handle on what exists, and doesn’t have to optionally wait
for some user-space application to get around to doing the work first. If you
attempt to start the user-space code first, it will fail for multiple reasons:
First, the device isn’t there to be opened until shm open() is called from
real-time code, and even if it is there, if there are no registered hooks for the
device.
6.4.3 Limitations
With shared memory, there is no inherent coordination between userspace
and real-time, as you can see in the example. Any rules governing usage of
the area must be added by your code. At any point, user code can overwrite
one area that a real-time thread needed to retain data in. In addition, one
can’t write to the area from real-time and then wait for it to be read and
cleared when Linux gets time to schedule your user-space process. This would
delay your real-time code indefinitely.
A little bit of synchronization can solve this type of problem. For example,
if you are using the area to get frames of data over to user-space, the real-time
thread could write the blocks at a given interval across the shared space, and
prepend each segment with a status byte indicating the state of the data.
The user-space program, when it is done reading or analyzing each segment,
can update that status byte to show that it is in use. This way the real-time
side can easily tell what areas are safe to overwrite.
This by-hand coordination can also easily allow you to direct real-time
code from user-space. One simple use is to allow control of real-time threads.
If both ends know that a certain area is meant to direct the actions of a real-
time thread, userspace code can easily flip a bit and indicate that a certain
thread should be suspended, resumed, or even spawned. This can be used
to (non-deterministically) direct nearly anything that the real-time code is
doing, or vice-versa.
6.5 Soft interrupts

On x86 platforms, running Linux, you will normally only find interrupts
numbered from 0 to 15 + NMI, as in the following:
CPU0
0: 75636868 XT-PIC timer
1: 6 XT-PIC keyboard
2: 0 XT-PIC cascade
4: 106 XT-PIC serial
5: 157842206 XT-PIC eth0
8: 1 XT-PIC rtc
13: 1 XT-PIC fpu
14: 13637083 XT-PIC ide0
6.5. SOFT INTERRUPTS 87
15: 12966 XT-PIC ide1

NMI: 0
On systems running RTCore, high interrupt numbers show up in /proc/interrupts

that range from 16-223. 2
CPU0
0: 1398262 RTLinux virtual irq timer
1: 4 RTLinux virtual irq keyboard
2: 0 RTLinux virtual irq cascade
11: 4902708 RTLinux virtual irq usb-uhci, eth0
12: 0 RTLinux virtual irq PS/2 Mouse
14: 29546 RTLinux virtual irq ide0
15: 5 RTLinux virtual irq ide1
219: 12178 RTLinux virtual irq sofirq jitter test
220: 0 RTLinux virtual irq RTLinux Scheduler
221: 26 RTLinux virtual irq RTLinux FIFO
222: 1293626 RTLinux virtual irq RTLinux CLOCK_GPOS
223: 5124 RTLinux virtual irq RTLinux printf
NMI: 0
ERR: 0
The interrupts above IRQ 15 are the software interrupts as provided by

RTCore, although they still appear to be real hardware interrupts as far
as Linux is concerned. The handler for these interrupts is executed in the
GPOS’s kernel context, permitting a real-time thread to indirectly call func-
tions within the GPOS kernel safely.
This demands a little explanation - you cannot safely call GPOS kernel
functions from within the real-time kernel, as many of those calls will block
as the Linux kernel performs various tasks. This generally leads to dead-
lock, and has obvious implications for code that is supposed to be executing
deterministically. A safe way around this is to register a software interrupt
handler in the Linux kernel that waits for a certain interrupt. When the
real-time code requires a service to be done asynchronously in the GPOS
space, it signals an interrupt for this handler. The handler will not execute
in real-time, so the real-time code is not blocked in any way, but there is no
guaranteed worst case delay between calling the soft-interrupt handler and
2
RTCoreBSD systems have a limit of 32 soft IRQs.
actual execution. This is due to the same reason as before: The real-time
kernel may prevent the GPOS kernel from running for some time, depend-
ing on the current set of demands. However, for soft-real-time tasks, this is
generally a sufficient approach.
Again, it must be stressed that the GPOS is only seeing RTCore virtual
IRQs. The handlers the GPOS had installed before RTCore was loaded are
not affected but are now managed by the interrupt emulation layer, and thus
have become soft interrupts. This process of insertion is handled transpar-
ently to GPOS drivers.
This can be used to handle many inter-kernel communication mechanisms.
As previously discussed, rtl printf() uses this mechanism to pass data to
the kernel ring buffer. It could also serve as a way for real-time code to
allocate memory, by signalling a GPOS hander to safely perform the memory
management asynchronously.
6.5.1 The API

#include <stdlib.h>
To setup a software interrupt only a few functions are needed. With

the rtl get soft irq() and rtl free soft irq(), interrupts are registered
and deregistered:
int rtl_get_soft_irq(void (* handler)(int, void *, struct rtl_frame *),

const char * devname);
void rtl_free_soft_irq(unsigned int irq);
The string passed as second argument to rtl get soft irq() is the
string name that will be associated with the IRQ, which on Linux will be
displayed in /proc/interrupts. It is a good idea to make this something
meaningful, especially if you are making heavy use of the soft IRQ handlers.
The interrupt number assigned is the first free interrupt number from
the top down. As such, there is little risk this will ever collide with a real
hardware interrupt. rtl get soft irq() will return -1 if there is a failure,
but should otherwise return the value of the IRQ registered.
void rtl_global_pend_irq(int irq);

To actually signal the interrupt to Linux the function rtl global pend
irq() is given the soft interrupt number. When the Linux kernel runs, it
will see this interrupt as pending and execute your Linux handler.
The interrupt handler declaration is just like the one you would use for a
regular Linux interrupt handler:
static void my_handler(int irq, void *ignore,

struct rtl_frame *ignore);
The same restrictions that apply to Linux-based hardware interrupt han-

dlers apply to soft interrupt handlers, with respect to things like synchro-
nization with Linux kernel resources from within an interrupt handler, etc.
6.5.2 An Example
This section wouldn’t be complete without a simple example. The soft IRQ
API is fairly small, so let’s look at a piece of code that uses all of the calls:
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
pthread t thread;
static int our soft irq;
void * start routine(void *arg)

{ 10
p . sched priority = 1;
pthread setschedparam (pthread self(),
SCHED FIFO, &p);

while (1) {
clock nanosleep (CLOCK REALTIME,

rtl global pend irq(our soft irq);
}
return 0;
}
static int soft irq count;
void soft irq handler(int irq, void *ignore, 30

struct rtl frame *ignore frame) {
soft irq count++;
printf("Recieved soft IRQ #%d\n",soft irq count);
}

soft irq count = 0;
our soft irq = rtl get soft irq(soft irq handler,
"Simple SoftIRQ\n");
if (our soft irq == −1) 40
return −1;
rtl main wait();

rtl free soft irq(our soft irq);
return 0;
} 50
On initialization, we get a soft IRQ, providing the function that should

act as the handler, and a short name. If this call is successful, we spawn a
thread.
From this point on, our soft irq handler() is registered in the Linux
kernel as an interrupt handler, and we have a real-time thread in an infinite
loop. In this loop, it activates on half-second intervals, pending our soft
IRQ each time. These interrupts are caught by Linux, which executes our
soft irq handler(), which in turn dumps the current interrupt count via
printk(). On exit, the tail end of main() destroys our real-time thread as
usual, and then deregisters the soft IRQ handler.
As you can see, it isn’t very hard to interact with the Linux kernel in this
fashion. By simply pending interrupts, you can trigger your own handlers to
do some dirty work in the GPOS kernel, without sacrificing determinism in
your real-time code.
Chapter 7
Debugging in RTCore
No one likes to admit it, but most developers spend a large chunk of time
debugging code, rather than writing it. Bugs in RTCore can be even more
difficult to trace down: by inserting any debug traces or other mechanisms,
the system is changed, and all of a sudden the bug won’t trigger. (Timing de-
pendent bugs are of course possible in other systems, but are more prevalent
in real-time development.)
Additionally, all real-time code, if it is running inside the RTCore kernel,
has the potential to halt the machine (PSDD threads live in external address
spaces). Debugging userspace applications is simpler, as a failure will simply
result in the death of the process, not the kernel. Trying to tackle the bug
is usually just a matter of cleaning up and trying the program again. These
luxuries are harder to come by in the kernel.
Fortunately, RTCore provides a debugger that can often prevent pro-
gramming errors from bringing the system down. Loaded with the rest of the
RTCore, (it can be disabled through recompilation with source) the RTCore
debugger watches for exceptions of any kind, and stops the thread that caused
the problem before the system goes down.
7.1 Enabling the debugger

The debugger is enabled during configuration of RTCore, under selective
component building options.
93
94 CHAPTER 7. DEBUGGING IN RTCORE
7.2 An example
There are some important things to know about the debugger, but before
getting into the details, lets walk through a simple example to describe ex-
actly what we are talking about. As with anything else, the first step is a
hello world application:
#include <time.h>
#include <stdio.h>
pthread t thread;
pthread t thread2;

{
int i; 10
volatile pthread t self;

self = pthread self();
pthread setschedparam (pthread self(), SCHED FIFO, &p);
if (((long) arg) == 1) { 20
/* cause a memory access error */
*(unsigned long *)0 = 0x9;
}

for (i = 0; i < 20; i ++) {
printf("I’m here; my arg is %ld\n", (long) arg); 30
}
7.2. AN EXAMPLE 95
return 0;
}
int main(int argc, char **argv)

{
pthread create (&thread, NULL, start routine, (void *) 1);
pthread create (&thread2, NULL, start routine, (void *) 2);
rtl main wait(); 40

return 0;
}
As with our other examples, we have an initialization context and a

cleanup context, with real-time code that is run in between. In our initial-
ization, we spawn two real-time threads running the same function, with an
error (access of illegal memory) that the first thread will hit, as its argument
is 1.
What happens when this module is loaded is that the first thread is
spawned and causes a memory access error. The debugger catches this, and
halts all real-time threads. This means that the second thread is also halted,
so there is no stream of ”I’m here” messages from the second thread, even
though it doesn’t have a problem. This is to allow for a completely known
system state that the developer can step through at will.
The debugger prints a notice of the exception to the console, so that run-
ning ’dmesg’ will produce a line detailing which thread caused the exception,
where it was and how to begin debugging. Now we can start the debugger
and analyze the running code.
RTLinuxPro provides the real-time debugger module, and also GDB to
be used from userspace. Other debuggers are also usable, such as DDD, but
we will assume GDB for this example. Now that we have real-time code that
has hit an exception, we can run GDB on the object file that was saved for
us during compilation for debugging:
# gdb hello.o.debug
(gdb)
The next step is to connect GDB to the real-time system. This is accom-
plished using the remote debugging facility of GDB. The real-time system
provides a real-time FIFO for debugger communication:
(gdb) target remote /dev/rtf10

Remote debugging using /dev/rtf10
The RTCore debugger uses three consecutive real-time devices: /dev/rtf10,

/dev/rtf11, and /dev/rtf12. The starting FIFO can be changed with the
source version of the kit. Future versions of RTCore may use the named
FIFO capability of RTCore rather than the older /dev/rtf devices.
Now, in our case, we expect to see a memory access violation. Once the
target remote /dev/rtf10 message is entered we should see GDB display
the following:

[New Thread 1123450880]
start_routine (arg=0x1) at test.c:25
25 *(unsigned long *)0 = 0x9;
(gdb)
The above message tells us that we are indeed debugging through /dev/rtf10,
that the thread ID that faulted is 1123450880 and that the fault was in the
function start routine which was passed 1 argument named arg with value
0x1. This is all contained in source file test.c on source line 25. GDB also
displays the actual source line in question. The error that was generated was
indeed where we placed it.
Now, we examine the function call history. This may be necessary in
complex applications in order to determine the source of an error. Typing
bt will cause cause GDB to print the stack backtrace that led to this point.
(gdb) bt
#0 start_routine (arg=0x1) at test.c:25
#1 0xd1153227 in ?? ()
7.2. AN EXAMPLE 97
Perhaps it is not clear what type of variables are being operated on. If
you wish to examine the type and values of some variables use the following
commands:
(gdb) whatis arg

type = void *
(gdb) print arg
$1 = (void *) 0x1
To get a better idea of what other operations are being performed in this
function one can list source code for any function name or any set of line
numbers with:
(gdb) list start_routine

16
17 volatile pthread_t self;
18 self = pthread_self();
19
20 p . sched_priority = 1;
21 pthread_setschedparam (pthread_self(), SCHED_FIFO, &p);
22
23 if (((long) arg) == 1) {
24 /* cause a memory access error */
25 *(unsigned long *)0 = 0x9;
It is also possible to disassemble the executable code in any region of

memory. For example, to view the start routine function:
(gdb) disassemble start_routine

Dump of assembler code for function start_routine:
0xd1137060 <start_routine>: push %ebp
0xd1137061 <start_routine+1>: mov %esp,%ebp
0xd1137063 <start_routine+3>: sub $0x10,%esp
0xd1137066 <start_routine+6>: lea 0xfffffff8(%ebp),%eax
0xd1137069 <start_routine+9>: push %edi
0xd113706a <start_routine+10>: push %esi
0xd113706b <start_routine+11>: push %ebx
0xd113706c <start_routine+12>: mov 0xd116f8a0,%edx
0xd1137072 <start_routine+18>: mov %edx,0xfffffffc(%ebp)
0xd1137075 <start_routine+21>: movl $0x1,0xfffffff8(%ebp)

0xd113707c <start_routine+28>: push %eax
0xd113707d <start_routine+29>: push $0x1
0xd113707f <start_routine+31>: push %edx
0xd1137080 <start_routine+32>: call 0xd11540e4
0xd1137085 <start_routine+37>: add $0xc,%esp
0xd1137088 <start_routine+40>: cmpl $0x1,0x8(%ebp)
0xd113708c <start_routine+44>: jne 0xd1137098 <start_routine+56>
0xd113708e <start_routine+46>: movl $0x9,0x0
0xd1137098 <start_routine+56>: lea 0xfffffff0(%ebp),%ebx
0xd113709b <start_routine+59>: push %ebx
0xd113709c <start_routine+60>: mov 0xd1161e8c,%eax
0xd11370a1 <start_routine+65>: push %eax
Once you are done debugging, you may exit the debugger and stop exe-
cution of the process being debugged.
(gdb) quit
The program is running. Exit anyway? (y or n) y
RTCore will resume execution of all threads but will leave the application
that was being debugged stopped. To actually remove the application or
module you must stop it through the means that you are used to - either by
sending it a signal to stop it (perhaps by typing control-c in the window) or
removing the application module.
7.3 Notes
There are a few items to keep in mind when using the RTCore debugger.
Most of these items are short but important, so keep them in mind in order
to make your debugging sessions more effective.
7.3.1 Overhead
The debugger module, when loaded, catches all exceptions raised, regardless
of whether it is related to real-time code, GPOS, or otherwise. This incurs
some overhead: Consider for example the case where a userspace program
causes several page faults as it is working through some data. These page
7.3. NOTES 99
faults cause the debugger to do at least some minor work to see if the fault
is real-time related. This may lead to a slight degradation of the GPOS
performance, so if the GPOS really needs some extra processing, the debug-
ger module may be removed. In practice, however, the benefits of having
protection against misbehaving RT programs usually outweigh the overhead
incurred by the debugger.
For those who wish to avoid this overhead, the source version of RTCore
allows you to reconfigure the OS without the debugger for production use.
7.3.2 Intercepting unsafe FPU use

Real-time threads that use the FPU must enable floating point operations
with pthread attr setfpu np(). If they do not do this, they cannot safely
use the floating point unit on the CPU, as the context will not be maintained
for them.
On PPC systems, the debugger will detect threads that use the FPU
without enabling it, and cause a fault.
7.3.3 Remote debugging

Sometimes it is helpful to debug code remotely. This usually occurs when
the remote machine is a different architecture, and you don’t want to run
GDB on the target machine itself. (RTLinuxPro provides GDB, but there
may not be enough room on the target device, you may need some additional
tools, etc.) In this case, netcat is the preferred option.
Netcat provides the ability to pipe file data over a given port. In the
context of the RTCore debugger, this means that we can start netcat on the
target such that it essentially exports /dev/rtf10 over the network. Here is
an example of how to start netcat on the target machine:
nc -l -p 5000 >/dev/rtf10 </dev/rtf10 &
This starts netcat on the device, listening on port 5000, feeding data from
the network listener into the FIFO, and also pushing data coming out of the
FIFO out onto that same listener. In GDB running on the development
host, you can connect to the remote real-time system with target remote
targethost:5000, where targethost is the target machine name.
Netcat will exit when the user detaches from the socket, so if you are
going to do many debugging runs, it is helpful to run it in a loop, as in:
while :; do nc -l -p 5000 >/dev/rtf10 </dev/rtf10 ; done
7.3.4 Safely stopping faulted applications

Once you are done analyzing the state of the system, the faulty app must be
stoppped. This can be done the following series of commands:
(gdb) CTRL-Z
[1]+ Stopped gdb
# killall app_name
# kill %1
Make sure to not trigger any GDB commands that would cause the real-
time code to continue, as it would just execute the faulty code again.
7.3.5 GDB notes

GDB has a problem with examining data in the bss section, so any variables
that were not explicitly initialized are not viewable from GDB. This may be
fixed in a later release, but in the meantime, it is simplest to initialize any
variables that will be analyzed with GDB.
The RTCore debugger can be used to debug the user-space (PSDD) RT
threads1 . Debugging threads running under the userspace frame scheduler is
also supported.
Under NetBSD, the RTCore symbols must be explicitly loaded. This can
be done with:
gdb hello.o
(gdb) symbol-file /var/run/rtlmod/ksyms
1
at present, only with Linux
Chapter 8
Tracing in RTCore
8.1 Introduction
Real-time programs can be challenging to debug because traditional debug-
ging techniques such as setting breakpoints and step-by-step execution are
not always appropriate. This is mainly due to two reasons:
• Some errors are in the timing of the system. Stopping the program
changes the timing, so the system can not be analyzed without modi-
fying it’s behavior.
• If the real-time program controls certain hardware, suspending the pro-

gram for analysis may cause the hardware to malfunction or even break.
RTCore implements a subset of POSIX trace facilities. Using them, it

is possible to analyze and evaluate real-time performance while a real-time
program is running. An introduction to the POSIX tracing as well as the
API definitions can be found in the Single UNIX Specification 1 .
The tracer aims to follow the POSIX Tracing API reasonably closely. One
notable difference is that most functions and constants have the RTL TRACE
or rtl trace prefix rather than posix trace . The API functions are de-
clared in the include/rtl trace.h file. To use the tracer, CONFIG RTL TRACER
option (”RTLinux tracer support”) must be enabled during configuration of
the system.
1
http://www.unix.org/version3/
101
102 CHAPTER 8. TRACING IN RTCORE
examples/tracer/rtl trace default.o is a module that creates a trace-

stream for each cpu and starts the tracing.
To see a quick demonstration, recompile the system with CONFIG RTL TRACER
on, load the RTCore, examples/tracer/rtl trace default.o, examples/
tracer/testmod.o modules, and run the examples/tracer/tracedump 0,
where 0 can be replaced with the desired CPU number. You should see the
dump of stream of events on the target CPU.
8.2 Basic Usage of the Tracer

There are two parties involved in tracing: the program being analyzed and
the analyzer process. When the program to be analyzed is instrumented for
tracing, it records the information about events encountered during execu-
tion. For each event, information about the current CPU, current thread id,
a timestamp, and optional user data is recorded into an in-memory buffer.
RTCore tracer provides built-in trace points for certain system events such as
context switches. The list of currently supported system events is provided
in the next section. In addition, an RTCore program can trace user-defined
events by invoking rtl trace event function with RTL TRACE UNNAMED USEREVENT
as the event id.
Before the tracing can be started, a POSIX trace stream must be created.
For an example of creating a trace stream, please see the examples/tracer/
rtl trace default.o module.
The analyzer process is a GPOS (userspace) process that reads the event
records made by the trace subsystem. This is done with the functions
rtl trace trygetnext event, rtl trace getnext event, and
rtl trace timedgetnext event. An example of a trace analyzer process
can be found in examples/tracer/tracedump.c.
8.3 POSIX Events

For every event, the following members of the struct rtl trace event info are
filled:
• posix event id is the event identifier.

8.3. POSIX EVENTS 103
• posix timestamp is struct timespec, represents the time of the event;

the clock used does not necessarily correspond to any of the system
clocks.
• posix thread id is the thread id for the current thread.
The list of currently supported events include:
• RTL TRACE OVERFLOW – The system detected an overflow. Some events

have been lost. It is necessary to reset the profiling in progress to avoid
getting incorrect results.
• RTL TRACE RESUME – The system has recovered from an overflow con-
dition.
• RTL TRACE SCHED CTX SWITCH – context switch. The accompaning data
is a void * pointer of the new thread.
• RTL TRACE CLOCK NANOSLEEP – the thread invoked the clock nanosleep
call.
• RTL TRACE BLOCK– the thread voluntarily blocks itself (e.g., as a result
of a clock nanosleep call).
• RTL TRACE UNNAMED USEREVENT – this is a user-defined event. The data

can be arbitrary.
The events may be selectively enabled for tracing with the rtl trace set filter
function. For best performance, it is advisable to disable unneeded event
types.
It is possible to perform function call tracing the help of the tracer. To do
this, the program to be analyzed must be compiled with the -finstrument-functions
option to gcc. For an example, please see examples/tracer/testmod.c in
the RTCore distribution. For modules compiled with -finstrument-functions,
two special events are generated:
• RTL TRACE FUNC ENTER – Function entry. event->posix prog address

represents the address in the program from which the function call has
been made. The data that accompanies this event is a void * pointer
to the function that has been called.
104 CHAPTER 8. TRACING IN RTCORE
• RTL TRACE FUNC EXIT – Function exit. event->posix prog address

is a pointer to the function that has exited. The data that accompanies
this event is a void * pointer to the place from which the function call
has been made.
Chapter 9
IRQ Control
Once RTCore is loaded, the GPOS does not have any direct control over
hardware IRQs - manipulation is handled through RTCore when there are
no real-time demands. However, RTCore applications can manipulate IRQs
for real-time control. We’ll now cover the basic usage of the IRQ control
routines.
9.1 Interrupt handler control

First, let’s look at the calls needed to manage interrupt handlers. Unless
otherwise specified, only the original GPOS interrupt handlers will handle
incoming interrupts, once there are no real-time demands. Here we cover
how to set up your own real-time interrupt handlers.
9.1.1 Requesting an IRQ

An RTCore application can install an IRQ handler with the call rtl request irq(
irq num, irq handler), where the irq handler parameter is a function of
type:
unsigned int *handler(unsigned int irq, struct rtl_frame *regs);
This will hook the function passed as the second argument to rtl request irq
to be called when IRQ irq num occurs, much like any other IRQ handler.
When that function is invoked, it will run in interrupt context. This means
that some functions may not be callable from the handler and all interrupts
105
106 CHAPTER 9. IRQ CONTROL
will be disabled. This handler is not debuggable directly, but as threads are,
it is safe to post to a semaphore that a thread is waiting on. The thread
will be switched to immediately so that operations can be performed in a
real-time thread context. Upon execution of any operation this thread that
causes a thread switch control will return to the interrupt handler.
9.1.2 Releasing an IRQ

An IRQ can be released with rtl free irq(irq num). This will unhook the
handler given to rtl request irq and it will not be called again. However,
it is possible that this interrupt handler is still executing on the current or
another CPU so care should be taken by the application programmer to
ensure this is not the case.
9.1.3 Pending an IRQ

Many applications require that a GPOS interrupt handler get an IRQ in
addition to the handler installed by rtl request irq once the handler is
done doing any work. The RTCore application might just be interested in
keeping track of when IRQs are coming in, or some simple statistic, before
allowing the GPOS to proceed and handle the work.
In these cases, the rtl global pend irq(irq num) function should be
used. This will pend the IRQ for the GPOS and once the RTOS is finished
the GPOS will process this as a pending IRQ.
9.1.4 A basic example

Let’s look at a basic example of an application that tracks incoming IRQs for
the GPOS. This grabs an IRQ with rtl request irq(), pends it during op-
eration with rtl global pend irq(), and releases it with rtl free irq().
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <unistd.h>
#include <string.h>
9.1. INTERRUPT HANDLER CONTROL 107
pthread t thread;
sem t irqsem; 10
int irq = −1;
void *thread code(void *t) {

static int count = 0;
while (1) {
sem wait (&irqsem);
count++;
printf("IRQ %d has occurred %d times\n",
irq, count); 20
}
return NULL;
}
unsigned int intr handler(unsigned int irq,

struct rtl frame *regs) {
rtl global pend irq(irq);
sem post(&irqsem);
return 0; 30
}

{
int ret;
if ((argc != 2) | | strncmp( argv[1], "irq=", 4 )) {

printf( "Usage: %s: irq=#\n", argv[0] );
return −1;
} 40
irq = atoi(&argv[1][4]);
sem init (&irqsem, 1, 0);

if ((ret = rtl request irq( irq, intr handler )) != 0) {

printf("failed to get irq %d\n", irq);
ret = −1;
goto out; 50
}
rtl main wait();
rtl free irq(irq);
out:
pthread cancel( thread );
pthread join( thread, NULL );
60
return ret;
}
This code initializes and pulls the requested IRQ for tracking from the
passed-in arguments. It then spawns a thread that waits on a semaphore -
this thread will be printing out the IRQ count as they occur. As mentioned,
the handler will be invoked in interrupt context, and as such is fairly limited
in what it can do. Instead, the handler is hooked up but does no real work
except for the rtl global pend irq() for the GPOS and sem post() for
the thread.
As with the other examples, this can continue indefinitely. If it is hooked
to the interrupt for a hard drive, it will trigger a message with a count for
each interrupt triggered by the device. When the application is stopped with
a CTRL-C, it will release the IRQ handler, kill the thread, and unload as
usual. The GPOS IRQ handler will then be the only handler for the device.
9.1.5 Specifics when on NetBSD

RTCore on BSD UNIX also requires that you call rtl map gpos irq(bsd irq)
to obtain the IRQ identifier prior to using any RTCore interrupt control func-
tions. This function transforms the NetBSD IRQ identifier to an RTCore
IRQ.
9.2. IRQ STATE CONTROL 109
NetBSD’s interrupt scheme changed considerably with the addition of

SMP support. This call maintains compatibility with RTCore interrupt han-
dling, and must be called before functions like rtl request irq().
The IRQ can be an ISA IRQ number (e.g., IRQ7 for LPT), or the return
value from the PCI interrupt lookup function pci intr map(). On success,
the function returns a value that can be used for rtl request irq() or other
RTCore IRQ functions. On error, a negative value is returned.
9.2 IRQ state control

Besides interrupt handlers, applications commonly need to control interrupt
states - specifically, whether interrupts are enabled or disabled. This is a
common means of synchronization for some tasks, although less intrusive
means of mutual exclusion are generally possible. Here we cover how to
enable and disable interrupts, save state, and similar tasks.
9.2.1 Disabling and enabling all interrupts

Generally, interrupts are disabled with a “hard” disable and “hard” enabled.
When RTCore is running, any enable and disable calls made by the GPOS are
virtualized so that they do not actually disable real interrupts. RTCore appli-
cations can directly disable hardware interrupts with rtl stop interrupts
and enable them again with rtl allow interrupts. These function calls
enable and disable interrupts unconditionally. Sometimes, it is preferable to
save the current state, disable interrputs, perform some critical work and
then restore the saved state. This can be done with the sequence below:
#include <rtl sync.h>
void function(void)
{
rtl irqstate t flags;
/* save state and disable interrupts */

rtl no interrupts( flags );
/* perform some critical operation. . . */ 10

/* restore the previous interrupt state */

rtl restore interrupts( flags );
These calls do disable the real interrupts, so they must be used with
care. Interrupts should never be disabled longer than absolutely necessary,
as events may be missed. The system may also run out of control if the ap-
plication never re-enables the interrupts. However, some applications cannot
handle any kind of jitter during certain operations, even the minimal over-
head of receiving an Ethernet IRQ, and must disable all interrupts for short
periods.
While this is a simple mechanism for synchronization, it cannot be stressed
enough that lighter mechanisms that do not disable interrupts are almost al-
ways favorable. Even if you think that the code protected with disabled
interrupts is not on an important path, it may be running on the same hard-
ware with another application that cannot tolerate that kind of behavior.
Please see section 5.7.3 for more details.
Specific IRQs can be enabled or disabled with rtl hard enable irq(irq num)
and rtl hard disable irq(irq num) respectively. This allows the user to
target a specific IRQ rather than the entire set.
9.3 Spinlocks
pthread spin lock includes an implicit IRQ save of state and interrupt dis-
able and pthread spin unlock includes an implicit restore of the interrupt
state at the time the corresponding pthread spin lock call.
This can be a problem in cases where the locks are released in a different
order from when they were taken. For example:
void function(void)
{
pthread spinlock t lock1, lock2;
9.3. SPINLOCKS 111
/* initialize the locks */

pthread spin init( &lock1, 0 );
pthread spin init( &lock1, 0 ); 10
/* . . .assume interrupts are enabled here. . . */
/* acquire lock 1 */
pthread spin lock( &lock1 );
/* . . .interrupts are now disabled here. . . */
/* acquire lock 2 */
pthread spin lock( &lock2 );
/* the state in lock2 is interrupts “enabled” */ 20
/* release lock 1 */
pthread spin unlock( &lock1 );
/* interrupts are now disabled since lock1 state was “enabled” */
/* release lock 2 */
pthread spin unlock( &lock2 );
/* restore to a interrupt disabled state */
} 30
Note that state restore when releasing lock1 and lock2 is incorrect since
the locks were acquired in a different order than they were released in.
Chapter 10
Writing Device Drivers
This chapter presents examples of several classes of RTCore drivers and how
they interact with user-level programs and other RTCore applications.
Writing RTCore device drivers is very similar to writing normal RTCore
applications. Since all memory, including device memory, is accessible to
RTCore applications every RTCore program can potentially function as a
driver. Where drivers and normal RTCore applications differ is in how they
communicate with user-space (GPOS) applications and other RTCore pro-
grams.
10.1 Real-time FIFOs

The simplest way of communicating with a driver is through a real-time
FIFO. This is the simplest type of driver and is best used when one-way
communication with the driver is needed since FIFOs only perform read()
or write() operations. An example is a motor controller that that only
receives commands (such as motor speed) or a simple data acquisition device
that sends information (such as the temperature of a probe).
FIFO operations and how to use them in RTCore applications is well
covered in previous chapters so it will not be covered here.
10.2 POSIX files

A more advanced, and more full featured, interface is through POSIX file
operations. Drivers can advertise their services to other RTCore applications,
113
114 CHAPTER 10. WRITING DEVICE DRIVERS
and only RTCore applications, through files, just as with a standard UNIX
system. These files are managed by RTCore and are not directly accessible
from the GPOS environment. For example, a Linux application that opens
/dev/lpt0 is communicating with the Linux (non-real-time) parallel port
driver and not the RTCore driver. Conversely, a RTCore application that
opens /dev/lpt0 is communicating with the RTCore driver and not with the
Linux driver.
The example driver below provides a /dev/lpt0 file that can be used
through POSIX open(), read(), write(), ioctl(), mmap() and close()
calls from RTCore applications. Two files, /dev/lpt0 and /dev/lpt1 are
created. When an RTCore application performs any operations on these files
the driver prints a message.
#include <stdio.h>
#include <rtl posixio.h>
static rtl ssize t rtl par read(struct rtl file *filp, char *buf,
rtl size t count, rtl off t* ppos)
{
printf("read() called on file /dev/lpt%dn", filp−>f priv);
return 0;
} 10
static rtl ssize t rtl par write(struct rtl file *filp, const char *buf,
rtl size t count, rtl off t* ppos)
{
printf("read() called on file /dev/lpt%dn", filp−>f priv);
return 0;
}
static int rtl par ioctl(struct rtl file *filp,

unsigned int request, unsigned long l) 20
{
printf("ioctl() called on file /dev/lpt%dn", filp−>f priv);
return 0;
}
static int rtl par open(struct rtl file *filp)

10.2. POSIX FILES 115
{
printf("open() called on file /dev/lpt%dn", filp−>f priv);
return 0;
} 30
static int rtl par release(struct rtl file *filp)

{
printf("close() called on file /dev/lpt%dn", filp−>f priv);
return 0;
}
static rtl off t rtl par llseek(struct rtl file *filp,

rtl off t off, int flag)
{ 40
printf("lseek() called on file /dev/lpt%d, offset %d and flag %d\n",
filp−>f priv, off, flag);
return 0;
}
int rtl par mmap(struct rtl file *filp, void *a, rtl size t b,
int c, int d, rtl off t e, rtl caddr t *f)
{
return 0;
} 50
int rtl par munmap(struct rtl file *filp, void *a, rtl size t length)
{
return 0;
}
int rtl par unlink(const char *filename, unsigned long i)

{
printf("unlink() called on %s, should be /dev/lpt%d\n",
filename, i); 60
return 0;
}
int rtl par poll handler(const struct rtl sigaction *sigact)

{
printf("sigaction() with SIGPOLL called\n");
return 0;
}
int rtl par ftruncate(struct rtl file *filp, rtl off t off) 70
{
printf("ftruncate() called on file /dev/lpt%d\n",
filp−>f priv);
return 0;
}
void rtl par destroy(int minor)

{
printf("destroy() called on minor %d, last user done\n", 80
minor);
}
static struct rtl file operations rtl par fops =

{
open: rtl par open,
release: rtl par release,
read: rtl par read,
write: rtl par write,
ioctl: rtl par ioctl, 90
munmap: rtl par munmap,
mmap: rtl par mmap,
unlink: rtl par unlink,
install poll handler: rtl par poll handler,
ftruncate: rtl par ftruncate,
destroy: rtl par destroy
};

{ 100
rtl register dev( "/dev/lpt0", &rtl par fops, 0 );
rtl register dev( "/dev/lpt1", &rtl par fops, 1 );
10.3. REFERENCE COUNTING 117
rtl main wait();
rtl unregister dev( "/dev/lpt0" );

rtl unregister dev( "/dev/lpt1" );
return 0;
} 110
10.2.1 Error values

Drivers should report errors to the caller through handler return values for
each operation. For example, a driver that wishes to report a failure on during
a write() when there is no space remaining should return -RTL ENOSPC. The
POSIX file layer of RTCore will translate any return value less than 0 as an
error and will set errno appropriately. So, RTCore applications making
this write() call will receive a -1 return value and rtl errno will contain
RTL ENOSPC. This application and print the errno value through rtl perror.
A complete list of errno values can be found in include/rtl errno.h.
10.2.2 File operations

Any file operation that a driver does not wish to handle can be safely set to
NULL. The RTCore POSIX file layer will check for NULL handlers and will
report the appropriate error to the caller.
A list of ioctl() flags is in include/rtl ioctl.h. There are many flags
for specific devices and for general use. It is recommended that you not
create new flags unless one of those found in rtl ioctl.h does not fit your
needs.
Just as there is a list of ioctl() flags there is also a list of mmap() flags
in sys/mman.h. Only create new flags if you absolutely need them and none
of the pre-existing flags will fit.
10.3 Reference counting

Devices registered with the RTCore kernel are reference counted. If you
poked into include/rtl posixio.h, you have already seen an extra callback
named destroy(). RTLinuxPro as of version 1.2 has added the capability

of internally handling reference counts to all devices.
From a developer’s standpoint, this generally doesn’t require any extra
work in most situations, but it is worth stepping through the rules of how
RTCore handles these operations. We’ll use our previous simple example as
a reference point.
First, when you register a device with rtl register dev(), it registers
the name and sets the usage count to one. This causes the usage to drop back
to 0 when you call rtl unregister dev(). Also, any open() call increments
the device’s usage count, while close() decrements it again.
For devices that allocate and destroy areas, it is important that when the
last user detaches from the device, any resources associated with that device
are destroyed. Lets look at an example where the device driver maintained a
pointer to a shared region of memory, initialized to NULL, for the user. When
the first user calls open(), memory is allocated for use by the threads. When
the last user detaches from this device through close(), it is important that
the area is deallocated.
This is the reason for the destroy() callback in include/rtl posixio.h.
If the device has work that needs to be done when the last user exits on a
device, this hook is called. For the shared memory example, we would have
added a destroy callback defined as:
void example_destroy(int minor) {
rtl_gpos_free(array_ptr);
}
This would have been passed in the fops structure with everything else.
When the last user exits, RTCore will call this function so that memory is
safely deallocated, and not when other threads may be using it. Otherwise,
if some code was using the area when another called rtl unregister dev(),
the memory would be freed out from under active code.
RTCore provides a couple of routines to allow you to control these counts
by hand if needed: incr dev usage(int minor) and decr dev usage(int
minor). This is helpful if you need to work with device resources, and want
to make sure that the last user doesn’t exit and cause a destruction of all
device resources while this work is occurring. An alternative is to perform
a normal open() on the device, do the work, and then close(). This is
the simplest method, but some drivers may still derive some use from the
incr/decr routines.
10.3. REFERENCE COUNTING 119
There is one more factor to keep in mind when using these calls: the
rtl namei() call performs an implicit incr dev usage(). This is done in
order to simplify the process of safely allocating a device. For functions
that use rtl namei(), there must be a symmetric decr dev usage() call to
prevent an artificiallly raised usage count.
10.3.1 Reference counting and userspace

This reference count concept extends to devices available to userspace pro-
cesses. Consider the API call rtl gpos mknod(), which allows RTCore code
to create devices visible in the GPOS filesystem. If you create a real-time
device and a userspace-visible counterpart, there may also be userspace pro-
cesses bound to the area as well. With respect to reference counting, these
processes are treated the same way. . Each GPOS open() raises the de-
vice count, and a GPOS close() decrements it. Even if all of the real-time
threads close and exit while one userspace maintains a handle, the destruc-
tion of the resource waits until the last user closes. When the userspace
code exits, the callbacks will find that it was the last user, and will free any
resources just as if it was opened by a real-time thread.
For example, if we had added an rtl gpos mknod() call to the creation
of a shared memory device (and an rtl gpos unlink() to the cleanup), let a
userspace application also access the area, and then shut down our real-time
threads, the userspace application would still be able to access the area. Once
it exits, the close() would occur and bring our usage count to 0, causing
the destroy callback to execute and clean up.
Of course, this is a fairly simple example, but it doesn’t get much more
complicated in a real-world system. One difference is that most drivers en-
capsulate information on a per-device basis, so the destroy() logic needs to
use the minor parameter in order to determine what should be cleaned up.
However, all of the basic concepts apply, and RTCore does all of the work
for you internally. This allows for greater flexibility and simplicity in the
common driver.
Part II
RTLinuxPro Technologies
121
Chapter 11
Real-time Networking
11.1 Introduction
For many applications, a simple machine running real-time code will solve
a problem sufficiently. Common problems are generally self contained, and
there usually is no need to refer to external sources in real-time for informa-
tion. The configuration data comes from a user application in the general
purpose OS that is either interacting with the user or with some normal data
source.
However, more complex systems are appearing on the market that need
to access real-time data that may not be contained on the local system. An
example would be a robot with multiple embedded boards connected by an
internal network, where each machine needs to transfer processed information
between components in real-time. Visual information that has been processed
and converted into motion commands need to get to the board driving the
robot’s legs quickly, or it may stumble on an obstacle ahead.
RTLinuxPro offers zero copy, hard real-time networking over both Eth-
ernet and FireWire, through a set of common UNIX network APIs. This
allows users to communicate over FireWire links or Ethernet segments with
the same calls one would use anywhere else.
For more information on this package, please refer to the LNet documen-
tation or email business@fsmlabs.com.
123
124 CHAPTER 11. REAL-TIME NETWORKING
Chapter 12
PSDD
12.1 Introduction
The standard RTLinuxPro (RTCore) execution model may be described as
running multiple hard real-time threads in the context of a general purpose
OS kernel. This model is very simple and efficient. However, it also implies
no memory protection boundaries between real-time tasks and the OS kernel.
For some applications, the single name space for all processes may also be a
problem. This is where Process Space Development Domain (PSDD) comes
into play.
In PSDD, real-time threads execute in the context of an ordinary userspace
processes and thus have the benefits of memory protection, extended libc
support, easier developing and debugging. It is also possible to use it for
prototyping ordinary in-kernel RTCore modules.
12.2 Hello world with PSDD

Let’s look at PSDD ”hello world” application (Figure 12.1). The main()
function locks all the process’s pages in RAM, creates an RTCore thread,
and sleeps. The real-time thread prints a message to the system log every
second. This periodic mode of execution is accomplished by obtaining the
current time and using it as a base for rtl clock nanosleep(3) absolute
timeout value.
There is a couple of interesting things about this program. First of all,
we need to use mlockall(2) to make sure we don’t get a page fault while
125
126 CHAPTER 12. PSDD
#include <rtl pthread.h>

#include <rtl time.h>
#include <stdio.h>
#include <unistd.h>
rtl pthread t thread;

void *thread code(void *param) {
int i = 0;
struct rtl timespec next; 10
rtl clock gettime(RTL CLOCK REALTIME, &next);
next.tv sec ++;
while (1) {
rtl clock nanosleep(RTL CLOCK REALTIME,
RTL TIMER ABSTIME, &next, NULL);
rtl printf("hello world %d\n", i++);
next.tv sec++;
}
return NULL;
} 20
int main(void) {
if (mlockall(MCL CURRENT | MCL FUTURE)) {
perror("mlockall");
return −1;
}
rtl pthread create(&thread, NULL, &thread code, NULL);
while (1)
sleep(1);
return 0; 30
}
Figure 12.1: PSDD ”hello world” program

12.3. BUILDING AND RUNNING PSDD PROGRAMS 127
all: psddhello
include rtl.mk
psddhello: psddhello.c
$(USER CC) $(USER CFLAGS) −opsddhello psddhello.c \
−L$(RTL LIBS DIR) −lpsdd −N −static
Figure 12.2: a Makefile for building PSDD ”hello world”
in real-time mode. Second, rtl /RTL prefixes are added to the names of
all RTCore POSIX functions and constants to distinguish them from other
userspace POSIX threads implementations, e.g. LinuxThreads/glibc.
12.3 Building and running PSDD programs

The above example program can be built using the Makefile shown in Fig-
ure 12.2. rtl.mk is a small makefile fragment that is found in the top-level
directory of RTCore distribution. It contains assignments of variables that
are useful in building RTCore applications. We need to link our program to
the PSDD library libpsdd.a.
To run the program, execute ./psddhello as root. You can use dmesg(8)
to view the messages from the program.
12.4 Programming with PSDD

The RTCore paradigm of strict separation between real-time and non-real-
time application components is still true with PSDD. Typically, the main()
program performs application-specific initialization, locks down process pages
in memory, creates some RT threads using rtl pthread create(), and then
proceeds to interact with them or just sleeps. Note that real-time threads
execute in the same address space as the process, so shared memory is auto-
matically available.
As with kernel-level RTCore, you are restricted with what you can do in
the real-time threads. First of all, no GPOS system calls are allowed in RT
threads. If a function that results in a system call, for example sleep(3),
is called from a real-time thread, RTCore issues warning message of the

following form to the syslog:
Attempt to execute syscall NN from an RT-thread!
You can use reentrant functions from libc and other libraries, for example,
sprintf(3), and RTCore API functions.
In GPOS context (meaning non-real-time threads in userspace, as op-
posed to the hard real-time threads in userspace controlled by PSDD), RTCore
API functions are also allowed, as long as they are non-blocking. For ex-
ample, rtl clock nanosleep and rtl sem wait are not allowed in GPOS
context, while rtl sem post is OK.
Running hard real-time threads in user process context requires the pro-
cess memory map to be fixed while real-time threads are running. RTCore
enforces it by making all attempts to change the memory mappings fail after
the first real-time thread was created in a process. Let’s consider the ways
in which a user space process memory map may potentially change.
• Automatic stack growth. Ordinary, GPOS will attempt to automati-

cally map new pages to process stack as it grows. For PSDD program,
a fixed amount of stack1 is allocated for the main() routine at the time
the first RT thread is created. An attempt to use more stack than the
allocated amount will cause a segmentation fault.
• Dynamic memory allocation routines, e.g. malloc(), free(), rtl gpos malloc()
etc can only be used before the first RT-thread is started. (These are
system calls.)
• Memory remapping calls: mmap(), shmat() etc. Same restrictions as

for the malloc().
• fork(), exec(), and the calls based on them, such as popen() and
system() should not be used in PSDD processes – neither before, nor
after the first RT thread is started.
An implication of the above concerns PSDD RT-thread stacks. There is

an implicit malloc() call done in rtl pthread create() if the RT thread
1
The default is 20480 bytes; this can be changed with rtl growstack(int stacksize)
before the first RT-thread is created.
12.5. STANDARD INITIALIZATION AND CLEANUP 129
userspace stack has not been provided with rtl pthread attr setstackaddr().
Therefore, one has to use rtl pthread attr setstackaddr() to provide
stack space for all RT threads (with a possible exception for the first thread).
Given the above, the correct initialization sequence of a PSDD application
is as follows.
1. Make an mlockall(MCL CURRENT|MCL FUTURE)2 call to lock down the

process memory.
2. Allocate all needed memory (including memory for the RT thread

stacks), establish shared memory and other mappings.
3. Optionally call rtl growstack(stacksize) function to specify the

amount of stack in bytes for the main() function.
4. Possibly perform additional application initialization.
5. Create application real-time threads. At the time of the creation of

the first RT thread in a program, the main() stack will be allocated
and then the process memory map will be fixed. Subsequent malloc(),
free(), mmap() etc. calls will fail.
RTCore API functions have rtl prefix added to their names to avoid
ambiguity. This may result in a confusion. For example, there are both
nanosleep() and rtl nanosleep() available in PSDD environment. nanosleep()
should only be used in GPOS context (functions called from main()). On
the other hand, rtl nanosleep() should only be called from RT threads
and never from GPOS. A single program may use both functions in different
contexts.
12.5 Standard Initialization and Cleanup

PSDD can be used for prototyping in-kernel RTCore modules. With PSDD,
it is often possible to enjoy convenience and safety of user space development,
2
FreeBSD does not have a working mlockall() implementation. The RTCore system
uses mlock() internally to emulate the effect of mlockall() for code, data and stack
pages. This emulation only works if the program is built statically (-static option to
gcc). In addition, mlock() must be used for other mapped memory.
and then simply recompile it for inclusion into the kernel for improved per-
formance. To this end, PSDD provides helper routines to facilitate migration
between user and kernel spaces. The provided psddmain.o object file pro-
vides a standard main() routine that arranges locking the process memory,
installs signal cleanup handlers, calls the module’s init module() routine
and enters an infinite sleep. On process exit, the user’s cleanup module()
is called. Depending on the way you compile the program source, you can
get either a kernel module, or a user program – or both.3
12.6 Input and Output

PSDD programs have access to all of the available real-time devices. This
is accomplished with the standard POSIX IO functions. The PSDD ver-
sions of those are rtl open, rtl read, rtl write, rtl ioctl, rtl mmap,
rtl ftruncate, and rtl close. Most devices currently do not implement
blocking IO, and thus require O NONBLOCK flag to open them. The notable
exception is /dev/irq. Commonly available devices include:
• /dev/rtfN real-time FIFOs

These are FIFO channels that can be used for communication between
RT and non-RT components of the system. To create a FIFO, use
rtl open.
rt_fd=rtl_open("/dev/rtf0",O_WRONLY|O_CREAT|O_NONBLOCK);
To set the size of an RT-FIFO to 4000, use:
rtl_ioctl(rt_fd, RTF_SETSIZE, 4000);
After that, rtl write call can be used to put data to the RT-FIFO.
The user side will use ordinary userspace open/read/write functions to
access the FIFO.
3
This is only needed when using the init module()/cleanup module() interfaces to
kernel threads - if main() is used, the PSDD main library is not needed.
12.7. EXAMPLE: USER-SPACE PC SPEAKER DRIVER 131
• /dev/irqN interrupt devices

These are intended for handling RT-interrupts in userspace context.
A blocking read from /dev/irqN blocks execution of a calling thread
until the next interrupt number N is received. RTL IRQ ENABLE rtl ioctl
must be called on the irq file descriptor to enable receiving of further
interrupts.
• /dev/ttySN RTCore serial driver
• /dev/lptN RTCore parallel driver
RTCore also provides rtl inb() and rtl outb() functions for accessing
x86 IO space.
As of RTLinuPro 2.1, PSDD applications also now have access to named
FIFOs (rtl mkfifo(), rtl unlink()), shared memory (rtl shm open(),
rtl shm unlink()), and named semaphores (rtl sem open(), rtl sem unlink(),
rtl sem close()). PSDD applications have access to many of the RTCore
API calls in userspace.
12.7 Example: User-space PC speaker driver

Let us consider a larger example, a PC speaker driver written with PSDD.
This example demonstrates interrupt handling in user space processing and
x86-style IO.
IBM PC compatible computers have a speaker that can be turned on
and off by switching a bit in IO port 0x61. So the idea is to convert the
incoming audio stream to a series of 1-bit samples to turn this bit on and off
to make the speaker produce the sound. Appendix I contains full source of
a userspace PC speaker driver. Here we’re going to examine the interesting
parts.
The input for our sound driver is a stream of 1-byte logarithmically en-
coded (ulaw -encoded) sound samples. The most common sampling rate for
such files is 8000 Hz. Rather than using a periodic thread, we will drive the
speaker using interrupts from the so-called Real-Time-Clock (RTC) avail-
able on x86 PCs. We program the RTC to interrupt the CPU at 8192 Hz
which is a close enough match for the sampling frequency. The example uses
RT-FIFO 3 to buffer samples.

char ctemp;
char devname[30];
sprintf(devname, "/dev/rtf%d", FIFO NO);
fd fifo = rtl open(devname, RTL O WRONLY|RTL O CREAT|RTL O NONBLOCK);
if (fd fifo < 0) {
rtl printf("open of %s returned %d; errno = %d\n",
devname, fd fifo, rtl errno);
return −1;
} 10
rtl ioctl (fd fifo, RTF SETSIZE, 4000);
fd irq = rtl open("/dev/irq8", RTL O RDONLY);
if (fd irq < 0) {
rtl printf("open of /dev/irq8 returned %d; errno = %d\n",
fd irq, rtl errno);
rtl close(fd fifo);
return −1;
}
rtl pthread create (&thread, NULL, sound thread, NULL);
/* program the RTC to interrupt at 8192 Hz */ 20
save cmos A = RTL RTC READ(RTL RTC A);
save cmos B = RTL RTC READ(RTL RTC B);
/* 32kHz Time Base, 8192 Hz interrupt frequency */
RTL RTC WRITE(0x23, RTL RTC A);
ctemp = RTL RTC READ(RTL RTC B);
ctemp &= 0x8f; /* Clear */
ctemp |= 0x40; /* Periodic interrupt enable */
RTL RTC WRITE(ctemp, RTL RTC B);
(void) RTL RTC READ(RTL RTC C);
} 30
Figure 12.3: PSDD sound driver initialization

12.8. SAFETY CONSIDERATIONS 133
The user module initialization function (Figure 12.3) creates and opens
RT-FIFO 3, sets the FIFO size to 4000, and opens the /dev/irq8 device.
Interrupt 8 is the RTC interrupt. Then it starts up the thread that is going
to do all data processing and programs the RTC to interrupt at the needed
frequency.
The real-time thread (Figure 12.4) enters an infinite loop. First it calls
rtl read blocks on the /dev/irq8 file descriptor. This causes the to block
until the next interrupt from RTC is received. Once this happens, an attempt
to get a sample from the RT FIFO is performed. If successful, the data
is converted from the logarithmic encoding, and the speaker bit is flipped
accordingly.
A very important moment here is that interrupt processing code has to
signal the device that it can generate more interrupts (”clear device irq”).
This code is device specific. In addition, the interrupt line needs to be
reenabled in the interrupt controller. The latter is accomplished by using
RTL IRQ ENABLE ioctl in the driver.
The cleanup routine (please see the listing in the Appendix) cancels and
joins the thread and closes file descriptors to deallocate interrupt and FIFO
resources.
12.8 Safety Considerations

PSDD environment provides a safe execution environment for hard real-time
programs. All arguments of the RTCore API functions are checked for valid-
ity, and memory protection is enforced. A hard real-time program, however,
can potentially bring the system down simply by consuming all available
CPU time. To ensure that this does not happen, RTCore provides an op-
tional software watchdog that would stop all real-time tasks in case of such
event. The watchdog may be enabled during configuration of the system.
Normally, root privilege is required to use PSDD facilities. Memory lock-
ing functions in GPOS also require root privilege. It is possible to allow non-
root users to run PSDD applications by reconfiguring the RTCore kernel,
however, this is potentially insecure, and can not normally be recommended.
void *sound thread(void *param) {

char data;
char temp;
struct rtl siginfo info;
while (1) {
rtl read(fd irq, &info, sizeof (info));
(void) RTL RTC READ(RTL RTC C); /* clear IRQ */
rtl ioctl(fd irq, RTL IRQ ENABLE);
if (rtl read(fd fifo, &data, 1) > 0) { 10

data = filter(data);
temp = rtl inb(0x61);
temp &= 0xfc;
if (data) {
temp |= 3;
}
rtl outb(temp,0x61);
}
}
return 0; 20
}
Figure 12.4: PSDD sound driver Real-time thread

12.9. DEBUGGING PSDD APPLICATIONS 135
12.9 Debugging PSDD Applications

The RTCore debugger (described in the “Debugger” chapter) allows pro-
grammers to debug PSDD applications in the same way normal RTCore
applications are debugged. For detail on the debugger please refer to the
“Debugger” chapter. Here, we will discuss some PSDD specific features of
the debugger.
One of the most common uses for debugging in large PSDD programs is
to trace the location of an illegal system call (a non-PSDD system call). As an
example we will walk through the program in rtlinuxpro/examples/psdd debug.
This application creates a real-time thread that eventually executes an ille-
gal system call readv(). Normally the RTCore system will print a message
telling the user that the thread has executed an illegal system call, disallow
the system call and then allow the thread to continue executing. Below is an
example of what is displayed on the console:
Attempt to execute syscall 145 from a RTLinux thread

PC: 0x080518c1
This allows applications that do make non-real-time system calls continue

to execute without stopping the application (and preventing it from perform-
ing its assigned duties). However, this warning message does not provide
very useful information for debugging the application and finding where the
offending system call is being made. In order to do this, one must config-
ure RTCore and enable the option “Put PSDD tasks in a debuggable state
when executing syscalls” and recompile RTCore. With this option enabled
a breakpoint will be inserted anywhere a PSDD task executes a non-PSDD
system call. The programmer can then attach the debugger to the applica-
tion and obtain a source listing, backtrace or any other useful information in
determining where the call was made.
When the same example is run with the above debugging option enabled
the console output looks like:
Attempt to execute syscall 145 from a RTLinux thread

PC: 0x080518c1
Inserted breakpoint in PSDD task where system call was made.
rtl_debug: exception 0x3 in psdd_debug (pid 21276, EIP=0x80518c2), psdd
thread id 0; (re)start GDB to debug
To debug this, one runs GDB as normal and connects to the debugger
FIFO:
root@host115<psdd_debug>$ gdb psdd_debug
GNU gdb (5.3)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
[New Thread 0]
0x080518c2 in __readv (fd=0, vector=0x0, count=0) at ../sysdeps/unix/sysv/linux/readv.c:
51 ../sysdeps/unix/sysv/linux/readv.c: No such file or directory.
in ../sysdeps/unix/sysv/linux/readv.c
Now you can see that the call occurred in readv(), a part of libc. This
information by itself is not very useful since one already knows that the error
occurred in side libc as the system printed a message notifying us that system
call was made. To find out where that call was made from, a backtrace listing
shows:
(gdb) back
#0 0x080518c2 in __readv (fd=0, vector=0x0, count=0) at ../sysdeps/unix/sysv/linux/read
#1 0x080481e7 in start_routine ()
#2 0x0804831a in psdd_startup ()
From this one can tell that the function that called readv() was start routine()
which is the function that makes this call in the example.
12.10 PSDD API

Table 12.1 describes the functions provided by the PSDD API. The functions
are broken into groups according to functionality. For each function, it is
specified whether or not it can be used in RT-threads and in GPOS context.
The detailed description of these functions may be found in the manpages.
12.10. PSDD API 137
Table 12.1: The PSDD API
Function RT GPOS Group

rtl clock gettime Y Y Clock and sleep functions
rtl clock nanosleep Y N
rtl nanosleep Y N
rtl usleep Y N
rtl open Y4 Y File IO
rtl close Y Y
rtl ioctl Y Y
rtl lseek Y Y
rtl ftruncate N Y
rtl read Y Y
rtl write Y Y
rtl mkfifo N Y
rtl unlink Y Y
rtl shm open N Y
rtl shm unlink Y Y
rtl cpu exists Y Y SMP support
rtl getcpuid Y Y
rtl pthread attr init Y Y Thread creation attributes
rtl pthread attr destroy Y Y
rtl pthread attr setcpu np Y Y
rtl pthread attr getcpu np Y Y
rtl pthread attr setfp np Y Y
rtl pthread attr getfp np Y Y
rtl pthread attr setschedparam Y Y
rtl pthread attr getschedparam Y Y
rtl pthread attr setstackaddr Y Y
rtl pthread attr getstackaddr Y Y
rtl pthread attr setstacksize Y Y
rtl pthread attr getstacksize Y Y
rtl pthread create Y5 Y Thread control functions
rtl pthread cancel Y Y
rtl pthread exit Y N
rtl pthread join Y Y
4
rtl open can not be called from RT if O CREAT is specified for a FIFO
5
Only if preallocated stack is specified with pthread attr setstackaddr
Table 12.1: The PSDD API
Function RT GPOS Group

rtl pthread equal Y Y
rtl pthread kill Y Y
rtl sched get priority max Y Y
rtl sched get priority min Y Y
rtl pthread self Y Y
rtl pthread idle Y Y
rtl pthread testcancel Y N
rtl pthread setschedparam Y N
rtl sem init Y Y Semaphore support
rtl sem destroy Y Y
rtl sem getvalue Y Y
rtl sem post Y Y
rtl sem trywait Y Y
rtl sem wait Y Y6
rtl sem timedwait Y Y
rtl sem open Y Y
rtl sem close Y Y
rtl sem unlink Y Y
rtl printf Y Y Message logging
rtl growstack N Y Stack allocation for the main()
rtl virt2phys7 N Y Address translation
12.11 Frame Scheduler

12.11.1 Introduction
Many real-time tasks contain periodic loops that do not require sophisticated
scheduling that RTCore is capable of providing. It is also often convenient
6
Unlike for kernel applications, in PSDD the use of rtl sem wait and
rtl sem timedwait is not allowed. This is due to the fact that the whole of GPOS
on the current CPU may be blocked by these functions, thus making it possible to freeze
the system.
7
rtl virt2phys is currently not supported on x86 Linux systems that have more than
4G of physical memory and use the PAE addressing.
12.11. FRAME SCHEDULER 139
to separate scheduling details from program logic. This allows the real-
time systems developer to experiment with different scheduling parameters
without recompiling application programs. For such cases, PSDD provides a
userspace frame scheduler8 .
The frame scheduler supports hard real-time scheduling of user space
tasks in terms of frames and minor cycles. There is a fixed number of minor
cycles per frame. Minor cycles can be either time-driven or interrupt-driven.
For each task, it is possible to specify task priority, the CPU to schedule
the task on, the starting minor cycle number within the frame, and the run
frequency in terms of minor cycles. (For example, if there are 10 minor cycles
in a frame, the starting minor cycle is 2, and the run frequency is 3, the task
will run at the following minor cycles: 2, 5, 8, 2, 5, 8, ...). If there are multiple
tasks ready at the start of a minor cycle, the task with a higher priority is
run first.
The tasks running under a frame scheduler are UNIX processes of the
following structure:
void rt thread(void*arg) {
/* thread executed in hard real-time */
while (1) {
/* block the execution until the next run */
fsched block();
user code();
}
}
10
struct fsched task struct task desc;
application init();
// initialize RT subsystem
fsched init(argc, argv, &task desc, NULL);
// start real−time thread
fsched run(rt thread, &task desc);
/* main thread sleeps forever; hard RT thread is running*/

while (1) { 20
8
The frame scheduler is only available for Linux systems.
sleep(1);
}
}
The hard real-time part of the user process is a PSDD RT thread and
therefore subject to the same restrictions, e.g., it can not use UNIX system
calls or non-reentrant library functions.
The task code itself does not contain any scheduling information. This
information is supplied when attaching a new task to the scheduler via com-
mand line interface. This approach allows the user to change schedules with-
out recompiling.
The power of PSDD can be seen from the fact that the frame scheduler
itself is implemented using hard real-time user facilities of PSDD. Thus, quite
complicated real-time applications can be developed using the framework.
12.11.2 Command-line interface to the scheduler

The user manipulates the frame scheduler via the ”fsched” command. The
description of the supported formats and their meanings is provided below.
fsched create
- create and initialize the frame schedulers subsystem. This command
has to be issued before any other commands can be used. In the
current implementation, this starts the userspace scheduler process,
rtl fsched, and thus the directory containing rtl fsched must be
present in the user’s PATH variable.
fsched delete
- destroy the frame schedulers subsystem
fsched config -mpf minor cycles per frame -dt dt per minor cycle
[ -s sched id ] [ -i interrupt source ]
- configure a scheduler. Must be issued before the scheduler can be
used.
fsched [ -s sched id] start

- starts the frame scheduler
12.11. FRAME SCHEDULER 141
fsched [ -s sched id ] stop

- stops the frame scheduler. If there are any user tasks attached to the
scheduler, they are detached and killed.
fsched [ -s sched id] pause|resume

- pauses and resumes the execution at the next minor cycle.
fsched attach [ -s sched id ] -n program -p priority -rf run freq

-smc starting -cpu cpu number -args "arguments passed to user
process"
- attach a program to the frame scheduler. ”program” is the name or
path of the executable to start. ”priority” can lie between 1 (min) and
255 (max). If the CPU is not specified, the default CPU is used. The
task starts the execution at the ”starting” minor cycle number of the
next frame with ”run freq” frequency.
fsched info [ -s sched id ] [ -n average runs ]

- display the information about the schedulers and tasks. More infor-
mation is provided in the Section 12.11.4.
fsched reset [ -s sched id ] - reset scheduler statistics
fsched debug -p pid - break in the user process ”pid”. The break
happens at the next minor cycle, and all scheduling activity stops. Af-
ter that, it is possible to attach to the process with GDB and perform
source-level debugging. Please refer to the GDB example in the distri-
bution and to the Chapter 7 for more information.
If sched id is omitted, the default scheduler id of 1 is used. There may

be several slot schedulers running concurrently on the same machine. It is
up to the user to ensure that there are no conflicts between the schedules.
12.11.3 Building Frame Scheduler Programs

A typical Makefile structure for the frame scheduler user programs is as
follows:
all: engine
include fsched.mk
engine: engine.c
$(USER_CC) $(FSCHED_CFLAGS) -o engine engine.c $(FSCHED_LIBS)
fsched.mk is a small makefile fragment that is provided with by the slot
scheduler. It contains assignments of various variables that encapsulate in-
clude paths, compiler switches and libraries.
12.11.4 Running Frame Scheduler Programs

First, it is necessary to make sure the fsched directory is in the PATH:
export PATH=$PATH:/directory_that_contains_fsched
Typically, running frame scheduler programs is accomplished with a shell
script like the following:
fsched create
sleep 1
fsched config -mpf 10 -dt 50
fsched attach -n user1 -rf 3 -smc 1 -p 1
fsched start
Here we create a frame scheduler, configure it to each 50 milliseconds
period with 10 minor cycles per frame. Then we attach a user program to
execute starting at minor cycle 1 of each frame with the frequency of 3 minor
cycles. The task runs at priority 1. Finally, the whole system is started with
the fsched start command.
It is often useful to keep a continuously updating window with the sched-
uler status display. This can be accomplished with the following command:
watch -n 1 fsched info -s 1
This will run the fsched info command every second and display its output
full screen. An example of such a screen is displayed in Figure 12.5.
For each task, fsched info displays execution statistics: last, running
average, min and max execution times in microseconds, total number of
execution cycles, and number of overruns. Percentage of the current CPU
time used by the RT tasks is also displayed.
12.12. CONCLUSION 143
Every 1s: ./fsched info -s 1 Wed Aug 28 19:54:54 2002
FS: 1 baraban IRQ=0 MPF=10 DT=50ms started

CPU0 load 1%
PID CPU PRI FREQ LAST MIN MAX AVG(us) TOTAL OVR CMD
3474 0 1 3 43.7 30.9 43.7 37.8 15 75 ./user1
3477 0 2 2 17.1 15.0 39.6 18.4 120 0 ./user2
Figure 12.5: Example of a frame scheduler monitoring window
12.12 Conclusion
PSDD offers a simple means of writing complex real-time code in user space,
while still allowing for the normal RTCore approach of splitting real-time
logic from management code. Users with no knowledge of GPOS kernel
programming can use it for rapid prototyping and deployment of real-time
applications. Others may use it as a testbed for code that will eventually
run in kernel mode.
Chapter 13
Controls Kit (CKit)
This chapter provides an overview of CKit by working through a simple PID

example. For more in depth documentation, please refer to the CKit manual.
13.1 Introduction
During the implementation of controllers and control algorithms, one finds
oneself needing to handle parameter updates and alarms in a well behaved,
controlled manner. Moreover, these may sometimes be handled in the context
of a distributed application, as would be the case in dangerous environments.
For example, a fully automated assembly plant may need to be centrally
monitored and tuned from a remote location.
FSMLabs has addressed this problem by introducing the FSMLabs Con-
trols Kit (CKit). It is a collection of utilities for building control systems and
control interfaces using XML to describe control objects. Controls Kit pro-
vides software for exporting RTLinux control variables, including methods
for defining composite objects, setting alarms and triggers, updating and ex-
porting control information to either a local or remote machine. CKit makes
it easy to develop both the localized and distributed application via a set of
API interfaces and libraries as well as the highly portable XML document
standard.
The FSMLabs’ ControlsKit (CKit) is the subcomponent of RTCore which
gives developers a mechanism via which to manipulate both parameters and
alarms from the Linux command line and through the network. Additional
tools and libraries from both FSMLabs and FSMLabs partners interface to
145
146 CHAPTER 13. CONTROLS KIT (CKIT)
CKit to allow for:
• distributed control and logging
• control of legacy hardware
• controller algorithms
• graphical user interface creation and manipulation
• asynchronous alarm messaging
The core CKit subsystem is divided into the following main subcompo-
nents. Please refer to Figure 13.1 for a visual description of the same:
1. the hard real time component, ckit module.rtl: this component pro-
vides the communications interface and services that connect the RTCore
programs to the user space programs. All RTCore applications which
use the CKit services must use the CKit hard real time API as described
in the CKit Manual to:
• register the parameters/entities of interest

• assign description information to each parameter
• assign attributes and limits to each parameter
• attach each parameter to a global logic tree – this is especially use-
ful to differentiate between similarly named parameters belonging
to different controllers. For example, a given RTCore program
may have two PID algorithms (PID1 and PID2), both of which
may have a parameter named Kp.
• from RTCore programs, request that shell commands be executed
on the GPOS shell
• send messages to the GPOS from the RTCore programs
• send alarms of varying degrees of criticality to the GPOS from the
RTCore programs
• write third party libraries that enhance the functionality of RTCore.
For example, third party libraries may include control algorithms,
hardware drivers, networking algorithms, and legacy hardware in-
terfaces.
13.1. INTRODUCTION 147
Figure 13.1: CKit Design

2. the user space CKit Daemon, ckitd: this is the main user space server
which monitors the real time side and performs all types of actions
on behalf of both the real time component (above) and the user space
utilities.
3. the user space real time utilities: these are a collection of user space
programs that interface to the CKit Daemon. These tools are used to
not only interpret all messages and shell commands generated within
the RTCore programs, but also to set and read all parameter informa-
tion of all registered parameters. Please refer to the CKit Manual for a
complete listing of the CKit utilities. The user will use these utilities to
interface to the CKit Daemon and query parameters and alarms. The
remainder of this chapter will use some of these utilities.
4. the user space C++ libraries: these libraries can be linked against the
user’s C++ programs. These libraries are used to perform all the same
functionality of user utilities. In addition, it can be used to parse the
XML responses from the Ckit Daemon. Again, please refer to the CKit
Manual for a more in depth description of this library.
13.2 Operation of the Ckit

To use the CKit, the user must do the following:
• execute ckit module.rtl: This is only needed once while the computer
continues running, and it assumes that rtcore is already running. This
enables the hard real time infrastructure of the CKit.
• execute ckitd: This is only needed once while the computer continues
running. This enables the soft real time infrastructure of the CKit and
monitors the hard real time module, above.
• write CKit capable RT programs: For this, use the RT API to both
register critical parameters with CKit and identify alarm conditions
• execute user’s RTCore program: see Figure 13.3.1 for a description of
one such program hard real time program.
• use the CKit soft real time utilities: These utilities are a set of user
space utilities designed to – from within the user space – to:
13.3. PD EXAMPLE 149
– set parameter values in the hard real time module,

– read parameter values from the hard real time module,
– view alarms generated either/both on the user side or/both the
real time side,
– subscribe to asynchronous alarms of varying levels,
– execute shell commands whenever a subscribed asynchronous alarms
occur,
See refer to the CKit Manual for a more thorough description of the
CKit user space utilities.
• use the CKit C++ Library: Write your own user space applications
which can mimic all of the aforementioned user space utilities. Please
refer to the CKit Manual for a full description of the same.
That’s it!
Optionally, you can also use (or write your own) hard real time and non
real time libraries. For example, see the appropriate chapters in the CKit
Manual for a complete description of how to use and write your own libraries
which you can either share with your co-workers or clients in binary form.
The next sections will demonstrate a simple hard real time programming
example and its execution. This example will use the core CKit to register
entities which can be used to implement a simple PD controller.
13.3 PD Controller Implementation Using Core

CKit Entities
In this section, we present a simple example which registers a the parameters
for a simple PD controller, as well as the Set point variable for the PD
controller.
13.3.1 Entity Registration

The entities for this project will be housed within a toplevel group entity:
“Toplevel”. Please refer to Figure 13.3.1 for the source code listing.
To the “Toplevel” entity, we are going to link – as children entities –
two additional entities:
• a group entity, PD, which is going to group additional subentities:
– a float entity which is to act as the proportional entity, Kp, which

in this case it will model the stiffness of the valve controller.
– a float entity which is to act as the derivative entity, Kd, which
in this case it will model the damping provided by the valve con-
troller.
• an integer entity, Valve, which is to act as the setpoint to the PD con-

troller. In this case, this variable will denote the desired valve opening
and it is to be specified as percentage of total gap.
For this example, all non-group entities will be updateable in real time from
the user space side.
Optionally, we choose to provide for our entities, a set of attributes and
“suggestions” for the graphical utility. The GUI has the option of either
ignoring these suggestions or acting on them. The settable attributes or
suggestions are – among others – any of the following:
• type of widget to use when displaying the entity
• units to use for the widget
• display string to use
• is the minimum value locked?
• is the maximum value locked?
• is the current value locked?
• is the minimum value auto-ranged?
• is the maximum value auto-ranged?
In this case, when the GUI displays the Kd and Kp entities, we will request
that it use a dial widget. We are also going to request that the GUI display
the value of both Kp and Kd using the C string syntax: “%.1e” and “%.3f”,
respectively. Last but not least, we are going to request that the GUI use
the units “N/m” to display the Kp entity, and the units “Ns/m” to display
the Kd entity.
Finally, at the end, when we are supposed to clean up after ourselves

(after rtl main wait()), we are going to destroy the toplevel entity “Top-
level”, which will automatically unlink said entity. Not only will it unlink
the “Toplevel” entity, but it will also recursively unlink all offspring of
“Toplevel”.
/**********************************************************************
* Include the appropriate CKit header, declare our entities, and make
* some definitions
**********************************************************************/
#include <ckit/rtmodule.h>
#ifndef TRUE
#define TRUE (1)
#define FALSE (0)
#endif /* TRUE */ 10
static CK entity Toplevel, PD, Kp, Kd, Valve;
/**********************************************************************
* Main routine
**********************************************************************/
int main(void)
{
/* 20
* TOPLEVEL GROUP
*/
CK group init(&Toplevel,
"Controller",
"Factory floor’s conveyor belt controller",
NULL);
/*
* PD GROUP, CHILD OF TOPLEVEL
*/ 30
CK group init(&PD,
"PD",
"Proportional + Derivative Controller",
&Toplevel);
/*
* CHILDREN OF PD GROUP
*/
40
/* proportional gain and attributes */
CK scalar float init(&Kp,
"Kp",
"This sets the loop gain for the controller",
&PD,
0.1, 10.0, 4.2);
CK entity set sugg str(ckWidget, &Kp,"ckit::dial");
CK entity set sugg str(ckRepresentation,&Kp,"%.1e");
CK entity set sugg str(ckUnits, &Kp,"N/m");
50
/* derivative gain and attributes */
CK scalar float init(&Kd,
"Kd",
"This sets the derivative gain for the controller",
&PD,
1.0, 3.0, 2.2);
CK entity set sugg str(ckWidget,&Kd,"ckit::dial");
CK entity set sugg str(ckRepresentation,&Kd,"%.3f");
CK entity set sugg str(ckUnits,&Kd,"Ns/m");
60
/*
* ADDITIONAL CHILDREN OF TOPLEVEL GROUP
*/
/* valve controller (set point for PD controller) and attributes */

CK scalar int init(&Valve,
"Coolant",
"Set the desired opening (percent) of valve",
&Toplevel,
0, 100, 30); 70
CK entity set sugg str(ckWidget,&Valve,"ckit::dial");
CK entity set sugg str(ckUnits,&Valve,"%");
CK entity set sugg str(ckRepresentation,&Valve,"%.1d");
/*
* WAIT UNTIL WE ARE SHUT DOWN
*/
rtl main wait();
80
/*
* DESTROY THE TOPLEVEL ENTITY
*/
CK entity destroy(&Toplevel);
return 0;
}
Fig. 13.3.1 Programming Example which demonstrates the initial-

ization and utility of CKit entities.
13.3.2 Program Execution

To execute this program, we need to do the following:
1. if you haven’t done so already, start up rtcore
2. if you haven’t done so already, start up ckit module.rtl
3. if you haven’t done so already, start up ckitd
4. execute our RT program. In this case, the name of the source file is
“mycontroller.c”. Using our Makefile, we immediately obtained the
executable “mycontroller.rtl”. To execute it, we simply type:
./mycontroller.rtl
Now, we are ready to begin querying and setting parameters.

5. query the parameter tree by typing:
ck_hrt_op -L
This should give you:
+-#> Controller # group #

| +-#> Coolant # integer # 30
| +-#> PD # group #
| | +-#> Kd # float # 2.200
| | +-#> Kp # float # 4.2e+00
| |
|
Note that in this case, each entity is displayed along with its type and
current value. Additional verbosity can be obtained by specifying the
“-n#” option as follows. Note that the larger the number, the greater
degree of verbosity will be used to display the tree:
ck_hrt_op -L -n3
In this case, a verbosity level of 3 will add not only the current values of
the entities, but also the minimum/maximum bracket for each entity,
if applicable:
+-#> Controller # group # #

| +-#> Coolant # integer # 30 # [0,100]
| +-#> PD # group # #
| | +-#> Kd # float # 2.200 # [1.000,3.000]
| | +-#> Kp # float # 4.2e+00 # [1.0e-01,1.0e+01]
| |
|
You can also obtain the XML version of the same by providing the “-x”
option in the command line as follows:
ck_hrt_op -L -x
Please refer to the CKit Manual for a more complete description of the
ck hrt op utility.
6. set the value of Kp by typing:
ck_hrt_op -s 2.5 -v -p Controller::PD::Kp
which states that we want to set to the value of 2.5 (“-s 2.5”) the
current value (“-v”) of the parameter entity Kp (“-p Controller::-
PD::Kp”). You should see a synchronous alarm appear on the screen
which states that the command was either successful or not.
7. obtain the maximum allowable value of Kp by typing:
ck_hrt_op -g -u -p Controller::PD::Kp
which once again states that we want to query (“-g”) the maximum
value (“-u”) of the parameter entity Kp (“-p Controller::PD::Kp”).
The response from the CKit is:
1.0e+01
where note that the string format is consistent with the syntax used in
the source file of “%.1e”.
8. query the description and type of a given entity by typing the -d and
-t options as follows:
athena% ck_hrt_op -d -p Controller::PD::Kp

This sets the loop gain for the controller
where in this case we show both the shell command prompt (athena%)
prior to the command itself and the subsequent output. Similarly for
the type:
athena\% ck_hrt_op -t -p Controller::PD::Kp

float
9. query parameters on a remote target machine. To do so, first, in your

target machine, type the following:
ck_xmlrpc_server
Then, in your host machine, type any of the aforementioned commands

but add the “-X” option to specify the remote target machine. For
example, assuming that the target machine name is “coyote.hilton.-
net”, then on your host machine you would type:
ck_hrt_op -X http://coyote.hilton.net:3134/RPC2 -L
This should give you:

+-#> Controller # group #

| +-#> Coolant # integer # 30
| +-#> PD # group #
| | +-#> Kd # float # 2.200
| | +-#> Kp # float # 4.2e+00
| |
|
which is exactly the same output as before.

In this case, note that the URL is of the form:
http://machineName:3134/RPC2
where port number 3134 is the default port number as configured by

default in the CKit.
10. subscribe to asynchronous alarms of all levels using ck alarm:
ck_alarm -s all
Note that in our example, we did not explicitly specify any alarm mes-
sages, although alarms can be generated due to many reasons within
the CKit infrastructure.
11. subscribe to asynchronous alarms of level 2 and 3 asynchronous alarms

using ck alarm. Also, when an alarm occurs, execute my script com-
mand “myAction.sh” which accepts a single argument denoting the
alarm level:
ck_alarm -s 2,3 -e "myAction.sh %L"
Note that in this example, the %L is a key token which will automati-
cally be replaced by the actual alarm level each time that myAction.sh
is executed. Please refer to the CKit Manual for a more thorough de-
scription of this command.
12. view the parameter trees on the local machine using the graphical user
interface. This interface is written in Perl:
13.4. XML-RPC API 157
ck_hrt_op_GUI
Note that for this last command to work, you must make sure that you
have both Gtk and GtkPerl installed in your machine. If not, you can
usually obtain it directly from your Linux distribution CDs or from
CPAN, the central Perl repository.
13. view and manipulate the parameter trees in a remote machine (assume
once again coyote.hilton.net) using the graphical user interface:
ck_hrt_op_GUI http://coyote.hilton.net:3134/RPC2
Note that for this last command to work, you must make sure that you
have both Gtk and GtkPerl installed in your local machine (not the
target machine). If not, you can usually obtain it directly from your
Linux distribution CDs or from CPAN, the central Perl repository.
Also, you need to make sure that ck hrt op is in your path.
You can also write your own C++ programs which take advantage of the CKit
user space library for the sake of querying parameters and alarms both locally
and on remote machines. Within your C++ programs, you can easily query
parameters, set parameters, subscribe to alarms, etc. Please refer to the
CKit Manual for a more thorough description of this library. Alternatively,
you can also write XML-RPC programs in any language which will query the
XML-RPC server over the network. The following Section will describe one
such example.
13.4 XML-RPC API

It is possible to make XML-RPC queries through the network using any
language that is XML-RPC capable and which can further interpret the
resulting XML. For example, users have created interfaces to Microsoft Excel
which they can then use to query RTLinux boxes from within their Microsoft
Windows boxes. For this, Visual Basic was used to not only perform the
XML-RPC calls, but to also interpret the resulting XML.
We now present a simple example which will query the parameter tree
running on a remote target machine. In this case, we’ll reuse the machine
used in the previous section, coyote.hilton.net. It is assumed at this point

that ck xmlrpc server is already running on the target machine.
The example code is presented in Figure 13.4. In this example, several
headers are first included. Then, we initialize some constants which we later
use during the call to the remote server.
#!/usr/bin/perl −w
use Frontier::Client;
use MIME::Base64;
use strict;
# Let’s initialize some constants

use constant TRUE=>1;
use constant FALSE=>0;
use constant ROOTPATH=>"hrt";
use constant NODEPATH=>"Controller"; 10
use constant TREEDEPTH=>1024;
use constant SHOWHIDDEN=>FALSE;
# Set the target address:

my $target = "http://coyote.hilton.net:3134/RPC2";
# initialize the client

my $rpc = new Frontier::Client ( url => $target )
| | die "Unable to connect for whatever reason";
20
# Do query
my $response = $rpc->call(’fsmlabs.ckit.getTree’,
NODEPATH,ROOTPATH,
TREEDEPTH,SHOWHIDDEN);
# print out the response

printf("%s\n\n",$response);
Fig. 13.4 Perl program that queries parameter tree on remote tar-
get machine.
Notice the simplicity of the code. As such, it is the responsibility of the

user to next parse (if necessary) the XML response from the server. However,
the point has been made. It is quite easy to query the remote target machine.
13.5. CONCLUSION 159
13.5 Conclusion
You are now on your way to understanding the utility of the Controls Kit.
Its strength lies in helping you manipulate parameter trees in real time, while
at the same time monitoring asynchronous alarms. In addition, please note
that there are now two graphical front ends for the CKit. The first is a Perl
based interface, and the second one is a Java based interface. Please refer to
the CKit documentation for more detail. Both of these are designed to help
greatly in the development of interfaces for use in industrial control environ-
ments. In short, the Controls Kit is designed to help in your deployment and
creation of control algorithms. Happy Controlling!
Chapter 14
RTLinuxPro Optimizations
Optimizations are of course very important in real-time systems. However,

many are detrimental to the development process. With RTLinuxPro, it
is very easy to enable proper optimizations during development, and dis-
able them on deployment. This chapter will cover general optimizations and
techniques useful in developing RTCore applications.
14.1 General optimizations

First off, let’s cover the basic optimizations that can make the difference
between having real-time response or a non-real-time system. Primarily,
these can be grouped into the following categories, mainly targetted at the
x86 architecture:
• Power management
• System Management Interrupts (SMIs)
• Interrupt controllers
Power management is generally the simpler of the two to solve - it should

be disabled in the BIOS for nearly all systems. Contact FSMLabs for in-
formation on systems that absolutely require both power management and
real-time response.
The reason power management is a problem is that it will dynamically
shift the CPU’s clock speed, so that an operation that took a given amount
161
162 CHAPTER 14. RTLINUXPRO OPTIMIZATIONS
of time at one point may take a different amount of time later on, if the clock
speed has been changed in the meantime. Disabling the feature will ensure
a constant clock speed for the hardware, and a reliable execution time for
real-time code.
For SMIs, the hardware responds to certain events by essentially taking
the CPU offline while it manages internal work. This appears to software
as a long-delayed execution of whatever was happening at the time of the
SMI. If a real-time thread is execution when an SMI occurs, the code may
be delayed by tens of milliseconds or more.
RTLinuxPro provides a tool to disable SMIs for some hardware, in the
utilities directory. This does not cover all possible SMI possibilities, but it
will help on many configurations. The application will disable SMIs while it
is running, and reenable them on exit. The result is that while it is running,
a system that would not ordinarily have acceptable real-time performance
will be capable of standard response times.
Interrupt management is also important - when using Linux as the GPOS,
it is important that APIC support is enabled if the hardware is capable. This
allows for a much higher performance interrupt controller, which results in
better real-time performance.
14.2 RTCore-internal optimizations

Users that have purchased the source of the RTCore OS have several build
options are available for optimization and error checking. These are made
available through the RTCore build system, usually entered with:
make menuconfig
Under the internal debugging section, users can enable or disable several
options. First is a paranoid mode - this enables more extensive internal
checks within RTCore. While helpful for debugging, it does add overhead,
and should be disabled on deployment for users that need every last cycle
from their hardware. There is also a similar error and sanity check mode
that checks for valid file descriptors and so on - the impact is minimal, but
may be important in very tight cases.
Additionally, under ’Selective building of RTLinux modules’ is another
means of improving performance. By default, the debugger is built into
the system, which will catch faults and other problems. However, this does
14.3. CPU MANAGEMENT 163
add overhead to the normal exception handling of faults in the GPOS, as

the RTCore debugger is always run first on any execption. This overhead
can be very minorly detrimental to GPOS throughput. If the best possible
performance is needed from the GPOS on deployment, and there is no need
for the debugger, RTCore can be rebuilt without the module.
14.3 CPU management

Proper management of the processor is essential to getting the best pefor-
mance out of the system. RTCore offers several means of managing this
resource, which we will cover here. After each of the aspects have been cov-
ered, we will put them together into a single example of how to effectively
manage CPU resources. This section is primarily interested in SMP systems.
14.3.1 Targetting specific CPUs

By default, real-time threads are spawned on the same processor that the
loading program is run on. On an SMP system, the GPOS may be running
the loading program on any CPU at any given time, so real-time threads
may be started on CPU 0, CPU 1, or any other, depending on the current
scheduling considerations for the GPOS.
If the application at hand has 2 real-time threads, and each of them
needs more than 50% of the CPU’s bandwidth, it is obvious that they must
be directed to different CPUs in order to be able to handle the workload.
(By default, they would both attempt to start on the processor they were
created on.)
This is handled easily with a pthread attribute and a call to pthread attr
setcpu np(), which takes the attribute for the thread and the CPU num-
ber the thread should run on. The example at the end of this section will
demonstrate the entire sequence in action.
14.3.2 Reserving CPUs

When there is no real-time activity on a specific CPU (all threads are waiting,
no interrupts waiting, etc), RTCore allows the GPOS to execute non-real-
time code. For example, this means that Linux can then let cron run, or let
some scripts execute on that processor.
This allows GPOS throughput to increase, but it does so at the expense

of the cache. As the GPOS runs tasks on the processor, it dirties the cache,
pushing real-time thread code and data out. When it comes time to run a
thread again, it may be necessary to get some of its data back into the cache,
resulting in a slight delay. The amount of delay is slight, but it reduces the
effective bandwidth of the CPU, as it has to wait while the cache refills.
RTCore allows the user to reserve a CPU such that the GPOS is not
allowed to execute code on that specific CPU. This is done again with a
pthread attribute and a call to pthread attr setreserve np(), providing
the attribute object and the cpu to be reserved. This call has the effect of
refocusing interrupts away from that processor. Please refer to the documen-
tation for this call for full details.
Once the given CPU is reserved, real-time threads can generally stay
mostly in cache, allowing for performance at the hardware limit.
14.3.3 Interrupt focus

Interrupts can be a source of some latency, if there are a large number of
them coming into a processor while there is a real-time thread working. The
overhead is minimal, but it can cause slight disturbances. In these cases, it is
generally best to refocus the non-real-time interrupts to another processor.
This is done in RTCore with rtl irq set affinity(int irq, unsigned
long *mask, unsigned long *oldmask). Callers provide the IRQ number
to be refocused, and a mask for that CPUs should allow that IRQ. The call
provides the previously used mask to the user with the third argument.
By setting a thread on a specific processor, disallowing the GPOS from
running on that processor, and focusing non-real-time IRQs away from that
processor, real-time threads can execute at the very limit of the hardware.
The thread and data can generally live entirely within cache, and the only
interrupts seen will be those related to any real-time interrupts that are still
focused to that specific CPU.
The act of reserving a processor as described above will automatically
refocus interrupts away from the targetted processor. After this thread is
started, any real-time interrupts can be refocused back to the reserved CPU.
14.3. CPU MANAGEMENT 165
14.3.4 Example
Let’s put the three together into an example. Here, we want to start a thread
on CPU 1, disallow the GPOS from running on that CPU, and we want to
focus an interrupt to that CPU. Since the act of reserving the processor
refocuses interrupts away from it, we need to refocus the interrupt back.
#include <stdio.h>
pthread˙t thread;
unsigned long mask = 0x2, oldmask;
sem˙t irq˙sem;
unsigned int irq˙handler(unsigned int irq, struct rtl˙frame *regs) {

rtl˙global˙pend˙irq(irq); 10
sem˙post(&irq˙sem);
return 0;
}
void *thread˙code(void *t) {
rtl˙irq˙set˙affinity(12, &mask, &oldmask);
while (1) {
sem˙wait(&irq˙sem); 20
printf("Got IRQ 12\n");
}
return NULL;
}
int main(void) {
sem init(&irq sem, 1, 0);

pthread attr init(&attr); 30
pthread attr setcpu np(&attr, 1);
pthread attr setreserve np(&attr, 1);
rtl request irq(12, irq handler);
pthread create(&thread, &attr, thread code, 0);
rtl main wait();
pthread cancel (thread); 40

rtl free irq(12);
rtl irq set affinity(12, &oldmask, &mask);

sem destroy(&irq sem);
return 0;
}
That’s all you need to do for all 3 optimizations. Let’s step through it to
make sure everything is clear. First, we set up the semaphore between the
interrupt handler and the thread, and initialize the pthread attribute - this
is used to do 2 of the three steps. First, the thread is targetted to CPU 1,
and that CPU gets reserved. (The reservation does not actually take place
until the thread has started.)
Next, the IRQ handler is installed, and the thread is spun - all that is left
to do is refocus the interrupt back to the CPU we’re on. This is done in the
thread, saving the old mask so we can restore it when we’re done. That’s all
there is to the third step.
As you can see, it’s not difficult - in roughly 50 lines of code, all three
factors have been integrated into the system. For more details, please refer
to the RTLinuxPro examples and documentation.
Chapter 15
SlickEdit IDE for RTCore
The RTCore Integrated Development Environment (IDE) is based on the

SlickEdit IDE. This allows developers to edit source code, compile, run and
debug RTCore applications through a single interface.
This chapter is intended to provide a quick introduction to using SlickEdit
with RTCore. For more detailed information on SlickEdit, the SlickEdit in-
terface or customizing SlickEdit please see the documents in /opt/rtldk-x.y/
doc/vslick/.
15.1 Creating a RTCore application project

Make sure that your PATH environment variable is set appropriately (see the
“Getting Started Guide”). Run the command vs to start SlickEdit.
To create a new project, click on Project and then New. The next window
will show that “RTLinuxPro Application” will be selected and does not need
to be changed. Fill in the project name for your application in the “Project
name:” field. Enter the directory where you wish your project to be put in
the “Location:” field. Do not fill out the “Executable name:” entry - it will
be automatically filled out for RTCore applications.
The next menu will ask for “Project Type” and “Source Type”. Select
“Executable” (the default) for RTCore applications and choose the language
you plan to use for your application. The following window will ask if you
would like to begin with an empty project or a source-code file created. Gen-
erally, selecting “An application with a main() function” is the best option.
The next menu will ask how to create and manage build rules for your
167
168 CHAPTER 15. SLICKEDIT IDE FOR RTCORE
Figure 15.1: SlickEdit Startup Screen
project. For RTLinuxPro applications you must always select “Build with
an auto-generated, auto-maintained makefile”.
The screen should now look like Fig. 15.1.
15.2 Compiling RTCore applications

To build an application, just click the ”Build” pulldown, and click the ”Build”
or ”Rebuild” option. With the example main() function that is autogenerated
the resulting binary will not do too much. However, once built, the program
15.2. COMPILING RTCORE APPLICATIONS 169
Figure 15.2: A simple example
can be run by opening a shell window and changing directories to where the
project was built. Beneath this directory, there will be a ”Debug” directory
that contains the binary. With the example main() function, it will simply
return.
By default, debugging information will be built into the resulting appli-
cation. This can be disabled by hitting ”Build”, ”Set Active Configuration”
followed by ”Release”.
From here on, this behaves just like any other IDE. Taking the autogen-
erated example, a simple real-time thread can be added with ease. Fig. 15.2
shows the example with the additional code. Right clicking on the thread
function and selecting ”Show thread func in Class Browser” will open a stan-
dard object view for the project, allowing the user to drill down into func-
tions, include files, variable types, etc. Any remaining work is just a matter
of customizing the environment, according to personal preferences - editor
modes, code beautification, etc.
170 CHAPTER 15. SLICKEDIT IDE FOR RTCORE
Part III
Appendices
171
Appendix A
List of abbreviations
• AGP: Advanced Graphics Port
• API: Application Programming Interface
• APIC: Advanced Programmable Interrupt Controller
• APM: Advanced Power Management
• BIOS: Basic Input Output System
• CLI: CLear Interrupt flag
• CPU: Central Processing Unit
• DA/AD: Digital to Analog / Analog to Digital conversion
• DAQ: Data AcQuisition
• DMA: Direct Memory Access
• DRAM: Dynamic RAM
• EDF: Earliest Deadline First
• FAQ: Frequently Asked Questions
• FIFO: First In First Out
• FP: Floating Point
173
174 APPENDIX A. LIST OF ABBREVIATIONS
• GNU: GNU’s Not Unix (a recursive acronym)
• GPOS: General Purpose Operating System
• GUI: Graphical User Interface
• IDE: Integrated Device Electronics / Integrated Development Environ-

ment
• IP: Internet Protocol
• IPC: Inter Process Communication
• IRQ: Interrupt ReQuest
• ISA: Industry Standard Architecture / Instruction Set Architecture
• ISR: Interrupt Service Routine
• NVRAM: Non-Volatile RAM
• OS: Operating System
• PCI: Peripheral Component Interconnect
• PIC: Programmable Interrupt Controller
• PLIP: Parallel Line Internet Protocol
• POSIX: Portable Operation System Interface eXchange
• RAM: Random Access Memory
• RFC: Request For Comment
• RMS: Rate Monotonic Scheduler
• ROM: Read Only Memory
• RPM: RedHat Package Manager
• RT: Real Time
• RTOS: Real Time Operating System

175
• SCSI: Small Computer System Interface
• SHM: SHared Memory
• SLIP: Serial Line Internet Protocol
• SMI: System Management Interrupt
• SMM: System Management Mode
• SMP: Symmetric Multi Processor
• SRAM: Static RAM
• STI: SeT Interrupt flag
• TCP: Transmission Control Protocol
• TCP/IP: Transmission Control Protocol / Internet Protocol
• TLB: Translation Lookaside Buffer
• UDP: User Datagram Protocol
• UP: Uni Processor
• XT-PIC: Old XT (Intel 8086) Programmable Interrupt Controller

176 APPENDIX A. LIST OF ABBREVIATIONS
Appendix B
Terminology
• GPOS : General Purpose Operating System - The non-real-time oper-

ating system that RTCore is running as the lowest priority thread.
• RTCore : The core technology that powers RTLinuxPro and RTCoreBSD.

Viewing the system as running two operating systems, RTCore is the
RTOS that provides the deterministic control needed for real-time ap-
plications.
• EDF-scheduler : In this scheduling strategy, rather than using the

priority of a task to direct scheduling, the scheduler selects the task
with the closest deadline. In other words, it selects the task with the
least time left until it should be run. This scheduling strategy has
a ”flat” priority and is optimal for systems that handle asynchronous
events and non-periodic real-time tasks.
• FIFO Scheduler (SCHED FIFO): A First In First Out Scheduler is one

in which all processes/threads at the same priority level are scheduled
in the order they arrived on the queue. When the scheduler is called
the queue is checked for jobs of the highest priority is checked first. If
there is no thread runnable in the highest priority level the next level is
checked and so forth. A job scheduled with a policy of SCHED FIFO
can monopolize the CPU if it is always ready to run, and if there is no
mechanism to preempt it.
• Frontside Bus : This is the high speed bus that exists between the CPU
and memory.
177
178 APPENDIX B. TERMINOLOGY
• Host Bridge : The host bridge acts as a hub between most major
subsystems in a PC. It acts as an interface between CPUs, memory,
video, and other busses, such as PCI.
• North Bridge : The north bridge of a machine is the controller responsi-

ble for high speed operations. Bus components that require high speed
access, such as CPU to memory, PCI interaction, etc., are considered
part of the north bridge.
• PCI-Bridge: A logic chip (controller) connecting PCI busses. Access to

PCI devices runs over the PCI-Bridge from another subsystem to the
PCI bus where the peripheral device is located.
• PCI-ISA-Bridge: To support legacy ISA devices most PC’s have a ISA-

bus available via the PCI bus. The connecting controller is referred to
as PCI-ISA-Bridge.
• Rate Monotonic Scheduler (RMS) : An optimized scheduling policy

that is applicable if all tasks have a common periodicity, the criteria is
that all tasks fit the requirement
n
X Ci 1
< n(2 n − 1)
i=1 Ti
with C being the worst case execution time and T the period of each
task. As the task number n increases, the utilization converges to about
69%, which is not as efficient as other schedulers, but is preferable in
situations requiring static scheduling.
• South Bridge : The south bridge is the collection of controllers that

deal with slower component systems, such as serial controllers, floppy,
PCI-ISA bridges, etc.
• Asynchronous Signals: All signals that reach a thread from an external

source, meaning that a different thread of execution is posting a signal
via pthread kill(). Not all thread functions are async safe, as signals
may come at any time, even when the thread is not ready to be inter-
rupted. An asynchronous signal is delivered to the process and not to
a specific thread within a multithreaded process.
179
• Async Safe : A thread function that can handle asynchronous sig-

nals without leading to race conditions or synchronisation problems
(like blocking other threads indefinitely, leading to inconsistency in
global variables, etc.) are considered to be async safe functions. Func-
tions that are not async safe should be used with these possible side
effects in mind, meaning that the points at which they are safe to
call should be set appropriately. If a thread has a cancellation state
of PTHREAD CANCEL ENABLE and the cancellation type set to
PTHREAD CANCEL ASYNCHRONOUS then only async-safe func-
tions should be used, or signal handlers must be installed.
• Atomic Operation : A execution operation during which a context

switch can occur but state is preserved. During atomic operations it
is legal to assume that conditional variables, mutexes, etc. will be
unchanged, as proper locking has taken place. An atomic operation
behaves as if it where completed as a single instruction.
• Barrier : A thread synchronisation primitive based on conditional vari-

ables. A barrier is a point in the execution stream at which a set
of threads will wait until all threads requiring synchronisation have
reached it. After all threads have reached the barrier the condition
predicate is set TRUE and execution of all threads can continue.
• Busy wait loop : This is the act of waiting for an event in a running
process, using the CPU during the wait. Rather than being put to
sleep and rescheduled when the event occurs, the process spins doing
useless activity during the wait. This saves the overhead of scheduling
another process in and then having to reschedule the first.
• Cache flush : A cache flush involves writing the content of the cache to
memory or to whatever media is appropriate. This is only necessary on
hardware that does not support write through caching, or on SMP sys-
tems when a task moves between CPUs. Generally cache flushes have a
noticable influence on performance, especially for real-time operations.
As the flushed data must be refetched, the resulting delay from a flush
may result in jitter.
• Conditional Variable : A condition variable is a complex synchronisa-

tion mechanism comprised of a conditional variable and its predicate,
as well as an associated mutex. A thread acquires the mutex and then

waits until the condition is signaled, then performs the task depending
on the condition, releasing the mutex afterward.
• Context Switch : removing the currently running thread from the pro-
cessor and starting a different thread on this CPU. A context switch in
RTCore will only save the state of the integer registers unless floating
point is enabled. (See pthread attr setfp np.)
• Deadlock : Deadlock occurs if synchronisation primitives are used in-

consistently such that different threads are control are waiting for each
other to release resources. An example of such a setup is two threads
that each acquire a mutex that the other is waiting for. Since both
threads are blocked neither will free the mutex they hold and thus
both are blocked infinitely.
• Detached thread : When creating a thread with pthread create, the

attributes passed will by default make the thread joinable, meaning
that another thread can call join on it. This is commonly done to
catch return status and to finish the cleanup of the joinable thread. If
the thread’s state is set to PTHREAD CREATE DETACHED, then
all resources of this thread will be released when the thread exits, so
there is no return status, and no further synchronization needed.
• Embedded system : Operating systems and software for systems that

perform non-traditional tasks are refer1red to as embedded systems.
These systems span a wide range, but in general, embedded systems
are low memory systems and have restrictions with respect to available
mass-storage devices as well as minimal power.
• Global Variables : Global variables are those that are visible through-
out the application, rather than being restricted to a specific thread.
The variables themselves are not protected against concurrent activity,
and usually require some kind of synchronization primitives to ensure
safe handling.
• Handler : If an event should be handled in a specific way, a function or

thread will be programmed that can respond to this event (e.g. update
the pixels on the screen if the mouse moves). The association of this
function or thread with a specific event makes it to the handler of this
181
event. The handler must be explicitly registered with whatever will

detect the event, which is generally the operating system.
• Hard Real-time : Systems capable of guaranteeing worst case jitter and

a worst case response time, regardless of system load, qualify as being
hard real-time systems.
• InterProcess Communication : Commonly referred to as IPC, this refers

to any mechanism by which multiple processes can coordinate their ac-
tion. These mechanisms range from files, shared memory, semaphores,
and other shared resources.
• Interrupt : All processors have the capability to receive external signals

via dedicated interrupt lines. If an interrupt line is set the processor
will halt execution and jump to a interrupt handling routine. Interrupts
are electric signals caused by some hardware (peripherals like network
cards or IDE disk controllers) and have a software counterpart that is
part of the operating system.
• Interrupt Interception : In RTCore, no interrupt will directly reach

Linux’s interrupt handlers as every interrupt is handled by the inter-
rupt interception code first. If there is a real-time handler available
this handler will be called, otherwise it will be passed on to Linux for
handling when there is time.
• Interrupt Handler : The action that should be taken when an interrupt

occurs is defined in a kernel thread that is called upon recieving that
interrupt. The mapping of interrupt service routines to an interrupt
handler is done by the GPOS kernel as well as by a real-time thread.
This means that there can be two handlers for the same interrupt in
RTCore: In this case the real-time handler is called first, and only if
the task is not destined for the real-time handler will it be passed to
the GPOS interrupt handler for execution.
• Interrupt Mask : An interrupt mask determines which interrupts ac-

tually can reach the system. A bit mask is used to enable/disable
interrupts.
• Interrupt Response Time: On asserting an hardware interrupt the sys-

tem will call the associated interrupt service routine. The time from
the assertion of the interrupt (the electric signal being active on the
interrupt pin) to the point where this interrupt service routine is called
is defined as the interrupt response time. In practice the interrupt
response time is the time from asserting the interrupt until the sys-
tem acknowleges it or respond with a noticable action. This time is
therefore a little longer than the ”theoretical” interrupt response time.
• Instruction Set: To communicate with a specific hardware a set of op-

erations is used to directly communicate with the hardware (i.e. ma-
nipulate register content). This instruction set is hardware specific and
directly maps to machine code.
• Jitter : Jitter values represent the time variance in completion of an

event. This can represent anything from task completion variance to
real-time scheduling variance.
• Kernel : The kernel is the core of an operating system, providing the

basic resources and controlling access to these resources.
• Kernel Module : Modules are dynamically loaded capabilities, repre-

sented as object code that is linked into the kernel as needed. Once a
kernel module is loaded it is no different from a statically compiled in
kernel function.
• Kernel Thread : A kernel thread is similar to a normal thread in that it

represents a specific execution path, although in this case it runs within
the kernel. Kernel threads can have more restrictions than normal
threads, such as stack space, but offer the advantage of access to kernel
structures and subsystems.
• Latency : The time between requesting an action and the actual oc-
currence.
• Local Variables : As opposed to global variables, local variables are

only visible to a single thread or single execution scope.
• Multithreaded : A process that has more than one flow of control (In
general, there are also shared resources between these control paths).
• Mutex (Mutual Exclusion Object): A mutex is an object that allows

multiple threads to synchronize access to shared resources. A mutex
183
has two states: locked and unlocked. Once a mutex has been locked
by a thread all other threads that try to lock it will block until the
thread that acquired the mutex unlocks it. After this one of the blocked
threads will acquire it.
• Polling : Polling is the strategy of checking a condition or a condition

change while in a loop. Generally polling is an expensive strategy to
test conditions/condition changes.
• Priority Inversion : If a high priority thread blocks on a mutex (or

any other synchronisation object) that was previously locked by a low
priority task, this will lead to priority inversion: The lower priority
thread must gain a higher priority in order to guarantee execution
time. Otherwise another high priority thread may come along and
block execution of the lower priority task from running, preventing
freeing of the mutex and also stalling both the low and high priority
threads. and thus the mutex will not be unlocked. This scenario leads
to a lower priority task blocking a high priority task which is an implicit
priority inversion.
• Process : An entity composed of at least one thread of execution and

a set of resources managed by the operating system that are assigned
to this entity.
• Race Condition : If two executing entities compete for a resource and

there is no control ensuring safe access of the resource, unpredictable
behavior can occur. Race conditions can occur with any shared re-
sources if appropriate synchronization is not done by all entities that
require access to this resource.
• Re-entrant Function : A reentrant function will behave in a predictable

way even if multiple threads are using it at the same time. Any syn-
chronisation or access of global data is handled in a way that it is safe
to call these functions multiple times without fear of data corruption.
• RR Scheduler : In Round Robin Scheduling, there are different priority

levels available, and the ordering of threads/processes is the same as
in SCHED FIFO. The difference is that each scheduling entity has a
defined time-slice. If it does not exit or block before the time-slice
expires it will be preempted by the kernel and the next runnable thread
will be scheduled.
• Scheduler : The thread that handles the task-queue of the system, it

decides which process is to be run next after a process gives up the
CPU (either by exiting or blocking). The order in which the scheduler
will grant control to the CPU is described by the scheduling policy and
the priority assigned to each task.
• Scheduling Jitter : The variance of time between the point at which

a process requested scheduling and the time at which it actually runs.
In the common literature, scheduling jitter will sometimes refer to the
absolute deviation from the requested timing.
• Semaphore : The simplest form of a semaphore is equivalent to a mutex,

the binary semaphore. Associated with a semaphore is a counter that
defines the number of threads that can access the protected resource via
the semaphore. On access of the protected resource a thread acquires
the semaphore by decrementing the counter. If the counter reaches
0 no other threads can access the protected resource. When a thread
releases the protected resource it increments the semaphore again. The
underlaying mechanism is a conditional variable with the condition
counter > 0.
• Shared Memory : Memory accessed by more than one process. Shared

memmory can be accessed from a real-time process as well as from
a non-real-time (GPOS processes) for data exchange or for process
synchronisation. RTCore offers this mechanism, although there are
many types of shared memory systems.
• Signal : A numeric value delivered to a process via system call, de-

scribing an action to be taken by the process. The process may accept
a signal or mask it. If a process has a signal handler installed for the
signal number sent this handler will be executed on arival of the signal.
Signals issued from a thread within a process can be posted to a spe-
cific thread (via the thread id), while signals sent between prcesses are
received at the process level and are not directed to a specific thread.
• Signal Handler : To manage asynchronous signals at a process level

signal handlers are installed. These can then be called by the thread
185
that received the signal. Note that signal handlers are installed at the
process level and not at the thread level, so if a asynchronous signal is
received, it cannot be directed at a specific thread. Only signals issued
from within the process can be sent to specific thread thread IDs that
exist within that specific process.
• Sigaction : The sigaction call controls the actions taken upon reception
of a given set of signals. It sets up signal handlers for the action, among
other things.
• Soft Interrupts : All GPOS interrupts in RTCore are soft interrupts.

These interrupts are not directly related to hardware events but are the
hardware events that the real-time kernel has passed on to the GPOS
for management if there was no real-time interrupt handler assosciated
with the interrupt.
• Soft Realtime : Systems that can provide guaranteed average response

times to a class of events, but cannot provide a guaranteed maximum
scheduling variance.
• Spinlock : Waiting on a mutex can be done in a infinite loop, probing

for the mutex on every iteration. Spinlocks block the CPU, and are
thus ”expensive” operations if it is not ensured that the thread will
only spin for a very short time. A spinlock is efficient only if it is
not active longer than the amount of time it would take to perform a
context switch.
• Spurious Wakeup : If a thread is waiting on a conditional variable

and receives a signal it can be woken up and could return from the
wait, even if the condition the code was waiting for hasn’t really oc-
curred. To prevent race conditions due to spurious wakeups evaluation
for condition variables and condition wait is done in a loop. This way,
a spurious wakeup will be caught and the code will continue to wait
for the condition.
• Stack : To pass arguments and context information for function calls

each process and thread have a stack associated with it. This stack is
private to each process or thread.
• Symbol Table : A symbol table exists to map an mnemonic symbol to

an address or location where the contents are stored. In the context of
this book, this generally refers to the kernel symbol table, which maps
the addresses of kernel structures. In the general case, this can be used
anywhere, such as in a custom application.
• Synchronous Signal : Any signal that is the result of the threads ac-
tion, and occurs in direct reaction to that action. This is opposed to
asynchronous signals, which may arrive at any time and may not be
related to a thread action. An example of a synchronous signal would
be a thread that does a division by zero causing an FPE INTDIV. Syn-
chronous signals are delivered to the process that caused the it and not
to the specific thread.
• Task : In the process model a task represents a process, while in the

thread model a task can be a process (single threaded process) or a
thread of execution within a process. The task concept is used in the
V1 API of RTCore and was replaced by the thread based POSIX API.
Usage of the V1 API is not recommended.
• Task Priority : Every task will be called by the scheduler to execute in

an order specified by its priority level. POSIX specifies a minimum of
32 priority levels. Besides the priority of a task its scheduling policy
(SCHED FIFO, SCHED RR, etc) will influence when it is run as well.
A priority level only specifies its rank within the scheduling policy, and
is in relation to other tasks in that same scheduling class.
• Thread : Each independant flow of control within a process that has

an execution context associated with a instruction sequence that can
be executed. A thread is fully described by its context, instruction
sequence, and state.
• Thread Context : Each thread exists within the context of a process,

and this context is comprised of a set of resources, such as register
context, stack, private storage area, attributes and the instructions to
execute. It also includes the structures through which the thread is
accessible (thread structures and other management constructs).
• Ticks : Each cycle executed by a processor counts as a tick. Processors

such as the Pentium maintain a count of the number of these ticks that
187
have occurred since boot time. This value is useful as an indicator of

the length of time of a task, among other things.
• Timers: Hardware components on the motherboard or integrated into

the CPU that measure time, and can be the source of a periodic trigger.
• User-space Thread : These are threads created, synchronized, and ter-

minated using the threads API running in user space. They are as-
sociated with a kernel-scheduled entity (a process) and are not visible
to the kernel, but are rather scheduled by a separate scheduling entity
that lives within the process. The kernel only sees a single process and
will not distinguish between the different threads. Note that this differs
from userspace POSIX threads, in which each thread appears to the
kernel as a schedulable process.
• User Mode : A mode of operation where access is restricted to a subset

of available functions available to normal user-space processes. Kernel
level subsystems and special processor modes are not available to the
userspace code. A thread executing application code will do this in user
mode util it issues a system call, which the kernel will then execute on
behalf of the process. Once the system call completes and returns, user
mode is reentered.
• User-space : The memory space a user process exists in. Execution

of user code and all resources associated with a user mode operation
reside in user space. User space is left when a privileged operation is
executed (syscall).
Appendix C
Familiarizing with RTLinuxPro
This chapter is intended to provide a simple overview of how to interact with

RTLinuxPro. It is assumed that the kit is already installed as defined by the
instructions provided with the CD or download.
RTLinuxPro installs into a root directory point as defined by the version,
and cannot be safely moved from that point. This is because all of the tools
are built against a known location, and depend on the existence of that
point for configuration, libraries, and other information. This allows the kit
to be installed on any distribution, regardless of host glibc version, installed
utilities, etc.
This installation directory is different in every version, so that you can
keep multiple installations on the same machine. The root path is /opt/rtldk-x.y,
where x is the major release number, and y is the minor release number. The
remainder of this chapter will assume this path as the root location of all
commands.
C.1 Layout
The installation guide should walk you through the specifics of each directory,
so we will focus only on the important ones here:
1. bin - This is where all of the tool binaries exist. Make sure that the full
path to this directory is first in your $PATH variable, so that you use
these tools before any others in your path. (Running gcc -v should
report information including the /opt/rtldk-x.y path if this is con-
figured properly.
189
190 APPENDIX C. FAMILIARIZING WITH RTLINUXPRO
2. rtlinux kernel x y - This is the prepatched kernel to be loaded on the

real-time system, whether that is the development host or a target
board. You’ll need to take the precompiled image and install it like any
other kernel or rebuild it to suit your environment. (The x y value will
correspond to the major and minor kernel version numbers provided
with the release.
3. rtlinuxpro - All of the RTCore components (and optionally, code),

scripts, examples, drivers, and other RTCore-specific tools are con-
tained here.
There are many other directories, but these are central to our uses here.
Also of note is the docs directory, which contains API documentation and
more information on getting started with RTCore.
C.1.1 Self and cross-hosted development

There is a large divergence in ways that the kit is used, as each embedded
system has its own set of specific requirements. In many cases, the develop-
ment kit will be installed on an x86 machine, but built with compilers and
real-time code targetting a different architecture. In this case, you will need
to find the correct way of getting the kernel image, real-time modules, and
filesystem to the embedded device, whether it is a flash procedure, a BOOTP
configuration with an NFS root, or whatever is appropriate. For simplicity’s
sake, we will assume that the installation of the development kit is on the
machine that will be used for the actual real-time execution. (Although this
is not always an optimal solution.)
If you are doing cross-hosted development, refer to the installation in-
structions provided with RTLinuxPro. Provided in installation manual are
some example procedures for getting a kernel built and transferred to the tar-
get board. As each board varies slightly, we won’t be covering the specifics
of this procedure here.
C.2 Loading and unloading RTCore

RTCore must be loaded in order for any real-time services to be available.
The process is simple, once you have compiled and installed your the patched
C.3. USING THE ROOT FILESYSTEM 191
kernel. This procedure is the same as with any other kernel, and as the proce-
dure is beyond the scope of this book, we suggest the normal Kernel-HOWTO
for details. Essentially, it involves changing to the rtlinux kernel 2 4 di-
rectory, building a kernel image suited to your device needs (or using the
provided stock image), and installing that image. This may be a local LILO
or GRUB update, or it might be a matter of making the image available for
TFTP by an embedded board. Again, we assume a self hosted development
environment for this example.
Once the system is running the correct kernel, RTCore can be loaded
with the following commands:
cd /opt/rtldk-x.y/rtlinuxpro
./modules/rtcore &
This will load the RTCore OS found in the installation, which will vary
based on any additional components installed. Unloading the OS consists of:
killall rtcore
C.2.1 Running the examples

Now you can run some of the examples, or run the regression test with
scripts/regression.sh in the rtlinuxpro directory. In order to get a feel
for the steps needed to load and run real-time code, it is worth stepping
through the examples provided. Each of the examples is built to be self-
explanatory, and can be run by just executing the local binary.
Once the application code is running, the test will generally continue
indefinitely. After you are done running it, the application can be stopped
with a CTRL-C.
C.3 Using the root filesystem

Included with the development kit is a root filesystem, built for the intended
target of the kit. This means that if you are using the generic PowerPC
version of the kit, there is a root filesystem containing a set of binaries built
for generic PowerPC root. This will provide a solid Linux installation for
use by the development system. For a generic PowerPC version, there is a
ppc6xx root directory inside the development kit tree, but this name will
vary by architecture.
For a generic x86 system as we described in the installation section, you
will likely use the host filesystem already present. However, if you intend
to use separate systems for development and testing (as advised) or are tar-
getting a different architecture completely, this option should help speed the
development process. For many embedded systems, it is much simpler to NFS
root mount a remote filesystem, at least for testing, rather than rebuilding
an image every time you generate new binary code for the target.
For most distributions, exporting this tree is a very simple exercise,
and is no different than exporting any other NFS mount point. Edit your
/etc/exports and run exportfs -a, and the tree will be available to the
embedded system. In many environments, it is also advisable to simply have
the device retrieve its kernel image from DHCP, and build the image such
that it automatically mounts the root filesystem from your development ma-
chine. If this is useful for your environment, the kernel build offers an option
to build the boot parameters in as automatic arguments to the bootstrap
process. For example, under a PowerPC build, under the kernel’s ’General
setup’ option, you can set the boot options to be (as one variable):
root=/dev/nfs nfsroot=10.0.0.2:${RTLDK_ROOT}/ppc6xx_root ip=bootp
The setting defined here sets the root filesystem to be NFS, and that this
root system lives on the machine at 10.0.0.2, under ${RTLDK ROOT}/ppc6xx root.
Be sure to replace ${RTLDK ROOT} with the correct location of your de-
velopment kit. The IP setting configures the device to use bootp in order to
configure itself, although there are many options that may be used in order to
configure the interface. These arguments are built into the kernel image, and
are passed as normal parameters to the boot process during a TFTP-based
boot, just as if they were typed in at a LILO prompt. For more information
on these options, refer to Documentation/nfsroot.txt inside the Linux ker-
nel tree. Many users need read/write access to the root filesystem, at least
for testing. Add a rw after the root=/dev/nfs to use this, or remount your
NFS root as read/write on the target, with:
mount -o remount,rw 10.0.0.2:${RTLDK_ROOT}/ppc_root /
Once these options are configured and your remote device is using the
NFS mount as its root filesystem, you can do all development on the host
C.4. SUMMARY 193
machine with the development kit, and move the resulting images under
the NFS mount point. For simplicity, it is often useful to simply copy the
rtlinuxpro directory from the kit under the NFS root mount. While some
of these pieces should be removed for the final system, this simple copy will
allow access to all of the targetted real-time code needed for the embedded
device.
C.4 Summary
This chapter on development kit usage might come across as being rather
light, and there is a reason for that. The development kit is intended to be
simple to use, and to allow a programmer to install a stable build environment
for producing real-time code. As such, this involves installing the kit, placing
the tools in your path, and then using the various components (such as the
root filesystem and modules) as needed. The intent is that configuration and
use is as simple as possible, allowing the programmer to concentrate on the
task at hand, and not have to be distracted by development tool problems
in the build environment. Specific details such as board configuration for
network boot is described in more detail in the devkit manual.pdf document
provided with RTLinuxPro.
Appendix D
Important system commands
This is an overview with some usage examples that might be helpful when
working with RTCore, and most UNIXes in general.
bunzip2
The bzip2 and bunzip2 command are for compressing and decompressing
.bz2 files. Bzip2 offers better compression rates than gzip, and is becoming
more popular on FTP sites and other distribution locations.
bunzip2 linux-2.4.0.tar.bz2
Decompress the compressed archive.
bzip2recover file.bz2
Recover data from a damaged archive.
bunzip2 -t file.bz2
Test if the file could be decompressed, but don’t do it.
dmesg
The kernel logs important messages in a ring buffer. To view the contents
of this buffer you can use the dmesg command.
dmesg
Dump the entire ring-buffer content to the terminal.
dmesg -c
Dump it to the terminal and then clear it.
dmesg -n level
Set the level at which the kernel will print a message to the console. Set-
195
196 APPENDIX D. IMPORTANT SYSTEM COMMANDS
ting dmesg -n 1 will only allow panic messages through to the console, but
all messages are logged via syslog.
find
find can be used to find a specific fileset in a directory hierarchy, and
optionally execute a command on these files.
find .
List all files in the current directory and below.
find . -name "*.[ch]"
List all files in the directory and below that end in .c or .h (c-sources and
header files)
find . -type f -exec ls -l {} ;
Find all regular files an display a long listing (ls -l) of them.
find . -name "*.[ch]" -exec grep -le min ipl {} ;
List all files in the directory hirarchy that contain the string ” min ipl”
in them.
find /usr/src/ -type f -exec grep -lie MONOTONIC {} ;
List all files below /usr/src/ that contain the string MONOTONIC, using
a non case sensitive search. (MONoToniC will also match.)
grep
grep is for searching strings using regular expressions. Regular expres-
sions are comprised of characters, wildcards, and modifiers. Refer to the grep
man page or a book on regular expression syntax for details.
grep -e STRING *.c

Display all lines in all files ending with .c that contain STRING.
grep -ie STRING *
Display all lines in all files of the local directory that contain STRING in
upper or lower case. (e.g. StrInG)
grep -ie "void pthread" *.c
Find the string ”void pthread” in any .c files. The quotations are required
to enclose the blank in the string.
grep -e "char msg" *.c
Find the declaration of ”char *msg” in the .c files of the local directory.
The ”*” must be escaped so that it is not interpreted as a wild card.
197
gunzip
The gunzip command will decompress .gz files. You will not need any
options for decompression. For compressing files use gzip.
gunzip FILE.gz
Decompress FILE.gz which will rename it to FILE in the process.
gunzip -c FILE.gz
Decompress FILE.gz and send the decompressed output to standard out-
put, and not to a file.
gzip FILE
Compress FILE renaming it to FILE.gz with the default compression
speed.
gzip -9 FILE
Compress FILE with the best compression ratio. (This will be slow)
init
Init is the master resource controller of a SysV-type Unix system. While
testing an RTCore system it is advisable to do this in init or runlevel 1, which
is a single user mode without networking and with a minimum set of system
resources.
init 1
Put the computer into runlevel 1. (No networking, single user mode)
init 2
After tests in init 1 ran successfully, bring the box back up to a multiuser
networking system. This need not be runlevel 2, and will vary depending on
which UNIX you are running. Check /etc/inittab to see which runlevel
is the system default runlevel. It should be safe to run it back up to the
runlevel set as default.
init 6
Reboot the system.
init 0
Halt the system.
locate
Many but not all Linux systems have the locate database available, which
caches all filenames on the system and make it easier to locate a specific file.
locate irq.c
List all files on the system that have irq.c in them. (alpha irq.c and
irq.c.rej will also match.)
locate rtlinux | more
If the search is too general, output will be more than a screen. By piping
the output into the ”more” program a paged listing is displayed.
make
GNU make is one of the primary tools of any development under modern
UNIXes. Given a makefile, Makefile, or GNUmakefile, which are the default
name make will look for, make will build a source tree, resolving dependan-
cies based on the information and macros given in the makefile.
make -f GNUmakefile
This will run make with the provided my makefile, if the name isn’t one
of the default names that GNU make will search for.
make -n
This will instruct make only to report what it would do, but will not
actually process any source files.
make -k
Normally make will terminate on the first fatal error it encounters. With
the -k flag make can be forced to continue. This makes sense if within a
source tree multiple independant executables are to be built, and one wants
to build the rest even if the first fails.
make -p -f /dev/null
Show the database settings that make will apply by default without actu-
ally compiling anything. This will list all implied rules and variable settings.
objdump
objdump allows you to view symbol information in object files, such as
kernel modules. It also allows you to disassemble object files. This is helpful
when trying to locate what could be causing system hangs with a module.
The output is not very user friendly, but if short functions were used it should
not be too hard to read. If long functions with many flow control statements
were used, it can be close to unreadable.
tar
Archives ending in .tar (Compressed tar files will end in .tar.gz tar.bz2
or .tgz) can be unpacked with tar. To make this operation safe, check what
199
is in the archive and where it will be unpacked to first!
tar -tf rtlpro cd.tar

List the files contained in the archive.
tar -tvf rtlpro cd.tar
Gives you more details on the files than the above command.
tar -xvf rtlpro cd.tar
Unpack the rtlpro cd.tar archive in verbose mode. This will list every file
as it is handled.
tar -cvf mycode.tar mycode
This will pack up the content of the directory ”mycode” into the archive
mycode.tar, naming every file as it is processed.
uname
To get the exact system name of the running kernel, use the uname com-
mand. A common problem is that one has the wrong kernel running and
runs into ”funny” problems this way, such as symbol problems on module
load. Running uname should clear up any question of what kernel is active.
uname
Print the system type. (e.g. ”Linux”)
uname -m
Print the system hardware type. (e.g. ”i586”)
uname -r
Print the kernel release name of the running system. (e.g. 2.4.16-rtl)
uname -a
Print the full system string, dumping all known information about the
running kernel.
Appendix E
Things to Consider
There are limits to RTCore introduced by underlying hardware that in prin-

ciple cannot be bypassed in software. These limits need to be considered
during a project’s planning stage, or at the latest, when selecting a hardware
platform. Example code provided in the RTLinuxPro package will perform
some basic tests on your system in order to judge its appropriateness for
real-time work, but here are some common stumbling blocks that developers
run into.
E.0.1 System Management Interrupts (SMIs)

Essentially all Pentium-class systems have the capability to use SMIs, but
it has only rarely been done. Some platforms, though, make heavy usage
of SMIs to control peripheral devices like sound cards or VGA controllers.
SMIs are interrupts that can’t be intercepted from software. Consequently,
RTCore will be prevented from operating correctly during SMI execution.
Preventing SMIs from controlling hardware is generally not a problem: Sim-
ply select peripheral devices that don’t require SMIs. This is a simple choice
for almost all ISA/PCI/AGP cards, although it is not necessarily true for
onboard controllers. In rare cases, SMIs have been ”used” to correct design
bugs in the hardware, so make sure to keep away from such hardware when
selecting components for a real-time system. Check with your vendor for
details.
201
202 APPENDIX E. THINGS TO CONSIDER
E.0.2 Drivers that have hard coded cli/sti

There are drivers available for Linux which may have hard coded cli/sti (clear
interrupt flag/set interrupt flag), that will cause problems in conjunction with
RTCore. To make sure a driver is not using cli/sti, use the command objdump
to check for cli instructions. Good candidates for such hard-coded cli/sti’s
are binary released drivers for Linux. Vendors of such drivers most likely did
not take real-time requirements into account when designing their drivers. It
is very important to perform this check on binary drivers - if you don’t see
delays during normal execution, it is not safe to assume that they are not
there, as that code path may not have been triggered yet.
E.0.3 Power management (APM)

Most laptops and some desktop PCs now have power management hardware
included, which optimizes power consumption by reducing system clock fre-
quency, memory timings and bus frequencies (Probably other things as well).
This has clear implications for real-time systems; if timers change their be-
havior during operations, consequences are at best hard to predict. In gen-
eral, a system that is using power management will not be very good for
real-time operations, unless these effects have been explicitly addressed by
drivers and the core real-time system. If this is not the case, power manage-
ment must be disabled.
E.0.4 Hardware platforms

RTCore is dependent on certain hardware behavior for successful operation.
This might be most obvious for peripheral devices like data acquisition boards
or stepper motor controller boards, but ”standard” hardware dependencies
are often overlooked.
Depending on application demands, hardware platform selection can make
or break the project. It is important to find a platform that can provide the
performance and accuracy you need for your application. With RTLinuxPro,
a targetted evaluation is recommended to ensure that the machine can pro-
vide appropriate accuracy, followed by a strict analysis of program demands
to see if the specifications of both hardware and software can be met.
There may be a lot of flexibility here, depending on need. For a very high
performance application, the range of possible architectures may be limited
203
to a small handful of target systems. But for others, such as a few low
frequency sampling threads, a much slower system will likely be cheaper and
still provide ample resources.
A prime example of this is the National Semiconductor Geode processor.
While it is x86-compatible, many operations are virtualized on the chip,
meaning that performance may degrade during certain time windows. (Video
and audio are two known problem areas.) While the chip goes into System
Management Mode (SMM) to handle this activity, hardware-induced jitter
may spike as high as 5 milliseconds. For many applications, this is the kiss
of death - but others may be fine with this level of jitter. For these lower
bandwidth applications, the Geode is a cheap x86-compatible solution for
the field, and the jitter is within specification.
It is because of these situations that FSMLabs recommends evaluation
and testing with the RTLinuxPro test suite, followed by hard analysis of
application demands. If the target hardware will suit the application, it may
not matter if there is potentially 5 millisecond jitter - in this case, a Geode
is perfectly suitable. The important part is that requirements are built and
understood so that the proper hardware and software configuration can be
selected.
E.0.5 Floppy drives

Typical PCs include a floppy drive. For historic reasons, the floppy drive is
able to change the bus speeds, and floppy drivers do CMOS calls to select
the floppy type. The consequence for RTCore is that scheduling jitter can
substantially increase if the floppy is accessed. The simplest solution is not
to have a floppy drive on a real-time system. If a floppy drive is absolutely
necessary, these effects must be taken into account. That is, you must test
your real-time threads while accessing the floppy drive, to ensure that it is
not disturbing real-time operation in an unacceptable manner.
E.0.6 ISA devices

In a PC-based system, compatibility with older hardware is available only
at a relatively high performance penalty. A typical example of this is the
PCI-ISA bridge that can be the dominating cause of worst-case system jitter
in a system. When making the decision of which hardware to select for a
real-time system, careful consideration should be made concerning the ISA
204 APPENDIX E. THINGS TO CONSIDER
bus. If a system can be designed without an ISA-bus, it is the preferable

choice.
RTCore will not be able to compensate for slow hardware in all cases:
If the bus is controlled by an ISA device, RTCore will have to wait. When
an ISA DMA request occurs, everything is clocked down to the speed of
the ISA bus and waits until the transfer finishes. Thankfully, the ISA bus
is being removed entirely from many modern designs, so unless you have
specific hardware that is ISA-only, the entire issue can be avoided.
E.0.7 DAQ cards

Data acquistition is one of the more common tasks where one would use an
RTCore-based PC system. When designing such a system, it is important
to carefully consider which data acquisition peripherals should be used. De-
pending on project demands, there are a variety of cards offering varying
levels of capabilities.
Depending on the included hardware, some cards will sample data au-
tonomously, buffering into their own internal storage, and will only notify
the system when a large amount of data has been collected. For acquisition
rates that outpace the timing capabilities of the host machine, this can be
beneficial, but it usually comes with some kind of cost.
On the other hand, some cards operate in a polling manner, allowing you
to set up the sampling rate purely from real-time code. This has advantages
in that you can use simpler hardware without internal buffering, but it re-
quires the host machine to be capable of performing the requested sampling
rate. For most applications, the best choice is somewhere in the middle,
allowing some work to be done on the board, and some in RTCore, without
raising costs too much.
Appendix F
RTCore Drivers
This appendix covers specific details on drivers provided with RTCore.
F.1 Digital IO Device Common API

Digital IO devices advertise their services through files that can be operated
on with open, read, write, ioctl and close.
IO devices must first be opened with a call to open using the appropriate
flags. The read/write mode of the returned file descriptor only affects write
and read calls. ioctl calls completely ignore the read/write status of the
file descriptor and allow reading or writing as requested. Once operations on
the device are complete a normal call to close is necessary.
Most operations are done through ioctl. There are a number of ioctl
calls that operate on devices listed below.
Setting a clearing a single bit is done with:
int fd;
/* set bit #3 */
ioctl( fd, RTL\_SETBIT, 3 );
/* clar bit #10 */
ioctl( fd, RTL\_CLEARBIT, 10 );
One can clear specific bits and clear other bits atomically. That is, clear
some and set some in a single operation.
int fd;
unsigned long mask[2];
205
206 APPENDIX F. RTCORE DRIVERS
/* clear bit #4 and 8 */

mask[0] = (1<<4) | (1<<8);
/* set bit #2 */
mask[0] = 1<<2;
ioctl( fd, RTL\_CLEARSETBITMASK, &mask );
int fd;
/* clear bit #4 and 8 */

mask[0] = (1<<4) | (1<<8);
/* set bit #2 */
mask[0] = 1<<2;
ioctl( fd, RTL\_CLEARSETBITMASK, &mask );
Sometimes one wishes to read a specific mask of registers without chang-
ing the input or output state of other registers. Writing a specific mask of
bits without changing the state of any other bits is show below.
int fd;
/* wish to write bits 4 and 8 */

mask[0] = (1<<4) | (1<<8);
/* set bit 4 to a 0 (low) and bit 8 to a 1 (high) */
mask[0] = (1<<8);
ioctl( fd, RTL\_WRITEBITMASK, &mask );
The code below shows reading from certain bits without changing any
output of other bits.
int fd;
/* wish to read bits 4 and 8 */

mask[0] = (1<<4) | (1<<8);
ioctl( fd, RTL\_READBITMASK, &mask );
/* mask[1] contains the value of bits 4 and 8 */

F.2. INTEL 82C55 DIGITAL IO 207
F.2 Intel 82C55 Digital IO

This driver supports the Measurement Computing PCI-24 and most boards
that use the Intel 82C55 chip.
This driver allows control of the Intel 82C55 GPIO digital lines. The
driver follows the standard API conventions for digital IO in RTCore. Please
see the section that describes this for details.
F.2.1 Driver specifics

The Intel 82C55 provides 24 input and output lines that are configurable as
input or output in 4 different banks. The banks are described below:
• /dev/dio1024 0 — 8 bits
This driver only supports mode 0 for basic input and output described
in the Intel 82C55 manual. It does now allow data strobing or parallel data
communication.
F.3 Marvell GT64260 and GT64360 Digital

IO Driver
This driver allows control of the Marvell GT643,260 GPIO digital lines. The
driver follows the standard API conventions for digital IO in RTCore. Please
see the section that describes this for details.
F.3.1 Driver specifics

The Marvell chipset allows each individual bit to be configured as either input
or output regardless of the state of other bits. Up to 32 lines are available
but many board configurations do not actually run all the lines out from
the chip to connectors on the outside. It is often the case that the Marvell
chip bit 0 does not correspond to the output pin on the board. Read the
documentation on your board to be certain of which pins are run out from
the Marvell and to where.
F.4 Video Framebuffer Driver

This section describes the video framebuffer driver that allows real-time video
display.
The framebuffer driver exports an interface through the device files /dev/fb0,
/dev/fb1 and so on.
F.4.1 Calling Contexts

The /dev/fb* devices allow access to the framebuffer device inside of RTCore
threads and GPOS routines (inside of main() functions in RTCore applica-
tions). Operations in interrupt handlers are not allowed due to the normal
limits of ioctl(), open() and close().
These devices are only accessible from kernel space as the device only
exists in RTCore, not in the standard Linux environment. Linux applications
that operate on /dev/fb* will instead be acting on the Linux framebuffer
device.
F.4.2 Operations on the Framebuffer

Applications may open /dev/fb* devices and then pass that file descriptor
to any of the RTCore graphics functions.
The Linux kernel must be configured with a working framebuffer device
for this interface to work. The kernel distributed by FSMLabs already in-
cludes support for this.
Framebuffer devices support these ioctl() calls:
• RTL FB GET XRES - returns the X resolution
• RTL FB GET YRES - returns the Y resolution
• RTL FB GET VIRTXRES - returns the virtual X resolution
• RTL FB GET VIRTYRES - returns the virtual Y resolution

F.5. IEEE-1284 – PARALLEL PORT DIGITAL IO DRIVER 209
• RTL FB GET SCREEN BASE - assigns pointer to an unsigned

long that is passed in to the base of the framebuffer to allow direct
modification by applications
• RTL FB GET SCREEN BASE SIZE - returns the size (in bytes)
of the framebuffer
• RTL FB SET SCREEN OFFSET - takes a pointer to 2 unsigned

longs. These represent the X and Y offsets to set. Used to pan the
display or re-orient it.
See the man page for rtl put pixel for further operations on framebuffer
devices.
F.4.3 Examples
Before operating on these devices the framebuffer device must be opened and
the screen must be configured for the proper resolution and bitdepth. This
interfaces do not allow changing of these parameters at this time. To do this,
one should use the fbset utility from the command line before opening the
device.
It is generally recommended that framebuffers be configured to 16 or 24-
bit depth since palette management routines are not supported under this
interface yet. The interface is also geared towards dealing with colors in
R/G/B triples since all calls take them as arguments.
F.5 IEEE-1284 – Parallel Port Digital IO Driver

This driver supports most parallel port devices for digital IO only. It does
not support parallel communication. The driver follows the standard API
conventions for digital IO in RTCore. Please see the section that describes
this for details.
F.6 Power Management Driver

The RTCore power management driver provides the ability to change the
processor frequency (when the hardware supports this) on a per-thread basis
or immediately and allow power saving by putting the CPU into an idle state
when the system is not active
F.7 Frequency changing

This driver allows per-thread and immediate changing of CPU frequency.
Please see the man page for gpos freq list for details on this feature.
F.8 CPU Idle calls

When it starts, the power management driver creates a RTCore thread that is
lower priority than all other RTCore tasks and lower priority than the GPOS
itself. This low priority thread enters the processor “sleep” or power-saving
mode when it executes. How it does this is configured via the driver.
When the RTCore system is idle (no realtime threads need to execute)
the GPOS finishes any processing that it must do and then notifies the power
management driver that it is now idle. The driver then changes the priority
of the power saving thread so that it executes at higher priority than the
GPOS. Once executing, the power-saving thread enters the CPU specific
power saving (idle) mode and waits for an interrupt. If a realtime interrupt
or realtime thread become active and need to execute during this time they
are immediately scheduled with no delay. The power-saving thread only
affects executing of the GPOS.
When the GPOS is allowed to execute again is configured in the power
management driver. It can be much more efficient to not allow the GPOS
execute again for a given period of time rather than switching between the
GPOS and the power-saving mode rapidly. In the driver is a configurable
parameter to set a minimum amount of time that the system will spend in
power-saving mode once it enters it before allowing the GPOS to run again.
F.9 Additional Uses

For information on additional features or functionality please email us at:
support@fsmlabs.com. The power management driver is designed to be
a power management infrastructure that is highly flexible to allow you to
meet your performance and power consumption needs. To do achieve this we
F.10. PPS DRIVER 211
have provided the power management tools but not a complete and generic
solution to every problem “out of the box”. We’re happy to assist you in
configuring and setting up your power management, please contact us.
F.10 PPS driver

The pulse-per-second (PPS) driver included with RTCore creates a clock
(CLOCK PPS) that is synchronized with an external time source.
F.11 Input
PPS driver takes input as a signal from a digital IO device that transitions
from low to a high state once per second on the boundry of every second.
A second input signal can be used for cross-checking. For example, the
driver is currently setup to allow a #define change in the source code to allow
the primary PPS signal (for example, a rubidium clock) and a secondary PPS
signal (for example, a GPS) to be checked against one another. If they two
differ by more than 20 microseconds it is reported.
F.12 Timing and how it works

The PPS driver re-calculates how many processor timer ticks elapse between
each PPS signal. This gives a calibration of how “off” the on-chip timer
is from the PPS source. The PPS driver then adjusts it’s estimate of how
long each timer tick takes. It then hides all this by presenting the user with
a CLOCK PPS abstraction. When a application requests a operation on
CLOCK PPS the calculation of “corrected” time takes place transparently.
For example, when making a call to read the current time with:

clock\_gettime( CLOCK\_PPS, &next );
The call returns the nearest estimated time that is synchronized with the
PPS clock. At each PPS pulse the driver is able to recalibrate it’s current
estimate of time by comparing the current estimate of time when the PPS
transition occurs. This deviation from the PPS time is then removed by
slowing or speeding the estimate of actual time during the next second by no
more than 1/4 second per-second and no less than 1/4 the difference between
the actual PPS transition and the estimated transition.
In practice the estimated clock differs by no more than 6 microseconds
with typical hardware and a stable PPS source (Rubidium clock or GPS).
F.13 Using the driver

The PPS driver can be used anywhere by replacing CLOCK REALTIME
with CLOCK PPS.
F.13.1 Starting th driver

To start the driver run it as any other application. The compiled in default
IO device to poll for the PPS signal can be over-ridden as a command-line
argument. For example pps.rtl /dev/dio1024 0.
F.13.2 Applications
It is often the case that an application will wish to set the absolute time before
using the PPS driver since the PPS driver only keeps CLOCK PPS in sync
with the PPS but does not initialized with any particular absolute time. Ini-
tially, CLOCK PPS starts with the value returned by CLOCK REALTIME.
If an external time source is available with the PPS (a GPS date/time
string, for example) an application can parse it and then set the absolute
CLOCK PPS time with:
struct timespec ts;

/* ...set ts from some source... */
clock\_settime( CLOCK\_PPS, &ts );
At the next PPS signal the PPS driver will set CLOCK PPS to the time
pass in by ts. Once the call to clock settime has been made the PPS driver
will report CLOCK PPS as being invalid.
To check for the state of CLOCK PPS and wait until it is valid and
ready for use (the set date operation has completed, the PPS signal has been
acquired or setup is complete):
F.14. CAVEATS 213
while ( !rb\_pps\_is\_valid ) {
printf("Rb clock is not valid yet, waiting 10 seconds.\n");
usleep(10*1000);
}
The same can be done with gps pps is valid to check the secondary
PPS signal source.
Apart from that, one can use CLOCK PPS just as one would use CLOCK REALTIME.
In order to setup for a period every 625 microseconds that is aligned with an
even PPS boundry one can use:
struct timespec next, period = {0, 625000};
/* get the current time and setup so the first wakeup is on a PPS transition */
clock\_gettime( CLOCK\_PPS, &next );
next.tv\_nsec = 0;
next.tv\_sec += 1;
while ( 1 ) {
/* sleep */
clock\_nanosleep( CLOCK\_PPS, TIMER\_ABSTIME,
&next, NULL);
if ( !rb\_pps\_is\_valid ) {
printf("Lost Rb pulse, shutting down.\n");
return NULL;
}
/* setup for the next cycle */

timespec\_add( &next, &period );
}
F.14 Caveats
F.14.1 Jitter value
The PPS driver polls the digital IO device that the PPS signal comes in on.
Since missing a pulse would result in inaccurate time calculation the thread
that polls the device runs at the highest priority and does the polling with
interrupts disabled. This means no other thread can run while the driver is
waiting for the PPS transition. This means that the time spent polling must
be kept to a minimum to allow other threads to run normally.
The driver source includes an estimated jitter value that can be adjusted.
The polling thread will estimate then the next PPS signal will arrive and
schedule itself to wakeup and start polling at 1/2 this jitter value before it
expects the PPS. This ensures that scheduling jitter will not prevent the
thread from catching the PPS signal. The polling thread will wait for the
full estimated jitter time for the PPS signal. If it does not see the signal in
that time it will either abort time synchronization or it will estimate when
it expected the PPS and continue on, hoping to catch the next PPS signal
(depending on which is configured in the source of the driver).
This jitter value is critical to the performance of the system and varies
greatly with different hardware. It is suggested that the end-user adjust this
to reflect their own system.
F.14.2 SMP systems

The PPS driver works just as well on multi-processor systems as it does on
single-processor systems as long as the time on processors is synchronized.
This is the case with most modern systems.
The polling thread (mentioned above) will only need to run on a single
CPU and can synchronize time for all processors assuming that every pro-
cessor timer runs at the same rate. Even minor differences in timer speed
among processors can lead to CLOCK PPS estimates to become inaccurate
for some processors over a long time. It is suggested that end-users make
sure that the processor clocks run at the same rate, are phase-locked or some
other measure is taken to make sure that the time remains in sync.
F.15 Serial driver

RTCore supports a driver that controls real-time serial hardware through a
POSIX interface. This includes tcgetattr(), tcsetattr(), and the normal
I/O functions open(), close(), read(), and write().
Applications that use the serial driver must include the following headers
at least:
F.15. SERIAL DRIVER 215
#include <time.h>
#include <unistd.h>
This provides knowledge about struct termios which is needed to set

up the serial port. Once the user has an open file descriptor for the serial
device, various flags should be set to configure the port. The following is
example setup code for a user:
struct rtl_termios term;
fd = open("/dev/ttyS0", O_NONBLOCK);
if (fd < 0) {
printf("Unable to open serial device\n");
return -1;
}
tcgetattr (fd, &term);

term.c_cflag = (CS8 | CREAD | CLOCAL);
term.c_use_fifo = 1;
term.c_fifo_depth = 4;
cfsetospeed (&term, B38400);
tcsetattr (fd, TCSANOW, &term);
First, the user opens the device with open() to get a valid file descriptor.
With this, they can now call tcgetattr() to get a filled-out termios structure
with details on the port configuration.
Using this, the user must then set the correct settings for their hardware.
Most settings can be left as above. Most users can leave the hardware FIFOs
enabled at the given depth. Any speed supported by the hardware can be
specified, and the specific settings can be found in the termios.h header.
Speed values can also be set explicitly with cfsetospeed(), which takes
a struct termios pointer and a speed value, such as B115200.
Once the settings are configured, they are applied to the port with tcsetattr().
This enables/disables any settings specified, and prepares the port for work.
Now normal read()/write() calls can be used to read and write data to and
from the port. Calls to read() will return any data already buffered in from
the driver, and calls to write() will write what is possible at the moment,
and buffer the rest so the calling thread can continue on with other work. The
size of the write buffer can be modified if needed with RTL SERIAL BUFSIZE.
When the user is done with the device, the file should be closed with a
normal close() call.
Note that the driver for the serial driver is intended to be as simple as
possible. It does cover a wide range of hardware on multiple architectures,
but may require minimal changes to work on certain hardware. As serial
hardware rarely varies, modifications are rarely needed.
For example code, please refer to the drivers/serial directory provided
with RTCore. This contains an example that will route data from point to
point between two real-time threads, using FIFOs to interact with non-real-
time data providers and receivers on both ends.
F.16 VME driver

Provided with RTCore is a VME driver for the Tundra Universe II PCI/VME
bridge. This provides hard real-time support for VME activity - as of this
release the driver supports:
• VME interrupts
• Access to A16, A24, and A32 address spaces, using the access methods
defined by the VME specification - D8, D16, D32 and D64.
• Master and slave configurations for the above
• Supervisor and user mode accesses
• DMA transfers
• BLT transfers and, for D64 operations, MBLT transfers
Examples are provided in drivers/examples/vme - these demonstrate

how to handle VME interrupts, set up master and slave windows to the
various address spaces, including how to perform DMA transfers in the A32
address space.
The driver’s operation is very simple - all access is done with the device
/dev/vme 0 with open(), close(), mmap(), munmap(), and ioctl() calls.
Users open the device with open() in order to get a file descriptor that can
F.16. VME DRIVER 217
be used to set up interrupt handlers (via ioctl()) or get access to an address

space (via mmap()).
If users require multiple windows onto the VME address space, such as
one A16 window and one A32 window, it is required that separate open()
calls are performed Each window is accessed via a separate mmap() call.
Please see drivers/examples/vme/ for specific examples of how to use
the VME interface.
To open the VME device:
int fd;
fd = open("/dev/vme_0", O_RDWR);
F.16.1 Interrupts
To register an interrupt handler for VME interrupts, once the device has
been opened:
void int_handler(int level)

{
}
ioctl(fd, RTL_VME_REG_INT, (unsigned long)int_handler);
To generate VME interrupt number corresponding to the variable num:
int num = 1;
ioctl( fd, RTL_VME_TRIGGER_INT, num );
F.16.2 Slave memory regions

One must allocate local host memory before advertising it on the VME bus.
It can be done through the rtl gpos malloc() or a shared memory region
can be used that can then be shared with user-processes or other RTCore
threads. Below, is an example with shared memory:
int shm_fd;
void *raddr;
unsigned long size = 4<<10;
shm_fd = shm_open("/dev/vme_super_shm", RTL_O_CREAT | RTL_O_DMA, 0777);

ftruncate(shm_fd, size);
raddr = mmap( 0, size, PROT_READ|PROT_WRITE, MAP_SHARED, shm_fd, 0 );
Once the memory is allocated, it can be advertised as a slave region on

the VME bus. The example below shows the memory being advertised on the
VME bus in the A32 space allowing 8-byte access. It is possible to use any
combination of A32, A24, D16 with D64, D32, D16, D8 and supervisor/user
mode.
char *vme_ptr;
vme_ptr = mmap( (void *)raddr, size, 0,

MMAP_VME_A24|MMAP_VME_D8|MMAP_VME_SUPER|MMAP_VME_SLAVE|MMAP_VME_DATA,
fd, vme_addr )
Any VME device on the bus may access this memory at the address
represented by the variable vme addr. However, the local host cannot access
this memory through the VME bus pointer. It must be accessed as local
memory (variable raddr in the example above.
F.16.3 Master memory regions

The code below shows how get a pointer to a master region of VME memory.
The VME bus address is represented by the variable vme addr and the size of
the window in size. The flags passed into mmap() can be chosen to allocate
any combination of 32/16/8 bit access and supervisor/user.
char *vme_ptr;
vme_ptr = rtl_mmap( 0, size, 0,

RTL_MMAP_VME_A32|RTL_MMAP_VME_D32|RTL_MMAP_VME_SUPER|RTL_MMAP_VME_DATA,
fd, vme_addr );
F.16. VME DRIVER 219
F.16.4 DMA transfers

DMA transfer into local memory from remote VME memory and to a remote
VME address from local memory is supported. The example below shows a
transfer from local memory into a memory address on the VME bus. The
variable buffer points to a local region of memory that is DMA transferable.
#include <vme.h>
struct vme_dma_desc_s desc;
char *buffer;
unsigned long vme_addr;
desc.vme_addr = (void *)vme_addr;

desc.local_addr = (void *)buffer;
desc.count = size;
desc.flags = RTL_MMAP_VME_A32|RTL_MMAP_VME_D32;
rtl_ioctl( fd, RTL_VME_DMA_TOVME, &desc );
The following example shows a transfer from a remote region of memory

to a local buffer.
struct vme_dma_desc_s {
void *vme_addr, *local_addr;
unsigned long count;
} desc;
char *buffer;
unsigned long vme_addr;
desc.vme_addr = (void *)vme_addr;

desc.local_addr = (void *)buffer;
desc.count = size;
rtl_ioctl( fd, RTL_VME_DMA_FROMVME, &desc );
F.16.5 Performance
Every effort has been made in this driver to take advantage of the hardware
to provide the fastest and lowest latency transfers and interrupts. However,
optimizing for every case and configuration in a general driver is difficult.
Small changes in the driver or your application may cause performance dif-
ferences.
If you have any questions about optimizing for your application or if
you see performance problems (or less than what you would expect) please
contact us via email at support@fsmlabs.com.
For the fastest transfers possible one should try to use DMA mode wher-
ever possible. In addition, it is better to use D64 when the remote device
supports it. If that is not possible, the largest data operation size is preferable
to smaller ones since this can make a huge performance difference.
Appendix G
The RTCore POSIX namespace
RTCore now provides a fully decoupled and clean POSIX namespace for
real-time applications. Historically, it has provided POSIX names to users,
in addition to names from the Linux or BSD namespace. This means that
users also brought Linux and BSD kernel structures and functions into their
application. As of RTLinuxPro 2.1, this behavior is deprecated by default,
and users do not get GPOS headers unless explicitly requested.
Users are encouraged to read this section to fully understand the system,
but can refer to section G.6 for quick details.
This appendix details the usage of these new headers, and implications for
users porting applications from versions before RTLinuxPro 2.1. The impact
for existing users has been minimized as much as possible while providing a
clean POSIX environment. It is recommended that both new and existing
users review this chapter to ensure familiarity with how to handle clean and
’polluted’ applications.
G.1 Clean applications

Clean applications are ones that use only the RTCore-provided POSIX envi-
ronment, and do not depend on any names, functions, etc., from the GPOS.
These applications can build with the usual CFLAGS and include paths pro-
vided by the rtl.mk file. For example, consider the following simple app:
#include <stdio.h>
221
222 APPENDIX G. THE RTCORE POSIX NAMESPACE
#include <unistd.h>
int ret;
ret = mkfifo("/test", 0777);
if (ret == ENOSPC) {
printf("Error, no space for FIFO\n");
return -1;
}
rtl_main_wait();
unlink("/test");
return 0;
}
This creates a fifo and waits, then unlinks it on exit. It is entirely POSIX
and does not require services from the GPOS, whether it is functions, defined
values, etc. It can be built with a simple Makefile:
include path_to/rtl.mk
all: test.rtl
clean:
rm -f *.rtl
include $(RTL_DIR)/Rules.make
Clean applications can use pure POSIX names as above, or rtl prefixes
for all POSIX functions, and RTL for all POSIX defined values. For exam-
ple, it can use RTL ENOSPC instead of ENOSPC, or rtl mkfifo() instead of
mkfifo(). This can be done interchangably in a clean application.
G.2 Polluted applications

Polluted applications are those that use the POSIX environment for real-
time applications, but also need to use supporting services provided by the
non-real-time system. Here is the previous clean program under Linux, using
Linux function calls to set up a non-real-time interrupt handler.
G.2. POLLUTED APPLICATIONS 223
#include <gpos_bridge/sys/gpos.h>
#include <linux/sched.h>
#include <rtl_stdio.h>
#include <sys/rtl_types.h>
#include <sys/rtl_stat.h>
#include <rtl_unistd.h>
void *dev_id = "test";

void gpos_handler(int irq, void *dev, struct pt_regs *regs) {
return;
}

int ret;
ret = rtl_mkfifo("/test", 0777);
if (ret == RTL_ENOSPC) {
rtl_printf("Error, no space for FIFO\n");
return -1;
}
request_irq(4, gpos_handler, SA_SHIRQ, "test", dev_id);
rtl_main_wait();
free_irq(4, dev_id);
rtl_unlink("/test");
return 0;
}
The program now requires some Linux headers. RTCore provides a main
file that users can include to get most of the GPOS namespace by default
- gpos bridge/sys/gpos.h. This does not include all of the namespace,
but it does get a large portion of it. For this example, we include that file
and Linux’s sched.h for the request irq() and free irq() function prototypes,
along with everything else they need. If you include gpos.h first, it will define
KERNEL for you, but if you are only including a few Linux headers by
hand, you will need to add a #define KERNEL before any Linux headers.
The other major difference is that since there is known pollution, any
code dealing with RTCore is changed to use an RTL or rtl prefix (including
POSIX include files). This ensures that you get the RTCore function, and
avoid any overlap with Linux-provided names. In this example, it was not
necessary - the include files needed for Linux support do not overlap with
anything being used in the real-time space, and the names could have re-
mained unchanged. However, this is not always the case, and using the rtl
prefix ensures that you always get the right name, regardless of what the
GPOS has defined, what other patches may have added, etc. This ambiguity
only arises in polluted applications.
This program can be built the same as the previous example - it does not
require any extra build logic to do so. By default, though, it will generate a
warning - please see section G.4.3 for details on how to supress this.
For some applications, it may be better to avoid this pollution, and split
the application into two components - one that is a GPOS kernel module,
entirely polluted, and one that is RTCore-based only, non-polluted. This
prevents any possible confusion about naming ambiguities, and provides a
clean separation between real-time kernel components and non-real-time ker-
nel components. The build system does not enforce this, though, as it may
not be desirable for some users.
G.3 PSDD users

PSDD users are by definition polluted, as they are sharing their POSIX
namespace with userspace applications. However, this does not mean that
there is ambiguity - the same simple rules apply. As with previous releases
of RTLinuxPro, PSDD users build applications with USER CFLAGS, which
provides the correct include information. The difference now is that all pieces
of code that use RTCore-provided services must use rtl or RTL for the
function.
For example, a normal userspace application uses stdio.h:
#include <stdio.h>
An application with a PSDD component must now include the rtl pre-
fixed version of the file for RTCore services:
#include <rtl_stdio.h>
Other standard POSIX includes also follow the same formula:
#include <sys/rtl_types.h>
G.4. INCLUDE HIERARCHIES AND RULES 225
In each case, the rtl prefixed file provides the rtl prefixed POSIX func-
tions and names. When writing code, the same rules apply - here are two
lines which open two files - one GPOS file, and one real-time device:
int fd, rtl_fd;

fd = open("/gpos_file", O_NONBLOCK);
rtl_fd = rtl_open("/rtcore_device", RTL_O_NONBLOCK);
The same mechanism applies for read(), write(), pthread create(), and
so on, for the rest of the POSIX functions provided to PSDD by RTCore.
Note that a file descriptor associated a GPOS file (fd in this example) is not
interchangable with an RTCore file descriptor - each system has their own
set of file descriptors, thread identifiers, etc.
G.4 Include hierarchies and rules

RTCore provides 3 main hierarchies of headers:
• app
• rtcore
• gpos bridge
Each one provides a specific set of information:
G.4.1 app/
This provides the pure POSIX environment for users. In that directory is
a set of POSIX-comliant files that provide standard names and functions.
Below that is an rtl directory that provides those same names with the rtl
prefix. In-kernel users get this and the subdirectory in their include path by
default, PSDD users only get the subdirectory (with the rtl prefixes.)
A user that does this:
#include <string.h>
gets standard functions like ’strcmp()’. This also includes the rtl string.h
file, which provides the real ’rtl strcmp()’ function. This allows the two to
be interchangable in clean code, but polluted code can include ’rtl string.h’
and get the RTCore name only. This also allows users to include ’rtl errno.h’
and get RTCore’s errno set, without conflicting with any other provided set
of values.
Users get this directory in their include path by default, along with the
rtl subdirectory, so no include path additions are needed.
G.4.2 rtcore/
This directory is internal to RTCore, and will not be visible to most users.
This contains internal header information specific to RTCore, and does not
provide supported interfaces to RTCore. Users who get this directory tree
and do need files in this directory can uniquely get these files with:
#include <rtcore/file_x.h>
G.4.3 gpos bridge/

Users who need to use GPOS-provided names can use facilities provided in
this directory. For example, a user who simply wants to get as much as
possible from the GPOS namespace can do:
#include <gpos_bridge/sys/gpos.h>
As we saw earlier, this gets a lot of names from the GPOS without having
to explicitly name them. You will see a warning that GPOS headers are being
used, but this can be supressed by adding this to the top of the file:
#define __RTCORE_POLLUTED_APP__
This must be done before including any files, so the entire compilation
knows that GPOS pollution is intended. When you do include GPOS files,
the build will see this, but will not warn you about the fact that pollution
has been detected.
G.5. INCLUDING GPOS FILES 227
G.5 Including GPOS files

GPOS files can be included directly, in addition to the gpos.h file discussed
above. However, the GPOS generally does not expect the namespace to be
populated. Because of this, any GPOS-specific files (such as sched.h in our
earlier example) must be include first, before including RTCore files. The
RTCore header system will determine if the GPOS has created any names,
and handle them.
Users who explicitly include GPOS kernel files without including gpos.h
must define KERNEL before including the file, as most GPOS kernel
headers expect this to be defined. (gpos.h defines it for you.)
Users can also include rtcore app.h - this identifies the user as an appli-
cation, and enables build checks to detect GPOS pollution. However, the
RTCore-provided POSIX files in app/ include this by default, so the check
is generally done transparently.
G.6 Quick rules

G.6.1 Older apps that must be polluted
Users with older applications that would like to remain polluted should follow
these steps:
• Use #include <rtl stdio.h> and similarly prefixed names for POSIX
includes, instead of #include <stdio.h>, for example.
• Use rtl and RTL prefixes before RTCore-based function calls, con-
stants, etc.
• Include GPOS-specific files before RTCore include files. If you are

not going to include gpos bridge/sys/gpos.h, you will need to add a
#define KERNEL before including GPOS headers.
• Add #define RTCORE POLLUTED APP to the front of all polluted .c

files, before includes, to supress any build warnings.
• #include <rtl.h> can be used to easily get much of the pollution

that was previously present.
G.6.2 Older users that want to avoid pollution

Users that want to easily avoid pollution can simply split pieces that depend
on GPOS-specific headers into a separate .c file, compile it in a separate step,
and link it to the application.
G.6.3 Exceptions
The RTCore header system can handle most of what a GPOS may define
for it, but there are some rare cases where the GPOS may provide a very
corrupted set of names. These usually result in POSIX names being redefined
to version-specific internal names from the GPOS, which may cause problems
during a build.
An example of this is signal.h - this may redefine POSIX names to in-
ternal structures, and can corrupt the namespace in applications that need
to be polluted. In PSDD applications, the non-real-time GPOS thread may
need to use signal.h and the functions it provides, so for this rare case, it is
recommended that users do the following:
#include <stdio.h>
#include <fcntl.h>
...
#include <signal.h>
/* Handle any GPOS specific signalling functions here */

void cleanup(void (*clean)(int)) {
struct sigaction sigact;
sigset_t set;
sigemptyset(&set);
sigact.sa_mask = set;
sigact.sa_handler = clean;
sigact.sa_flags = 0;
sigaction(SIGCONT, NULL, &sigact);

}
/* Now include RTCore headers */

G.6. QUICK RULES 229
#include <rtl_signal.h>
...
It should be stated that this is a very rare case, and will not appear in
most applications. However, if it does, and pollution is a requirement, such
as in a userspace real-time application, the best method is to include GPOS
headers, perform work with those headers, and then include RTCore headers.
Appendix H
System Testing
When selecting a platform for RTCore the only way to know if it will really do
the job for you is to test on the actual hardware. While the test environment
does not have to exactly mirror the target environment, the closer you get to
the final system, the more reliable the results will be. In general the outcome
of these tests provide answers to three essential questions:
1. Can I run RTCore on this system at all or is it simply not suited?
2. What is the worst case scheduling jitter to be expected in this system

setup?
3. What interrupt response may I expect from my peripheral devices?
This will not eliminate the requirement to evaluate the final system setup
you wish to deploy, but it will minimize the risk of running into hardware
related problems during project development.
H.1 Running the regression test

The regression test will let you know if RTCore will operate properly on the
selected system. If the regression test fails the system is either not installed
correctly or is simply not suitable for real-time work. If the regression test
fails for you please contact support@fsmlabs.com.
After you compile and load the updated kernel change to the rtlinuxpro
directory and issue the following command:
231
232 APPENDIX H. SYSTEM TESTING
bash scripts/regression.sh
This will then run a set of tests, which MUST all return the status [ OK
]. If any of the test fail, contact support@fsmlabs.com. If the first test passed
without any errors, running the regression test for a while is generally helpful
also. To run the test in an infinite loop issue the following command again
from the rtlinuxpro directory:
bash scripts/long_regression.sh
(This is the normal regression script run in an infinite loop, printing the
number of runs completed as it goes.)
H.1.1 Stress testing

The idea of testing under heavy load cannot be stressed enough. It is im-
portant to see how the real-time system behaves in terms of scheduling when
placed under varying loads. Some jitter shift will occur due to hardware load,
but this should be minimal.
Running the jitter test is easy - change directories to the rtlinuxpro/
measurement directory. There you will find a ’jitter.rtl’ binary. Run this,
and it will print out worst case timings seen so far on each CPU. It will only
print a message when a new value is seen.
At this point, the real-time threads are running, and it’s time to place the
machine under load. This can vary greatly by hardware, but here is a basic
start. It is important to put the machine under heavy interrupt, memory,
and CPU load. First, change to the kernel directory and run:
make dep
make clean
make -j 60
And/or on another console, log in and run several instances of:
find / 2>&1 > /dev/null &
This will add to the thrashing of the GPOS VM. Increase the number
of find processes running, preferably staggered in time so that the buffer
cache is cycled through. Add other applications until swapping is induced,
H.2. JITTER MEASUREMENT 233
and the system is under heavy load. For SMP machines, it helps to have
more instances running, as each CPU thrashes over the PCI bus. For some
embedded boards, running make on the kernel is not feasible, but a high
number of finds is a good approximation, when done in conjunction with the
next step.
Finally, run a ping flood (ping -f machine) from another machine on
the network, at least over a 100Mbit wire. This, in addition to the disk work,
will put the machine under heavy interrupt load. Feel free to add more work,
as RTCore will handle the load. It is important that you determine what
your hardware is capable of doing with respect to real-time demands.
Many test applications from other vendors do very short tests, either in
time or number of interrupts (some as short as a minute). Due to potential
cache interactions and other factors, it is important that a test machine be
placed under load for a long time, preferably days. FSMLabs performs all
testing under heavy load for a period of at least 48 hours before releasing
any kind of performance numbers.
H.2 Jitter measurement

We just ran the jitter test, but let’s take a closer look at the mechanics of what
we’re after. Scheduling jitter is defined as the difference between the time
that code was scheduled to run and the actual point at which it executes.
Scheduling overhead and hardware latencies contribute to this value, and
while some jitter will nearly always happen, it is important to get a worst
case value for your hardware.
Note that most companies provide worst case numbers in terms of context
switch times. This number is in most cases useless except from a marketing
standpoint. Consider an absolute worst case in the real world, where a thread
needs to execute at time x. Context switch tells only a small part of the work
that must happen here. First, the timer interrupt needs to occur indicating
that it’s time to work. Then the scheduler needs to be woken in order to
determine who gets executed next. Finally there has to be a context switch
into the context of the thread that should be run.
RTCore is well optimized for these situations. When FSMLabs quotes
worst case numbers, it is the sum total of not just context switch, but all
three factors:
234 APPENDIX H. SYSTEM TESTING
interrupt latency + scheduling overhead + context switch = worst case
The previously run test involves a real-time thread scheduled on each

CPU to be run every 1000 microseconds. At each scheduling point, it cal-
culates the delta of how far off it was from the expected scheduling point.
The code will perform 1000 samples per second, and will push the results to
a handler that may dump results through to the controlling terminal.
In general, the load on the machine will not affect the running of the
real-time code, although high interrupt rates will cause a shift in the worst
case value. As we just covered in H.1.1, the machine should be placed under
heavy load in order to get an accurate worst case value. Once you have
gathered the data you need, kill the userspace application and unload the
real-time module.
Appendix I
Sample programs
Here we’ve collected the source code for all of the examples used in the book.
They are also be provided in the RTLinuxPro distribution.
I.1 Hello world

#include <stdio.h>
int main(void)
{
printf("Hello from the RTL base system\n");
return 0;
}
I.2 Multithreading
#include <stdio.h>
#include <unistd.h>
pthread t thread;

{
235
236 APPENDIX I. SAMPLE PROGRAMS

int count = 0; 10
while (1) {
timespec add ns(&next, 1000*1000);
&next, NULL);
count++;
if (!(count % 1000))
printf("woke %d times\n",count); 20
}
return NULL;
}
int main(void)
{
rtl main wait(); 30
return 0;
}
I.3 FIFOs
I.3.1 Real-time component
#include <stdio.h>
I.3. FIFOS 237
#include <unistd.h>
pthread t thread;
int fd1;
10
{
while (1) {
clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 20

&next, NULL);
write( fd1, "a message\n", strlen("a message\n"));

}
return NULL;
}
int main(void)
{ 30
mkfifo("/communicator", 0666);
fd1 = open("/communicator", O RDWR | O NONBLOCK);
ftruncate(fd1, 16<<10);
rtl main wait();

40
close(fd1);
unlink("/communicator");
return 0;
}
I.3.2 Userspace component

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

int fd;
char buf[255];
10
fd = open("/communicator", O RDONLY);
while (1) {
read(fd,buf,255);
printf("%s",buf);
sleep(1);
}
}
I.4 Semaphores
#include <stdio.h>
#include <unistd.h>
I.4. SEMAPHORES 239
pthread t wait thread;

pthread t post thread;
sem t sema;
void *wait code(void *t) 10

{
while (1) {
sem wait(&sema);
printf("Waiter woke on a post\n");
}
}
void *post code(void *t)

{
struct timespec next; 20
while ( 1 ) {
clock nanosleep( CLOCK REALTIME,

printf("Posting to the semaphore\n"); 30

sem post(&sema);
}
return NULL;
}
int main(void)
{
sem init(&sema, 1, 0);
40
pthread create(&wait thread, NULL, wait code, (void *)0);
pthread create(&post thread, NULL, post code, (void *)0);
rtl main wait();
pthread cancel(post thread);

pthread join(post thread, NULL);
pthread cancel(wait thread);

pthread join(wait thread, NULL); 50
sem destroy(&sema);
return 0;
}
I.5 Shared Memory

I.5.1 Real-time component
#include <time.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#define MMAP SIZE 5003
pthread t rthread, wthread; 10

int rfd, wfd;
unsigned char *raddr, *waddr;
void *writer(void *arg)

{
pthread setschedparam(pthread self(), SCHED FIFO, &p); 20
I.5. SHARED MEMORY 241
waddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE,

MAP SHARED,wfd,0);
if (waddr == MAP FAILED) {
printf("mmap failed for writer\n");
}

while (1) { 30
&next, NULL);
waddr[0]++;
waddr[1]++;
waddr[2]++;
waddr[3]++;
}
}
40
void *reader(void *arg)
{
pthread setschedparam(pthread self(), SCHED FIFO, &p);
raddr = (char*)mmap(0,MMAP SIZE,PROT READ|PROT WRITE,

MAP SHARED,rfd,0); 50
if (raddr == MAP FAILED) {
printf("failed mmap for reader\n");
}

while (1) {

&next, NULL); 60
printf("rtl_reader thread sees "
"0x%x, 0x%x, 0x%x, 0x%x\n",
raddr[0], raddr[1], raddr[2], raddr[3]);
}
}
wfd = shm open("/dev/rtl_mmap_test", RTL O CREAT, 0600);

if (wfd == −1) { 70
printf("open failed for write on "
return −1;
}
rfd = shm open("/dev/rtl_mmap_test", 0, 0);

if (rfd == −1) {
printf("open failed for read on "
return −1; 80
}
ftruncate(wfd,MMAP SIZE);
pthread create(&wthread, NULL, writer, 0);

pthread create(&rthread, NULL, reader, 0);
rtl main wait();
pthread cancel(wthread); 90
pthread join(wthread, NULL);
pthread cancel(rthread);
pthread join(rthread, NULL);
munmap(waddr, MMAP SIZE);
munmap(raddr, MMAP SIZE);
I.5. SHARED MEMORY 243
close(wfd);
close(rfd);
shm unlink("/dev/rtl_mmap_test");
return 0; 100
}
I.5.2 Userspace application

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#define MMAP SIZE 5003 10
int main(void)
{
int fd;
unsigned char *addr;
if ((fd=open("/dev/rtl_mmap_test", O RDWR))<0) {
perror("open");
exit(−1);
} 20
addr = mmap(0, MMAP SIZE, PROT READ, MAP SHARED, fd, 0);
if (addr == MAP FAILED) {
printf("return was %d\n",errno);
perror("mmap");
exit(−1);
}
while (1) {
printf("userspace: the rtl shared area contains" 30
" : 0x%x, 0x%x, 0x%x, 0x%x\n",

addr[0], addr[1], addr[2], addr[3]);
sleep(1);
}
munmap(addr, MMAP SIZE);
close(fd);
return 0;
} 40
I.6 Cancel Handlers

#include <time.h>
#include <unistd.h>
#include <stdio.h>
pthread t thread;
pthread mutex t mutex;
void cleanup handler(void *mutex) {

pthread mutex unlock((pthread mutex t *)mutex); 10
}
void *thread handler(void *arg)

{
pthread cleanup push(cleanup handler,&mutex);
pthread mutex lock(&mutex);
while (1) { usleep(1000000); }
pthread cleanup pop(0);
pthread mutex unlock(&mutex);
return 0; 20
}

pthread mutex init (&mutex, NULL);
I.7. THREAD API 245
pthread create (&thread, NULL, thread handler, 0);
rtl main wait();

pthread join (thread, NULL); 30
pthread mutex destroy(&mutex);
return 0;
}
I.7 Thread API

#include <time.h>
#include <stdio.h>

void *thread stack;
void *handler(void *arg)

{
printf("Thread %d started\n",arg); 10
if (arg == 0) { //first thread spawns the second
pthread attr init(&attr);
pthread attr setstacksize(&attr, 32768);
pthread attr setstackaddr(&attr,thread stack);
pthread create(&thread2,&attr,handler,(void*)1);
}
return 0;
} 20
thread stack = rtl gpos malloc(32768);

if (!thread stack)
return −1;
pthread create(&thread1, NULL, handler, (void*)0);
rtl main wait(); 30
rtl gpos free(thread stack);
return 0;
}
I.8 One Way queues

#include <time.h>
#include <stdio.h>
#include <unistd.h>
#include <onewayq.h>
DEFINE OWQTYPE(our queue,32,int,0,−1);

DEFINE OWQFUNC(our queue,32,int,0,−1); 10
our queue Q;
void *queue thread(void *arg)

{
int count = 1;

while (1) {
I.8. ONE WAY QUEUES 247
if (our queue enq(&Q,count)) {

printf("warning: queue full\n");
}
count++;
}
}
30
void *dequeue thread(void *arg)
{
int read count;

while (1) {
TIMER ABSTIME, &next, NULL); 40
read count = our queue deq(&Q);
if (read count) {
printf("dequeued %d\n",
read count);
} else {
printf("queue empty\n");
}
}
}
50
our queue init(&Q);
queue thread, 0);
dequeue thread, 0);
rtl main wait();

pthread cancel (thread1); 60

return 0;
}
I.9 Processor reserve/optimization

#include <stdio.h>
pthread t thread;
unsigned long mask = 0x2, oldmask;
sem t irq sem;
unsigned int irq handler(unsigned int irq, struct rtl frame *regs) {
rtl global pend irq(irq); 10
sem post(&irq sem);
return 0;
}
void *thread code(void *t) {
rtl irq set affinity(12, &mask, &oldmask);
while (1) {
sem wait(&irq sem); 20
printf("Got IRQ 12\n");
}
return NULL;
}
int main(void) {
I.10. SOFT IRQS 249
sem init(&irq sem, 1, 0);

pthread attr init(&attr); 30
pthread attr setcpu np(&attr, 1);
pthread attr setreserve np(&attr, 1);
rtl request irq(12, irq handler);
pthread create(&thread, &attr, thread code, 0);
rtl main wait();
pthread cancel (thread); 40

rtl free irq(12);
rtl irq set affinity(12, &oldmask, &mask);

sem destroy(&irq sem);
return 0;
}
I.10 Soft IRQs

#include <time.h>
#include <stdio.h>
#include <stdlib.h>
pthread t thread;
static int our soft irq;

{ 10
pthread setschedparam (pthread self(),
SCHED FIFO, &p);

while (1) {
rtl global pend irq(our soft irq);
}
return 0;
}
static int soft irq count;
void soft irq handler(int irq, void *ignore, 30

struct rtl frame *ignore frame) {
soft irq count++;
printf("Recieved soft IRQ #%d\n",soft irq count);
}

soft irq count = 0;
our soft irq = rtl get soft irq(soft irq handler,
"Simple SoftIRQ\n");
if (our soft irq == −1) 40
return −1;
rtl main wait();

rtl free soft irq(our soft irq);
return 0;
} 50
I.11. PSDD SOUND SPEAKER DRIVER 251
I.11 PSDD sound speaker driver

#define RTL RTC A 10
#define RTL RTC B 11
#define RTL RTC C 12
#define RTL RTC D 13
#define RTL RTC PORT(x) (0x70 + (x))

#define RTL RTC WRITE(val, port) do { rtl outb p((port),RTL RTC PORT(0)); \
rtl outb p((val),RTL RTC PORT(1)); } while(0)
#define RTL RTC READ(port) ({ rtl outb p((port),RTL RTC PORT(0)); \
rtl inb p(RTL RTC PORT(1)); }) 10
#include <rtl pthread.h>

#include <rtl unistd.h>
#include <rtl time.h>
#include <rtl signal.h>
#include <rtl errno.h>
#include <rtl stdio.h>
#include <sys/rtl io.h>
#include <sys/rtl types.h>
#include <sys/rtl stat.h> 20
#include <rtl fcntl.h>
#include <sys/rtl ioctl.h>
#include <unistd.h>
#define FIFO NO 3
#define RTC IRQ 8
int fd fifo;
int fd irq;
rtl pthread t thread; 30
char save cmos A;
char save cmos B;
static int filter(int x) {

static int oldx;
int ret;
if (x & 0x80) {
x = 382 − x;
} 40
ret = x > oldx;
oldx = x;
return ret;
void *sound thread(void *param) {

char data;
char temp;
struct rtl siginfo info; 50
while (1) {
rtl read(fd irq, &info, sizeof (info));
(void) RTL RTC READ(RTL RTC C); /* clear IRQ */
if (rtl read(fd fifo, &data, 1) > 0) {

data = filter(data);
temp = rtl inb(0x61);
temp &= 0xfc;
if (data) { 60
temp |= 3;
}
rtl outb(temp,0x61);
}
}
return 0;
}

char ctemp; 70
char devname[30];
sprintf(devname, "/dev/rtf%d", FIFO NO);
fd fifo = rtl open(devname, RTL O WRONLY|
I.11. PSDD SOUND SPEAKER DRIVER 253
RTL O CREAT|RTL O NONBLOCK);

if (fd fifo < 0) {
rtl printf("open of %s returned %d; errno = %d\n",
devname, fd fifo, rtl errno);
return −1;
} 80
rtl ioctl (fd fifo, RTF SETSIZE, 4000);
fd irq = rtl open("/dev/irq8", RTL O RDONLY);

if (fd irq < 0) {
rtl printf("open of /dev/irq8 returned %d; errno = %d\n",
fd irq, rtl errno);
rtl close(fd fifo);
return −1;
}
90
rtl pthread create (&thread, NULL, sound thread, NULL);
/* program the RTC to interrupt at 8192 Hz */

save cmos A = RTL RTC READ(RTL RTC A);
save cmos B = RTL RTC READ(RTL RTC B);
/* 32kHz Time Base, 8192 Hz interrupt frequency */

RTL RTC WRITE(0x23, RTL RTC A);
ctemp = RTL RTC READ(RTL RTC B);
ctemp &= 0x8f; /* Clear */ 100
ctemp |= 0x40; /* Periodic interrupt enable */
RTL RTC WRITE(ctemp, RTL RTC B);
(void) RTL RTC READ(RTL RTC C);

while (1) {
sleep(1000);
}
110
rtl pthread cancel (thread);
rtl pthread join (thread, NULL);
RTL RTC WRITE(save cmos A, RTL RTC A);

RTL RTC WRITE(save cmos B, RTL RTC B);
rtl close(fd irq);
rtl close(fd fifo);
return 0;
}
120

RTL Book1

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

RTL Book1

Hochgeladen von

Copyright:

Verfügbare Formate

Real-time Programming in RTCore

FSMLabs, Inc. Copyright Finite State Machine Labs Inc. 2001-2004

17th January 2005

3 Real-time Concepts and RTCore 27

3.4 Services Available to Real-Time Code . . . . . . . . . . . . . . 35

4 The RTCore API 39

4.8.4 CPU reservation . . . . . . . . . . . . . . . . . . . . . 59

6 Communication between RTCore and the GPOS 73

8 Tracing in RTCore 101

9 IRQ Control 105

10 Writing Device Drivers 113

II RTLinuxPro Technologies 121

12.3 Building and running PSDD programs . . . . . . . . . . . . . 127

13 Controls Kit (CKit) 145

14 RTLinuxPro Optimizations 161

15 SlickEdit IDE for RTCore 167

III Appendices 171

C Familiarizing with RTLinuxPro 189

D Important system commands 195

E Things to Consider 201

F RTCore Drivers 205

F.9 Additional Uses . . . . . . . . . . . . . . . . . . . . . . . . . . 210

G The RTCore POSIX namespace 221

H System Testing 231

I Sample programs 235

Real-time software is needed to run multimedia systems, telescopes, machine

UNIX (RTCoreBSD) or even a Java VM. 1

1.1 Some background

1.2 How the book works

2.2 Using RTCore

more grounding, skip ahead to Chapter 3 for background information.

2.2.1 Hello world

void *thread code(void *t)

clock gettime(CLOCK REALTIME, &next);

rtl main wait(); 30

Again, everything starts with a normal main() function. A standard

2.2.3 Basic communication

clock gettime(CLOCK REALTIME, &next);

clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 20

write( fd1, "a message\n", strlen("a message\n"));

fd1 = open("/communicator", O RDWR | O NONBLOCK);

pthread create(&thread, NULL, thread code, (void *)0);

rtl main wait();

int main(int argc, char **argv) {

Again, there should be no surprises here - this is a normal non-real-

2.2.4 Signalling and multithreading

pthread t wait thread;

void *wait code(void *t) 10

void *post code(void *t)

clock gettime(CLOCK REALTIME, &next);

clock nanosleep( CLOCK REALTIME,

printf("Posting to the semaphore\n"); 30

rtl main wait();

pthread cancel(post thread);

pthread cancel(wait thread);

Real-time Concepts and

3.1 RTOS kingdom/phylum/order

3.1.1 Non-real-time systems

• No guaranteed worst-case scheduling jitter. Under heavy system load,

3.1.2 Soft real-time

3.1.3 Hard real-time

• System time is a managed resource. Timing resources are managed

• Guaranteed worst-case scheduling jitter. If a task needs to be happen

void thread code(void t)

void wait code(void t) 10

void post code(void t)

void handler(void arg)

void thread handler(void arg)