Sie sind auf Seite 1von 50
Introduction to Linux Device Drivers Recreating Life One Driver At a Time Muli Ben-Yehuda mulix@mulix.org,
Introduction to Linux Device Drivers Recreating Life One Driver At a Time Muli Ben-Yehuda mulix@mulix.org,

Introduction to Linux Device Drivers

Recreating Life One Driver At a Time

Muli Ben-Yehuda

mulix@mulix.org, muli@il.ibm.com

IBM Haifa Research Lab and Haifux - Haifa Linux Club

muli@il.ibm.com IBM Haifa Research Lab and Haifux - Haifa Linux Club Linux Device Drivers, BIUX, April

Linux Device Drivers, BIUX, April 2006 – p.1/50

Why Write Linux Device Drivers?

For fun,

For fun,For profit (Linux is hot right now, especially embedded Linux), To scratch an itch. Because

For fun,

For profit (Linux is hot right now, especially embedded Linux), hot right now, especially embedded Linux),

To scratch an itch.(Linux is hot right now, especially embedded Linux), Because you can! OK, but why Linux drivers?

Because you can!right now, especially embedded Linux), To scratch an itch. OK, but why Linux drivers? Because the

OK, but why Linux drivers?

Because the source is available.an itch. Because you can! OK, but why Linux drivers? Because of the community’s cooperation and

Because of the community’s cooperation and involvement.an itch. Because you can! OK, but why Linux drivers? Because the source is available. Have

Have I mentioned it’s fun yet?OK, but why Linux drivers? Because the source is available. Because of the community’s cooperation and

the source is available. Because of the community’s cooperation and involvement. Have I mentioned it’s fun
the source is available. Because of the community’s cooperation and involvement. Have I mentioned it’s fun

Linux Device Drivers, BIUX, April 2006 – p.2/50

klife - Linux kernel game of life

klife is a Linux kernel Game of Life implementation. It is a software device driver,

klife is a Linux kernel Game of Life implementation. It is a software device driver, developed specifically for this talk.

klife is a Linux kernel Game of Life implementation. It is a software device driver, developed

The game of life is played on a square grid, where some of the cells are alive and the rest are dead.device driver, developed specifically for this talk. Each generation, based on each cell’s neighbors, we mark

Each generation, based on each cell’s neighbors, we mark the cell as alive or dead.where some of the cells are alive and the rest are dead. With time, amazing patterns

With time, amazing patterns develop.each cell’s neighbors, we mark the cell as alive or dead. The only reason to implement

The only reason to implement the game of life inside the kernel is for demonstration purposes.cell as alive or dead. With time, amazing patterns develop. Software device drivers are very common

Software device drivers are very common on Unix systems and provide many services to the

Software device drivers are very common on Unix systems and provide many services to the user. Think about /dev/null, /dev/zero, /dev/random, /dev/kmem

common on Unix systems and provide many services to the user. Think about /dev/null, /dev/zero, /dev/random,

Linux Device Drivers, BIUX, April 2006 – p.3/50

Anatomy of a Device Driver

Anatomy of a Device Driver A device driver has three sides: one side talks to the

A device driver has three sides: one side talks to the rest of the kernel, one talks to the hardware, and one talks to the user:

User

one talks to the hardware, and one talks to the user: User Kernel Device Driver Hardware
one talks to the hardware, and one talks to the user: User Kernel Device Driver Hardware
one talks to the hardware, and one talks to the user: User Kernel Device Driver Hardware

Kernel

Device Driver

and one talks to the user: User Kernel Device Driver Hardware Device File Linux Device Drivers,
and one talks to the user: User Kernel Device Driver Hardware Device File Linux Device Drivers,
and one talks to the user: User Kernel Device Driver Hardware Device File Linux Device Drivers,

Hardware

Device

File

one talks to the user: User Kernel Device Driver Hardware Device File Linux Device Drivers, BIUX,

Linux Device Drivers, BIUX, April 2006 – p.4/50

Kernel Interface of a Device Driver

In order to talk to the kernel, the driver registers with subsystems to respond to
In order to talk to the kernel, the driver registers with
subsystems to respond to events. Such an event might
be the opening of a file, a page fault, the plugging in of
a new USB device, etc.
Kernel
Event List
x File Open
x
Page Fault
x
Interrupt
x Hotplug
Device Driver

Linux Device Drivers, BIUX, April 2006 – p.5/50

User Interface of a Device driver

Since Linux follows the UNIX model, and in UNIX everything is a file, users talk

Since Linux follows the UNIX model, and in UNIX everything is a file, users talk with device drivers through device files . device files.

Linux follows the UNIX model, and in UNIX everything is a file, users talk with device

Device files are a mechanism, supplied by the kernel, precisely for this direct User-Driver interface.users talk with device drivers through device files . klife is a character device , and

klife is a character device , and thus the user talks to it through a character device file character device, and thus the user talks to it through a character device file.

The other common kind of device file is a block device file . We will only discuss character device files today. block device file. We will only discuss character device files today.

. The other common kind of device file is a block device file . We will
. The other common kind of device file is a block device file . We will

Linux Device Drivers, BIUX, April 2006 – p.6/50

Anatomy of klife device driver

The user talks with klife through the /dev/klife device file. klife talks to the kernel
The user talks with klife through the /dev/klife device file. klife talks to the kernel

The user talks with klife through the /dev/klife device file. /dev/klife device file.

The user talks with klife through the /dev/klife device file. klife talks to the kernel through
The user talks with klife through the /dev/klife device file. klife talks to the kernel through
The user talks with klife through the /dev/klife device file. klife talks to the kernel through

klife talks to the kernel throughThe user talks with klife through the /dev/klife device file.

The user talks with klife through the /dev/klife device file. klife talks to the kernel through
The user talks with klife through the /dev/klife device file. klife talks to the kernel through
The user talks with klife through the /dev/klife device file. klife talks to the kernel through

When the user opens /dev/klife, the kernel calls

klife’s open routine When the user closes /dev/klife, the kernel calls

klife’s release routine When the user reads or writes from or to /dev/klife -

you get the

its initialization function

and through register_chrdev

and through hooking into the timer interrupt

We will elaborate on all of these later- you get the its initialization function and through register_chrdev and through hooking into the timer

and through register_chrdev and through hooking into the timer interrupt We will elaborate on all of
and through register_chrdev and through hooking into the timer interrupt We will elaborate on all of

Linux Device Drivers, BIUX, April 2006 – p.7/50

Driver Initialization Code

s t a t i c i n t i n i t klife_module_init (

s t a t i c

i n t

i n i t

klife_module_init ( void )

 
s t a t i c i n t i n i t klife_module_init ( void

{

 

i

n t

ret ;

pr_debug ( " k l i f e

module

i n

i t

called \ n" ) ;

i

f

( ( ret

register_chrdev (KLIFE_MAJOR_NUM ,

"

k l i f e

& klife_fops ret ) ;

"

,

=

printk (KERN_ERR " register_chrdev:

%d \ n" ,

return

ret ;

 
 

}

ret ) ; " , = printk (KERN_ERR " register_chrdev: %d \ n" , return ret
ret ) ; " , = printk (KERN_ERR " register_chrdev: %d \ n" , return ret

Linux Device Drivers, BIUX, April 2006 – p.8/50

Driver Initialization

One function (init ) is called on the driver’s initialization. init) is called on the driver’s initialization.

One function ( init ) is called on the driver’s initialization.
One function ( init ) is called on the driver’s initialization.
One function ( exit ) is called when the driver is removed from the system.

One function (exit) is called when the driver is removed from the system.

Question: what happens if the driver is compiled into the kernel, rather than as a

Question: what happens if the driver is compiled into the kernel, rather than as a module?

The init function will register hooks that will get the driver’s code called when the

The init function will register hooks that will get the driver’s code called when the appropriate event happens.

Question: what if the init function doesn’t register any hooks?

Question: what if the init function doesn’t register any hooks?

There are various hooks that can be registered: file operations, pci operations, USB operations, network operations - it all depends on what kind of device this is.code called when the appropriate event happens. Question: what if the init function doesn’t register any

file operations, pci operations, USB operations, network operations - it all depends on what kind of
file operations, pci operations, USB operations, network operations - it all depends on what kind of

Linux Device Drivers, BIUX, April 2006 – p.9/50

Registering Chardev Hooks

struct file_operations klife_fops = {

struct

file_operations

klife_fops

=

{

struct file_operations klife_fops = {
 

.

owner = THIS_MODULE,

 

.

open =

klife_open ,

.

release

=

klife_release ,

.

read =

klife_read ,

 

write

=

k l i f e _ w r i t e ,

 

.

.mmap = klife_mmap ,

.

i o c t l

=

k l i f e _ i o c t l

 

}

;

.

.

.

i

f

( ( ret

register_chrdev (KLIFE_MAJOR_NUM ,

"

k l i f e

& klife_fops ) ) ret ) ;

"

,

=

printk (KERN_ERR " register_chrdev:

%d \ n" ,

<

0 )

e & klife_fops ) ) ret ) ; " , = printk (KERN_ERR " register_chrdev: %d
e & klife_fops ) ) ret ) ; " , = printk (KERN_ERR " register_chrdev: %d

Linux Device Drivers, BIUX, April 2006 – p.10/50

User Space Access to the Driver

We saw that the driver registers a character device tied to a given major number , but how does the user create such a file? major number, but how does the user create such a file?

major number , but how does the user create such a file? # mknod /dev/klife c

# mknod /dev/klife c 250 0

And how does the user open it?

if ((kfd = open("/dev/klife", O_RDWR)) < 0) { perror("open /dev/klife"); exit(EXIT_FAILURE);

}

And then what?

perror("open /dev/klife"); exit(EXIT_FAILURE); } And then what? Linux Device Drivers, BIUX, April 2006 – p.11/50

Linux Device Drivers, BIUX, April 2006 – p.11/50

File Operations

and then you start talking to the device. klife uses the following device file operations:File Operations open for starting a game (allocating resources). release for finishing a game (releasing resources).

device. klife uses the following device file operations: open for starting a game (allocating resources). release
device. klife uses the following device file operations: open for starting a game (allocating resources). release

open for starting a game (allocating resources). for starting a game (allocating resources).

release for finishing a game (releasing resources). for finishing a game (releasing resources).

write for initializing the game (setting the starting positions on the grid). for initializing the game (setting the starting positions on the grid).

read for generating and then reading the next state of the game’s grid. for generating and then reading the next state of the game’s grid.

ioctl for querying the current generation number, and for enabling or disabling hooking into the timer for querying the current generation number, and for enabling or disabling hooking into the timer interrupt (more on this later).

mmap for potentially faster but more complex direct access to the game’s grid. for potentially faster but more complex direct access to the game’s grid.

potentially faster but more complex direct access to the game’s grid. Linux Device Drivers, BIUX, April

Linux Device Drivers, BIUX, April 2006 – p.12/50

The open and release Routines

open and release are where you perform any setup not done in initialization time and
open and release are where you perform any setup not done in initialization time and
open and release are where you perform any setup not done in initialization time and

open and release are where you perform any setup not done in initialization time and any cleanup not done in module un- load time.

release are where you perform any setup not done in initialization time and any cleanup not

Linux Device Drivers, BIUX, April 2006 – p.13/50

klife_open

klife’s open routine allocates the klife structure which holds all of the state for this

klife’s open routine allocates the klife structure which holds all of the state for this game (the grid, starting positions, current generation, etc).

the klife structure which holds all of the state for this game (the grid, starting positions,

s t a t i c

i n t

klife_open ( struct

inode inode ,

struct

f i

l e

f i

l p

)

{

 

struct

k l i f e

k ;

i

n t

ret ;

ret

=

a l l o c _ k l i f e (&k ) ;

 

i

f

( ret )

 

return

ret ;

 

f i l p >private_data

=

k ;

return

0 ;

 

}

  i f ( ret )   return ret ;   f i l p −
  i f ( ret )   return ret ;   f i l p −

Linux Device Drivers, BIUX, April 2006 – p.14/50

klife_open - alloc_klife

s t a t i c i n t a l l o c _

s t a t i c

i n t

a l l o c _ k l i f e ( struct

k l i f e ∗ ∗

pk )

s t a t i c i n t a l l o c _ k

{

 

i

n t

ret ;

 

struct

k l i f e

k ;

k = kmalloc (

sizeof ( k ) , GFP_KERNEL) ;

 

i f

(

!

k )

 

return ENOMEM;

 
 

ret

 

=

i n i t _ k l i f e

( k ) ;

i

f

( ret )

{

 

kfree

( k ) ;

 

k

= NULL;

 

}

pk

=

k ;

return

ret ;

 
}

}

}

Linux Device Drivers, BIUX, April 2006 – p.15/50

klife_open - init_klife

s t a t i c i n t i n i t _ k

s t a t i c

i n t

i n i t _ k l i f e ( struct

k l i f e

k )

s t a t i c i n t i n i t _ k l

{

 

i

n t

ret ;

memset( k ,

0 ,

sizeof ( k ) ) ;

 

spin_lock_init (&k>lock ) ;

ret = ENOMEM;

 

/

one

page

to

be

exported

to

userspace

/

k>grid

= ( void ) get_zeroed_page (GFP_KERNEL) ;

 

i

f

( ! k>grid ) goto

done ;

 

k>tmpgrid

= kmalloc ( sizeof ( k>tmpgrid ) , GFP_KERNEL) ;

 

i

f

( ! k>tmpgrid )

 
 

goto

free_grid ;

 
>tmpgrid ) , GFP_KERNEL) ;   i f ( ! k − >tmpgrid )    
>tmpgrid ) , GFP_KERNEL) ;   i f ( ! k − >tmpgrid )    

Linux Device Drivers, BIUX, April 2006 – p.16/50

klife_open - init_klife cont’

k − >timer_hook . func = klife_timer_irq_handler ;

k>timer_hook . func

=

klife_timer_irq_handler ;

k − >timer_hook . func = klife_timer_irq_handler ;

k>timer_hook . data = k ;

 

return

0 ;

free_grid :

free_page ( ( unsigned

long ) k>grid ) ;

done :

return

ret ;

}

= k ;   return 0 ; free_grid : free_page ( ( unsigned long ) k
= k ;   return 0 ; free_grid : free_page ( ( unsigned long ) k

Linux Device Drivers, BIUX, April 2006 – p.17/50

klife_release

klife’s release routine frees the resource allocated during open time. s t a t i
klife’s release routine frees the resource allocated during
open time.
s t a t i c
i n t
klife_release ( struct
inode ∗ inode
,
struct
f i
l e
f i
l p
)
{
struct
k l i f e ∗
k
=
f i l p −>private_data ;
i
f
( k−>timer )
klife_timer_unregister ( k ) ;
i
f
( k−>mapped ) {
/ ∗
undo
setting
the
grid
page
to
be
reserved
∗ /
ClearPageReserved ( virt_to_page ( k−>grid ) ) ;
}
f
r e e _ k l i f e ( k ) ;
return
0 ;
}

Linux Device Drivers, BIUX, April 2006 – p.18/50

Commentary on open and release

Commentary on open and release Beware of races if you have any global data driver author

Beware of races if you have any global data driver author stumble on this point.Commentary on open and release Note also that release can fail, but almost no one checks

Note also that release can fail, but almost no one checks errors from close(), so it’s better if it doesn’thave any global data driver author stumble on this point. Question: what happens if the userspace

Question: what happens if the userspace program crashes while holding your device file open?one checks errors from close(), so it’s better if it doesn’t many a Linux Device Drivers,

many a

userspace program crashes while holding your device file open? many a Linux Device Drivers, BIUX, April

Linux Device Drivers, BIUX, April 2006 – p.19/50

write

For klife, I “hijacked” write to mean “please initialize the grid to these starting positions”.

For klife, I “hijacked” write to mean “please initialize the grid to these starting positions”.There are no hard and fast rules to what write has to mean, but it’s

For klife, I “hijacked” write to mean “please initialize the grid to these starting positions”.

There are no hard and fast rules to what write has to“please initialize the grid to these starting positions”. mean, but it’s good to KISS (Keep It

mean, but it’s good to KISS (Keep It Simple, Silly

)

There are no hard and fast rules to what write has to mean, but it’s good
There are no hard and fast rules to what write has to mean, but it’s good

Linux Device Drivers, BIUX, April 2006 – p.20/50

klife_write - 1

s t a t i c ssize_t k l i f e _ w r

s t a t i c

ssize_t

k l i f e _ w r i t e (

struct

f

i

l

e

f

i

l

p

const

char

user

ubuf ,

ubuf ,

 

size_t

count ,

, l o f f _ t

f_pos )

 
 

{

 

size_t sz ; char kbuf ; struct k l i f e k = f i l p >private_data ; ssize_t ret ;

 

sz = count > PAGE_SIZE ? PAGE_SIZE

 

:

count ;

kbuf

= kmalloc ( sz , GFP_KERNEL) ;

 

i f

( ! kbuf ) return ENOMEM;

 
 

Not trusting users: checking the size of the user’s buffer

 
f ( ! kbuf ) return − ENOMEM;     Not trusting users: checking the size
f ( ! kbuf ) return − ENOMEM;     Not trusting users: checking the size

Linux Device Drivers, BIUX, April 2006 – p.21/50

klife_write - 2

ret = − EFAULT;  

ret = EFAULT;

 
ret = − EFAULT;  
 

i

f

( copy_from_user ( kbuf ,

ubuf ,

sz ) )

 

goto

free_buf ;

 

ret

=

klife_add_position ( k ,

kbuf ,

sz ) ;

 
 

i

f

( ret

== 0)

 
 

ret

=

sz ;

 

free_buf :

 

kfree ( kbuf ) ;

 

return

ret ;

 

}

Use copy_from_user in case the user is passing a bad pointer.

:   kfree ( kbuf ) ;   return ret ;   } Use copy_from_user in
:   kfree ( kbuf ) ;   return ret ;   } Use copy_from_user in

Linux Device Drivers, BIUX, April 2006 – p.22/50

Commentary on write

Note that even for such a simple function, care must be exercised when dealing with

Note that even for such a simple function, care must be exercised when dealing with untrusted users. untrusted users.

Note that even for such a simple function, care must be exercised when dealing with untrusted

Users are always untrusted. .

Always be prepared to handle errors! be prepared to handle errors!

must be exercised when dealing with untrusted users. Users are always untrusted . Always be prepared
must be exercised when dealing with untrusted users. Users are always untrusted . Always be prepared

Linux Device Drivers, BIUX, April 2006 – p.23/50

read

For klife, read means “please calculate and give me the next generation”.
For klife, read means “please calculate and give me the next generation”.

For klife, read means “please calculate and give me the next generation”.The bulk of the work is done in two other routines: klife_next_generation calculates the next

The bulk of the work is done in two other routines:“please calculate and give me the next generation”. klife_next_generation calculates the next generation based

klife_next_generation calculates the next generation based on the current one, according to the rules of the game of life.The bulk of the work is done in two other routines: klife_draw takes a grid and

klife_draw takes a grid and “draws” it as a single string in a page of memory.klife_next_generation calculates the next generation based on the current one, according to the rules of the

to the rules of the game of life. klife_draw takes a grid and “draws” it as
to the rules of the game of life. klife_draw takes a grid and “draws” it as

Linux Device Drivers, BIUX, April 2006 – p.24/50

klife_read - 1

s t a t i c ssize_t  

s t a t i c ssize_t

 
s t a t i c ssize_t  

klife_read ( struct

f

i

l

e

f i

l p

,

char ubuf ,

size_t

count ,

l o f f _ t

f_pos )

{

 

struct

k l i f e

k l i f e

;

char page ;

 

ssize_t

len ;

 

ssize_t

ret ;

unsigned long

flags ;

 

k l i f e

=

f i l p >private_data ;

 

/

special

handling

for

mmap /

i f

( k l i f e >mapped)

 
 

return

klife_read_mapped ( f i l p ,

ubuf ,

count ,

f_pos ) ;

 

i f

( ! ( page = kmalloc (PAGE_SIZE , GFP_KERNEL) ) ) return ENOMEM;

l p , ubuf , count , f_pos ) ;   i f ( ! (
l p , ubuf , count , f_pos ) ;   i f ( ! (

Linux Device Drivers, BIUX, April 2006 – p.25/50

klife_read - 2

spin_lock_irqsave (& k l i f e − >lock , klife_next_generation ( k l i

spin_lock_irqsave (& k l i f e >lock , klife_next_generation ( k l i f e ) ;

flags ) ;

spin_lock_irqsave (& k l i f e − >lock , klife_next_generation ( k l i f

len =

klife_draw ( k l i f e ,

page ) ;

spin_unlock_irqrestore (& k l i f e >lock ,

flags ) ;

 

i f

( len

<

0 )

{

 

ret

=

len ;

 

goto free_page ;

 

}

/

len

can ’ t

 

be

negative

/

len

=

min ( count ,

( size_t ) len ) ;

 

Note that the lock is held for the shortest possible time. We will see later what the lock protects us against.

 
Note that the lock is held for the shortest possible time. We will see later what
Note that the lock is held for the shortest possible time. We will see later what

Linux Device Drivers, BIUX, April 2006 – p.26/50

klife_read - 3

i f ( copy_to_user ( ubuf , ret = − EFAULT; page , len )

i f

( copy_to_user ( ubuf , ret = EFAULT;

page ,

len ) )

{

i f ( copy_to_user ( ubuf , ret = − EFAULT; page , len ) )
 

goto

free_page ;

 

}

f_pos +=

len ;

 

ret

=

len ;

free_page :

kfree ( page ) ;

 

return

ret ;

}

copy_to_user in case the user is passing us a bad page.

 
free_page : kfree ( page ) ;   return ret ; } copy_to_user in case the
free_page : kfree ( page ) ;   return ret ; } copy_to_user in case the

Linux Device Drivers, BIUX, April 2006 – p.27/50

klife_read - 4

s t a t i c ssize_t klife_read_mapped ( struct f i l e ∗

s t a t i c ssize_t klife_read_mapped ( struct

f

i

l e

f

i

l

p

,

char ubuf ,

size_t

count ,

s t a t i c ssize_t klife_read_mapped ( struct f i l e ∗ f
 

l o f f _ t

f_pos )

 
 

{

 

struct

k l i f e

k l i f e

;

unsigned

long

flags ;

k l i f e

=

f i l p >private_data ;

 

spin_lock_irqsave (& k l i f e >lock ,

flags ) ;

 

klife_next_generation ( k l i f e ) ;

 

spin_unlock_irqrestore (& k l i f e >lock ,

flags ) ;

 

return

0 ;

 
 

}

Again, mind the short lock holding time.  

Again, mind the short lock holding time.

 
Again, mind the short lock holding time.  

Linux Device Drivers, BIUX, April 2006 – p.28/50

Commentary on read

Commentary on read There’s plenty of room for optimization in this code can you see where?

There’s plenty of room for optimization in this code can you see where?

plenty of room for optimization in this code can you see where? Linux Device Drivers, BIUX,

Linux Device Drivers, BIUX, April 2006 – p.29/50

ioctl

ioctl is a “special access” mechanism, for operations that do not cleanly map anywhere else.

ioctl is a “special access” mechanism, for operations that do not cleanly map anywhere else.It is considered extremely bad taste to use ioctls in Linux where not absolutely necessary.

ioctl is a “special access” mechanism, for operations that do not cleanly map anywhere else.

It is considered extremely bad taste to use ioctls in Linux where not absolutely necessary.for operations that do not cleanly map anywhere else. New drivers should use either sysfs (a

New drivers should use either sysfs (a /proc -like virtual file system) or a driver specific file system (you can write a Linux file system in less than a 100 lines of code).taste to use ioctls in Linux where not absolutely necessary. In klife, we use ioctl to

In klife, we use ioctl to get the current generation number, for demonstration purposes onlyfile system) or a driver specific file system (you can write a Linux file system in

than a 100 lines of code). In klife, we use ioctl to get the current generation
than a 100 lines of code). In klife, we use ioctl to get the current generation

Linux Device Drivers, BIUX, April 2006 – p.30/50

klife_ioctl - 1

s t a t i c i n t k l i f e _
s t a t i c
i n t
k l i f e _ i o c t l ( struct
inode ∗ inode ,
struct
f i
l
e
f i
l e
,
unsigned
i n t
cmd ,
unsigned
long
data )
{
struct
k l i f e ∗
long
k l i f e
gen ;
=
f i l e −>private_data ;
unsigned
i enable ;
n t
i ret ;
n t
unsigned
long
flags ;
ret
=
switch
0 ;
( cmd )
{
case KLIFE_GET_GENERATION:
spin_lock_irqsave (& k l i f e −>lock ,
flags ) ;
gen
=
k l i f e −>gen ;
spin_unlock_irqrestore (& k l i f e −>lock ,
flags ) ;
i f
( copy_to_user ( ( void ∗ ) data , & gen ,
ret = −EFAULT;
sizeof
(gen ) ) )
{
goto
done ;
}

Linux Device Drivers, BIUX, April 2006 – p.31/50

klife_ioctl - 2

break ; case KLIFE_SET_TIMER_MODE:  

break ; case KLIFE_SET_TIMER_MODE:

 
break ; case KLIFE_SET_TIMER_MODE:  
 

i

f

( copy_from_user(&enable , ret = EFAULT;

( void ) data ,

sizeof ( enable ) )

 

goto

done ;

 

}

pr_debug ( " user

request

to

%s

timer

mode\ n" ,

 

enable ?

" enable "

:

"

disable " ) ;

 

i

f

( k l i f e >timer

&& ! enable )

 

klife_timer_unregister ( k l i f e ) ;

 
 

else

i f

( ! k l i f e >timer

&& enable )

 

k l i f e _ t i m e r _ r e g i s t e r ( k l i f e ) ;

 

break ;

 
 

}

 

done :

 

return

ret ;

 
 

}

_ r e g i s t e r ( k l i f e )
_ r e g i s t e r ( k l i f e )

Linux Device Drivers, BIUX, April 2006 – p.32/50

memory mapping

The read-write mechanism, previously described, involves an overhead of a system call and related context

The read-write mechanism, previously described, involves an overhead of a system call and related context switching and of memory copying.mmap maps pages of a file into memory, thus enabling programs to directly access the

previously described, involves an overhead of a system call and related context switching and of memory

mmap maps pages of a file into memory, thus enabling programs to directly access the memory directly andcall and related context switching and of memory copying. save the overhead, but: fast synchronization between

save the overhead,

but:

fast synchronization between kernel space and user space is a pain (why do we need it?), synchronization between kernel space and user space is a pain (why do we need it?),

 

and Linux read and write are really quite fast.and user space is a pain (why do we need it?),   mmap is implemented in

mmap is implemented in klife for demonstration purposes, with read() calls used for synchronization and triggering a generation update.kernel space and user space is a pain (why do we need it?),   and Linux

in klife for demonstration purposes, with read() calls used for synchronization and triggering a generation update.
in klife for demonstration purposes, with read() calls used for synchronization and triggering a generation update.

Linux Device Drivers, BIUX, April 2006 – p.33/50

klife_mmap

. . .

.

.

.

. . .

SetPageReserved ( virt_to_page ( k l i f e >grid ) ) ;

 

ret

= remap_pfn_range (vma , vma>vm_start , virt_to_phys ( k l i f e >grid ) >> PAGE_SHIFT, PAGE_SIZE , vma>vm_page_prot ) ;

pr_debug ( " io_remap_page_range

returned

%d \ n" ,

ret ) ;

i f

 

( ret

== 0) k l i f e >mapped =

1

;

return

ret ;

 

}

%d \ n" , ret ) ; i f   ( ret == 0) k l
%d \ n" , ret ) ; i f   ( ret == 0) k l

Linux Device Drivers, BIUX, April 2006 – p.34/50

klife Interrupt Handler

What if we want a new generation on every raised interrupt?

What if we want a new generation on every raised interrupt?Since we don’t have a hardware device to raise interrupts for us, let’s hook into

What if we want a new generation on every raised interrupt?

Since we don’t have a hardware device to raise interrupts for us, let’s hook into the one hardware every PC has - the clock - and steal its interrupt!What if we want a new generation on every raised interrupt?

device to raise interrupts for us, let’s hook into the one hardware every PC has -
device to raise interrupts for us, let’s hook into the one hardware every PC has -

Linux Device Drivers, BIUX, April 2006 – p.35/50

Usual Request For an Interrupt Handler

Usually, interrupts are requested using request_irq():

Usually, interrupts are requested using request_irq():

Usually, interrupts are requested using request_irq():

/ claim our rc = ENODEV;

i r q

/

i f

( request_irq ( card>i r q ,

SA_SHIRQ , card ) ) printk (KERN_ERR

{

& t r i d e n t _ i n t e r r u p t , card_names [ pci_id>driver_data ] ,

 
 

" t r i d e n t :

unable

to

allocate

i r q

%d \ n" ,

card>i r q ) ;

goto

out_proc_fs ;

 
 

}

: unable to allocate i r q %d \ n" , card − >i r q
: unable to allocate i r q %d \ n" , card − >i r q

Linux Device Drivers, BIUX, April 2006 – p.36/50

klife Interrupt Handler

It is impossible to request the timer interrupt.

It is impossible to request the timer interrupt.Instead, we will directly modify the kernel code to call our interrupt handler, if it’s

It is impossible to request the timer interrupt.

Instead, we will directly modify the kernel code to call our interrupt handler, if it’s registered.It is impossible to request the timer interrupt. We can do this, because the code is

We can do this, because the code isthe timer interrupt. Instead, we will directly modify the kernel code to call our interrupt handler,

directly modify the kernel code to call our interrupt handler, if it’s registered. We can do
directly modify the kernel code to call our interrupt handler, if it’s registered. We can do

Linux Device Drivers, BIUX, April 2006 – p.37/50

Aren’t Timers Good Enough For You?

“Does every driver which wishes to get periodic notifications need to hook the timer interrupt?”

“Does every driver which wishes to get periodic notifications need to hook the timer interrupt?” - Nope . Nope.

“Does every driver which wishes to get periodic notifications need to hook the timer interrupt?” -

Linux provides an excellent timer mechanism which can be used for periodic notifications.need to hook the timer interrupt?” - Nope . The reason for hooking into the timer

The reason for hooking into the timer interrupt in klife is because we wish to be called from hard interrupt context , also known as top half context hard interrupt context, also known as top half context

whereas timer functions are called in softirq bottom half context . bottom half context.

Why insist on getting called from hard interrupt context? So we can demonstrate deferring work . deferring work.

Why insist on getting called from hard interrupt context? So we can demonstrate deferring work .
bottom half context . Why insist on getting called from hard interrupt context? So we can
bottom half context . Why insist on getting called from hard interrupt context? So we can

Linux Device Drivers, BIUX, April 2006 – p.38/50

The Timer Interrupt Hook Patch

The Timer Interrupt Hook Patch The patch adds a hook which a driver can register for,

The patch adds a hook which a driver can register for, to be called directly from the timer interrupt handler. It also creates two functions:

register_timer_interruptthe timer interrupt handler. It also creates two functions: unregister_timer_interrupt Linux Device Drivers, BIUX, April

unregister_timer_interrupthandler. It also creates two functions: register_timer_interrupt Linux Device Drivers, BIUX, April 2006 – p.39/50

two functions: register_timer_interrupt unregister_timer_interrupt Linux Device Drivers, BIUX, April 2006 – p.39/50

Linux Device Drivers, BIUX, April 2006 – p.39/50

Hook Into The Timer Interrupt Routine 1

’+’ marks the lines added to the kernel.  

’+’ marks the lines added to the kernel.

 
’+’ marks the lines added to the kernel.  

+

struct

timer_interrupt_hook timer_interrupt_hook ;

+

s t a t i c +{

+

void

call_timer_hook ( struct

pt_regs regs )

 

+

struct

timer_interrupt_hook hook =

timer_interrupt_hook ;

+

+

i f

( hook && hook>func )

+

hook>func ( hook>data ) ;

+} @@ 851,6 +862,8 @@ void

do_timer ( struct

pt_regs regs )

 

update_process_times ( user_mode ( regs ) ) ;

 

#endif

 

update_times ( ) ;

 

+

+

call_timer_hook ( regs ) ;

 

}

( user_mode ( regs ) ) ;   #endif   update_times ( ) ;   +
( user_mode ( regs ) ) ;   #endif   update_times ( ) ;   +

Linux Device Drivers, BIUX, April 2006 – p.40/50

Hook Into The Timer Interrupt Routine 2

+ i n t +{ register_timer_interrupt ( struct timer_interrupt_hook ∗ hook )  

+ i n t +{

register_timer_interrupt ( struct

timer_interrupt_hook hook )

 
+ i n t +{ register_timer_interrupt ( struct timer_interrupt_hook ∗ hook )  

+

printk (KERN_INFO " registering

a

timer

i n t e r r u p t

hook

%p

"

+

" ( func

%p,

data

%p ) \ n" ,

hook ,

hook>func ,

 

+

hook>data ) ;

 

+

+

xchg(&timer_hook ,

hook ) ;

+

return

0 ;

+}

+

+void

unregister_timer_interrupt ( struct

timer_interrupt_hook hook )

 

+{

+

printk (KERN_INFO " unregistering

a

timer

i n t e r r u p t

hook \ n" ) ;

 

+

+

xchg(&timer_hook , NULL) ;

 

+}

a timer i n t e r r u p t hook \ n" ) ;
a timer i n t e r r u p t hook \ n" ) ;

Linux Device Drivers, BIUX, April 2006 – p.41/50

Commentary - The Timer Interrupt Hook

Note that the register and unregister calls use xchg(), to ensure atomic replacement of the
Note that the register and unregister calls use xchg(), to ensure atomic replacement of the

Note that the register and unregister calls use xchg(), to ensure atomic replacement of the pointer to the handler. Why use xchg() rather than a lock?What context (hard interrupt, bottom half, process context) will we be called in? Which CPU’s

What context (hard interrupt, bottom half, process context) will we be called in?pointer to the handler. Why use xchg() rather than a lock? Which CPU’s timer interrupts would

Which CPU’s timer interrupts would we be called in?a lock? What context (hard interrupt, bottom half, process context) will we be called in? What

What happens on an SMP system?(hard interrupt, bottom half, process context) will we be called in? Which CPU’s timer interrupts would

context) will we be called in? Which CPU’s timer interrupts would we be called in? What
context) will we be called in? Which CPU’s timer interrupts would we be called in? What

Linux Device Drivers, BIUX, April 2006 – p.42/50

Deferring Work

You were supposed to learn in class about bottom halves, softirqs, tasklets and other such

You were supposed to learn in class about bottom halves, softirqs, tasklets and other such curse words.The timer interrupt (and every other interrupt) has to happen very quickly. Why? The interrupt

You were supposed to learn in class about bottom halves, softirqs, tasklets and other such curse

The timer interrupt (and every other interrupt) has to happen very quickly. Why?halves, softirqs, tasklets and other such curse words. The interrupt handler (top half, hard irq) usually

The interrupt handler (top half, hard irq) usually just sets a flag which says “there is work to be done”.(and every other interrupt) has to happen very quickly. Why? The work is then deferred to

The work is then deferred to a bottom half context, where it is done by an (old style) bottom half, softirq, or tasklet.just sets a flag which says “there is work to be done”. For klife, we defer

For klife, we defer the work we wish to do (updating the grid) to a bottom half context by scheduling a tasklet . tasklet.

For klife, we defer the work we wish to do (updating the grid) to a bottom
For klife, we defer the work we wish to do (updating the grid) to a bottom

Linux Device Drivers, BIUX, April 2006 – p.43/50

Preparing The Tasklet

DECLARE_TASKLET_DISABLED( k l i f e _ t a s k l e t ,

DECLARE_TASKLET_DISABLED( k l i f e _ t a s k l e t ,

klife_tasklet_func , 0 ) ;

DECLARE_TASKLET_DISABLED( k l i f e _ t a s k l e t , klife_tasklet_func

s t a t i c

void

k l i f e _ t i m e r _ r e g i s t e r ( struct

k l i f e

k l i f e )

 

{

 

unsigned

long

flags ;

 

i

n t

ret ;

spin_lock_irqsave (& k l i f e >lock ,

 

/

prime

the

tasklet

with

the

flags ) ; correct

data ours

 

/

t

a s k l e t _ i n i t (& k l i f e _ t a s k l e t ,

klife_tasklet_func ,

 
 

( unsigned

long ) k l i f e ) ;

 
 

ret

=

register_timer_interrupt (& k l i f e >timer_hook ) ;

 

i

f

( ! ret ) k l i f e >timer

1

;

= spin_unlock_irqrestore (& k l i f e >lock , pr_debug ( " register_timer_interrupt

flags ) ; returned

%d \ n"

,

ret ) ;

 

}

e − >lock , pr_debug ( " register_timer_interrupt flags ) ; returned %d \ n" ,
e − >lock , pr_debug ( " register_timer_interrupt flags ) ; returned %d \ n" ,

Linux Device Drivers, BIUX, April 2006 – p.44/50

The klife Tasklet

Here’s what our klife tasklet does:

Here’s what our klife tasklet does:

Here’s what our klife tasklet does:

First, it derives the klife structure from the parameter it gets.Here’s what our klife tasklet does: Then, it locks it, to prevent concurrent access on another

Then, it locks it, to prevent concurrent access on another CPU. What are we protecting against?it derives the klife structure from the parameter it gets. Then, it generates the new generation.

Then, it generates the new generation.access on another CPU. What are we protecting against? What must we never do here? Hint:

What must we never do here?are we protecting against? Then, it generates the new generation. Hint: can tasklets block? Last, it

Hint: can tasklets block?are we protecting against? Then, it generates the new generation. What must we never do here?

Last, it releases the lock.are we protecting against? Then, it generates the new generation. What must we never do here?

Then, it generates the new generation. What must we never do here? Hint: can tasklets block?
Then, it generates the new generation. What must we never do here? Hint: can tasklets block?

Linux Device Drivers, BIUX, April 2006 – p.45/50

Deferring Work - The klife Tasklet

s t a t i c void klife_timer_irq_handler ( void ∗ data )  

s

t a t i c

void

klife_timer_irq_handler ( void data )

 
s t a t i c void klife_timer_irq_handler ( void ∗ data )  

{

 

struct

k l i f e

k l i f e

=

data ;

/

2

times

a second

/

i f

( k l i f e >timer_invocation ++ % (HZ /

2 )

= =

0 )

 

tasklet_schedule (& k l i f e _ t a s k l e t ) ;

 
 

}

s

t a t i c

void

klife_tasklet_func ( unsigned

long

data )

 

{

 

struct

k l i f e

k l i f e

=

(

void ) data ;

 

spin_lock (& k l i f e >lock ) ; klife_next_generation ( k l i f e ) ; spin_unlock (& k l i f e >lock ) ;

 
 

}

− >lock ) ; klife_next_generation ( k l i f e ) ; spin_unlock (& k
− >lock ) ; klife_next_generation ( k l i f e ) ; spin_unlock (& k

Linux Device Drivers, BIUX, April 2006 – p.46/50

Adding klife To The Build System

Building the module in kernel 2.6 is a breeze. All that’s required to add klife

Building the module in kernel 2.6 is a breeze. All that’s required to add klife to the kernel’s build system are these tiny patches:

in kernel 2.6 is a breeze. All that’s required to add klife to the kernel’s build

In drivers/char/Kconfig: 

 
 

+config GAME_OF_LIFE

 

+

t r i s t a t e

" kernel

game

of

l

i f e

"

+

help

+

Kernel

implementation

of

the Game of

Life .

in drivers/char/Makefile 

 
 

+obj$(CONFIG_GAME_OF_LIFE) +=

k l i f e

. o

 
of Life . in drivers/char/Makefile     +obj − $(CONFIG_GAME_OF_LIFE) += k l i f e
of Life . in drivers/char/Makefile     +obj − $(CONFIG_GAME_OF_LIFE) += k l i f e

Linux Device Drivers, BIUX, April 2006 – p.47/50

Summary

Writing Linux drivers is easy

Writing Linux drivers is easyand fun! Most drivers do fairly simple things, which Linux provides APIs for. The real

Writing Linux drivers is easy

and fun!Writing Linux drivers is easy Most drivers do fairly simple things, which Linux provides APIs for.

Most drivers do fairly simple things, which Linux provides APIs for.Writing Linux drivers is easy and fun! The real fun is when dealing with the hardware’s

The real fun is when dealing with the hardware’s quirks.do fairly simple things, which Linux provides APIs for. It gets easier with practice but it

It gets easier with practiceLinux provides APIs for. The real fun is when dealing with the hardware’s quirks. but it

but it never gets boring.provides APIs for. The real fun is when dealing with the hardware’s quirks. It gets easier

Questions?

real fun is when dealing with the hardware’s quirks. It gets easier with practice but it
real fun is when dealing with the hardware’s quirks. It gets easier with practice but it

Linux Device Drivers, BIUX, April 2006 – p.48/50

Where To Get Help

google

googleCommunity resources: web sites and mailing lists . Distributed documentation (books, articles, magazines) Use The

google

Community resources: web sites and mailing lists . web sites and mailing lists.

Distributed documentation (books, articles, magazines) documentation (books, articles, magazines)

Use The Source , Luke! The Source, Luke!

Your fellow kernel hackers . fellow kernel hackers.

lists . Distributed documentation (books, articles, magazines) Use The Source , Luke! Your fellow kernel hackers
lists . Distributed documentation (books, articles, magazines) Use The Source , Luke! Your fellow kernel hackers

Linux Device Drivers, BIUX, April 2006 – p.49/50

Bibliography

kernelnewbies - http://www.kernelnewbies.org

kernelnewbies - http://www.kernelnewbies.orglinux-kernel mailing list archives - http : / / marc . theaimsgroup .com/? l =linux

kernelnewbies - http://www.kernelnewbies.org

linux-kernel mailing list archives -kernelnewbies - http://www.kernelnewbies.org http : / / marc . theaimsgroup .com/? l =linux − kernel&w=2

http : / / marc . theaimsgroup .com/? l =linuxkernel&w=2

Understanding the Linux Kernel, by Bovet and Cesati: / / marc . theaimsgroup .com/? l =linux − kernel&w=2 Linux Device Drivers, 3rd edition,

Linux Device Drivers, 3rd edition, by Rubini et. al.Understanding the Linux Kernel, by Bovet and Cesati Linux Kernel Development, 2nd edition, by Robert Love

Linux Kernel Development, 2nd edition, by Robert Lovethe Linux Kernel, by Bovet and Cesati Linux Device Drivers, 3rd edition, by Rubini et. al.

/usr/src/linux-xxx/and Cesati Linux Device Drivers, 3rd edition, by Rubini et. al. Linux Kernel Development, 2nd edition,

Device Drivers, 3rd edition, by Rubini et. al. Linux Kernel Development, 2nd edition, by Robert Love
Device Drivers, 3rd edition, by Rubini et. al. Linux Kernel Development, 2nd edition, by Robert Love

Linux Device Drivers, BIUX, April 2006 – p.50/50