Beruflich Dokumente
Kultur Dokumente
These slides are made available to you under a Creative Commons Share-Alike 3.0 license. The full terms of this license are here: https://creativecommons.org/licenses/by-sa/3.0/ Attribution requirements and misc., PLEASE READ: This slide must remain as-is in this specific location (slide #2), everything else you are free to change; including the logo :-) Use of figures in other documents must feature the below Originals at URL immediately under that figure and the below copyright notice where appropriate. You are free to fill in the Delivered and/or customized by space on the right as you see fit. You are FORBIDEN from using the default About the instructor slide as-is or any of its contents. (C) Copyright 2005-2012, Opersys inc. These slides created by: Karim Yaghmour Originals at: www.opersys.com/training/linux-device-drivers
Coursestructureandpresentation
1.Abouttheinstructor 2.Goals 3.Presentationformat 4.Expectedknowledge 5.Daybydayoutline 6.Courseware
1.Abouttheinstructor
Authorof:
2.Goals
3.Presentationformat
Coursehastwomaintracks:
4.Expectedknowledge
5.Daybydayoutline
Day1:
1.Introduction 2.HardwareandLinux,aviewfromuserspace 3.Writingmodules 4.Drivertypes,subsystemAPIsanddriverskeletons 5.Hookingupwithandusingkeykernelresources
Day2:
6.Lockingmechanisms 7.Interruptsandinterruptdeferal 8.Timelyexecutionandtimemeasurement 9.Memoryresources 10.Hardwareaccess
Day3:
11.Chardrivers 12.Blockdrivers 13.Networkdrivers 14.PCIdrivers 15.USBdrivers 16.TTYdrivers
6.Courseware
rd
Introduction
1.Systemarchitecturereview 2.Userspacevs.kernelspace 3.Inthekernelworld 4.Drivers 5.Handsonworkenvironment
1.Systemarchitecturereview
Kernel
Drivers
Libraries
Applications
2.Userspacevs.kernelspace
Separateaddressspace:
Memoryprotectionamongstprocesses:
Memoryprotectionbetweenprocessesandkernel:
Crossingbetweenuserspaceandkernelspaceis throughspecificevents(willseelater)
3.Inthekernelworld
Usemodules Onceloaded,coreAPIavailable:
kernelAPIchanges:
http://lwn.net/Articles/2.6kernelapi/
4.Drivers
Builtinvs.modularized Userspacedrivers? Aconceptmorethanareality Xwindow,libush,gadgetfs,CDwriters,... Hardtomaphardwareresources(RAM,interrupts, etc.) Slow(swapping,contextswitching,etc.) Securityissues Musthaveparametercheckingindrivers Preinitializebufferspriortopassingtouspace
Licensingreminder
Althoughtheuseofbinaryonlymodulesis widespread,Kernelmodulesarenotimmunetokernel GPL.SeeLDD3,p.11 Manykerneldevelopershavecomeoutratherstrongly againstbinaryonlymodules. HavealookatBELSappendixCforafewcopiesof noticesonbinaryonlymodules. Ifyouarelinkingadriverasbuiltin,thenyouaremost certainlyforbiddenfromdistributingtheresultingkernel underanylicenseotherthantheGPL.
Firmware:
5.Handsonworkenvironment
HardwareandLinux,aviewfrom userspace
1.Devicefiles 2.Typesofdevices 3.Majorandminornumbers 4./procandprocfs 5./sysandsysfs 6./devandudev 7.Thetools
1.Devicefiles
2.Typesofdevices
Whatuserspacesees:
Abstractionsprovidedbykernel
3.Majorandminornumbers
Thegluebetweenuserspacedevicefilesand thedevicedriversinthekernel. Userspaceaccessesdevicesthroughdevice nodes...specialentriesinthefilesystem. Eachdeviceinstancehasamajornumberanda minornumber. Eachcharandblockdriverthatregisterswiththe kernelhasamajornumber. Whenuserspaceattemptstoaccessadevice nodewiththatsamenumber,allaccessesresultin actions/callbackstodriver.
4./procandprocfs
/procisavirtualfilesystemoftypeprocfs Allfilesanddirectoriesin/procexistonlyin memory. Read/writesresultincallbackinvocation /procisusedtoexportinformationaboutalotof thingsinthekernel. Usedbymanysubsystems,drivers,andcore functionality. Typicallyregardedbykerneldevelopersasa mess.
Example/procentries:
5./sysandsysfs
6./devandudev
/devisthemainrepositoryfordevicenodes Distrosusedtoshipwiththousandsofentriesin /dev;foreverypossiblehardwareoutthere. Becameincreasinglydifficulttouseasdevices weremoreandmoremobile. Withthearrivalofsysfsandtherelatedhotplug functionality:udev. udevautomagicallycreatestheappropriateentries in/devdynamically. Canbeconfiguretoprovidepersistentview
7.Thetools
Manytoolstoseeorcontrolhardware Examples:
lspci:listPCIdevices lsusb:listUSBdevices fdisk:partitiondisk hdparm:setdiskparameters ifconfig,iwconfig:configurenetworkinterface insmod,modprobe,rmmod,lsmod:manage modules halt,reboot:controlsystem hotplug:managetheadding/removalof hardware
Writingmodules
1.Settingupyourtestsystem 2.Kernelmodulesversusapplications 3.Compilingandloading 4.Thekernelsymboltable 5.Preliminaries 6.Initializationandshutdown 7.Modulesparameters 8./sys/modulesand/proc/modules
1.Settingupyourtestsystem
2.Kernelmodulesversusapplications
initandcleanup kspacevs.uspace:
Concurrency The"current"process
3.Compilingandloading
Compilingmodules:
Loadingandunloadingmodules:
Versiondependency:
Version CPUbuildflags
vermagictestedagainsttargetedkernelatloadtime
KERNELDIRspecifiesversion linux/modules.hincludeslinux/version.hwhichhas:
#ifdefswhenneeded Uselowlevel/highmacrostohidedetails
4.Thekernelsymboltable
Macrousedoutsideanyfunctionscopeinmoduleto exportsymbolsforusebyothermodules.
EXPORT_SYMBOL_GPL():
SameasEXPORT_SYMBOL()butsymbolsareonly availabletomoduleslicensedas"GPL".
5.Preliminaries
Musthave:
Ifneedmoduleparameters:
MODULE_LICENSE("...");
"DualMPL/GPL" "Proprietary"
6.Initializationandshutdown
Initialization:
devices,filesystems,linedisciplines,/procentries,etc.
Thecleanupfunction:
__exittospecifythatcleanupisforunloadingonly
void=>noreturnvalue Usemodule_exit() Inexit,reversedeallocationorderforresources allocatedininit. Nocleanup=nounloading Registration/allocationmayfail Failuretodeallocateonerror=unstable Usereverseordergotoinsteadofpererrorrollback Useproperreturncodeincaseoferror (<linux/errno.h>)
Errorhandlingduringinitialization:
Useofcustomcleanup:
Moduleloadingraces:
Incaseofinitfailure,somekernelpartsmay alreadybeusingfctsregisteredpriortofailure.
7.Modulesparameters
Mayneedtospecifysomeparamsatloadtime Specifiedatloadtime:
insmod modprobe(/etc/modprobe.conf)
module_param():Usedoutsideofanyfunction scope
Vartypes:
module_param_arrayforparametersarray:
S_IRUGO=>readonlyforworld S_IRUGO|S_IWUSR=>rootonlywrite
Writableparametersdonotgeneratesignalto module,mustbedetectedlive.
8./sys/moduleand/proc/modules
Typesofdrivers,subsystemAPIs anddriverskeletons
1.Chardevicedriver 2.Blockdevicedriver 3.Networkdevicedriver 4.MTDmapfile 5.Framebufferdriver
1.Writingachardevicedriver
Registerchardevduringmoduleinitialization Chardevregistration:include/linux/fs.h
int register_chrdev(unsigned int, const char *, struct file_operations *);
struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*aio_read) (struct kiocb *, char __user *, size_t, loff_t); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*aio_write) (struct kiocb *, const char __user *, size_t, loff_t); int (*readdir) (struct file *, void *, filldir_t); unsigned int (*poll) (struct file *, struct poll_table_struct *); int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long); long (*compat_ioctl) (struct file *, unsigned int, unsigned long); int (*mmap) (struct file *, struct vm_area_struct *); int (*open) (struct inode *, struct file *); int (*flush) (struct file *); int (*release) (struct inode *, struct file *); int (*fsync) (struct file *, struct dentry *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *); ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *); ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *); ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned ... int (*check_flags)(int); int (*dir_notify)(struct file *filp, unsigned long arg); int (*flock) (struct file *, int, struct file_lock *); };
2.Writingablockdevicedriver
Registerblockdevduringmoduleinitialization Blockdevregistration:include/linux/fs.h
int register_blkdev(unsigned int, const char *);
Blockqueueregistration:include/linux/blkdev.h
extern void blk_init_queue(request_fn_proc *, spinlock_t *);
QueueofpendingI/Ooperationsfordevice
3.Writinganetworkdevicedriver
Registernetdevduringmoduleinitialization Netdevregistration:include/linux/netdevice.h
int register_netdevice(struct net_device *dev);
Param:netdeviceops
4.WritinganMTDmapfile
Firstparam:Typeofprobe(ex:cfi_probe) Secondparam:mapinfo
struct map_info { char *name; unsigned long size; unsigned long phys; #define NO_XIP (-1UL) void __iomem *virt; void *cached; int bankwidth; #ifdef CONFIG_MTD_COMPLEX_MAPPINGS map_word (*read)(struct map_info *, unsigned long); void (*copy_from)(struct map_info *, void *, unsigned long, ssize_t); void (*write)(struct map_info *, const map_word, unsigned long); void (*copy_to)(struct map_info *, unsigned long, const void *, ssize_t); #endif void (*inval_cache)(struct map_info *, unsigned long, ssize_t); /* set_vpp() must handle being reentered -- enable, enable, disable must leave it enabled. */ void (*set_vpp)(struct map_info *, int); unsigned long map_priv_1; unsigned long map_priv_2; void *fldrv_priv; struct mtd_chip_driver *fldrv; };
Oncelocated,useadd_mtd_partitions()to providepartitioninformationtoMTDsubsystem.
add_mtd_partition()isin include/linux/mtd/partitions.h
int add_mtd_partitions(struct mtd_info *, struct mtd_partition *, int);
5.Writingaframebufferdriver
Registerframebufferduringmoduleinit Framebufferregistration:include/linux/fb.h
int register_framebuffer(struct fb_info *fb_info);
Param:fbinfo
Definedininclude/linux/fb.h Containscallbacksforallframebufferoperations
struct fb_info { int node; int flags; struct fb_var_screeninfo var; /* Current var */ struct fb_fix_screeninfo fix; /* Current fix */ struct fb_monspecs monspecs; /* Current Monitor specs */ struct work_struct queue; /* Framebuffer event queue */ struct fb_pixmap pixmap; /* Image hardware mapper */ struct fb_pixmap sprite; /* Cursor hardware mapper */ struct fb_cmap cmap; /* Current cmap */ struct list_head modelist; /* mode list */ struct fb_ops *fbops; struct device *device; #ifdef CONFIG_FB_TILEBLITTING struct fb_tile_ops *tileops; /* Tile Blitting */ #endif char __iomem *screen_base; /* Virtual address */ unsigned long screen_size; /* Amount of ioremapped VRAM or 0 */ void *pseudo_palette; /* Fake palette of 16 colors */ #define FBINFO_STATE_RUNNING 0 #define FBINFO_STATE_SUSPENDED 1 u32 state; /* Hardware state i.e suspend */ void *fbcon_par; /* fbcon use-only private area */ /* From here on everything is device dependent */ void *par; };
Hookingupwithandusingkey kernelresources
1.printk 2./proc 3.Introductiontosysfs 4.Sysfsentrytypes 5.Sysfslayout 6.Kobjects,ksets,andsubsystems 7.Lowlevelsysfsoperations 8.Hotplugeventgeneration 9.Buses,devices,anddrivers
1.printk
Basics
KERN_EMERG:
KERN_ALERT:
KERN_CRIT:
KERN_ERR:
KERN_WARNING:
KERN_NOTICE:
KERN_INFO:
KERN_DEBUG:
Whenklogdandsyslogdarerunning=>appendto /var/log/messages,regarldessofvalueof console_loglevel. klogddoesnotsaveconsecutiveidenticallines,just theircount. Ifklogdnotrunning,mustmanuallyread/proc/kmsg. console_loglevelsetto DEFAULT_CONSOLE_LOGLEVEL. console_loglevelsetthroughsys_syslog(). Readloglevelconfigfrom/proc/sys/kernel/printkLDD3 (p.77).
Redirectingconsolemessages:
Howmessagesgetlogged:
klogddispatchesmessagestosyslogd,which checks/etc/syslog.confforfiguringouthowtodeal withsuchmessages. syslogdlogsmessagesbyfacility,kernelis LOG_KERN. Maywanttocustomize/etc/syslog.conffordispatching kernelmessages. syslogdmessagesimmediatelyflushedtodisk=> performanceissue. syslogdmaybemadetoavoidthisbyprefixinglog filenamewithhyphen.
Turningthemessagesonandoff:
Ratelimiting:
Use:
if(printk_ratelimit()) printk(...);
Printingdevicenumbers:
intprint_dev_t(char*buffer,dev_tdev); char*format_dev_t(char*buffer,dev_tdev);
Differenceinreturnval(quantityvs.buffer) bufferof20+bytes
2./proc
Avoidingsyslogoverhead Usingthe/procfilesystem:
Foreasyimplementationoflargeentriesin/proc <linux/seq_file.h>
Prototypes:
Sequenceofcallsfromstarttostopisatomic. Functionstobeusedbyshow():
intseq_puts(structseq_file*sfile,constchar*s); Equivalentofuserpaceputs() intseq_escape(structseq_file*m,constchar*s,const char*esc); Print"esc"charactersfoundin"s"inoctalform Typicalvaluefor"esc":"\t\n\\" intseq_path(structseq_file*sfile,structvfsmount*m, structdentry*dentry,char*esc); Printoutfilenameassociatedwithdirectoryentry Notusuallyusedindrivers.
seq_operations my_seq_ops = { = my_seq_start, = my_seq_next, = my_seq_stop, = my_seq_show
Declarefunctionsinstructseq_operations:
static struct .start .next .stop .show };
Declarefileopsforconnectingseqops:
static struct file_operations my_proc_ops = .owner = THIS_MODULE, .open = my_proc_open, .read = seq_read, .llseek = seq_lseek, .release = seq_release };
seq_read,seq_lseek,seq_releaseprovidedbykernel my_proc_open:
static int my_proc_open(struct inode *inode, struct file *file) { return seq_open(file, &my_seq_ops); }
Registerentrywith/proc:
entry = create_proc_entry("scullseql", 0, NULL); if (entry) entry->proc_fops = &scull_proc_ops;
3.Introductiontosysfs
Powermanagementandsystemshutdown:
Communicationswithuserspace:
Hotpluggabledevices:
Deviceclasses:
Tellappswhichtypesofperipheralsarepresent Objetrelationshipsandrefcounting
Objectlifecycles:
4.Sysfsentrytypes
Directories:ksetsorkobjects Files:kobjectattributes
Attributetypes:
Symlinks:Relationshipsbetweenkobjects
5.Sysfslayout
/sys/devices:
/sys/bus:
/sys/class:
/sys/block:
/sys/firmware:
/sys/kernel:
/sys/module:
/sys/power:
6.Kobjects,ksets,andsubsystems
Introduction:
kobjectissmallestelementofdevicemodel structkobject:<linux/kobject.h>
Kobjectbasics:
Embeddedingkobjects:
kobjectsalwaystiedtosomethingelse Almostneverexistindependently
kobject
struct kobject { char *k_name; char name[KOBJ_NAME_LEN]; struct kref kref; struct list_head entry; struct kobject *parent; struct kset *kset; struct kobj_type *ktype; struct dentry *dentry; };
kobjectinitialization:
1Mustmemset()kobjectinstanceto0.Otherwiseserious crashes.
2Initializeobjectandsetrefcountto1:
3Setobject'sname:
Referencecountmanipulation:
structkobject*kobject_get(structkobject*kobj);
voidkobject_put(structkobject*kobj);
Releasefunctionsandkobjecttypes:
Getproperkobj_type:
structkobj_type*get_ktype(structkobject*kobj);
Kobjecthierarchies,ksets,andsubsystems:
Basics:
Ksets:
Objectcontainer/aggregation
structkset:<linux/kobject.h>
struct kset { struct subsystem *subsys; struct kobj_type *ktype; struct list_head list; spinlock_t list_lock; struct kobject kobj; struct kset_hotplug_ops *hotplug_ops; };
Helperfunctionforkobject_init()andkobject_add():
intkobject_register(structkobject*kobj);
Removingkobjectfromkset:
Helperfunctionforkobject_del()andkobject_put():
Relationshipsummary:
Operationsonksets:
Basicksetmanipulation:
ksetrefcounting:
structkset*kset_get(structkset*kset); voidkset_put(structkset*kset);
Subsystems:
Subsystemcancontainmultipleksets
struct subsystem { struct kset kset; struct rw_semaphore rwsem; }
Subsystemhelperfunctions:
7.Lowlevelsysfsoperations
Basics:
Defaultattributes:
Reminder:
struct kobj_type { void (*release)(struct kobject *); struct sysfs_ops *sysfs_ops; struct attribute **default_attrs; };
"default_attrs"isarrayofsysfsattributepointers structattribute
struct attribute { char *name; struct module *owner; mode_t mode; };
"name":nameofattributeasseeninsysfs "owner":ownermodule
"mode":fileaccessmodeinsysfs structsysfs_ops
struct sysfs_ops { ssize_t (*show)(struct kobject *kobj, struct attribute *attr, char *buffer); ssize_t (*store)(struct kobject *kobj, struct attribute *attr, const char *buffer, size_t size); };
Attributesimplementedbysysfs_ops:
read()onsysfsattributegeneratescalltoshow()
write()onsysfsattributegeneratescalltostore()
Nondefaultattributes:
Defaultattributesusuallyenough Addingattributetokobject:
Removingattribute:
Binaryattributes:
structbin_attribute
struct bin_attribute { struct attribute attr; size_t size; ssize_t (*read)(struct kobjet *kobj, char *buffer, loff_t pos, size_t size); ssize_t (*write)(struct kobject *kobj, char *buffer, loff_t pos, size_t size); };
Create/destruction:
Symboliclinks:
Create/destroysymboliclinks:
Thereisnowayforlinkingadrivertothedeviceitcontrols.
/sys/buscontainsentriesforeachbus.Eachoftheseentries containsatleast2entries:"devices"and"drivers":
/sys/devicesentriesthemselvescontainsymboliclinksback tothepertinent/sys/busentries.
8.Hotplugeventgeneration
Basics:
kobject_add() kobject_del()
Loadingadriver
Creatingadevicenode Mountingpartitions
Hotplugoperations:
9.Buses,devices,anddrivers
Buses:
Basics:
"name":Busname,like"pci"or"usb"
Busregistration:
Deregister:
Busmethods:
match:
hotplug:
Iteratingoverdevicesanddrivers:
"start":firstdevicetostartfrom.IfNULL,startfromfirstdevonbus.
"fn":functiontocallwithdevptrand"data".Ifretvalisnonzero, iterationstopsandbus_for_each_dev()returnsretval.
intbus_for_each_drv(structbus_type*bus,structdevice *start,void*data,int(*fn)(structdevice*,void*));
Sameasbuf_for_each_dev()
Busattributes:
structattributealreadydescribed
show/storealreadyexplainedaspartofstructsysfs_ops
Staticallycreatingbus_attributestructures:
BUS_ATTR(name,mode,show,store); Actualnameis"bus_attr_"concatenatedwith"name"
Attributeregistration:
intbus_create_file(structbus_type*bus,structbus_attribute*attr); voidbus_remove_file(structbus_type*bus,structbus_attribute*attr);
Devices:
Basics:
Everydevicerepresentedusingstructdevice
structdevice:<linux/device.h>
struct device { /* Most important fields */ struct device *parent; struct kobject kobj; char bus_id[BUS_ID_SIZE]; struct bus_type *bus; struct device_driver *driver; void *driver_data; void (*release)(struct device *dev); };
Deviceregistration:
Deviceattributes:
structdevice_attribute:<linux/device.h>
struct device_attribute { struct attribute attr; ssize_t (*show)(struct device *dev, char *buf); ssize_t (*store)(struct device *dev, const char *buf, size_t count); };
Staticallycreatingbusattributedeclarations:
DEVICE_ATTR(name,mode,show,store); Actualnameis"dev_attr_"andname
Attributeregistration:
Devicestructureembeddeding:
Devicedrivers:
Basics:
Objectmodelhasdevicedriversinordertomapdevice driverstonewdevices.
Registration/deregistation:
Driverattributes:
Staticallycreatingdriverattributedeclarations:
DRIVER_ATTR(name,mode,show,store);
Registering/deregisteringattributes:
Driverstructureembedding:
10.Classes
Basics:
Highlevelrepresentationofwhatisbeingworked, insteadofhowit'simplemented. Basicallyaclassisanaggregateofalldevicesofa certaintype. See/sys/classesformostclassesfoundonsystem /sys/blockistheonlyclasswithitsowntopmostentry (historical). Classownershiphandledbysubsystems,noneedfor drivertocare. Driversshouldcareaboutclassesmainlyforexporting datatouserspace.
Interfacesexportedbydrivercoreforclass manipulation:
class_simple Fullclassinterface
Theclass_simpleinterface:
Createsimpleclass MusttestretvalusingIS_ERR()
voidclass_simple_destroy(structclass_simple*cs);
Destroyclass
voidclass_simple_device_remove(dev_tdev);
Remove"dev"fromclass
Setupahotplughandlerforaclass
Thefullclassinterface:
Managingclasses:
structclass
struct class { /* Most important fields */ char *name; struct subsystem subsys; struct list_head children; struct list_head interfaces; struct class_attribute *class_attrs; struct class_device_attribute *class_dev_attrs; int (*hotplug)(struct class_device *dev, char **envp, int num_envp, char *buffer, int buffer_size); void (*release)(struct class_deivce *dev); void (*class_release)(struct class *class); };
"release":Adeviceisreleasedfromclass "class_release":Classisreleased
intclass_register(structclass*cls); voidclass_unregister(structclass*cls);
structclass_attribute
struct class_attribute { struct attribute attr; ssize_t (*show)(struct class *cls, char *buf); ssize_t (*store)(struct class *cls, const char *buf, size_t count); };
Classdevices:
structclass_device
struct class_device { /* Most important fields */ struct kobject kobj; struct class *class; struct device *dev; void *class_data; char class_id[BUS_ID_SIZE]; };
"dev":Ifset,symlinkcreatedtocorresponding/sys/devicesentry.
structclass_device_attribute
struct class_device_attribute { struct attribute attr; ssize_t (*show)(struct class_device *cls, char *buf); ssize_t (*store)(struct class_device *cls, const char *buf, size_t count); };
Classinterfaces:
structclass_interface
struct class_interface { struct class *class; int (*add)(struct class_device *cd); void (*remove)(struct class_device *cd); };
11.Puttingitalltogether
PCIbusisdeclaredusing:
struct bus_type pci_bus_type = { .name = "pci", .match = pci_bus_match, .hotplug = pci_hotplug, .suspend = pci_device_suspend, .resume = pci_device_resume, .dev_attrs = pci_dev_attrs, };
pci_bus_typeregisteredusingbus_register()atstartup
Entriescreatedin/sys/pci:devicesanddrivers PCIdriversarestructpci_driver,whichcontainsa structdevice_driver. WhenPCIdriverregistered,structdevice_driveris initializedbyPCI_code. PCIktypeissettopci_driver_kobj_type Driverregisteredusingdriver_register() Whenadeviceisfoundonthebus,newstructpci_dev created.pci_devcontainsastructdeviceentry. Afterstructpci_devisinitialized,deviceregisteredwith device_register().
Deviceaddedtolistofdeviceinpci_bus_type Codethenwalkslistofdriverstofinda"match"for device. Whenmatchfound(seeearlierexplanation),driver's probe()functionisinvokedtoseeifdriverwillaccept responsibilityfordevice. Oncedriveracks,driveranddevicearetiedtogether andthenecessarysymlinksarecreatedinsysfs. Hotplug Removaldonethroughpci_remove_bus_device(), whichcallsdevice_unregister().
Removeadevice:
device_unregister():
Addadriver:
Initializesstructdevice_driverinstructpci_driver Callsdriver_register()
Followearlierdescription
Matchandprobecalledtomatchdriverwithdevice pci_unregister_driver():
Removeadriver:
Callsdriver_unregister()
12.Hotplug
Dynamicdevices:
The/sbin/hotplugutility:
Calledbykerneluponhotplugevent Smallshellscript
Defaultenvironmentvariablespassedtohotplug:
DEVPATH:
SEQNUM:
SUBSYSTEM:
Subsystemspecificenvironmentvariables(seeLDD3 forfulldetail):
IEEE1394(FireWire):SUBSYSTEM="ieee1394"
VENDOR_ID MODEL_ID GUID SPECIFIER_ID VERSION INTERFACE PCI_CLASS PCI_ID PCI_SUSBSYS_ID PCI_SLOT_NAME
Networking:SUBSYSTEM="net"
PCI:SUBSYSTEM="pci"
Input:SUBSYSTEM="input"
PRODUCT NAME PHYS EV,KEY,REL,ABS,MSC,LED,SND,FF PRODUCT TYPE INTERFACE DEVICE Nospecificenvironmentvariables ThereisaSCSIspecificscriptinvokedinuserspace
USB:SUBSYSTEM="usb"
SCSI:SUBSYSTEM="scsi"
Laptopdockingstations:SUBSYSTEM="dock" S/390andzSeries:SUBSYSTEM="dasd"
Using/sbin/hotplug:
Linuxhotplugscripts:
Trytofinddrivermatchingaddeddevice UseofmapsgeneratedviaMODULE_DEVICE_TABLE macros. /lib/module/KERNEL_VERSION/modules.*map MapsforPCI,USB,IEEE1394,INPUT,ISAPNPandCCW. Continueloadingallmodulesrelevantfound,kerneldecides bestmatch. Onshutdown,scriptsdonotremovedriversinceother devicesmayhavebeenputunderitsresponpsibilitysince firstload.
udev:
<major>:<minor>
13.Copyto/fromuser
Cansleep.Codecallingshouldbe:
Retval=>amountofmemorystilltobecopied Ifaccesserror,retval!=0
14.Dealingwithfirmware
Basics:
Thekernelfirmwareinterface:
Useappropriatekernelfunctioninstead:
Havingsentfirmwaretodevice,itcanbereleased:
voidrelease_firmware(structfirmware*fw);
Ifcan'tsleepwaitingonfirmware:
Howitworks:
Lockingmechanisms
1.Concurrencyanditsmanagement 2.Semaphoresandmutexes 3.Completions 4.Spinlocks 5.Lockingtraps 6.Alternativestolocking 7.Summary
1.Concurrencyanditsmanagement
Avoidsharedresourceswhenpossible(ex.global variables). Must"manage"concurrentaccesswhenever resourcesaresharedtoguaranteeatomicity. Uselocksorsimilarmechanismstoimplement "criticalsections". Nocodeinstanceshoulduse"object"untilitis properlyinitializedforall. Mustkeeptrackof"object"instancestofreewhen appropriate.
2.Semaphoresandmutexes
P,V,andsemaphoreint LockwithP:
UnlockwithV:
Whensemaphoreinitiallysetto"1"=>Mutex TheLinuxsemaphoreimplementation:
structsemaphore:<asm/semaphore.h>
Basicinitialization:
Staticdeclareandinitmutex:
Dynamicinitmutex:
InLinux:
Versionsof"down":
intdown_interruptible(structsemaphore*sem);
intdown_trylock(structsemaphore*sem);
Onlyone"up":
voidup(structsemaphore*sem);
Reader/writersemaphores:
voidinit_rwsem(structrw_semaphore*sem);
Readonlyaccess:
voiddown_read(structrw_semaphore*sem);
intdown_read_trylock(structrw_sempaphore*sem);
Writeaccess:
Changeawritelocktoareadlock.
3.Completions
Dynamicinitialization:
Waitingoncompletion(uninterruptablewait):
Completion:
voidcomplete(structcompletion*c);
Wakeupjustonewaitingthread Wakeupallwaitingthreads
voidcomplete_all(structcompletion*c);
Reinitializingforreuseaftercomplete_all():
Completioninkernelthread:
4.Spinlocks
Introductiontospinlocks:
Mostoftenusedmechanisminthekernel Unlikesemaphores,canbeusedincodethatcannot sleep. Usuallybetterperformancethansemaphores. Typicallymeantforprotectingconcurentaccesson SMPsystems. TypicallydefaultstonothingonUP(exceptforIRQ spinlocks). Eitherlockedorunlocked Iflockalreadytaken,spinintightloopwaitingfor resource.
IntroductiontothespinlockAPI:
spinlock_t<linux/spinlock.h> Staticinitialization:
Dynamicinitialization:
Entercriticalsection:
Leavecriticalsection:
Manymorefunctions
Spinlocksandatomiccontext:
Nevercreatesituationwherecontrolmaybelostwhile holdingaspinlock,otherwise=>deadlock. Codeusingspinlocksshouldbeatomic Carefullyexaminewhichkernelservicesyoucallwhile holdingaspinlock(copy_to/from_user,kmalloc,etc. willsleep.) Usespeciallocksifatomicsectioncouldbeinterrupted byaninterruptservicedbyaroutinerequiringthat samelock. Holdforasshortatimeaspossible
Thespinlockfunctions:
Locking:
voidspin_lock_irq(spinlock_t*lock);
voidspin_lock_bh(spinlock_t*lock);
3levelsof"priorities"withspinlocks:
3User=>spin_lock() 2Softwareinterrupt=>spin_lock_bh() 1Interrupt=>spin_lock_irq*()
Trylocking:
Reader/writerspinlocks:
rwlock_tmy_rwlock=RW_LOCK_UNLOCKED
Dynamicinitialization:
voidrwlock_init(rwlock_t*lock); voidread_lock(rwlock_t*lock); voidread_lock_irqsave(rwlock_t*lock,unsignedlong flags); voidread_lock_irq(rwlock_t*lock); voidread_lock_bh(rwlock_t*lock); voidread_unlock(rwlock_t*lock); voidread_unlock_irqrestore(rwlock_t*lock,unsignedlong flags); voidread_unlock_irq(rwlock_t*lock); voidread_unlock_bh(rwlock_t*lock);
Forreaders:
Forwriters:
voidwrite_lock(rwlock_t*lock); voidwrite_lock_irqsave(rwlock_t*lock,unsignedlong flags); voidwrite_lock_irq(rwlock_t*lock); voidwrite_lock_bh(rwlock_t*lock); intwrite_trylock(rwlock_t*lock); voidwrite_unlock(rwlock_t*lock); voidwrite_unlock_irqrestore(rwlock_t*lock,unsignedlong flags); voidwrite_unlock_irq(rwlock_t*lock); voidwrite_unlock_bh(rwlock_t*lock);
5.Lockingtraps
Ambiguousrules:
Lockorderingrules:
Fineversuscoarsegrainedlocking:
6.Alternativestolocking
Lockfreealgorigthms:
<linux/kfifo.h>
Atomicvariables:
Dynamicinitialization:
Reading:
Arithmeticopswithoutretval:
Arithmeticopswithretval:
Arithmeticopswithtest:
intatomic_sub_and_test(inti,atomic_t*v);
intatomic_inc_and_test(atomic_t*v);
intatomic_dec_and_test(atomic_t*v);
intatomic_add_negative(inti,atomic_t*v);
Bitoperations:
Basicops:
Toggle
Nonatomicbitvalretrieval:
inttest_bit(nr,void*addr); inttest_and_set_bit(nr,void*addr); inttest_and_clear_bit(nr,void*addr); inttest_and_change_bit(nr,void*addr); while(test_and_set_bit(nr,addr)!=0)wait_a_little(); Ifalreadyset,thisloopwillwait,untiltheotherbitofcode alreadyholdingthelockdoesatest_and_clear_bit(). Ifmultiplethreadscompeting,oneofthemwillhaveits test_and_set_bit()succeed,andtheotherswillcontinue looping.
Atomictestthenmodify:
Enteringcriticalsection:
Leavingcriticalsection:
seqlocks:
Appropriateforsituationswhere:
Readersget"free"access,butmusttestforcollision withwriters,andretryinthosecases.
Dynamicinitialization:
Forreaders:
unsignedintread_seqbegin(seqlock_t*lock);
Conductsimplecomputation Testifconcurrentwriteoccured
intread_seqretry(seqlock_t*lock,unsignedintseq);
Ifso,discardresultandrepeat unsignedintread_seqbegin_irqsave(seqlock_t*lock, unsignedlongflags); intread_seqretry_irqrestore(seqlock_t*lock,unsignedint seq,unsignedlongflags); voidwrite_seqlock(seqlock_t*lock); voidwrite_sequnlock(seqlock_t*lock); voidwrite_seqlock_irqsave(seqlock_t*lock,unsignedlong flags); voidwrite_seqlock_irq(seqlock_t*lock);
Interruptprotectedread:
Forwriters:
Variantsforwriters:
readcopyupdate:
Tochangedata:
Functionsfoundin<linux/rcupdate.h> Readermacros:
rcu_read_lock()
Disablepreemption Enablepreemption
rcu_read_unlock()
Complicatedpartistoknowwhentofree"oldcopy":
OtherCPUsmaystillhavereferencestooldcopy Writermustwaituntilitknowsnootherinstancehaspointer tooldcopy. Sinceallcodepathsreferencingresourceareatomically protected,itisassumedthatonceeveryprocessoronthe systemhasbeenscheduledatleastonce,thennoother processorstillholdsacopyoftheolddatapointer. Hence,wecanfreetheoldcopy. ThekernelRCUmechanismprovidesawayforregisteringa callbacktobeissuedonceallprocessorshavebeen scheduledtocleanuptheoldcopy. voidcall_rcu(structrcu_head*head,void(*func)(void*arg), void*arg);
FunctiontoregisterRCUcallback:
Callbackobtainssame"arg"aspassedtocall_rcu(). Typicallycallbackissuesakfree().
FulldetailofAPIandalgorithmin<linux/rcupdate.h>
7.Summary
Semaphores/ > ServicinguserspacecallsMutexes Completions >Endsignalsharedby routinesservicinguserspace. Spinlocks > SMPsystems/disablinginterrupts. Atomicops > Singlearithmeticop Bitop > Singlebitop Seqlocks > Fewwriters/lotsofreaders. RCU
> Pointerstructmodifications.
Interruptsandinterruptdeferal
1.Installinganinterrupthandler 2.Implementingahandler 3.Topandbottomhalves 4.Interruptsharing 5.InterruptdrivenI/O
1.Installinganinterrupthandler
Thebasics:
Interruptbitmask:
SA_SHIRQ:
SA_SAMPLE_RANDOM:
Interruptsgeneratedbydevicecancontributetoentropypoolfor randomnumbergeneration(/dev/randomand/dev/urandom).
Unregisterhandler:
voidfree_irq(unsignedintirq,void*dev_id);
The/procinterface:
Do"cat/proc/interrupts"tosee:
Interruptlinesthatcurrentlyhaveregisteredhandlers Thenumberoftimeseachtimeofinterruptoccuredforeach CPU. ThePIC(ProgrammableInterruptController)configurationfor theinterrupt. Thedriver(s)thathaveregisteredhandlersforthegiven interrupt,asprovidedbythe"dev_name"parameterof request_irq(). Thetotalnumberofallinterruptsthatoccuredsinceboot Thetotalnumberofinterruptsofagiventypethatoccured sinceboot,eachentrybeingseparatedbyaspace.
Do"cat/proc/stat"andlookforthe"intr"linetosee:
AutodetectingtheIRQnumber:
Onlyfornonsharedinterrupts UsuallyforISAonly
<linux/interrupt.h> unsignedlongprobe_irq_on(void);
retvalisbitmaskofunasignedinterrupts recordretvalforpassingtoprobe_irq_off() Enableinterruptsafterthiscall Configuredevicetoemitinterrupt Calltoaskkernelwhichinterruptoccured Disableinterruptsbeforethiscall Mayneedtoinsertdelaypriortocallingthisfunctiontogivetimefor theinterrupttooccur. retvalis>0ifonlyoneinterruptoccured retvalis0ifnointerruptoccured retvalis<0ifmorethanoneinterruptoccured
intprobe_irq_off(unsignedlong);
Probingcantakealotoftime(20msforframegrabber)
Besttodoprobingonlyonceatmoduleloadtime MostnonPCplatformsdon'tneedprobingandabove functionsareplaceholders(includingmostPPC,andMIPS implementations). Looponallpossibleinterrupts Recordinterrupthandlerforagiveninterrupt Configuredevicegeneratinganinterrupt Waitforinterrupttooccur Checktoseeifhandlerwascalled Freeinterrupthandler
Doityourselfprobing:
Fastandslowinterrupts:
Oldkernelabstraction
do_IRQdoes:
handle_IRQ_event:
Checkforscheduling(processesmayhavebeenwokenup asaresultofinterrupt).
2.Implementingahandler
Thebasics
Role:
Restrictionsastowhathandlercando Restrictionssimilartothoseoftimer:
Can'taccessuserspace
Handlerarugmentsandreturnvalue:
retvalisstatusofinterrupthandling:
IRQ_RETVAL(var);
Enablinganddisablinginterrupts:
Disablingasingleinterrupt:
<asm/irq.h> voiddisable_irq(intirq);
voiddisable_irq_nosync(intirq);
Disablingallinterrupts:
voidlocal_irq_save(unsignedlongflags);
voidlocal_irq_disable(void);
voidlocal_irq_restore(unsignedlongflags);
voidlocal_irq_enable(void);
Nonestingpossible:uselocal_irq_save().
3.Topandbottomhalves
Tasklets:Overview
Runsinsoftwareinterruptcontext Onlyonetaskletofagiventypewilleverberunninginthe sametimeintheentiresystem. Evenifreschededmultipletimes,willonlyrunonce. Interruptmayoccurwhiletaskletisrunning=>use appropriatelocks. TaskletsrunonthesameCPUwheretheyscheduled DECLARE_TASKLET() tasklet_init() tasklet_schedule()
APIreminder:
Workqueues:
Issuefunctionwithinworkqueueprocesscontext
Reminder:
4.Interruptsharing
Usesamerequest_irq() Differencewithnonsharedhandlers:
Registrationfailsifotherhandlershaveregistered withoutsettingtheSA_SHIRQflag.
Runningthehandler
The/procinterfaceandsharedinterrupts
5.InterruptdrivenI/O
Guidelines:
Timelyexecutionandtimemeasurement
1.Measuringtimelapses 2.Knowingthecurrenttime 3.Delayingexecution 4.Kerneltimers 5.Tasklets 6.Workqueues
1.Measuringtimelapses
Background:
jiffies_64(evenon32bitplatforms) Accesstojiffies_64notatomicon32bitplatforms
Driverstypicallyusejiffies(unsignedlong):
Sameasjiffies_64orleastsignificantbitsofjiffies_64.
Usingthejiffiescounter:
Comparisonfunctions:
<linux/jiffies.h> inttime_after(unsignedlonga,unsignedlongb);
inttime_before(unsignedlonga,unsignedlongb);
inttime_after_eq(unsignedlonga,unsignedlongb);
inttime_before_eq(unsignedlonga,unsignedlongb);
Obtainingtimedifference:
diff=(long)t2(long)t1
Convertingtimetomilliseconds:
msec=diff*1000/HZ;
Helperfunctionincaseyouneedtoreadjiffies_64:
Processorspecificregisters:
x86typeassemblymacros:
rdtsc(low32,high32); rdtscl(low32); rdtscll(var64);
OftennotsyncrhonizedonSMPsystems
2.Knowingthecurrenttime
Gettingabsolutetimestamp:
Gettingcurrenttime:
3.Delayingexecution
Longdelays
Busywaiting
Waitintightloopforacertaintime:
while(time_before(jiffies,deadline)) cpu_relax();
Yieldingtheprocessor
Timeouts
Ifnoeventiswaitedfor:
Shortdelays:
Sometimesshortbusywaitsneededforhardwareops Helperfunctions:
<linux/delay.h> voidmsleep(unsignedintmillisecs);
unsignedlongmsleep_interruptible(unsignedintmillisecs);
voidssleep(unsignedintseconds);
Uninterruptiblesleep
Likelywakeupmuchlaterthandelay
4.Kerneltimers
Background
Limitations:
Testingcurrentcontext:
Atimercallbackcanrescheduleitself OnSMP,callbackrunsonsameCPUasregistered
ThetimerAPI:
Staticinitialization:
structtimer_listTIMER_INITIALIZER(_function,_expires, _data);
Dynamicinitialization:
voidinit_timer(structtimer_list*timer);
Maychangethe3fieldsinstructafterinitialization Addingtimertolist:
Deletingtimerpriortoexpiry:
Modifytimerexpiry:
Deletetimerand,onreturn,makesureit'snotrunning onanyCPU:
Indicateiftimeriscurrentlyscheduledforexecution:
Theimplementationofkerneltimers:
Timersinsertedin"cascadingtable"dependingon expiry:
When__run_timersisexecuted:
Ifjiffiesismutltipleof256,rehashnextlevelinto256lists, andcascadeotherlevelsasneeded.
5.Tasklets
Somewhatsimilartotimers:
Differencefromtimers:
<linux/interrupt.h>
struct tasklet_struct { ... void (*func)(unsigned long); unsigned long data; }
Staticinitialization:
DECLARE_TASKLET(name,func,data); DECLARE_TASKLET_DISABLED(name,func,data);
Dynamicinitialization:
Taskletfeatures:
Whorunstasklets?
FullAPI:
voidtasklet_disable_sync(structtasklet_struct*t);
voidtasklet_enable(structtasklet_struct*t);
voidtasklet_schedule(structtasklet_struct*t);
voidtasklet_hi_schedule(structtasklet_struct*t);
voidtasklet_kill(structtasklet_struct*t);
6.Workqueues
Basics:
Notthesamethingaspreviouslyseenwaitqueues Similartotasklets:
Differentfromtasklets:
BasicAPI:
Onekernelthreadpercpu
structworkqueue_struct *create_singlethread_workqueue(constchar*name);
Onekernelthreadforentiresystem
Submittingatasktoaworkqueue:
Staticdeclaration:
DECLARE_WORK(name,void(*function)(void*),void*data);
Dynamicdeclaration:
INIT_WORK(structwork_sutrct*work,void(*function)(void*),void *data); PREPARE_WORK(structwork_struct*work,void(*function)(void*), void*data); SimilartoINIT_WORK()butdoesn'tinitializepointerstolink structwork_structtoactualworkqueue. Usefulifstructuremayhavealreadybeensubmittedtowork queue. intqueue_work(structworkqueue_struct*queue,structwork_struct *work); retvaliszeroifsuccessfulladd retvalnonzeroifalreadyinqueue(notaddedagain) intqueue_delayed_work(structworkqueue_struct*queue,struct work_struct*work,unsignedlongdelay);
Actualsubmission:
Aboutworkqueuecallbacksleep:
Willaffectothercallbacksqueuedinworkqueue. intcancel_delayed_work(structwork_struct*work);
RestofAPI:
voidflush_workqueue(structworkqueue_struct*queue);
voiddestroy_workqueue(structworkqueue_struct*queue);
Thesharequeue:
Canstillusecancel_delayed_work().
Memoryresources
1.Therealstoryofkmalloc 2.Lookasidecaches 3.get_free_pageandfriends 4.Thealloc_pagesinterface 5.vmallocandfriends 6.PerCPUvariables 7.Obtaininglargebuffers 8.MemorymanagementinLinux 9.Themmapdeviceoperation
1.Therealstoryofkmalloc
Theflagsargument:
GFP_ATOMIC:
GFP_KERNEL:
Usedbyfunctionsservicingsystemcallsonbehalfofprocess Callermustbereentrant Callermustnotbeholdinglocks Allocateforuserspace Maysleep Allocateshighmemory(ifavailable) Maysleep SimilartoGFP_KERNEL IndicatestokernelnottodoanyI/Otosatisfyrequest SimilartoGFP_KERNEL Indicatestokernelnottodoanyfilesystemcalls
GFP_USER:
GFP_HIGHUSER:
GFP_NOIO:
GFP_NOFS:
AdditionalflagstoOR"|"withbasicflagstofurther detailalloc:
__GFP_DMA:
__GFP_HIGHMEM:
__GFP_COLD:
__GFP_NOWARN:
__GFP_HIGH:
__GFP_REPEAT:
__GFP_NOFAIL:
__GFP_NORETRY:
Memoryzones:
MinimumzonesrecognizedonallplatformsbyLinux:
Thesizeargument:
2.Lookasidecaches
SLAB_NO_REAP:
SLAB_HWCACHE_ALIGN:
SLAB_CACHE_DMA:
"constructor"/"destructor"
Optional Initializenewlyallocatedobjects/cleanupobjectspriorto free. Constructorcalledafterallocation,notnecessarily immediately. Destructorsmaybecalledatanytimeafterfreerequest Mayormaynotsleepdependingif"flags"passedcontains. SLAB_CTOR_ATOMIC Canusesamefunctionforboth Actualconstructorcalledwith SLAB_CTOR_CONSTRUCTOR
void*kmem_cache_alloc(kmem_cache_t*cache, intflags);
voidkmem_cache_free(kmem_cache_t*cache, constvoid*obj);
Freecacheallocatedmemory
intkmem_cache_destroy(kmem_cache_t *cache);
Destroyentirecache(usuallyonmodule_exit)
Failsifnotallobjectshavebeenkmem_cache_free'd Checkretvalforfailure(memleak)
Usuallysetto"mempool_alloc_slab"
typedefvoid(mempool_free_t)(void*element,void *pool_data);
Usuallysetto"mempool_free_slab"
Typically:
cache = kmem_cache_create(...); pool = mempool_create(MY_POOL_MINIMUM, mempool_alloc_slab, mempool_free_slab, cache);
void*mempool_alloc(mempool_t*pool,int gfp_mask);
Allocatefrompool. Freefrompool.
voidmempool_free(void*element,mempool_t*pool);
intmempool_resize(mempool_t*pool,int new_min_nr,intgfp_mask);
voidmempool_destroy(mempool_t*pool);
3.get_free_pageandfriends
get_zeroed_page(unsignedintflags);
Getpointertonewpagepreinitializedwithzeroes Getpointertonewpagewithoutclearingcontent
__get_free_page(unsignedintflags);
__get_free_pages(unsignedintflags,unsignedint order);
"flags"sameasforkmalloc voidfree_page(unsignedlongaddr);
Pagefreeing:
voidfree_pages(unsignedlongaddr,unsignedlong order);
Different"order"fromallocationwillcausememory corruption.
4.Thealloc_pagesinterface
structpage*alloc_pages(unsignedintflags, unsignedintorder);
structpage*alloc_page(unsignedintflags);
void__free_page(structpage*page);
void__free_pages(structpage*page,unsignedin order);
Freepagelot
voidfree_hot_page(struct*page);
Freesinglepagethatisincache Freesinglepagethatisnotincache
voidfree_cold_page(struct*page);
5.vmallocandfriends
Can'tbeusedinanatomiccontext(relieson GFP_KERNEL).
retvaliszeroonfailure retvalispointertoregionofatleast"size"size
voidvfree(void*addr);
Freevmalloc'edmemory
6.PerCPUvariables
Usewisely.
<linux/percpu.h>
Staticdeclaration:
DEFINE_PER_CPU(type,name);
Typecanbearray(char[10])
Dynamicdeclaration:
void*alloc_percpu(type); void*__alloc_percpu(size_tsize,size_talign);
Useincaseofspecialalignmentneeds
free_percpu();
Canbemanipulatedwithoutlocks,pending preemptionprotection.
Forstaticallyallocatedvariables:
get_cpu_var(var);
put_cpu_var(var);
per_cpu(var,cpu_id);
Fordynamicallyallocatedvariables:
intget_cpu(void);
per_cpu_ptr(void*per_cpu_ver,intcpu_id);
voidput_cpu(void);
Mayexportpercpuvariables:
Tousefromanothermodule:
IncontrastwithDEFINE_PER_CPU()
Cannedpercpucounters:
<linux/percpu_counter.h>
7.Obtaininglargebuffers
<linux/bootmem.h> void*alloc_bootmem(unsignedlongsize);
void*alloc_bootmem_low(unsignedlongsize);
void*alloc_bootmem_pages(unsignedlongsize);
void*alloc_bootmem_low_pages(unsignedlong size);
voidfree_bootmem(unsignedlongaddr,unsigned longsize);
8.MemorymanagementinLinux
Addresstypes:
SeeFig151onp.414 Uservirtualaddress:
Physicaladdress:
Busaddress:
Kernellogicaladdress:
Virtualaddressesthatmapdirectlytophysicaladdressesby anoffset. kmalloc()handsoutthistypeofmemory Use__pa()toconverttophysicaladdress Virtualaddressthatdoesn'tnecessarilymaptoagivenrange ofphysicaladdress. Acontinguousrangeofkernelvirtualaddresseswilltypically notbephysicallycontiguous.Itwillbecontiguousinvirtual spacebecauseofthepagemappings. vmalloc()'edmemory
Kernelvirtualaddress:
Physicaladdressesandpages:
Highandlowmemory:
32bitsystemcanonlyaddress4GBofphysicalmemory
Linuxspecificlimitations:
Virtualaddressspaceisusuallysplitbetweenkerneland process:
3GBforprocess/1GBforkernel
Kernelcan'thandlememorynotmappedinitsspace(logical address):
Therefore,foralongtime,1GBRAMwasallLinuxsupported
Modernprocessors:
Linux'suseoftheseextensions:
Thememorymapandstructpage:
atomic_tcount;
void*virtual;
unsignedlongflags;
Pageuseflags
Addressconversionandstructpage:
structpage*virt_to_page(void*kaddr);
structpage*pfn_to_page(intpfn);
void*page_address(structpage*page);
Mappingstructpage:
Simplemapping:
<linux/highmem.h> void*kmap(structpage*page); Iflowmem,returnlogicaladdress Ifhighmem,mapshighmemtokernelspace Maysleeptocreatemapping voidkunmap(structpage*page); Freeingmappingdonewithkmap <linux/highmem.h> <asm/kmap_types.h> void*kmap_atomic(structpage*page,enumkm_typetype); Atomicformofkmap() voidkunmap_atomic(void*addr,enumkm_typetype); Freeingmappingdonewithkmap_atomic
Atomicmapping:
Pagetables:
Virtualmemoryareas:
"Memoryobjectwithitsownproperties":
Processesusuallyhavefollowingareas:
See/proc/<pid>/mapsforexamplelayout Eachlineinmapshasformat:
startendpermoffsetmajor:minorinodeimage
Thevm_are_structstructure:
VirtualaddressrangeforVMA
structfile*vm_file;
unsignedlongvm_pgoff;
unsignedlongvm_flags;
structvm_operations_struct*vm_ops;
void*vm_private_data;
Thevm_operations_structstructure:
void(*open)(structvm_area_struct*vma);
Calledeverytimenewreftoarea(likefork) Calledeverytimereftoareaisclosed(likeclose())
void(*close)(structvm_area_struct*vma);
structpage*(*nopage)(structvm_area_struct*vma, unsignedlongaddress,int*type);
Accesstopagebutpageisn'tthere IfNULL,emptypageisallocatedbykernel
AllowsVMAtobeinitializedpriortobeingreferenced Notneededbydrivers
Theprocessmemorymap:
9.Themmapdeviceoperation
Basics:
Actualydrivercallbackprototype:
Drivermust:
Populatingpagetables:
remap_pfn_range nopage
Usingremap_pfn_range:
pfnisactualRAM
phys_addrisI/Omemory
vma:
virt_addr:
pfn:
size:
prot:
Musttestretvalforsuccess(0meanfail) Aboutcaching:;
Asimpleimplementation:
static int my_driver_mmap(struct file *filp, struct vm_area_struct *vma) { if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end vma->vm_start, vma->vm_page_prot)) return -EAGAIN; return 0; }
AddingVMAoperations:
Modifyvm_opspriortoreturningfrommmap() callback:
vma>vm_ops=&my_vma_ops;
Mappingmemorywithnopage:
Asshownbefore:
structpage*(*nopage)(structvm_area_struct*vma,unsigned longaddress,int*type);
RemappingspecificI/Oregions:
RemappingRAM:
PhysicalrangesaboveRAM RAMlockedpages(reservedpages)/can'tbeswapped
Attemtpstoremapvirtualaddresseswillresultin mappingofthe"zeropage".
RemappingRAMwiththenopagemethod:
SeeLDD3forexampleofhowtomapphysicalpages allocatedinlogicaladdressestouspace.
RemappingKernelvirtualaddresses:
Relayfs:
10.PerformingdirectI/O
Allowdirectreading/writingusingauserspace bufferwithoutevercopyingthedatatokernel spacefirst. Veryusefulforstufflikeblockandnetworking devices. However,mostdriversthatneeddirectI/Oalready havesubsystemsthatdothedirtywork,suchasin thecaseofblockandnetworking. Kernelprovidesget_user_pages()tolockuser pagesinRAMfortransfer.
Oncedone,mustmarkpagesasdirtyifmodified:
voidSetPageDirty(structpage*page); voidpage_cache_release(structpage*page);
Removefrompagecache:
SeeLDD3forfulldetails
11.AsynchronousI/O
Streaming
TypicallyinvolvescarryingoutdirectI/O
Relevantchardevcallbacks:
SeeLDD3forfulldetails
12.Directmemoryaccess
OverviewofaDMAdatatransfer:
Basically:
1Setuphardwarefortransfer 2Respondtointerruptbysignalingthattransferisdone 1Processread 2DriversetsuphardwaretotransferfromdevicetoDMA buffer. 3Driverputsprocesstosleep 4Hardwarewritesdatatobuffer 5Hardwareissuesinterrupt 6Interrupthandlerdealswithtransfereddataandawakens process.
Usuallyonread:
Usuallyonwrite:
1Processwrite 2DatacopiedtoDMAbuffer 3DriversetsuphardwaretotransferfromDMAbufferto device. 4Driverputsprocesstosleep 5Hardwarecopiesdatafrombuffertodevice 6Hardwareissuesinterrupt 7Interrupthandlerwakesupprocess 1Interruptsignalsarrivalofnewdata 2Interrupthandlerinstructshardwaretotransferdatato designatedbuffer.
Forasynchronousdataarrival(likedataacquisition):
AllocatingtheDMAbuffer:
Doityourselfallocation:
Useioremap()latertomapregionforI/O Doesn'tworkonsystemswithhighmem
Canusescatter/gatherI/Oifdeviceallowsit
Busaddresses:
Conversionfunctionsexist,butstronglydiscouraged (deprecated):
unsignedlongvirt_to_bus(volatilevoid*address); void*bus_to_virt(unsignedlongaddress);
DMAforISAdevices:
13.ThegenericDMAlayer
Basics:
Dealingwithdifficulthardware:
intdma_set_mask(structdevice*dev,u64mask);
"mask":Numberofbitsdevicecanaddress retvaliszeroifdmaisnotpossiblewithmask
Notneededfordevicessupporting32bitDMA
DMAmappings:
Combinationof:
DMAbufferallocation Deviceaccessibleaddresstobuffer
DMAmappingsforPCIcode:
CoherentDMAmappings:
StreamingDMAmappings:
EachPCIDMAmappingtypedealtwithdifferently
SettingupcoherentDMAmappings:
void*dma_alloc_coherent(structdevice*dev,size_t size,dma_attr_t*dma_handle,intflag);
voiddma_free_coherent(structdevice*dev,size_t size,void*vaddr,dma_addr_tdma_handle);
Allparametersmustbeproperlyset
DMApools:
voiddma_pool_free(structdma_pool*pool,void *vaddr,dma_addr_taddr);
SettingupstreamingDMAmappings:
Workonbufferpreallocatedbydriver Mustindicatetransferdirecation
retvalisbusaddresstopasstodevice
Importantnotes:
Reacquirebufferfordrivermanipulationwithoutunmapping it.
Returnbufferbacktodevice
Singlepagestreamingmappings:
dma_attr_tdma_map_page(structdevice*dev,struct page*page,unsignedlongoffset,size_tsize,enum dma_data_directiondirection); voiddma_unmap_page(structdevice*dev, dma_addr_tdma_address,size_tsize,enum dma_data_directiondirection); Avoidpartialpagemappingsbecauseofpotential cachecoherencyproblems. Transferingseveralbuffersinthesametime Manydevicesacceptlistofbufferpointersforsingle DMA.
Scatter/gathermappings:
unsignedintlength;
unsignedintoffset;
Second,mapscatterbuffer:
intdma_map_sg(structdevice*dev,structscatterlist*sg,int nents,enumdma_data_directiondirection);
"nents":numberofentriesinlist FunctionstoresattributedDMAbusaddressesintoscatterliststruct.
Third,transferbuffers:
unsignedintsg_dma_len(structscatterlist*sg);
Finally,whendone,unmap:
voiddma_unmap_sg(structdevice*dev,structscatterlist *list,intnents,enumdma_data_direction);
"nents"isvaluepassedoriginallytodma_map_sg(),notwhatthat functionactuallydid.
PCIdoubleaddresscyclemappings:
retval0ifDACcanbeused
DACmappingsshouldliveinhighmemory DACmappingsmustbecreatedonepageatatime
"direction":PCI_DMA_TODEVICE, PCI_DMA_FROMDEVICE,PCI_DMA_BIDIRECATIONAL.
Noneedfor"unmapping"DACmappings Must,however,restrictaccess:
Hardwareaccess
1.I/OportsandI/Omemory 2.I/Oregistersvs.conventionalmemory 3.UsingI/Oports 4.UsingI/Omemory 5.PortsasI/Omemory
1.I/OportsandI/Omemory
Devicescontrolledthroughaccesstoitsspecial registers. Registersoftenplacedinconsecutivememory addresses. Mostarchitectureshavesinglememoryaddress spaceanddeviceaddressspace. SomearchitectureshaveseparateI/Oportsand mainmemory,eitherbydesignofthethe architectureordesignoftheactualboard.Thex86 isanexamplearchitecturewhereI/Oportsexist.
Kernelprovidesseparatefunctionsfordealingwith I/OportsanddealingwithI/Omemory.
2.I/Oregistersvs.conventionalmemory
<linux/kernel.h> voidbarrier(void);
Hardwarebarriers:
Platformdependent <asm/system.h>
voidrmb(void);
voidread_barrier_depends(void);
voidwmb(void);
voidmb(void);
SMPequivalentbarriers:
set_mb(var,value)
set_wmb(var,value) set_rmb(var,value)
Barriersareusedinkernel'slockingmechanisms (spinlocks,atomic_t,etc.)
3.UsingI/Oports
I/Oportallocation:
voidrelease_region(unsignedlongstart,unsigned longn);
Releasepreviouslyallocatedregion
intcheck_region(unsignedlongfirst,unsignedlong n);
ManipulatingI/Oports:
Someparametertypesandreturnvaluesdependon architecture.
Input:
Output:PerformingdirectI/O
I/Oportaccessfromuserspace:
Onx86/PC,portsareaccessiblefromuserspacein certainconditions:
Stringoperations:
Output:
PausingI/O:
Platformdependencies:
x86andx86_64:
ARM:
PPC:
MIPS:
4.UsingI/Omemory
I/Omemoryallocationandmapping:
Allocation:
Freeing:
Checking:
intcheck_mem_region(unsignedlongstart,unsignedlong len);
Deprecated:can'tguaranteeatomiccheckandrequest
Mapping:
void*ioremap_nocache(unsignedlongphys_addr, unsignedlongsize);
voidiounmap(void*addr);
AccessingI/Omemory:
Readfunctions:
unsignedintioread8(void*addr); unsignedintioread16(void*addr); unsignedintioread32(void*addr); voidiowrite8(u8value,void*addr); voidiowrite16(u16value,void*addr); voidiowrite32(u32value,void*addr); voidioread8_rep(void*addr,void*buf,unsignedlong count); voidioread16_rep(void*addr,void*buf,unsignedlong count);
Writefunctions:
Repeatreadfunctions:
voidioread32_rep(void*addr,void*buf,unsignedlong count); voidiowrite8_rep(void*addr,constvoid*buf,unsignedlong count); voidiowrite16_rep(void*addr,constvoid*buf,unsigned longcount); voidiowrite32_rep(void*addr,constvoid*buf,unsigned longcount); voidmemset_io(void*addr,u8value,unsignedintcount); voidmemcpy_fromio(void*dest,void*source,unsignedint count);
Repeatwritefunctions:
Memoryblockoperations:
Olderfunctions:
5.PortsasI/Omemory
Userequest_region()toreserveI/Oportregion RemapI/Oportregiontomemory:
void*ioport_map(unsignedlongport,unsignedintcount);
voidioport_unmap(void*addr);
Chardrivers
1.Majorandminor 2.Thefileoperationsstructure 3.Thefilestructure 4.Theinodestructure 5.Chardeviceregistration 6.Theolderwayfordeviceregistration 7.Openandrelease 8.Readandwrite 9.Read
1.Majorandminor
Basics:
Theinternalrepresentationofdevicenumbers:
dev_t Macrosin<linux/kdev_t.h>:
Currently,dev_tis32bit:
Allocatingandfreeingdevicenumbers:
register_chrdev_region():staticallocation
alloc_chrdev_region():dynamicallocation
unregister_chrdev_region():deallocation
Beginingdevicenumberforrange Numberofdevicesinrange
Dynamicallocationofmajornumbers:
2.Thefileoperationsstructure
Basics:
Connectionbetweenmajor/minornbrsandchardriver callbacks:
structfile_operations
f_op
file_operationsaretheactualimplementationofthe mainfilesystemcalls:open,close,read,write,etc.
Onecallbackperimplementedcall NULLforunsupportedcalls:
KernelbehaviorforNULLdependsoncall
Details:
loff_t(*llseek)(structfile*,loff_t,int);
PositioncounterunpredictableifNULL
ssize_t(*read)(structfile*,char__user*,size_t,loff_t *);
ssize_t(*aio_read)(structkiocb*,char__user*, size_t,loff_t);
ssize_t(*write)(structfile*,constchar__user*, size_t,loff_t*);
ssize_t(*aio_write)(structkiocb*,constchar__user *,size_t,loff_t);
int(*readdir)(structfile*,void*,filldir_t);
unsignedint(*poll)(structfile*,structpoll_table_struct *);
int(*ioctl)(structinode*,structfile*,unsignedint, unsignedlong);
int(*mmap)(structfile*,structvm_area_struct*);
int(*open)(structinode*,structfile*);
int(*flush)(structfile*);
int(*release)(structinode*,structfile*);
int(*fsync)(structfile*,structdentry*,intdatasync);
int(*aio_fsync)(structkiocb*,intdatasync);
int(*fasync)(int,structfile*,int);
int(*lock)(structfile*,int,structfile_lock*);
ssize_t(*readv)(structfile*,conststructiovec*, unsignedlong,loff_t*);
ssize_t(*writev)(structfile*,conststructiovec*, unsignedlong,loff_t*);
ssize_t(*sendfile)(structfile*,loff_t*,size_t, read_actor_t,void*);
ssize_t(*sendpage)(structfile*,structpage*,int, size_t,loff_t*,int);
int(*check_flags)(int);
int(*dir_notify)(structfile*filp,unsignedlongarg);
UsuallyimplementedbyFSesonly
Mostimportantcalls:
Usefullcalls:
mmap poll
3.Thefilestructure
modet_tf_mode:
loff_tf_ops:
unsignedintf_flags:
structfile_operations*f_op:Thefileopsforthefile
void*private_data:
structdentry*f_dentry:
Canbeusedtoobtaininodestruct:
filp>f_dentry>d_inode
4.Theinodestructure
dev_ti_rdev:
Fordevicedriverinodes,thisisdevicenumber Kernelinternalrepresentationofchardevices
structcdev*i_cdev:
Wheninodeischardevice,pointstoactualstructcdev
Shouldnotaccessinodefieldsdirectly,use macrosinstead:
unsignedintiminor(structinode*inode) unsignedintimajor(structinode*inode)
5.Chardeviceregistration
Initializingexistingcdevstruct:
voidcdev_init(structcdev*cdev,structfile_operations *fops);
Mustadddevicetosystem(forbothstaticand dynamic):
Aboutcdev_add:
Toremove:
Exampledeviceregistrationonp.57
6.Theolderwayfordeviceregistration
intregister_chrdev(unsignedintmajor,const char*name,structfile_operations*fops);
intunregister_chrdev(unsignedintmajor,const char*name);
Samemajorandnamepassedtoregister_chrdev.
7.Openandrelease
openshoulddo:
Identifythedevicebeingopened,ifusingnew registrationscheme:
container_of(pointer,container_type,container_filed) macro:
Ifusingoldregistrationscheme:
Otheropenresponsibilities:
Ifnecessary,updatef_op Allocateandfillanyprivatedataforfip>private_data
releaseshoulddo:
Deallocateanythingallocatedinfilp>private_data Shutdowndeviceonlastclose
Generalitiesaboutopen/release:
flush()calledeverytimethereisaclose(),thoughfew driverssupportthiscall.
8.Readandwrite
Alitbitmoreaboutbuff:
Mustuseappropriatefunctionsforuserspace references:
9.Read
Interpretationofreturnvaluebyuserspaceapp:
10.Write
Interpretationofreturnvaluebyuserspaceapp:
Exampleimplementationp.6869.
11.readv/writev
Vectoroperations Vectorentriescontain:bufferpointer+length value. Ifmissing,read/writearecalledmultipletimesby kernel. Ifuseful,alwaysbesttohaverealreadv/writev Easiest,havealoopindrivercallingread/write Ifimplemented,shouldbemorefancy,likehaving commandreodering. Nottypicallyimplementedbydriver
12.ioctl
Background:
intioctl(intfd,unsignedlongcmd,...);
Notactuallyavariablenumberofarguments,instead, 3rdargis:
char*argp
The"..."avoidstypecheckingatbuildtime argpcanbeusedtopassallsortsofthings
Usuallyimplementedasahuge"switch(cmd)" Each"cmd"isinterpretedandactedondifferently
Commandsmustbedefinedinheaders,andshared withuserspaceappssothattheyknowwhat commandstoinvokeforagivenaction. Numbersshouldbeuniquetoavoidmistakendriver accesstocausedamage. Useof"magicnumbers" Tohelpmanagingthesenumbers,thecommand codeshavebeensplitupinseveralbitfields. include/asm/ioctl.hdefinestheseparatebitfields Documentation/ioctlnumber.txtdefinesthealready allocated"magic"numbers.
Choosingtheioctlcommands:
Bitfields:
Sizeofargument:usually13or14bits,butactualwidth dependsonarch.
Notmandatory,butrecommended Ifneedlargerdatastructs,canignorethisfield
Macrosfordefiningcommandnumbers:
<asm/ioctl.h>
Example:
#define MY_IOC_MAGIC #define MY_IOCFIRSTCOMMAND #define MY_IOCSECONDCOMMAND #define MY_IOCREAD #define MY_IOCWRITE #define MY_IOCREADWRITE #define MY_IOC_MAXNR
Thereturnvalue
Thepredefinedcommands
Filesystemspecificfiles FIOCLEX:
Commandsspecifiedforanyfile,includingdrivers:
FIONCLEX:
FIOASYNC:
FIOQSIZE:
FIONBIO:
Usingtheioctlargument:
Ifnotpointer,useasis Ifpointer,mustmakesureaddressisvalid:
access_ok()checkstomakesureaddressisnotkernelspace <asm/uaccess.h> put_user(datum,ptr)and__put_user(datum,ptr) Writesdatumtousespace Sizeoftransferdeterminedusingptrtype __put_userassumeshavingalreadycalledaccess_ok() retvaliszeroonsuccess get_user(local,ptr)and__get_user(local,ptr) Readdatumfromuserspaceinto"local" retvaliszeroonsuccess
Transfermacros:
Capabilitiesandrestrictedoperations
Inadditiontosimpleuserpermissions,Linuxprovides "capabilities". Capabilitiesenableselectivelysettingpriviligeswith finergranularitythanjustusingroot/nonroot. Capabilities==permissionmanagement Capabilitiescontroledusingsys_capget()and sys_capset(). Availablecapabilitiesdefinedin<linux/capability.h> Noadditionalcapabilitiescanbedefined Capabilitiesrelevanttodrivers:
CAP_DAC_OVERRIDE:
CAP_NET_ADMIN:
CAP_SYS_MODULE:
CAP_SYS_RAWIO:
CAP_SYS_ADMIN:
CAP_SYS_TTY_CONFIG:
Driversshouldcheckforcapabilitiespriortocarrying outprivilegedoperations.
Capabilitychecking:
Devicecontrolwithoutioctl:
13.BlockingI/O
What'sblockingI/O:
Introductiontosleeping:
Introductiontowaitqueues:
DECLARE_WAIT_QUEUE_HEAD(name);
Dynamicinitialization:
voidinit_waitqueue_head(wait_queue_head_t*queue);
Simplesleeping
Basicsleepingmacrostake2or3arguments:
Sleepmacros:
Putprocessinuninterruptiblemode(notrecommended)
wait_event_interruptible(queue,condition)
wait_event_timeout(queue,condition,timeout)
wait_event_interruptible_timeout(queue,condition, timeout)
Sameaswait_event_timeout()
Wakeupfunctions,thebasics:
voidwake_up(wait_queue_head_t*queue);
Wakeupallprocessessleepinginqueue. Wakeuponlyprocessesinstateinterruptible.
voidwake_up_interruptible(wait_queue_head_t*queue);
Blockingandnonblockingoperations
Advancedsleeping:
Howaprocesssleeps:
3steps:
wait_queue_head_tisdefinedin<linux/wait.h>
Containslinkedlistandspinlock
Waitqueueentryisqueueentryoftypewait_queue_t struct __wait_queue { unsigned int flags; #define WQ_FLAG_EXCLUSIVE 0x01 struct task_struct * task; wait_queue_func_t func; struct list_head task_list; };
TASK_RUNNING: Taskisabletorun,butnotnecessarilyrunning
Tomodifyprocessstate:
Ifschedule()called,processwillreturninTASK_RUNNING.
Ifnot,mustresetprocessstatetoTASK_RUNNING manually. Eitherway,mustremovetaskfromwaitqueuetoavoidbeing wokenupagain. Ifsleepconditionsatisfiedbetweenif()andschedule(),then thisisok,processisreturnedto"TASK_RUNNING"and schedule()willreturn,thoughnotnecessarilyrightaway. Oncedone,mustcheckifcodeneedstosleepagain becausethesleepconditionisn'tfulfilled. Candopreviouslydescribedproceduremanually,orbetter usehelperfunctions.
Manualsleeps
Staticwaitqueueentrydeclaration:
Dynamicwaitqueueinitialization:
Addprocesstowaitqueueandsetstate:
voidfinish_wait(wait_queue_head_t*queue,wait_queue_t*wait);
Makesuredon'tneedtosleepagainbecausesleep conditionisn'tfulfilled.
Checkifwakeupduetosignal:
intsignal_pending(structtask_struct*p) SendERESTARTSYSinsuchacase.
Exclusivewaits
Whentouse:
Mustusemanualwaithelperfunctionsforexclusivewaits:
Thedetailsofwakingup:
Fullwakeupfunctions:
voidwake_up(wait_queue_head_t*queue);
voidwake_up_interruptible(wait_queue_head_t*queue);
voidwake_up_nr(wait_queue_head_t*queue,intnr);
voidwake_up_interruptible_nr(wait_queue_head_t *queue,intnr);
Same Wakesupallprocesses,whetherexclusiveornot.
voidwake_up_all(wait_queue_head_t*queue);
voidwake_up_interruptible_all(wait_queue_head_t *queue);
Same
voidwake_up_interruptible_sync(wait_queue_head_t *queue);'
Insurethatprocesswokenupdoesn'tgettorunpriortothisfunction havingreturned.
Mostdriversshouldjustusewake_up_interruptible()
14.pollandselect
Introductiontopollandselect:
NonblockingI/Oapplicationsusepoll,select,and epolltodetermineifdataisreadyforconsumption. Thesecallscanbeusedtocheckentiresetsoffile descriptorsforwhetheroneofthemisreadyfor read/writeorwaitforoneofthemtobereadyfor read/write. pollandselectessentiallyequivalent,implementedby 2separateunixteamsinparallel. epollisLinuxspecificforhandlingthousandsoffile descriptors.
Indrivercallbackforpoll,selectandepoll:
Drivershoulddo:
voidpoll_wait(structfile*,wait_queue_head_t*,poll_table*);
Whatthekerneldoes:
Forepoll:
Flagsrecognizedaspartofbitmask:
<linux/poll.h> POLLIN:
POlLRDNORM:
POLLRDBAND:
POLLPRI:
POLLHUP:
POLLERR:
POLLOUT:
POLLWRNORM:
POLLWBAND:
Interactionwithreadandwrite:
readingdatafromthedevice
writingdatatothedevice
15.fsync
16.fasync
Settingupasyncnotificationinuserspace:
signal(SIGIO, my_sig_handler); fcntl(my_fd, F_SETOWN, getpid()); c_flags = fcntl(my_fd, F_GETFL); fcntl(my_fd, F_SETFL, c_flags | FASYNC); /* /* /* /* Set Set Get Set sig-handler */ fd "owner" */ current flags */ async notification */
Driver'sinvolvement:
fasync()callbackresponsibility:
First3argumentstakenfromparametersoffasync() callback. Lastargumentisdriver'sownpointertoastruct fasync_struct*. voidkill_fasync(structfasync_struct**fa,intsig,int band); Firstparamsameasforfasync_helper sigistypicallySIGIO bandisPOLLIN|POLLRDNORMifreadavailableand POLLOUT|POLLWRNORMifwrite.
Upondataavailability,andifasyncreaders:
Oncloseshouldcallinternalfasync()callbackto removefilefromlistofasynclisteners:
my_fasync(1,filp,0);
17.llseek
18.Accesscontrolonadevicefile
Singleopendevices:
Restrictingaccesstoasingleuseratatime
Allowsusertoopendevicemultipletimes
Mustmaintain:
Usecriticalsection(spinlock)inopento:
Usecriticalsection(spinlock)incloseto:
BlockingI/OasanalternativetoEBUSY
UsuallyunavailabledeviceshouldreturnEBUSY Sometimes,it'sbettertojustwaituntiltheopencan succeed,dependingonwhat'stheexpeceduser experience. Onopen,putprocessonwaitqueueuntildeviceis available. Onrelease,wakeupwaitingprocesses Foreachseparateprocessdoinganopencreatesa newdevicewhichthatprocessnowdealswith independentlyofotherprocesses.
Cloningthedeviceonopen
Blockdrivers
1.Registration 2.Theblockdeviceoperations 3.Requestprocessing 4.Someotherdetails
1.Registration
Blockdriverregistration:
Allocatemajornumberifneedbe Createentryin/proc/devices
intunregister_blkdev(unsignedintmajor,constchar *name);
Diskregistration:
Blockdeviceoperations:
Calledondeviceopen Calledondeviceclose
int(*release)(structinode*inode,structfile*filp);
int(*ioctl)(structinode*inode,structfile*filp,unsignedint cmd,unsignedlongarg);
int(*media_changed)(structgendisk*gd);
int(*revalidate_disk)(structgendisk*gd);
structmodule*owner;
Thegendiskstructure:
Majornumber,firstminornumberandnumberof"disks"
chardisk_name[32];
structblock_device_operations*fops;
structrequest_queue*queue;
intflags;
sector_tcapacity;
void*private_data;
structgendiskcontainsakobject
Allocatingdisks:
structgendisk*alloc_disk(intminors); voiddel_gendisk(structgendisk*gd);
Freeingdisks:
Addingallocateddisktosystem:
1.Properlyinitializedallocatedstructgendisk 2.voidadd_disk(structgendisk*gd); Shouldnotbecalleduntildriverisfullyfunctionalasitwillresult incallstoyourdriver'scallbacks. IOW,ablockI/Orequestcallbackmusthavebeenregistered priortoadd_disk()beinginvoked.
Anoteonsectorsizes:
Kernelseesdeviceasflatarrayofsectors Kernelconsiderseverysectortohave512bytes
Canneverthelessoverridedefaultbymodifying parameterinrequestqueue:
blk_queue_hardsect_size(dev>queue,hardsect_size); set_capacity(dev>gd,nsectors*(harsect_size/512));
Mustcontinuetranslatingsectornumbernonetheless:
2.Theblockdeviceoperations
Theopenandreleasemethods:
int(*open)(structinode*inode,structfile*filp);
Whenisthiscalled:
Nowayfordrivertoknowdifferencebetweenuspaceand kspaceopen.
int(*release)(structinode*inode,structfile*filp);
Reverseopen()
Theioctlmethod:
3.Requestprocessing
Introductiontotherequestmethod:
Calledbykernelforeveryrequest Typicallystartstherequestprocessing
Asimplerequestmethod:
Requestqueuetraversalisdoneusing elv_next_request():
ReturnsNULLwhennomorerequests
Requestsmustbeexplicitelyremovedfromqueue using:
blkdev_dequeue_request(structrequest*req);
Oncerequestisfullyprocessed,musttellkernelthatit isso:
voidend_request(structrequest*req,intsucceeded);
Somerequestsarenotforactualtransfers,butfor devicecommand.
sector_tsector;
unsignedlongnr_sectors;
char*buffer;
Tofigureouttransferdirection,usemacro:
rq_data_dir(structrequest*req);
Requestqueues:
Basics:
TypicalI/0scheduler:
Availableschedulers:
Queuecreationanddeletion:
request_queue_t*blk_init_queue(request_fn_proc*request, spinlock_t*lock);
Set>queuedatatoprivatedata
voidblk_cleanup_queue(request_queue_t*req_queue);
Nomorerequestsqueuedafterthiscall Shouldbecalledduringdriverfinalization
Queuingfunctions:
structrequest*elv_next_request(request_queue_t*queue);
voidblk_dequeue_request(structrequest*req);
voidelv_requeue_request(request_queue_t*queue,struct request*req);
Requeuerequest
Queuecontrolfunctions:
Controlhowqueueoperates voidblk_stop_queue(request_queue_t*queue);
Suspendqueue Resumequeue
voidblk_start_queue(request_queue_t*queue);
voidblk_queue_bounce_limit(request_queue_t*queue, u64dma_addr);
voidblk_queue_max_sectors(request_queue_t*queue, unsignedshortmax);
Maximumnumberofsectorsperrequest
voidblk_queue_max_phys_segments(request_queue_t *queue,unsignedshortmax);
Maximumnumberofnoncontiguousmemoryrangeshandledby driverperrequest.
voidblk_queue_max_hw_segments(request_queue_t *queue,unsignedshortmax);
Maximumnumberofnoncontiguousmemoryrangeshandledby deviceperrequest.
voidblk_queue_max_segment_size(request_queue_t *queue,unsignedintmax);
Maximumsizeforsegmentsinarequest
blk_queue_segment_boundary(request_queue_t*queue, unsignedlongmask);
Setmaximummemoryboundaryforrequestsserviceablebydevice.
voidblk_queue_dma_alignment(request_queue_t*queue, intmask);
DMAalignmentconstraints Requestswillmatchsizeandalignment
voidblk_queue_hardsect_size(request_queue_t*queue, unsignedshortmax);
Specifysectorsizeotherthandefault512bytes Kernelwillcontinuetooperateon512byteassumptionhowever
Theanatomyofarequest:
Basics:
Thebiostructure:
Firstsectortobetransfered Bytestobetransfered
unsignedintbi_size;
unsignedlongbi_flags;
unsignedshortbio_phys_segments;
unsignedshortbio_hw_segments;
structbio*bi_next;
structbio_vecbi_io_vec;
structbio_vec
struct bio_vec { struct page *bv_page; unsigned int bv_len; unsigned int bv_offset; };
Seefig161inLDD3,p.482 Loopingaroundentriesinbioentries:
Canusebio_vecstructentriestocreateDMAmappings Ifdirectpageaccessneeded:
char*__bio_kmap_atomic(structbio*bio,inti,enumkm_typetype); void__bio_kunmap_atomic(char*buffer,enumkm_typetype);
Helperfunctions:
Operateonbuffertobetransferednextwithinbio Maynotbeusefulifdriverwantstowantbiolistbeforedecidingwhat totransfer. structpage*bio_page(structbio*bio); Getpointertonextpagetotransfer intbio_offset(structbio*bio); Getpageoffsetofrequest intbio_cur_sectors(structbio*bio); Getnumberofsectorstotransferfrompage char*bio_data(structbio*bio); Getkernellogicaladdresstobuffertobetransfered Nothighmem(atleast,bydefault,blockI/Olayerdoesn'thand highmembufferstodrivers). char*bio_kmap_irq(structbio*bio,unsignedlong*flags); IRQsafemappingofanytypeofbuffer,evenfromhighmem.
voidbio_unmap_irq(char*buffer,unsignedlong*flags); Undomappingdonewithbio_kmap_irq
Requeststructurefields:
structrequestinternals sector_thard_sector;
unsignedlonghard_nr_sectors;
unsignedinthard_cur_sectors;
structbio*bio;
char*buffer;
unsignedshortnr_phys_segments;
structlist_headqueuelist;
Barrierrequests:
Informingblocklayerthatdriverhandlesbarrierrequests:
Figuringoutwhetherrequestcontainsbarrier:
Nonretryablerequests:
intblk_noretry_request(structrequest*req); Ifretvalnonzero,drivershouldabort
Requestcompletionfunctions:
Basics:
intend_that_request_first(structrequest*req,intsuccess, intcount);
voidend_that_request_last(structrequest*req);
Typically:
if (!end_that_request_first(req, 1, sectors_xferred) { blkdev_dequeue_request(req); end_that_request_last(req); }
Workingwithbios:
Requestcallbackshould:
BlockrequestsandDMA:
InsteadofgoingthroughbiosandsettingupDMAforeach transfer:
Ifnorequestconcatenationwanted:
clear_bit(QUEUE_FLAG_CLUSTER, &queue->queue_flags);
Doingwithoutarequestqueue:
typedefint(make_request_fn)(request_queue_t*q,structbio*bio);
make_requestcan:
Signalingbiocompletiondirectly:
Mustalsosetupmake_requestcallback:
4.Someotherdetails
Commandprepreparation:
Taggedcommandqueueing:
Networkdrivers
1.Basics 2.Connectingtothekernel 3.Thenet_devicestructureindetail 4.Openingandclosing 5.Packettransmission 6.Packetreception 7.Theinterrupthandler 8.Receiveinterruptmitigation 9.Changesinlinkstate
1.Basics
2.Connectingtothekernel
Deviceregistration:
structnet_device*alloc_netdev(intsizeof_priv,const char*name,void(*setup)(structnet_device*));
Helper:
Oncedeviceisallocate*and*initialized:
Initializingeachdevice:
Accessingprivatedatapartofstructnet_device:
structsnull_priv*priv=netdev_priv(dev);
Moduleunloading:
intunregister_netdevice(structnet_device*dev); voidfree_netdev(structnet_device*dev);
3.Thenet_devicestructureindetail
Globalinformation:
charname[IFNAMSIZ];
unsignedlongstate;
structnet_device*next;
int(*init)(structnet_device*dev);
Hardwareinformation:
unsignedlongrmem_end;
unsignedlongrmem_start;
unsignedlongmem_end;
unsignedlongmem_start;
unsignedlongbase_addr;
unsignedcharirq;
unsignedcharif_port;
unsignedchardma;
Helperfunctionsforsettingupinterface information:
voidether_setup(structnet_device*dev);
SetupforEthernetdevice
voidltalk_setup(structnet_device*dev);
voidfc_setup(structnet_device*dev);
voidfddi_setup(structnet_device*dev);
voidhippi_setup(structnet_device*dev);
voidtr_setup(structnet_device*dev);
Interfaceinformation:
unsignedmtu;
unsignedlongtx_queue_len;
unsignedshorttype;
unsignedcharaddr_len;
unsignedcharbroadcast[MAX_ADDR_LEN];
unsignedchardev_addr[MAX_ADDR_LEN];
unsignedshortflags;
intfeatures;
Interfaceflags:
IFF_UP:
IFF_BROADCAST:
IFF_DEBUG:
IFF_LOOPBACK:
IFF_POINTOPOINT:
IFF_NOARP:
IFF_PROMISC:
IFF_MULTICAST:
IFF_ALLMULTI:
IFF_MASTER,IFF_SLAVE:
IFF_PORTSEL,IFF_AUTOMEDIA:
IFF_DYNAMIC:
IFF_RUNNING:
IFF_NOTRAILERS:
Features:
Setbydrivertotellkernelwhatdevicecando NETIF_F_SG,NETIF_F_FRAGLIST:
Scatter/gatherI/O
NETIF_F_IP_CSUM,NETIF_F_NO_CSUM, NETIF_F_HW_CSUM:
Controlwhetherkernelhastodochecksumingorwhetherdevice doesit.
NETIF_F_HIGHDMA:
HighmemoryDMAcapable
Supportfor802.1qVLANsupport TCPsegmentationoffloading
NETIF_F_TSO:
Fundamentaldevicemethods:
int(*open)(structnet_device*dev);
Calledwhendeviceisifconfig'edup Calledwhendeviceisifconfig'eddown
int(*stop)(structnet_device*dev);
int(*hard_start_xmit)(structsk_buff*skb,struct net_device*dev);
int(*rebuild_header)(structsk_buff*skb);
void(*tx_timeout)(structnet_device*dev);
Timeoutcallback
structnet_device_stats*(*get_stats)(structnet_device *dev);
Statisticscallback
int(*set_config)(structnet_device*dev,structifmap *map);
Interfaceconfigchangecallback Nottypicallyneeded
Optionaldevicemethods:
intweight;int(*poll)(structnet_device*dev,int *quota);
NAPIdriverpollcallback Checkwhethereventsoccuredoninternface
void(*poll_controller)(structnet_device*dev);
int(*do_ioctl)(structnet_device*dev,structifreq*ifr, intcmd);
Devicespecificioctl Mutlicastlistchangecallback
void(*set_multicast)(structnet_device*dev);
int(*set_mac_address)(structnet_device*dev,void *addr);
Ifsupported,allowchangingMACaddress
int(*change_mtu)(structnet_device*dev,int new_mtu);
ChangeMTU
int(*header_cache)(structneighbour*neigh,struct hh_cache*hh);
Fill"hh"withARPqueryresult
int(*header_cache_update)(structhh_cache*hh, structnet_device*dev,unsignedchar*haddr);
Update"hh"cache
int(*hard_header_parse)(structsk_buff*skb, unsignedchar*hadd);
Parsehardwareheaders
Utilityfields:
Maintainedbydriver unsignedlongtrans_start;
unsignedlonglast_rx;
intwatchdog_timeo;
void*priv;
structdev_mc_list*mc_list;intmc_count;
spinlock_txmit_lock;intxmit_lock_owner;
4.Openingandclosing
Alldoneviaifconfig Bringingupinterfacewithifconfig:
Assignaddresstointerface:
Turninterfaceon:
Driver'sopenshoulddo:
voidnetif_start_queue(structnet_device*dev);
Driver'scloseshoulddo:
Stopdevicequeue:
voidnetif_stop_queue(structnet_device*dev);
5.Packettransmission
Basics:
Kernelsignalstransmissiontodriverusing hard_start_xmit(). Packetsarehandedovertodriverinformofstruct sk_buff(skb). skbcontainseverythingthatisneededforpacketto travelonnetwork. Drivershouldusuallyjusttransmitskbasis Whentransmittedpacketshorterthanminimumsize supportedbydevice,zerooutremaindertoavoid securityleaks. hard_start_xmitshouldfreeskbaftertransmission
Controllingtransmissionconcurrency:
Notifykernelthatmoretransferscanstartoccurring again:
Iftransmissionmustbesuspendelsewherethanin hard_start_xmit():
Restartingqueueafternetif_tx_disable()using netif_wake_queue().
Transmissiontimeouts:
Scatter/gatherI/O:
Checkingwhetherpacketisfragmented:
Inafragmentedskb:
MustuseDMAoperationstomapstructpageasseenearlier.
6.Packetreception
Needtoallocateskbtopasstohigherlayers Receptionmode:
Interruptdriven:
Oneinterruptperpacket Systempollsinterfacefornewpackets
Polled:(highbandwidth)
Allocateskb:
structsk_buff*dev_alloc_skb(unsignedintlength); Callisatomic(werelikelyinaninterrupthandler)
Setinformationaboutpacket:
Protocol Checksumrequirements:
CHECKSUM_HW,CHECKSUM_NONE, CHECKSUM_UNNECESSARY
Setstatistics Pushskbtonetworkstack:
Possibleoptimization:
7.Theinterrupthandler
Possiblecauses:
Onreceipt,doasdescribedinprevioussection Onsend,deallocatetransmittedbuffer:
8.Receiveinterruptmitigation
9.Changesinlinkstate
Linkstatemaychange Changingcarrierstate:
Checkingcarrierstate:
10.Thesocketbuffers
Basic:
Theimportantfields:
Deviceresponsibleforbuffer Packetheaderpointers
union{...}h;union{...}nh;union{...}mac;
unsignedchar*head;unsignedchar*data;unsigned char*tail;unsignedchar*end;
Packetdatapointers
unsignedintlen;
unsignedintdata_len;
unsignedcharip_summed;
unsignedcharpkt_type;
Sharedinfostructhandling
Someinfostoredin"sharedinfo"structfor performancereasons.
Functionsactingonsocketbuffers:
structsk_buff*dev_alloc_skb(unsignedintlength);
voidkfree_skb(structsk_buff*skb);
voiddev_kfree_skb(structskb_buff*skb);
voiddev_kfree_skb_irq(structskb_buff*skb);
voiddev_kfree_skb_any(strucskb_buff*skb);
unsignedchar*skb_put(structsk_buff*skb,unsigned intlen);
Addlentoendofbuffer Checkifenoughspace
unsignedchar*__skb_put(structsk_buff*skb, unsignedintlen);
Addlentoendofbuffer Don'tcheckforspace
unsignedchar*skb_push(structsk_buff*skb, unsignedintlen);
unsignedchar*__skb_push(structsk_buff*skb, unsignedintlen);
Addlentobeginingofbuffer
intskb_tailroom(conststructsk_buff*skb);
intskb_headroom(conststructsk_buff*skb);
voidskb_reserve(structsk_buff*skb,unsignedint len);
Reserve"len"bytesatbeginingandendofbuffer
unsignedchar*skb_pull(structsk_buff*skb,unsigned intlen);
Removepackethead
intskb_is_nonlinear(conststructsk_buff*skb);
unsignedintskb_headlen(conststructsk_buff*skb);
void*kmap_skb_frag(constskb_frag_t*frag);
voidkunmap_skb_frag(void*vaddr);
11.MACaddressresolution
UsingARPwithEthernet:
OverridingARP:
NonEthernetheaders:
12.Customioctlcommands
13.statisticalinformation
get_stats()callback SeeLDD3fordetails
14.Multicast
SeeLDD3
15.Afewotherdetails
Forremotedebugging Providepoll_controller()callback
PCIdrivers
1.ThePCIinterface 2.PCIaddressing 3.Boottime 4.Configurationregistersandinitialization 5.MODULE_DEVICE_TABLE 6.RegisteringaPCIdriver 7.OldstylePCIprobing 8.EnablingthePCIdevice 9.Accessingtheconfigurationspace
10.AccessingtheI/Oandmemoryspaces 11.PCIinterrupts
1.ThePCIinterface
Mostwidelyusedperipheralbusinmainstream computers. PCIspecificationlaysoutcompletebus functionality. Archindependent HigherclockratethanthepopularISA 32bitbus Peripheralsareautoconfiguredatboottime Linuxprovidesabstractionstohelpdriversaccess PCIresourcesandconfiguration.
2.PCIaddressing
Peripheralidentifiedusing:
Linuxprovidespci_devstructtomanipulatePCI devices. Hostscanhavemanybusespluggedtogether usingPCIbridges.Bridgesareseenasspecial PCIperipherals. PCIlayoutistreeofbusesanddevices RootbusisPCIbus0 Examplelayoutinfig121inLDD3,p.304 ViewingofPCIlayoutcanbedoneusing"lspci",or lookingat/proc/pci,/proc/bus/pcior/sys/bus/pci/.
WaysinwhichdevicesinPCIlayoutaredisplayed (inhex):
Peripheralscananswerqueriesabout:
Alldevicesshareaddressspacefor:
Memorylocations(32or64bitrange) I/Oports(32bitrange)
EachPCIdevicefunctionhas256bytesforconfig (PCIXhas4K)
4bytesreservedforuniqueID(usedbydriversto locatedevice).
3.Boottime
UponresetPCIdevices:
Sysfsprovidesperdeviceconfigurationitems through/sys/bus/pci/devices/<adevice>:
4.Configurationregistersandinitialization
Device(function)configurationcontainedin256 bytes SeeLDD3p.308forillustration First64bytesarestandardandrequired Therestdependsonperipheral AllPCIregistersarelittleendian Useproperbyteorderingfunctionswhen necessary. Seeperipheralhardwaredocumentationfor meaninganduseofregisters.
Fieldsofinterestfordriver:
vendorID(16bit)
deviceID(16bit)
class(16bit)
"network"groupcontains:Ethernetandtokenring
"communication"groupcontains:serialandparallel
subsystemvendorID,subsystemdeviceID:
Driversusethesefieldstoindicatewhich peripheralstheysupport:
structpci_device_id __u32vendor;__32device;
PCIvendoranddeviceIDs UsePCI_ANY_IDifsupportforany
__u32subvendor;__32subdevice;
__u32class;__u32class_mask;
kernel_ulong_tdriver_data;
Helpermacrosforcreatingpci_device_idstructs:
PCI_DEVICE(vendor,device):
PCI_DEVICE_CLASS(device_class, device_class_mask)
Createsstructpci_device_idmatchingspecificclass
Candeclarelistofstructpci_device_idtogiveto PCIlayertoindicatethelistofsupporteddevices.
5.MODULE_DEVICE_TABLE
6.RegisteringaPCIdriver
structpci_driver:
constchar*name;
conststructpci_device_id*id_table;
int(*probe)(structpci_dev*dev,conststruct pci_device_id*id);
Driverprobecallback Calledbykernelastructpci_devisfoundforthisdriver
"id"isthedevicethekernelhasfoundforthisdriver retvalshouldbezeroifdriveracceptsresponsibilityfor deviceandhasinitializedsaiddevice. retvalshouldbenegativeerrorcodeifdriverdoesn't recognizedeviceordoesn'twanttohandleit. Calledwhendeviceisbeingremovedfromthesystem CalledwhenPCIdriverisbeingunloadedfromthekernel Optional Calledwhendeviceisgettingsuspended "state"issuspendstate
void(*remove)(structpci_dev*dev);
int(*suspend)(structpci_dev*dev,u32state);
int(*resume)(structpci_dev*dev);
Optional Calledtoreversesuspend()
BasicPCIdriverentry:
static struct pci_driver my_pci_driver = { .name = "my_pci_driver", .id_table = my_ids, .probe = my_probe, .remove = my_remove }
PCIdriverregistration:
intpci_register_driver(structpci_driver*pci_driver); retval<0iferror
PCIdriverremoval:
7.OldstylePCIprobing
Whennomoredevices,functionreturnsNULL Decrementrefcountondevice
voidpci_dev_put(structpci_dev*pci_dev);
Sameasabove,butallowspassingsubsystemvendor andsubsystemdevice.
structpci_dev*pci_get_slot(structpci_bus*bus, unsignedintdevfn);
Searchesaspecificbusforagivendevicefunction
8.EnablingthePCIdevice
intpci_enable_device(structpci_dev*dev);
"Wakesup"device Insomecases,assignsinterruptlineandI/Oregions
9.Accessingtheconfigurationspace
Havingfounddevice,drivermayneedtoread and/orwriteto
Mustprovidedistancefrombeginingofconfigspace "where"toreadfrominbytes.
<linux/pci.h> Readingconfigdata(8,16and32bit):
Writingconfigdata:
Previousfunctionsareactuallymacros
Lowleveloperations:
intpci_bus_read_config_byte(structpci_bus*bus, unsignedintdevfn,intwhere,u8*val); intpci_bus_read_config_word(structpci_bus*bus, unsignedintdevfn,intwhere,u16*val); intpci_bus_read_config_dword(structpci_bus*bus, unsignedintdevfn,intwhere,u32*val); intpci_bus_write_config_byte(structpci_bus*bus, unsignedintdevfn,intwhere,u8val); intpci_bus_write_config_word(structpci_bus*bus, unsignedintdevfn,intwhere,u16val);
intpci_bus_write_config_dword(structpci_bus*bus, unsignedintdevfn,intwhere,u32val);
Predefinedlocations"where"toreadfromin <linux/pci.h>:
10.AccessingtheI/Oandmemoryspaces
Accessedusingconfigaccessfunctionsat locations:
Usekernelhelperfunctionsinsteadofaccessing configdirectly:
unsignedlongpci_resource_start(structpci_dev *dev,intbar);
unsignedlongpci_resource_end(structpci_dev*dev, intbar);
retvalislastusableaddressofgivenregion
unsignedlongpci_resource_flags(structpci_dev *dev,intbar);
retvalisgivenregion'sflags
Regionflags:
<linux/ioport.h> IORESOURCE_IO:
IORESOURCE_MEM:
IORESOURCE_PREFETCH:
IORESOURCE_READONLY:
UsepreviouslydiscussedI/Ofunctionsto read/writeintoPCIregions.
11.PCIinterrupts
Afterthat,usethealreadycoveredrequest_irq(), etc.
USBdrivers
1.USBdevicebasics 2.USBandsysfs 3.USBurbs 4.WritingaUSBdriver 5.USBtransferswithouturbs
1.USBdevicebasics
Properties:
LinuxandUSB:
LinuxUSBdriversattachto"interface",notentire device.
Endpoints:
BasicformofUSBcommunication Cancarrydatainonedirectiononly
OUT:Fromcomputertodevice IN:Fromdevicetocomputer
CONTROL:
Writecommands Readstatus Eachdevicehasatleast"endpoint0" Atinsertiontime,USBcoreuses"endpoint0"toconfigdevice "endpoint0"transfersguaranteedbyUSBprotocol Periodtransfers:Bandwidthreserved Transfersmallamountsofdataatfixedratefromdevicetocomputer whenhostasks. Primarytransportformiceandkeyboards Canalsobeusedtosendcommandstodevices TransfersguaranteedbyUSBprotocol Asynchronoustransfers Transferlargeamountsofdata Losslesstransfers
INTERRUPT:
BULK:
ISOCHRONOUS:
Linuxstructforendpoints:
structusb_host_endpoint Containsactualendpointinformationplaceholder:
structusb_endpoint_descriptor Datainthisstructisaspassedbydevice
Placeholderentriesrelevanttodrivers:
bEndpointAddress: Endpoint'sUSBaddress UseUSB_DIR_OUTandUSB_DIR_INbitmaskstodetermine direction. bmAttributes: Endpointtype UseUSB_ENDPOINT_XFRERTYPE_MASK,USB_ENDPOINT_XFER_ISOC, USB_ENDPOINT_XFER_BUL,orUSB_ENDPOINT_XFER_INT todetermine endpointtype. wMaxPacketSize: Maximumpacketsizehandledbyendpointateverytransfer LargertransferswillbecutintowMaxPacketSize SeetheUSBspectousethisfieldtospecifya"highbandwidth" mode. bInterval Intervalinmillisecondsforinterrupttypetransfers
Interfaces:
structusb_interface Importantfields:
structusb_host_interface*altsetting: Arrayofpossiblealternatesettingsforinterface
Eachusb_host_interfacecorrespondstosetofendpointconfigs (structusb_host_endpoint). Noparticularordering unsignednum_altsetting: Numberofalternatesettings structusb_host_interface*cur_altsetting: Currentlyactivealternatesetting Pointerintoaltsetting intminor: AllUSBdeviceshavethesamemajornumber Thisistheminornumberattributedtodeviceaftercallto usb_register_dev().
Configurations:
Interfacebundle(oneormany)
Linuxstructforconfigurations:
structusb_host_config
Device:
Convertingdatafromstructusb_interfacetostruct usb_device:
interface_to_usb()macro Commonoperationfordrivers
2.USBandsysfs
FirstdeviceinUSBtreeisUSBroothub EachUSBroothubhasuniqueID(herethisis2)
DeviceconnectedusingexternalUSBhub:
<roothub><hubport><hubport>:<confignb>.<interface>
3.USBurbs
Basics:
Containinternalrefcount(deletedonfreeoflastref) urblifecycle:
structurbimportantfields:
structusb_device*dev:
USBdevicetowhichurbbelongsto
unsignedintpipe:
unsignedinttransfer_flags:
void*transfer_buffer:
dma_addr_ttransfer_dma:
inttransfer_buffer_length:
unsignedchar*setup_packet:
dma_addr_tsetup_dma:
usb_complete_tcomplete:
void*context:
intactual_length:
intstatus:
intstart_frame:
intinterval
intnumber_of_packets:
interror_count:
structusb_iso_packet_descriptoriso_frame_desc[0]:
Settingurbendpointtype:
unsignedintusb_sndctrlpipe(structusb_device*dev, unsignedintendpoint);
unsignedintusb_rcvctrlpipe(structusb_device*dev, unsignedintendpoint); unsignedintusb_sndbulkpipe(structusb_device*dev, unsignedintendpoint); unsignedintusb_rcvbulkpipe(structusb_device*dev, unsignedintendpoint); unsignedintusb_sndintpipe(structusb_device*dev, unsignedintendpoint); unsignedintusb_rcvintpipe(structusb_device*dev, unsignedintendpoint); unsignedintusb_sndisocpipe(structusb_device*dev, unsignedintendpoint);
urbflags:
URB_ISO_ASAP:
URB_NO_TRANSFER_DMA_MAP:
URB_NO_SETUP_DMA_MAP:
URB_ASYNC_UNLINK:
URB_NO_FSBR:
URB_ZERO_PACKET:
URB_NO_INTERRUPT:
urbstatus:
0:
Transfersuccessful urb_kill_urb()stoppedurb
ENOENT:
ECONNRESET:
EINPROGRESS:
EPROTO:
EILSEQ:
EPIPE:
ECOMM:
ENOSR:
EOVERFLOW:
EREMOTEIO:
ENODEV:
EXDEV:
EINVAL:
ESHUTDOWN:
Devicehasseriousproblem,systemshutitdown
Creatinganddestroyingurbs:
structurb*usb_alloc_urb(intiso_packets,intmem_flags);
voidusb_free_urb(structurb*urb);
Freesurb
"pipe":endpointtype,seeearlierfunctions:
usb_sndintpipe/usb_rcvintpipe
"transfer_buffer":kmalloc'edintput/outputbuffer "buffer_length":lengthoftransfer_buffer "compete":completioncallback "context":privatedataforcompletioncallback "interval":urbschedulinginterval voidusb_fill_bulk_urb(structurb*urb,structusb_device *dev,unsignedintpipe,void*transfer_buffer,int buffer_length,usb_complet_tcomplete,void*context); Paramaterssameasusb_fill_int_urb(); "pipe":usb_sndbulkpipeorusb_rcvbulkpipe
Bulkurbs:
Controlurbs:
voidusb_fill_control_urb(structurb*urb,structusb_device *dev,unsignedintpipe,unsignedchar*setup_packet,void *transfer_buffer,intbuffer_length,usb_complet_tcomplete, void*context); Parameterssameasusb_fill_bulk_urb() "setup_packet":sendpriortodata "pipe":usb_sndctrlpipeorusb_rcvctrlpipe Thisinitializernottypicallyused Directtransfersareusedinstead(willseelater) Nohelperfunctions Mustbeinitializedbyhand
Isochronousurbs:
SeeLDD3p.344forexample
Submittingurbs:
GFP_KERNEL:mostsituations
Completingurbs:thecompletioncallbackhandler
Possiblereasonsforcompletion:
Cancelingurbs:
intusb_kill_urb(structurb*urb);
intusb_unlink_urb(structurb*urb);
4.WritingaUSBdriver
Whatdevicesdoesthedriversupport?
__u16idVendor:
__u16idProduct:
__u16bcdDevice_lo,__u16bcdDevice_hi:
__u8bDeviceClass,__u8bDeviceSubClass,__u8 bDeviceProtocol:
Defineclass,subclassandprotocolasspelledoutbyspec Valuesspecifydevicebehavior
__u8bInterfaceClass,__u8bInterfaceSubClass,__u8 bInterfaceProtocol
kernel_ulong_tdriver_info:
Helpermacros:
USB_DEVICE_VER(vendor,product,lo,hi):
USB_DEVICE_INFO(class,subclass,protocol):
Createstructusb_device_idmatchingclassdescription Createstructusb_device_idmatchininterfacedescription
USB_INTERFACE_INFO(class,subclass,protocol):
MODULE_DEVICE_TABLE(usb,<listofstruct usb_device_id>);
RegisteringaUSBdriver:
structusb_driver:
structmodule*owner:
constchar*name:
conststructusb_device_id*id_table:
int(*probe)(structusb_interface*intf,conststruct usb_device_id*id):
Recordanyinforegardingdeviceinlocalstructsforfutureuse.
void(*disconnect)(structusb_interface*intf):
int(*ioctl)(structusb_interface*intf,unsignedintcode,void *buf):
int(*suspend)(structusb_interface*intf,u32state)
int(*resume)(structusb_interface*intf)
Nottypicallyimplemented Calledondeviceresume
Basicusb_driverentry:
static struct usb_driver my_driver = { .owner = THIS_MODULE, .name = "MyDriver", .id_table = my_table, .probe = my_probe, .disconnect = my_disconnect
};
Actualregistration:
intusb_register(structusb_driver*); voidusb_deregister(structusb_driver*);
Mostimportantworkshouldbeconductedatdevice openfromuspace.
probeanddisconnectindetail
Gettingstructusb_device_idfromstructusb_interface:
Whendisconnecting,donotforgettouse usb_set_intfdata()toresetprivatedatatoNULL.
ConnectionbetweenhigherlevelsandUSBdriver:
intusb_register_dev(structusb_interface*intf,struct usb_class_driver*class_driver);
voidusb_deregister_dev(structusb_interface*intf,struct usb_class_driver*class_driver);
Calledindisconnect()
USBbufferallocationprimitives:
void*usb_buffer_alloc(structusb_device*dev,size_t size,intmem_flags,dma_addr_t*dma);
"dev":usbdevice "size":amountrequested
"mem_flags":sameaskmalloc() "dma":"transfer_dma"entryinurbstruct
voidusb_buffer_free(structusb_device*dev,size_t size,void*addr,dma_addr_tdma);
Similarasabove "addr"isusb_buffer_alloc'edspace
5.USBtransferswithouturbs
usb_control_msg:
OtherUSBdatafunctions:
intusb_get_string(structusb_device*dev,unsigned shortlangid,unsignedcharindex,void*buf,intsize);
TTYdrivers
1.Basics 2.AsmallTTYdriver 3.tty_driverfunctionpointers 4.TTYlinesettings 5.ioctls 6.procandsysfshandlingofTTYdevices 7.Corestructdetails
1.Basics
Virtualconsole:
Systemconsole:
Serialports:
Pseudoterminals(PTYs):
Whatisa"linediscipline"?:
Typically,thisisaprotocolconversion:PPP, Bluetooth,etc.
See/proc/tty/driversforlistofttydriverscurrently loaded.
2.AsmallTTYdriver
Basics:
structtty_driver:<linux/tty_driver.h> Allocatingattydriver:
structtty_driver*alloc_tty_driver(<nbttydevices supported>);
inttty_register_driver(structtty_driver*driver); Resultsinsysfsentriescreation
Onceregistered,thedrivershouldregisterthedevices itcontrols:
Conversely:
structtty_drivercontents
"owner":THIS_MODULE "driver_name":nameshownin/proc/tty/drivers
structtermios:
Bitmaskentriesinstructtermios(seetermios manpage):
tcflag_tc_iflag;
tcflag_tc_oflag;
tcflag_tc_cflag;
tcflag_tc_lflag;
cc_tc_line;
cc_tc_cc[NCCS];
3.tty_driverfunctionpointers
openandclose:
Flowofdata:
Otherbufferingfunctions:
Noreadfunction?
FeedingcharacterstoTTYlayer:
Flushingbufferwhenenoughcharacters:
4.TTYlinesettings
Basics:
set_termios:
tiocmgetandtiocmset
5.ioctls
Some70differentioctlstottys SeesummaryinLDD3
6.procandsysfshandlingofTTYdevices
7.Corestructdetails
SeeLDD3forfulldetailsofthefollowingstructs:
AppendixA.Debuggingdrivers
1.Debuggingsupportinthekernel 2.Manualtechniques 3.Debuggingtools 4.Performancemeasurement 5.Hardwaretools
1.Debuggingsupportinthekernel
CONFIG_DEBUG_SLAB:
CONFIG_DEBUG_PAGEALLOC:
CONFIG_DEBUG_SPINLOCK:
CONFIG_DEBUG_SPINLOCK_SLEEP:
CONFIG_INIT_DEBUG:
CONFIG_DEBUG_INFO:
CONFIG_MAGIC_SYSRQ:
CONFIG_DEBUG_STACKOVERFLOW/ CONFIG_DEBUG_STACK_USAGE:
Trackdownkernelstackoverflows.
CONFIG_KALLSYMS(in"Generalsetup/ Standardfeatures):
Includekernelsymboltableintokernelimage. Otherwiseoopsesareinhex.
CONFIG_IKCONFIG/ CONFIG_IKCONFIG_PROC(in"Generalsetup"):
Includekernelconfigurationintokernelimageand makeitavailablethrough/proc.
CONFIG_APIC_DEBUG(in"Power management/ACPI"):
VerboseACPIdebuginfo. Turnondebuginfoindrivercore.
CONFIG_DEBUG_DRIVER:
CONFIG_SCSCI_CONSTANTS(in"Device drivers/SCSIdevicesupport"):
VerboseSCSIdebuginfo.
CONFIG_INPUT_EVBUG:
CONFIG_PROFILING(in"Profilingsupport"):
2.Manualtechniques
printk
printk(Detected error 0x%x on interface %d:%d\n, error_code, iface_bus, iface_id);
/proc
Mainfunctions:include/linux/proc_fs.h
struct proc_dir_entry *create_proc_read_entry(const char *name, mode_t mode, struct proc_dir_entry *base, read_proc_t *read_proc, void * data) void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
read_proc is a callback:
typedef int (read_proc_t)(char *page, char **start, off_t off, int count, int *eof, void *data);
Writedatatopage Careful:canonlyfill1pageatatime(4K) Use*starttotellOSthatyouhavemorethanapage. Yourfunctionwillthenbecalledmorethanoncewitha differentoffset. ALWAYSreturnthesizeyouwrote.Ifyoudon't, nothingwillbedisplayed. Youcanaddyourown/proctreeifyouwant... Methodimplementedinmostdevicedrivermodels (char,block,net,fb,etc.)allowingcustomfunctionality tobecodedindriver...
ioctl:
Canextendyourdriver'sioctltoallowauserspace applicationtopollorchangethedriver'sstateoutside oftheOS'control. Reportprintedoutbykernelregardinginternalerror thatcan'tbehandled. Sometimeslastoutputbeforesystemfreeze=>must becopiedbyhand. Containsaddressesandreferencestofunction addresseswhichcanbeunderstoodbylookingat System.map Canbeautomaticallydecodedwithklogd/ksymoops
oopsmessages:
Unable to handle kernel paging request at virtual address 0007007a printing eip: c022a8f6 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<c022a8f6>] Not tainted EFLAGS: 00010202 eax: 0000000a ebx: 00000004 ecx: 00000001 edx: e3a74b80 esi: 0007007a edi: e62150fc ebp: 0007007a esp: dfbdbc8c ds: 0018 es: 0018 ss: 0018 Process ip (pid: 2128, stackpage=dfbdb000) Stack: 00000000 bfff0018 00000018 0000000a 00000000 e41e8c00 c02f5acd e3a74b80 c022aec1 e3a74b80 0000000a 00000004 0007007a c02f5ac8 00000e94 00000246 e62150b4 00000000 e41e8c00 00000001 00000000 e5365a00 c022b039 e3a74b80 Call Trace: [<c022aec1>] [<c022b039>] [<c022df21>] [<c022e1d0>] [<c022b6ea>] [<c022afb0>] [<c022b1d0>] [<c0176f7e>] [<c022b2a0>] [<c022ddba>] [<c022d623>] [<c022db41>] [<c021c8f5>] [<c021daf3>] [<c0118238>] [<c021d44d>] [<c021e459>] [<c0107800>] [<c010770f>] Code: f3 a5 f6 c3 02 74 02 66 a5 f6 c3 01 74 01 a4 8b 5c 24 10 8b
Forgoodmeasure,alwayssavethecontentof /proc/ksymsbeforegeneratedanoops:
$ cat /proc/kallsyms > /tmp/kallsyms-dump $ sync
UsermodeLinux:http://usermodelinux.sf.net/
RunningusermodeLinux
$ ./linux Checking for /proc/mm...not found tracing thread pid = 14932 Linux version 2.6.11 (karim@localhost.localdomain) ... On node 0 totalpages: 8192 zone(0): 8192 pages. zone(1): 0 pages. zone(2): 0 pages. Kernel command line: root=/dev/ubd0 Calibrating delay loop... 2617.44 BogoMIPS Memory: 29480k available ... Initializing software serial port version 1 mconsole (version 2) initialized on /home/karim/.uml/... unable to open root_fs for validation UML Audio Relay (host dsp = /dev/sound/dsp, host mixer ... Initializing stdio console driver NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP IP: routing cache hash table of 512 buckets, 4Kbytes TCP: Hash tables configured (established 2048 bind 4096) NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. VFS: Cannot open root device "ubd0" or 62:00 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 62:00
3.Debuggingtools
gdb:allarchs
Canusestandardgdbtovisualizekernelvariables:
$ gdb ./vmlinux /proc/kcore
kdb:x86/ia64
IKD:IntegratedKernelDebugger
kgdb:Fullseriallinebasedkerneldebugger
Connecttoremotetargetthroughgdbonhost Usegdbasyouwouldforanyotherremoteprogram
LKCD:LinuxKernelCrashDump
4.Performancemeasurement
LMbench:
kernprof:
Integratedsamplebasedprofiler:
Measuringinterruptlatency:
Selfcontained Induced
Selfcontained:
Interrupthandlingfunction:
Induced:
LinuxisnotanRTOS...
5.Hardwaretools
JTAGtool
http://openwince.sourceforge.net/jtag/
AppendixB.Kerneldatatypes
1.UseofstandardCtypes 2.Assigninganexplicitsizetodatatypes 3.Interfacespecifictypes 4.Otherportabilityissues 5.Linkedlists
1.UseofstandardCtypes
2.Assigninganexplicitsizetodatatypes
<linux/types.h> Unsignedtypes:
Signedtypes(rarelyused):
s64
Forheadersexportedtouserspace,usethese instead(noPOSIXnamespacepollution:
IfyouwanttobeC99compliant(andthecompiler supportsit):
uint8_t,uint16_t,uint32_t,uint64_t
3.Interfacespecifictypes
Commonlyuseddatatypesareusuallytypedef'ed inthekernel Recently,typedefinghaslostitsappealwithkernel developers(opaquetypes) Many"_t"typesdefinedin<linux/types.h>(size_t, pid_t,etc.) Noproblemwhenusedincode:highlyportable Problemwhenprintingvaluesoutfordebugging (usuallysuchvaluesneednotbeprinted.) Toprint,casttolargepossibletypefortypedefed "_t"
4.Otherportabilityissues
Avoidexplicitconstants:use#definesinstead Timeintervals:
Pagesize:
Byteorder:
etc. Varientswithappended"s"forsignedor"p"forpointer
Dataalignment:
Pointersanderrorvalues:
retvalnotalwaysNULLonfailure Returninganerrorasapointervalue:
void*ERR_PTR(longerror);
Determiningifpointerreturnediserror:
longIS_ERR(constvoid*ptr);
Retrievingerrorfromptr(afterIS_ERR()):
longPTR_ERR(constvoid*ptr);
5.Linkedlists
Include"structlist_head"aspartofcustomstructs:
struct my_struct { struct list_head list; /* my stuff ... */ }
Staticinitialization:
Dynamicinitialization:
list_add(structlist_head*new,structlist_head *head);
Addentrytolistrightafterhead Couldpasslistentryinsteadofrealhead
list_add_tail(structlist_head*new,structlist_head *head);
Addtoendoflist Removefromlist
list_del(structlist_head*entry);
list_del_init(structlist_head*entry);
Removefromlistandreinitpointers Forremovingandinsertinginotherlists
list_move(structlist_head*entry,structlist_head *head);
Moveentrytobegining
list_move_tail(structlist_head*entry,strust list_head*head);
Moveentrytoend
list_empty(structlist_head*head);
retvalisnonzeroifempty
list_splice(structlist_head*list,structlist_head *head);
Insertalistinotherlist
list_entry(structlist_head*ptr,type_of_struct, field_name);
Macrosfortraversinglists:
list_for_each(structlist_head*cursor,structlist_head *list)
for()loopexecutedonceforeachlistentry Donotmodifylistinloop
list_for_each_prev(structlist_head*cursor,struct list_head*list)
Sameaslist_for_each()butinreverse
list_for_each_safe(structlist_head*cursor,struct list_head*next,structlist_head*list)
Sameaslist_for_each()butsavesnextentryinlistincase currententryisremoved.
Othertypeoflistdefined"hlist"=>same,buthead hasonlygotonepointer.
AppendixC.Kernelintegration
1.Kernellayout:Wherearethedrivers? 2.Kernelbuildsystem 3.Kernelconfigsystem 4.Addingadrivertothekernelsources 5.Creatingpatches 6.Distributingworkandinterfacingwiththe community
1.Kernellayout:Wherearethedrivers?
Applications arch/ARCH/kernel/entry.S Kernel kernel/* kernel/* mm/* capability.c, arch/ARCH/mm/ sched.c,fork.c, sys.c,softirq.c, kernel/* exit.c panic.c,... fs/pipe.c, fs/fifo.c,ipc/*, net/* kernel/signal.c
fs/*
arch/ARCH/*
net/* drivers/net/*
fs/*/*
drivers/ drivers/ block/* char/*
arch/ARCH/kernel:irq.c,traps.c
CPU
Basic Hardware
Main Memory
NIC
HD
45MB => => 100MB=> 19MB => 32MB => 108KB => 172KB => 1.0MB => 816KB => 10MB => 1.1MB =>
architecturedependentfunctionality mainkerneldocumentation alldrivers virtualfilesystemandallfstypes completekernelheaders kernelstartupcode SystemVIPC corekernelcode memorymanagement networkingcoreandprotocols scriptsusedtobuildkernel
Documentation 8MB
drivers/
acorn md acpi media atm oprofile crypto parisc dio scsi pci serial cdrom net char nubus cpufreq s390 ide sbus ieee1394 w1 mca zorro fc4 pcmcia firmware pnp i2c usb macintosh video mmc bluetooth mtd input tc isdn telephony message base misc block parport eisa infiniband
2.Kernelbuildsystem
drivers/Makefile
# # Makefile for the Linux kernel device drivers. # # 15 Sep 2000, Christoph Hellwig <hch@infradead.org> # Rewritten to use lists instead of if-statements. # obj-$(CONFIG_PCI) += pci/ obj-$(CONFIG_PARISC) += parisc/ obj-y += video/ obj-$(CONFIG_ACPI_BOOT) += acpi/ # PnP must come after ACPI since it will eventually need to check if acpi # was used and do nothing if so obj-$(CONFIG_PNP) += pnp/ # char/ comes before serial/ etc so that the VT console is the boot-time # default. obj-y += char/ # i810fb and intelfb depend on char/agp/ obj-$(CONFIG_FB_I810) += video/i810/ obj-$(CONFIG_FB_INTEL) += video/intelfb/ # we also need input/serio early so serio bus is initialized by the time # serial drivers start registering their serio ports obj-$(CONFIG_SERIO) += input/serio/ obj-y += serial/ obj-$(CONFIG_PARPORT) += parport/ obj-y += base/ block/ misc/ net/ media/ obj-$(CONFIG_NUBUS) += nubus/ obj-$(CONFIG_ATM) += atm/ obj-$(CONFIG_PPC_PMAC) += macintosh/ obj-$(CONFIG_IDE) += ide/
drivers/char/Makefile
# # Makefile for the kernel character device drivers. # # # This file contains the font map for the default (hardware) font # FONTMAPFILE = cp437.uni obj-y += mem.o random.o tty_io.o n_tty.o tty_ioctl.o
obj-$(CONFIG_LEGACY_PTYS) += pty.o obj-$(CONFIG_UNIX98_PTYS) += pty.o obj-y += misc.o obj-$(CONFIG_VT) += vt_ioctl.o vc_screen.o consolemap.o \ consolemap_deftbl.o selection.o keyboard.o obj-$(CONFIG_HW_CONSOLE) += vt.o defkeymap.o obj-$(CONFIG_MAGIC_SYSRQ) += sysrq.o obj-$(CONFIG_ESPSERIAL) += esp.o obj-$(CONFIG_MVME147_SCC) += generic_serial.o vme_scc.o obj-$(CONFIG_MVME162_SCC) += generic_serial.o vme_scc.o obj-$(CONFIG_BVME6000_SCC) += generic_serial.o vme_scc.o obj-$(CONFIG_ROCKETPORT) += rocket.o obj-$(CONFIG_SERIAL167) += serial167.o obj-$(CONFIG_CYCLADES) += cyclades.o obj-$(CONFIG_STALLION) += stallion.o obj-$(CONFIG_ISTALLION) += istallion.o obj-$(CONFIG_DIGIEPCA) += epca.o obj-$(CONFIG_SPECIALIX) += specialix.o obj-$(CONFIG_MOXA_INTELLIO) += moxa.o obj-$(CONFIG_A2232) += ser_a2232.o generic_serial.o obj-$(CONFIG_ATARI_DSP56K) += dsp56k.o ...
Forfulldetails:
Documentation/kbuild/makefiles.txt
3.Kernelconfigsystem
drivers/Kconfig
# drivers/Kconfig menu "Device Drivers" source "drivers/base/Kconfig" source "drivers/mtd/Kconfig" source "drivers/parport/Kconfig" source "drivers/pnp/Kconfig" source "drivers/block/Kconfig" source "drivers/ide/Kconfig" source "drivers/scsi/Kconfig" source "drivers/cdrom/Kconfig" source "drivers/md/Kconfig" source "drivers/message/fusion/Kconfig" source "drivers/ieee1394/Kconfig" source "drivers/message/i2o/Kconfig" source "drivers/macintosh/Kconfig" source "net/Kconfig" source "drivers/isdn/Kconfig"
drivers/char/Kconfig
# # Character device configuration # menu "Character devices" config VT bool "Virtual terminal" if EMBEDDED select INPUT default y if !VIOCONS ---help--If you say Y here, you will get support for terminal devices with ... config VT_CONSOLE bool "Support for console on virtual terminal" if EMBEDDED depends on VT default y ---help--The system console is the device which receives all kernel messages ... config HW_CONSOLE bool depends on VT && !S390 && !UML default y config SERIAL_NONSTANDARD bool "Non-standard serial port support" ---help--Say Y here if you have any non-standard serial boards -- boards ...
Forfulldetails:
Documentation/kbuild/kconfiglanguage.txt
4.Addingadrivertothekernelsources
5.Creatingpatches
Patchbasics
Analyzingapatch'scontent:
$ diffstat -p1 my_patch
Testingapatchbeforeapplyingit:
$ cp my_patch ${PRJROOT}/kernel/linux-2.6.11 $ cd ${PRJROOT}/kernel/linux-2.6.11 $ patch --dry-run -p1 < my_patch
Applyingpatches:
$ patch -p1 < my_patch
6.Distributingworkandinterfacingwiththecommunity
Createprojectwebpage(possiblyonsourceforge) PostpatchestoLKML
Title:
[PATCH]n/x [PATCH/RFC]n/x
signedoffby:YourName<your@mail.tld>
Integratecommunityfeedback Continuepostingupdatedpatches
AppendixD.Portingthekernel
1.Perarchkernellayout 2.Kernelstartup 3.Keydefinitions 4.InterfacingbetweenbootloaderandOS
1.Perarchkernellayout
$ ll arch/ppc total 104 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 10 drwxr-xr-x 2 -rw-r--r-1 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 2 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 5 drwxr-xr-x 2 drwxr-xr-x 2 karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim 4096 4096 4096 4096 4096 4096 36331 1747 4096 4096 4475 4096 4096 4096 4096 4096 4096 Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 4xx_io 8260_io 8xx_io amiga boot configs Kconfig Kconfig.debug kernel lib Makefile math-emu mm oprofile platforms syslib xmon
$ ll arch/mips/ total 220 drwxr-xr-x 2 drwxr-xr-x 12 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 6 drwxr-xr-x 4 -rw-r--r-1 drwxr-xr-x 3 drwxr-xr-x 5 drwxr-xr-x 5 drwxr-xr-x 2 drwxr-xr-x 4 -rw-r--r-1 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 3 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 6 drwxr-xr-x 2 drwxr-xr-x 6 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 3 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 5 drwxr-xr-x 2 drwxr-xr-x 4
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
4096 4096 4096 4096 4096 4096 4096 18490 4096 4096 4096 4096 4096 42476 2564 4096 4096 4096 4096 4096 21398 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096
Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun
17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17
15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48
arc au1000 boot cobalt configs ddb5xxx dec defconfig galileo-boards gt64120 ite-boards jazz jmr3927 Kconfig Kconfig.debug kernel lasat lib lib-32 lib-64 Makefile math-emu mips-boards mm momentum oprofile pci pmc-sierra sgi-ip22 sgi-ip27 sgi-ip32 sibyte sni tx4927
$ ll arch/arm/ total 156 drwxr-xr-x 4 drwxr-xr-x 2 drwxr-xr-x 2 -rw-r--r-1 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
4096 4096 4096 21624 3847 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 7636 4096 4096 4096 4096 4096
Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun
17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17
15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48
boot common configs Kconfig Kconfig.debug kernel lib mach-clps711x mach-clps7500 mach-ebsa110 mach-epxa10db mach-footbridge mach-h720x mach-imx mach-integrator mach-iop3xx mach-ixp2000 mach-ixp4xx mach-l7200 mach-lh7a40x mach-omap mach-pxa mach-rpc mach-s3c2410 mach-sa1100 mach-shark mach-versatile Makefile mm nwfpe oprofile tools vfp
$ ll arch/i386/ total 140 drwxr-xr-x 4 drwxr-xr-x 2 -rw-r--r-1 -rw-r--r-1 -rw-r--r-1 drwxr-xr-x 5 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 -rw-r--r-1 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
4096 4096 26742 42227 2255 4096 4096 4096 4096 4096 4096 4096 6341 4096 4096 4096 4096 4096
Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun
17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17
15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48
boot crypto defconfig Kconfig Kconfig.debug kernel lib mach-default mach-es7000 mach-generic mach-visws mach-voyager Makefile math-emu mm oprofile pci power
2.Kernelstartup
ExplanationforTQM860PPCboard 0.Kernelentrypoint:
arch/ppc/boot/common/crt0.S:_start
1._startcallson:
arch/ppc/boot/simple/head.S:start
2.startcallson:
arch/ppc/boot/simple/relocate.S:relocate
3.relocatecallson:
arch/ppc/boot/simple/miscembedded.c:load_kernel()
4.load_kernel()initializestheseriallineand uncompresseskernelstartingataddress0.
=>
loops_per_jiffy
11.rest_init()does:
1.Startinitthread 2.Unlocksthekernel 3.Becomestheidletask
12.Theinittask:
1.lock_kernel() 2.do_basic_setup() => callvariousinit()fcts 3.prepare_namespace() => mountrootfs 4.free_initmem() 5.unlock_kernel() 6.execve()ontheinitprogram(/sbin/init)
3.Keydefinitions
$ ll include/ total 148 ... drwxr-xr-x 2 drwxr-xr-x 24 drwxr-xr-x 2 drwxr-xr-x 3 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 10 drwxr-xr-x 3 drwxr-xr-x 5 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 45 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 3 drwxr-xr-x 2 drwxr-xr-x 31 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 2 drwxr-xr-x 18 ... karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 12288 Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 asm-alpha asm-arm asm-arm26 asm-cris asm-frv asm-generic asm-h8300 asm-i386 asm-ia64 asm-m32r asm-m68k asm-m68knommu asm-mips asm-parisc asm-ppc asm-ppc64 asm-s390 asm-sh asm-sh64 asm-sparc asm-sparc64 asm-um asm-v850 asm-x86_64 linux
$ ll include/asm-mips/ ... -rw-r--r-1 karim -rw-r--r-1 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 3 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim drwxr-xr-x 2 karim -rw-r--r-1 karim -rw-r--r-1 karim -rw-r--r-1 karim drwxr-xr-x 2 karim ...
karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim karim
519 696 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 1608 450 4189 4096
Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun
17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17
15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48
m48t35.h m48t37.h mach-atlas mach-au1x00 mach-db1x00 mach-ddb5074 mach-dec mach-ev64120 mach-ev96100 mach-generic mach-ip22 mach-ip27 mach-ip32 mach-ja mach-jazz mach-jmr3927 mach-lasat mach-mips mach-ocelot mach-ocelot3 mach-pb1x00 mach-rm200 mach-sibyte mach-vr41xx mach-yosemite marvell.h mc146818rtc.h mc146818-time.h mips-boards
$ ll include/asm-mips/vr41xx/ total 52 -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim -rw-r--r-1 karim karim
1489 1856 1497 1174 2559 1417 1411 1439 6151 6728 1492
Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun Jun
17 17 17 17 17 17 17 17 17 17 17
15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48 15:48
capcella.h cmbvr4133.h e55.h mpc30x.h pci.h siu.h tb0219.h tb0226.h vr41xx.h vrc4173.h workpad.h
4.InterfacingbetweenbootloaderandOS
Verybootloaderdependent Informationsources: