devfs
[Top] [All Lists]

RE: [draft] Hotswap and Linux

To: "'Johannes Erdfelt'" <jerdfelt@xxxxxxxxxxx>, devfs@xxxxxxxxxxx
Subject: RE: [draft] Hotswap and Linux
From: "Dunlap, Randy" <randy.dunlap@xxxxxxxxx>
Date: Tue, 27 Jun 2000 09:46:48 -0700
Sender: owner-devfs@xxxxxxxxxxx
Hi Johannes,

I'm replying to your original posting.
I know that Richard G. replied with some comments &
changes, but I haven't looked at his comments in any
detail, so some of mine may be repeats (or conflicts)
with his.  Lots of these changes are just typos.


> -----Original Message-----
> From: Johannes Erdfelt [mailto:jerdfelt@xxxxxxxxxxx]
> Sent: Wednesday, June 21, 2000 10:22 AM
> To: devfs@xxxxxxxxxxx
> Subject: [draft] Hotswap and Linux
> 
> 
...
> 
> 2. Hot-Swap Interfaces
> 
> A hot-swap interface is any interface to a computer which you can plug
INSERT:                                           into^
> and unplug devices while the machine is running. Insertion and removal
> does not necessarily need to happen while the machine is
...
> However, as is often the case with the computer industry, technologies
> blur and USB is being seen on servers and Hot-Swap PCI and 
> Hot-Swap SCSI
> technologies will eventually find their way to desktops. 
> PCMCIA, or more
> specifically CardBus, is essentially a form of Hot-Swap PCI. However,
> they are different enough to mention seperately.
CHANGE:                                separately.
> 
> Most of the interfaces are busses, allowing multiple devices to be
> connected. However, things like parport could also be considered a
> hot-swap interface.
> 
> 2.1 PCMCIA
> 
...
> 
> 2.2 SCSI
> 
> I'm not an expert on SCSI, but I know that SCSI has supported hot-swap
> devices for a while in some situations.
> 
> It is a higher level bus than the system bus.

ADD (with your choice of editing):
It is a separate (or secondary) bus that attaches to the
system bus.

> 
> 2.3 Hot-Swap PCI
> 
> Hot-Swap PCI is relatively recent which is mainly seen in servers. It
EDIT:            ^a               ^technology s/which/that/
> simply adds Hot-Swap capability to PCI (correct?)
> 
> PCMCIA (CardBus) and Hot-Swap PCI are very similar technologies.
ADD: except that PCI hot-plug is defined as an orderly removal
and insertion technology, with full hardware and software
approval of the slot power-off/on and possible driver
removal and reloading, while PCMCIA cards can be removed/inserted
by user actions without prior software approval or
coordination.

> 
> It is a part of the system bus.
> 
> 2.4 IEEE 1394
> 
> This is commonly called by it's trade names Firewire (Apple) 
EDIT:                          ^remove "'"
> or iLink (Sony).
> It is very similar to USB but is not as widely adopted.
> 
> It is a higher level bus than the system bus.
ADD (with your choice of editing):
It is a separate (or secondary) bus that attaches to the
system bus.

> 
> 2.5 USB
> 
> Universal Serial Bus was originally introduced a couple of 
> years ago and
EDIT: s/a couple of/several/
> has recently seen widespread adoption (iMac, etc).
EDIT: add to iMac: Sun, Intel platforms

> 
> It is a higher level bus than the system bus.
ADD (with your choice of editing):
It is a separate (or secondary) bus that attaches to the
system bus.

...
> 
> 3.1 Device Naming
> 
> Devices are named on a first come, first served basis. For 
EDIT:                    first-come, first-served
> instance, when
> the OS probes for SCSI devices, the first hard drive is 
> assigned major/minor
> 8/0. The second harddrive is assigned major/minor 8/16. 
EDIT:             hard drive
> Traditional device
> nodes under Linux for these devices are /dev/sda and /dev/sdb 
> respectively.
> 
> When a device can be inserted or removed randomly, as in the case of a
> hot-swap interface, the probing order becomes random. Since 
> Linux still
> uses a first come first served naming scheme, the device 
EDIT:    first-come first-served
> nodes name can be essentially random.
EDIT: node names
> 
...
> 
> 3.2 Device Permissions
> 
> Linux and most every other Unix, stores permissions for a 
EDIT:                            ^remove comma
> device in the
> filesystem. Those are associated with the name of the device 
> (/dev/sda).
> 
> Since we cannot guarantee the same name each time a device is 
> inserted (see
> 3.1), we cannot guarantee the device has the correct or same 
> permissions.
> 
> Tracking devices should be done by tracking characteristics 
> of the device
EDIT:          ^add comma; maybe add examples: such as a
                serial number or disk label
> not the name of the device node corresponding to the device.
> 
> 3.3 Enumeration and Driver Binding
> 
> Many of today's interfaces offer complicated device 
> structure. Some allow
EDIT:      ^add: 's' (such as PCI configuration space, USB
descriptors, and PCMCIA/CardBus Card Information Structure)
> multiple logical functions in one physical device, along with multiple
> interfaces of varying complexity.
> 
> Selecting the correct configuration parameters can be complicated. USB
> has the notion of configurations, interfaces, alternate settings and
> endpoints. Choosing which configuration or alternate setting 
> to use, or
> what drivers to bind to each interface is complicated and involves
EDIT:  driver
> user defined policy.
EDIT: ^user-defined
> 
> 3.4 Logical Function to Physical Device Association
> 
> Given a logical function of a device (say a SCSI drive, 
QUESTION:          ^name ?
> /dev/sdb) I cannot
EDIT:      ^insert comma
> determine what physical device it is.
> 
...
> 
> 3.5 Userspace API
> 
> Hot-Swap interfaces that are not system busses can often 
> export a userspace
EDIT:               ^should be hyphenated (userspace-visible)
> visible API which can move complex code out of the kernel 
> core and into
> userspace for a variety of reasons.
> 
...
> 
> 4. Existing Solutions
> 
> The problem of hot-swap has been tackled in the past at varying levels
> of complexity and varying levels of effectiveness.
> 
> 4.1 PCMCIA/Carbus
EDIT:        CardBus
> 
> PCMCIA (and Carbus, however I will use PCMCIA to cover both 
EDIT:         CardBus; however,
> technologies)
> has been supported by Linux for a couple of releases and has been the 
> mostly widely used hot-swap interface. It thusly has ran into many of
> these problems for a while and a series of solutions have 
> been developed
> with a varying degree of effectiveness.
> 
> The core of their solution is the Card Manager.
> 
> The PCMCIA card manager is notified by the kernel when a 
EDIT:        Card Manager
> device insertion
> or removal occurs on any PCMCIA bus. It obtains from the 
...
> 
> 4.1.2 PCMCIA and Device Permissions
> 
> The same shell scripts that configure the device, can be used 
EDIT:                                             ^delete comma
> to create and apply permissions to device nodes.
> 
> This has the same problem as Device Naming in that it requires hard
EDIT: change to: hard-coded
> coded configuration in shell scripts.
> 
...
> 
> 4.1.5 PCMCIA and Userspace API
> 
> Since PCMCIA is a system bus, it uses the same architecture defined
EDIT:                                            architecture-defined
> access to devices. This includes IRQ's, DMA channels, I/O 
> ports, memory
> ranges, etc. The existing userspace API's for this are used.
> 
> The existing API is limited to I/O port and memory access.
> 
...
> 
> 5.1 Complex Structure of USB Devices
> 
> This was touched on in section 3.3 (Enumeration and Driver Binding).
> 
> Each USB device can have multiple configurations. Only one 
> configuration
> can be active at once. Each configuration has multiple 
> interfaces. Each
> interface offers a logical function of the device. They are all active
> at once. Each interface has an alternate setting. Alternate settings
CHANGE (?): I'd say:  They can all be active simultaneously.
> select how much bandwidth a device uses, programming interface, etc.
EDIT:                        s/device/interface/
> Only one alternate setting can be active at a time per interface. Each
> alternate setting has 1 or more endpoints.
> 
...
> 
> Since each interface is a logical function of the device, separate
> permissions are required. Thusly, multiple device nodes are required
EDIT:                       Thus,                            ^possibly
> for one device.
> 
> Tracking permissions for the device then entails tracking the device
> and the permissions for each interface.
> 
> A quick summary, one device node is not sufficient to 
EDIT:     summary: One device node generally is not...
> describe the device
> and the layout of the device can wildly change from device to 
> device and configuration to configuration.
> 
> 5.2 Userspace API
> 
...
> 
> The existing userspace solution overloads one device node with ioctl's
> for each transfer type and feature for all endpoints (and thusly
> interfaces). The existing implemented solution will not be the final
> solution since it has a variety of problems.
NEEDS TO BE SUBSTANTIATED:                   ^, some of which are....
> 
...
> 
> 6.1 Required Features for the solution
> 
> Some people have had the misconception that this locks 
> everyone into devfs
> to use USB. This is not the case. It locks us into a certain 
> feature set that must be supported.
AGREED.
> 
> In the case right now, devfs is the only solution which 
EDIT: The case right now is that devfs is...
> offers all of the infrastructure needed.
> 
> I cannot see how hot-swap can be implemented in a clean way 
> without these requirements.
> 
> 6.1.1 Dynamic creation of device nodes
> 
> Once configurations and alternate settings are selected, the number of
> interfaces, endpoints can radically change.
EDIT:      s/,/ &/
> 
> 6.1.2 No fixed limit on device nodes
> 
> The structure and layout of device nodes is sparse and complicated
EDIT:                                                              ^add
comma
> quickly outgrowing the entire major/minor space as currently 
> implemented.
> 
> This is required to solve the problem described in section 4.2
EDIT: end with "4.2."
> 
> 6.1.3 VFS intercepts for common syscalls (chmod, chown, etc)
> 
> To properly track device permissions, we need to know when permissions
> are changed for device nodes so the permissions can be saved in a
> database along with other identifying information about the device.
> 
> This is required to solve the problem described in section 3.2
EDIT: end with "3.2."
> 
> 6.1.4 VFS intercepts to load modules on demand
> 
> Part of the design will track devices and their logical functions. The
> VFS can intercept open() calls for devices and load modules on demand.
> 
> 6.2 Goals
> 
> There were goals I strived to meet while I create this solutions
EDIT:                                        created this solution.
> 
> 6.2.1 Minimal code duplication
> 
...
> 
> This also minimizes the impact into unswappable kernel memory 
EDIT:                           s/into/to/
> and kernel image size.
> 
> 6.2.2 Intuitive and Clean Architecture
> 
> Some other solutions had been suggested to dynamically negotiate minor
> numbers between kernel and user space for device nodes. This 
> is not clean
> nor intuitive and is just a kludge. (Nor does this solution 
> meet all of the requirements)
EDIT:             requirements.)
> 
> The solution must be designed to stand up for the future.
> 
...
> 
> 6.4 Description of solution
> 
> The core of the solution is devfs and devfsd. Using devfs allows us to
> meet goals 6.2.1 (Code Duplication) and 6.2.2 (Intuitive and Clean
> Architecture) immediately.
> 
> devfs centralizes much of the common code, such as the kernel 
> to userspace
EDIT: kernel-to-userspace
> channel to communicate device insertions, removals, chmod 
> calls, etc. It
> also avoids creating extra VFS' for each hot-swap bus like usbdevfs is
> currently implemented.
> 
...
> 
> 6.4.1 USB and Logical Function to Physical Device Association
> 
> Using a design from David Hinds (maintainer of the Linux 
> PCMCIA code) I
> propose we create a generic ioctl() interface which can be used to
> retrieve physical device data for a given logical device. 
> This data will
EDIT:      s/will/will be/
> domain specific and specific to each hot-swap interface.
> 
> An alternative solution is to use devfs to create logical device nodes
> as a bus specific node, and create symlinks 
EDIT: as bus-specific nodes,
> (/dev/scsi/host0/bus0/target0
> -> /dev/usb/device1/scsi/target0 or something similar).
> 
...
> 
> 6.4.2 Processing on REGISTER events
> 
> When a REGISTER event occurs, the module obtains the 
QUESTION:                           ^What module?
> descriptors for the
> device and parses them, determining which configurations, 
> interfaces and
> alternate settings are available on the device. It then executes an
> arbitrary algorithm (not explained here for brevity) to select a
> configuration, then select alternate settings and drivers for each
> interface. The software then programs the active configuration and the
> active alternate setting for each interface.
QUESTION: But the to-be-loaded driver can modify the interface
selection?

> 
> If the driver must be loaded at insertion time, instead of at 
> use time,
> the driver will be loaded, binds the driver to the interfaces necessary
EDIT:                        bound to the interfaces necessary
> and activates the interface.
EDIT: and the interface activated.
> 
...
> 
> 7. Future work
> 
> The work I've done is not just applicable to USB. Many other 
> systems can use similar algorithms and code.
EDIT: s|systems|I/O subsystems|
> 
...
> 
> 7.2 SCSI
> 
> Many, if not all, modern SCSI devices have a unique identifiers which
EDIT:                                        ^delete 'a'
> could be used to devices. This could be used to track permissions on
EDIT:             ^insert: identify; s/This/These/
> SCSI generic devices (CD burners, scanners, etc) as well as 
> the logical function.
> 
...
> 
> 7.4 ALSA
> 
> A similar, but different, problem exists for ALSA. Most 
> devices supported
> under ALSA export many interfaces to control the device. To 
> appropriately
> solve this problem, the ALSA team overloads /proc with the 
EDIT: change to "To solve this problem appropriately,"
> device nodes it
> needs. Permissions aren't tracked for the device nodes.
> 
> 8. Closing Remarks
> 
> I expect tweaks to be made when I get input from the kernel developer
> community at large, including those other subsystem that may be
EDIT:                                       subsystems
> affected including PCMCIA, Hot-Swap PCI, IEEE1394 and SCSI.
> 
...
> 
> Also, please help clear up some of the terminology and descriptions. I
> admit my english and explanations can suck.
EDIT:      English
> 
> JE


I think that you are off to a very good start here.
Have you sent this to Johnny ? @mot.com, who did the
PCI hot-swap implementation last year?
And the 1394 people?

What are your next steps for this?

HTH.
~Randy
___________________________________________________
|Randy Dunlap     Intel Corp., DAL    Sr. SW Engr.|
|randy.dunlap.at.intel.com            503-696-2055|
|NOTE:  Any views presented here are mine alone   |
|and may not represent the views of my employer.  |
|_________________________________________________|


<Prev in Thread] Current Thread [Next in Thread>