| Devres - Managed Device Resource | 
 | ================================ | 
 |  | 
 | Tejun Heo	<teheo@suse.de> | 
 |  | 
 | First draft	10 January 2007 | 
 |  | 
 |  | 
 | 1. Intro			: Huh? Devres? | 
 | 2. Devres			: Devres in a nutshell | 
 | 3. Devres Group			: Group devres'es and release them together | 
 | 4. Details			: Life time rules, calling context, ... | 
 | 5. Overhead			: How much do we have to pay for this? | 
 | 6. List of managed interfaces	: Currently implemented managed interfaces | 
 |  | 
 |  | 
 |   1. Intro | 
 |   -------- | 
 |  | 
 | devres came up while trying to convert libata to use iomap.  Each | 
 | iomapped address should be kept and unmapped on driver detach.  For | 
 | example, a plain SFF ATA controller (that is, good old PCI IDE) in | 
 | native mode makes use of 5 PCI BARs and all of them should be | 
 | maintained. | 
 |  | 
 | As with many other device drivers, libata low level drivers have | 
 | sufficient bugs in ->remove and ->probe failure path.  Well, yes, | 
 | that's probably because libata low level driver developers are lazy | 
 | bunch, but aren't all low level driver developers?  After spending a | 
 | day fiddling with braindamaged hardware with no document or | 
 | braindamaged document, if it's finally working, well, it's working. | 
 |  | 
 | For one reason or another, low level drivers don't receive as much | 
 | attention or testing as core code, and bugs on driver detach or | 
 | initialization failure don't happen often enough to be noticeable. | 
 | Init failure path is worse because it's much less travelled while | 
 | needs to handle multiple entry points. | 
 |  | 
 | So, many low level drivers end up leaking resources on driver detach | 
 | and having half broken failure path implementation in ->probe() which | 
 | would leak resources or even cause oops when failure occurs.  iomap | 
 | adds more to this mix.  So do msi and msix. | 
 |  | 
 |  | 
 |   2. Devres | 
 |   --------- | 
 |  | 
 | devres is basically linked list of arbitrarily sized memory areas | 
 | associated with a struct device.  Each devres entry is associated with | 
 | a release function.  A devres can be released in several ways.  No | 
 | matter what, all devres entries are released on driver detach.  On | 
 | release, the associated release function is invoked and then the | 
 | devres entry is freed. | 
 |  | 
 | Managed interface is created for resources commonly used by device | 
 | drivers using devres.  For example, coherent DMA memory is acquired | 
 | using dma_alloc_coherent().  The managed version is called | 
 | dmam_alloc_coherent().  It is identical to dma_alloc_coherent() except | 
 | for the DMA memory allocated using it is managed and will be | 
 | automatically released on driver detach.  Implementation looks like | 
 | the following. | 
 |  | 
 |   struct dma_devres { | 
 | 	size_t		size; | 
 | 	void		*vaddr; | 
 | 	dma_addr_t	dma_handle; | 
 |   }; | 
 |  | 
 |   static void dmam_coherent_release(struct device *dev, void *res) | 
 |   { | 
 | 	struct dma_devres *this = res; | 
 |  | 
 | 	dma_free_coherent(dev, this->size, this->vaddr, this->dma_handle); | 
 |   } | 
 |  | 
 |   dmam_alloc_coherent(dev, size, dma_handle, gfp) | 
 |   { | 
 | 	struct dma_devres *dr; | 
 | 	void *vaddr; | 
 |  | 
 | 	dr = devres_alloc(dmam_coherent_release, sizeof(*dr), gfp); | 
 | 	... | 
 |  | 
 | 	/* alloc DMA memory as usual */ | 
 | 	vaddr = dma_alloc_coherent(...); | 
 | 	... | 
 |  | 
 | 	/* record size, vaddr, dma_handle in dr */ | 
 | 	dr->vaddr = vaddr; | 
 | 	... | 
 |  | 
 | 	devres_add(dev, dr); | 
 |  | 
 | 	return vaddr; | 
 |   } | 
 |  | 
 | If a driver uses dmam_alloc_coherent(), the area is guaranteed to be | 
 | freed whether initialization fails half-way or the device gets | 
 | detached.  If most resources are acquired using managed interface, a | 
 | driver can have much simpler init and exit code.  Init path basically | 
 | looks like the following. | 
 |  | 
 |   my_init_one() | 
 |   { | 
 | 	struct mydev *d; | 
 |  | 
 | 	d = devm_kzalloc(dev, sizeof(*d), GFP_KERNEL); | 
 | 	if (!d) | 
 | 		return -ENOMEM; | 
 |  | 
 | 	d->ring = dmam_alloc_coherent(...); | 
 | 	if (!d->ring) | 
 | 		return -ENOMEM; | 
 |  | 
 | 	if (check something) | 
 | 		return -EINVAL; | 
 | 	... | 
 |  | 
 | 	return register_to_upper_layer(d); | 
 |   } | 
 |  | 
 | And exit path, | 
 |  | 
 |   my_remove_one() | 
 |   { | 
 | 	unregister_from_upper_layer(d); | 
 | 	shutdown_my_hardware(); | 
 |   } | 
 |  | 
 | As shown above, low level drivers can be simplified a lot by using | 
 | devres.  Complexity is shifted from less maintained low level drivers | 
 | to better maintained higher layer.  Also, as init failure path is | 
 | shared with exit path, both can get more testing. | 
 |  | 
 |  | 
 |   3. Devres group | 
 |   --------------- | 
 |  | 
 | Devres entries can be grouped using devres group.  When a group is | 
 | released, all contained normal devres entries and properly nested | 
 | groups are released.  One usage is to rollback series of acquired | 
 | resources on failure.  For example, | 
 |  | 
 |   if (!devres_open_group(dev, NULL, GFP_KERNEL)) | 
 | 	return -ENOMEM; | 
 |  | 
 |   acquire A; | 
 |   if (failed) | 
 | 	goto err; | 
 |  | 
 |   acquire B; | 
 |   if (failed) | 
 | 	goto err; | 
 |   ... | 
 |  | 
 |   devres_remove_group(dev, NULL); | 
 |   return 0; | 
 |  | 
 |  err: | 
 |   devres_release_group(dev, NULL); | 
 |   return err_code; | 
 |  | 
 | As resource acquisition failure usually means probe failure, constructs | 
 | like above are usually useful in midlayer driver (e.g. libata core | 
 | layer) where interface function shouldn't have side effect on failure. | 
 | For LLDs, just returning error code suffices in most cases. | 
 |  | 
 | Each group is identified by void *id.  It can either be explicitly | 
 | specified by @id argument to devres_open_group() or automatically | 
 | created by passing NULL as @id as in the above example.  In both | 
 | cases, devres_open_group() returns the group's id.  The returned id | 
 | can be passed to other devres functions to select the target group. | 
 | If NULL is given to those functions, the latest open group is | 
 | selected. | 
 |  | 
 | For example, you can do something like the following. | 
 |  | 
 |   int my_midlayer_create_something() | 
 |   { | 
 | 	if (!devres_open_group(dev, my_midlayer_create_something, GFP_KERNEL)) | 
 | 		return -ENOMEM; | 
 |  | 
 | 	... | 
 |  | 
 | 	devres_close_group(dev, my_midlayer_create_something); | 
 | 	return 0; | 
 |   } | 
 |  | 
 |   void my_midlayer_destroy_something() | 
 |   { | 
 | 	devres_release_group(dev, my_midlayer_create_something); | 
 |   } | 
 |  | 
 |  | 
 |   4. Details | 
 |   ---------- | 
 |  | 
 | Lifetime of a devres entry begins on devres allocation and finishes | 
 | when it is released or destroyed (removed and freed) - no reference | 
 | counting. | 
 |  | 
 | devres core guarantees atomicity to all basic devres operations and | 
 | has support for single-instance devres types (atomic | 
 | lookup-and-add-if-not-found).  Other than that, synchronizing | 
 | concurrent accesses to allocated devres data is caller's | 
 | responsibility.  This is usually non-issue because bus ops and | 
 | resource allocations already do the job. | 
 |  | 
 | For an example of single-instance devres type, read pcim_iomap_table() | 
 | in lib/devres.c. | 
 |  | 
 | All devres interface functions can be called without context if the | 
 | right gfp mask is given. | 
 |  | 
 |  | 
 |   5. Overhead | 
 |   ----------- | 
 |  | 
 | Each devres bookkeeping info is allocated together with requested data | 
 | area.  With debug option turned off, bookkeeping info occupies 16 | 
 | bytes on 32bit machines and 24 bytes on 64bit (three pointers rounded | 
 | up to ull alignment).  If singly linked list is used, it can be | 
 | reduced to two pointers (8 bytes on 32bit, 16 bytes on 64bit). | 
 |  | 
 | Each devres group occupies 8 pointers.  It can be reduced to 6 if | 
 | singly linked list is used. | 
 |  | 
 | Memory space overhead on ahci controller with two ports is between 300 | 
 | and 400 bytes on 32bit machine after naive conversion (we can | 
 | certainly invest a bit more effort into libata core layer). | 
 |  | 
 |  | 
 |   6. List of managed interfaces | 
 |   ----------------------------- | 
 |  | 
 | CLOCK | 
 |   devm_clk_get() | 
 |   devm_clk_put() | 
 |  | 
 | DMA | 
 |   dmam_alloc_coherent() | 
 |   dmam_alloc_noncoherent() | 
 |   dmam_declare_coherent_memory() | 
 |   dmam_free_coherent() | 
 |   dmam_free_noncoherent() | 
 |   dmam_pool_create() | 
 |   dmam_pool_destroy() | 
 |  | 
 | GPIO | 
 |   devm_gpiod_get() | 
 |   devm_gpiod_get_index() | 
 |   devm_gpiod_get_index_optional() | 
 |   devm_gpiod_get_optional() | 
 |   devm_gpiod_put() | 
 |  | 
 | IIO | 
 |   devm_iio_device_alloc() | 
 |   devm_iio_device_free() | 
 |   devm_iio_device_register() | 
 |   devm_iio_device_unregister() | 
 |   devm_iio_kfifo_allocate() | 
 |   devm_iio_kfifo_free() | 
 |   devm_iio_trigger_alloc() | 
 |   devm_iio_trigger_free() | 
 |  | 
 | IO region | 
 |   devm_release_mem_region() | 
 |   devm_release_region() | 
 |   devm_release_resource() | 
 |   devm_request_mem_region() | 
 |   devm_request_region() | 
 |   devm_request_resource() | 
 |  | 
 | IOMAP | 
 |   devm_ioport_map() | 
 |   devm_ioport_unmap() | 
 |   devm_ioremap() | 
 |   devm_ioremap_nocache() | 
 |   devm_ioremap_wc() | 
 |   devm_ioremap_resource() : checks resource, requests memory region, ioremaps | 
 |   devm_iounmap() | 
 |   pcim_iomap() | 
 |   pcim_iomap_regions()	: do request_region() and iomap() on multiple BARs | 
 |   pcim_iomap_table()	: array of mapped addresses indexed by BAR | 
 |   pcim_iounmap() | 
 |  | 
 | IRQ | 
 |   devm_free_irq() | 
 |   devm_request_any_context_irq() | 
 |   devm_request_irq() | 
 |   devm_request_threaded_irq() | 
 |  | 
 | LED | 
 |   devm_led_classdev_register() | 
 |   devm_led_classdev_unregister() | 
 |  | 
 | MDIO | 
 |   devm_mdiobus_alloc() | 
 |   devm_mdiobus_alloc_size() | 
 |   devm_mdiobus_free() | 
 |  | 
 | MEM | 
 |   devm_free_pages() | 
 |   devm_get_free_pages() | 
 |   devm_kasprintf() | 
 |   devm_kcalloc() | 
 |   devm_kfree() | 
 |   devm_kmalloc() | 
 |   devm_kmalloc_array() | 
 |   devm_kmemdup() | 
 |   devm_kstrdup() | 
 |   devm_kvasprintf() | 
 |   devm_kzalloc() | 
 |  | 
 | PCI | 
 |   pcim_enable_device()	: after success, all PCI ops become managed | 
 |   pcim_pin_device()	: keep PCI device enabled after release | 
 |  | 
 | PHY | 
 |   devm_usb_get_phy() | 
 |   devm_usb_put_phy() | 
 |  | 
 | PINCTRL | 
 |   devm_pinctrl_get() | 
 |   devm_pinctrl_put() | 
 |  | 
 | PWM | 
 |   devm_pwm_get() | 
 |   devm_pwm_put() | 
 |  | 
 | REGULATOR | 
 |   devm_regulator_bulk_get() | 
 |   devm_regulator_get() | 
 |   devm_regulator_put() | 
 |   devm_regulator_register() | 
 |  | 
 | SLAVE DMA ENGINE | 
 |   devm_acpi_dma_controller_register() | 
 |  | 
 | SPI | 
 |   devm_spi_register_master() |