locking.txt revision 1.1
11.1Shaad 21.1Shaad Device-mapper Locking architecture 31.1Shaad 41.1ShaadOverview 51.1Shaad 61.1ShaadThere are 2 users in device-mapper driver 71.1Shaad a) Users who uses disk drives 81.1Shaad b) Users who uses ioctl management interface 91.1Shaad 101.1ShaadManagement is done by dm_dev_*_ioctl and dm_table_*_ioctl routines. There are 111.1Shaadtwo major structures used in these routines/device-mapper. 121.1Shaad 131.1ShaadTable entry: 141.1Shaad 151.1Shaadtypedef struct dm_table_entry { 161.1Shaad struct dm_dev *dm_dev; /* backlink */ 171.1Shaad uint64_t start; 181.1Shaad uint64_t length; 191.1Shaad 201.1Shaad struct dm_target *target; /* Link to table target. */ 211.1Shaad void *target_config; /* Target specific data. */ 221.1Shaad SLIST_ENTRY(dm_table_entry) next; 231.1Shaad} dm_table_entry_t; 241.1Shaad 251.1ShaadThis structure stores every target part of dm device. Every device can have 261.1Shaadmore than one target mapping entries stored in a list. This structure describe 271.1Shaadmapping between logical/physical blocks in dm device. 281.1Shaad 291.1Shaadstart length target block device offset 301.1Shaad0 102400 linear /dev/wd1a 384 311.1Shaad102400 204800 linear /dev/wd2a 384 321.1Shaad204800 409600 linear /dev/wd3a 384 331.1Shaad 341.1ShaadEvery device has at least two tables ACTIVE and INACTIVE. Only ACTIVE table is 351.1Shaadused during IO. Every IO operation on dm device have to walk through dm_table_entries list. 361.1Shaad 371.1ShaadDevice entry: 381.1Shaad 391.1Shaadtypedef struct dm_dev { 401.1Shaad char name[DM_NAME_LEN]; 411.1Shaad char uuid[DM_UUID_LEN]; 421.1Shaad 431.1Shaad int minor; 441.1Shaad uint32_t flags; /* store communication protocol flags */ 451.1Shaad 461.1Shaad kmutex_t dev_mtx; /* mutex for generall device lock */ 471.1Shaad kcondvar_t dev_cv; /* cv for ioctl synchronisation */ 481.1Shaad 491.1Shaad uint32_t event_nr; 501.1Shaad uint32_t ref_cnt; 511.1Shaad 521.1Shaad uint32_t dev_type; 531.1Shaad 541.1Shaad dm_table_head_t table_head; 551.1Shaad 561.1Shaad struct dm_dev_head upcalls; 571.1Shaad 581.1Shaad struct disklabel *dk_label; /* Disklabel for this table. */ 591.1Shaad 601.1Shaad TAILQ_ENTRY(dm_dev) next_upcall; /* LIST of mirrored, snapshoted devices. */ 611.1Shaad 621.1Shaad TAILQ_ENTRY(dm_dev) next_devlist; /* Major device list. */ 631.1Shaad} dm_dev_t; 641.1Shaad 651.1ShaadEvery device created in dm device-mapper is represented with this structure. 661.1ShaadAll devices are stored in a list. Every ioctl routine have to work with this 671.1Shaadstructure. 681.1Shaad 691.1Shaad Locking in dm driver 701.1Shaad 711.1ShaadLocking must be done in two ways. Synchronisation between ioctl routines and 721.1Shaadbetween IO operations and ioctl. Table entries are read during IO and during some ioctl routines. There are only few routines which manipulates table lists. 731.1Shaad 741.1ShaadRead access to table list: 751.1Shaad 761.1Shaaddmsize 771.1Shaaddmstrategy 781.1Shaaddm_dev_status_ioctl 791.1Shaaddm_table_info_ioctl 801.1Shaaddm_table_deps_ioctl 811.1Shaaddm_disk_ioctl -> DIOCCACHESYNC ioctl 821.1Shaad 831.1ShaadWrite access to table list: 841.1Shaaddm_dev_remove_ioctl -> remove device from list, this routine have to 851.1Shaad remove all tables. 861.1Shaaddm_dev_resume_ioctl -> Switch tables on suspended device, switch INACTIVE 871.1Shaad and ACTIVE tables. 881.1Shaaddm_table_clear_ioctl -> Remove INACTIVE table from table list. 891.1Shaad 901.1Shaad 911.1ShaadSynchronisation between readers and writers in table list 921.1Shaad 931.1ShaadI moved everything needed for table synchronisation to struct dm_table_head. 941.1Shaad 951.1Shaadtypedef struct dm_table_head { 961.1Shaad /* Current active table is selected with this. */ 971.1Shaad int cur_active_table; 981.1Shaad struct dm_table tables[2]; 991.1Shaad 1001.1Shaad kmutex_t table_mtx; 1011.1Shaad kcondvar_t table_cv; /*IO waiting cv */ 1021.1Shaad 1031.1Shaad uint32_t io_cnt; 1041.1Shaad} dm_table_head_t; 1051.1Shaad 1061.1Shaaddm_table_head_t is used as entry for every dm_table synchronisation routine. 1071.1Shaad 1081.1ShaadBecause every table user have to get list to table list head I have implemented 1091.1Shaadthese routines to manage access to table lists. 1101.1Shaad 1111.1Shaad/* 1121.1Shaad * Destroy all table data. This function can run when there are no 1131.1Shaad * readers on table lists. 1141.1Shaad */ 1151.1Shaadint dm_table_destroy(dm_table_head_t *, uint8_t); 1161.1Shaad 1171.1Shaad/* 1181.1Shaad * Return length of active table in device. 1191.1Shaad */ 1201.1Shaaduint64_t dm_table_size(dm_table_head_t *); 1211.1Shaad 1221.1Shaad/* 1231.1Shaad * Return current active table to caller, increment io_cnt reference counter. 1241.1Shaad */ 1251.1Shaadstruct dm_table * dm_table_get_entry(dm_table_head_t *, uint8_t); 1261.1Shaad 1271.1Shaad/* 1281.1Shaad * Return > 0 if table is at least one table entry (returns number of entries) 1291.1Shaad * and return 0 if there is not. Target count returned from this function 1301.1Shaad * doesn't need to be true when userspace user receive it (after return 1311.1Shaad * there can be dm_dev_resume_ioctl), therfore this isonly informative. 1321.1Shaad */ 1331.1Shaadint dm_table_get_target_count(dm_table_head_t *, uint8_t); 1341.1Shaad 1351.1Shaad/* 1361.1Shaad * Decrement io reference counter and wake up all callers, with table_head cv. 1371.1Shaad */ 1381.1Shaadvoid dm_table_release(dm_table_head_t *, uint8_t s); 1391.1Shaad 1401.1Shaad/* 1411.1Shaad * Switch table from inactive to active mode. Have to wait until io_cnt is 0. 1421.1Shaad */ 1431.1Shaadvoid dm_table_switch_tables(dm_table_head_t *); 1441.1Shaad 1451.1Shaad/* 1461.1Shaad * Initialize table_head structures, I'm trying to keep this structure as 1471.1Shaad * opaque as possible. 1481.1Shaad */ 1491.1Shaadvoid dm_table_head_init(dm_table_head_t *); 1501.1Shaad 1511.1Shaad/* 1521.1Shaad * Destroy all variables in table_head 1531.1Shaad */ 1541.1Shaadvoid dm_table_head_destroy(dm_table_head_t *); 1551.1Shaad 1561.1ShaadInternal table synchronisation protocol 1571.1Shaad 1581.1ShaadReaders: 1591.1Shaaddm_table_size 1601.1Shaaddm_table_get_target_count 1611.1Shaaddm_table_get_target_count 1621.1Shaad 1631.1ShaadReaders with hold reference counter: 1641.1Shaaddm_table_get_entry 1651.1Shaaddm_table_release 1661.1Shaad 1671.1ShaadWriter: 1681.1Shaaddm_table_destroy 1691.1Shaaddm_table_switch_tables 1701.1Shaad 1711.1ShaadFor managing synchronisation to table lists I use these routines. Every reader 1721.1Shaaduses dm_table_busy routine to hold reference counter during work and dm_table_unbusy for reference counter release. Every writer have to wait while 1731.1Shaadis reference counter 0 and only then it can work with device. It will sleep on 1741.1Shaadhead->table_cv while there are other readers. dm_table_get_entry is specific in that it will return table with hold reference counter. After dm_table_get_entry 1751.1Shaadevery caller must call dm_table_release when it doesn't want to work with it. 1761.1Shaad 1771.1Shaad/* 1781.1Shaad * Function to increment table user reference counter. Return id 1791.1Shaad * of table_id table. 1801.1Shaad * DM_TABLE_ACTIVE will return active table id. 1811.1Shaad * DM_TABLE_INACTIVE will return inactive table id. 1821.1Shaad */ 1831.1Shaadstatic int 1841.1Shaaddm_table_busy(dm_table_head_t *head, uint8_t table_id) 1851.1Shaad{ 1861.1Shaad uint8_t id; 1871.1Shaad 1881.1Shaad id = 0; 1891.1Shaad 1901.1Shaad mutex_enter(&head->table_mtx); 1911.1Shaad 1921.1Shaad if (table_id == DM_TABLE_ACTIVE) 1931.1Shaad id = head->cur_active_table; 1941.1Shaad else 1951.1Shaad id = 1 - head->cur_active_table; 1961.1Shaad 1971.1Shaad head->io_cnt++; 1981.1Shaad 1991.1Shaad mutex_exit(&head->table_mtx); 2001.1Shaad return id; 2011.1Shaad} 2021.1Shaad 2031.1Shaad/* 2041.1Shaad * Function release table lock and eventually wakeup all waiters. 2051.1Shaad */ 2061.1Shaadstatic void 2071.1Shaaddm_table_unbusy(dm_table_head_t *head) 2081.1Shaad{ 2091.1Shaad KASSERT(head->io_cnt != 0); 2101.1Shaad 2111.1Shaad mutex_enter(&head->table_mtx); 2121.1Shaad 2131.1Shaad if (--head->io_cnt == 0) 2141.1Shaad cv_broadcast(&head->table_cv); 2151.1Shaad 2161.1Shaad mutex_exit(&head->table_mtx); 2171.1Shaad} 2181.1Shaad 2191.1ShaadDevice-mapper betwwen ioctl device synchronisation 2201.1Shaad 2211.1Shaad 2221.1ShaadEvery ioctl user have to find dm_device with name, uuid, minor number. 2231.1ShaadFor this dm_dev_lookup is used. This routine returns device with hold reference 2241.1Shaadcounter. 2251.1Shaad 2261.1Shaadvoid 2271.1Shaaddm_dev_busy(dm_dev_t *dmv) 2281.1Shaad{ 2291.1Shaad mutex_enter(&dmv->dev_mtx); 2301.1Shaad dmv->ref_cnt++; 2311.1Shaad mutex_exit(&dmv->dev_mtx); 2321.1Shaad} 2331.1Shaad 2341.1Shaadvoid 2351.1Shaaddm_dev_unbusy(dm_dev_t *dmv) 2361.1Shaad{ 2371.1Shaad KASSERT(dmv->ref_cnt != 0); 2381.1Shaad 2391.1Shaad mutex_enter(&dmv->dev_mtx); 2401.1Shaad if (--dmv->ref_cnt == 0) 2411.1Shaad cv_broadcast(&dmv->dev_cv); 2421.1Shaad mutex_exit(&dmv->dev_mtx); 2431.1Shaad} 2441.1Shaad 2451.1ShaadBefore returning from ioctl routine must release reference counter with 2461.1Shaaddm_dev_unbusy. 2471.1Shaad 2481.1Shaaddm_dev_remove_ioctl routine have to remove dm_dev from global device list, 2491.1Shaadand wait until all ioctl users from dm_dev are gone. 2501.1Shaad 2511.1Shaad 2521.1Shaad 2531.1Shaad 2541.1Shaad 2551.1Shaad 2561.1Shaad 2571.1Shaad 2581.1Shaad 2591.1Shaad 2601.1Shaad 2611.1Shaad 2621.1Shaad 2631.1Shaad 264