#42 — Daemon crashes on concurrent activity
| State | Resolved |
|---|---|
| Version: |
—
|
| Area | Functionality |
| Issue type | Patch |
| Severity | Important |
| Submitted by | Seth Heeren |
| Submitted on | May 15, 2010 |
| Responsible | Seth Heeren |
| Target release: | 0.6.9 |
Last modified on
May 22, 2010
by
Seth Heeren
The daemon crashes when userland actions are triggered in rapid succession.
$ zfs list -rHoname lin_recv | xargs -trn1 zfs set snapdir=hidden
[output ok]
$ zfs list -rHoname lin_recv | xargs -P8 -trn1 zfs set snapdir=hidden
[immediate crash of daemon]
When running under gdb 'bad address' and 'SIGPIPE' are witnessed. I recognize that from badly shielded cur_fd. This should not happen. Next step:
$ for a in $(zfs list -rHoname lin_recv); do sleep .05; zfs set snapdir=hidden "$a"& done; wait
Works like a charm. The tiny sleep allows for proper isolation of the threads. This suggests to me that thread initialization is at fault. I have found the offending line:
cmdlistener.c:118 uses a static. This static is illegally shared with other threads initializing at the same time.
I attach the patch that removes the race condition/isolation violation window. This has been pushed to testing as 0dc59d7a5cc20e1e0c1658ce07b6b84dd0d51460
$ zfs list -rHoname lin_recv | xargs -trn1 zfs set snapdir=hidden
[output ok]
$ zfs list -rHoname lin_recv | xargs -P8 -trn1 zfs set snapdir=hidden
[immediate crash of daemon]
When running under gdb 'bad address' and 'SIGPIPE' are witnessed. I recognize that from badly shielded cur_fd. This should not happen. Next step:
$ for a in $(zfs list -rHoname lin_recv); do sleep .05; zfs set snapdir=hidden "$a"& done; wait
Works like a charm. The tiny sleep allows for proper isolation of the threads. This suggests to me that thread initialization is at fault. I have found the offending line:
cmdlistener.c:118 uses a static. This static is illegally shared with other threads initializing at the same time.
I attach the patch that removes the race condition/isolation violation window. This has been pushed to testing as 0dc59d7a5cc20e1e0c1658ce07b6b84dd0d51460
- Steps to reproduce:
- Run multiple commands at the exact same time
Added by
Seth Heeren
on
May 22, 2010 05:19 AM
Issue state:
unconfirmed → resolved
Reviewed, closing

patch.diff
