Personal tools
You are here: Home Issue tracker zfs-fuse unstable under load

#22 — zfs-fuse unstable under load

State Resolved
Version:
Area User interface
Issue type Patch
Severity Important
Submitted by Seth Heeren
Submitted on Jan 25, 2010
Responsible Seth Heeren
Target release:
Return to tracker
Last modified on May 22, 2010 by Seth Heeren
I'm going to provide more details. For now, refer to group posting over at http://groups.google.com/[…]/8401baca9693691a
Steps to reproduce:
manyfold, e.g.

bonnie++ -d /tank &
sudo zpool iostat -v 1

Observe daemon crash or deadlock within 10s (on my system). Performance and resource usage may influence the timing.
Attached:
fuse-threading.patch — differences between files, 0Kb
Added by Seth Heeren on Jan 25, 2010 06:15 PM
Target release: 0.6.00.6.1
fix available. this will in likelihood fix most of the symptoms listed. Of course, some instability may be unrelated and therefore unaffected

# git clone http://git.zfs-fuse.net/official
# cd official/
# git checkout origin/critical

Please retest
Added by Seth Heeren on Jan 26, 2010 06:54 PM
Severity: CriticalImportant
Emmanuel announced he has an independent fix in 67aca02ee

I confirm that it allows bonnie++ to run to completion while running 'zpool iostat -v 1' as well.

Ricardo Correia has confirmed that 67aca02ee is safe (and indicates probable changes needed to cater for zero-copy txg optimization as present in upstream).
Ricardo also validated the notion of making cur_fd thread-local.

I will retest the other instability symptoms with just 67aca02ee and see if they reoccur. If not, we can probably toss my patch.

I'm not convinced why my patch (attached) fixed the same symptoms at all... I want to review the code to see whether we might still want it for generic reasons (e.g. in case we want to re-enable those uiocopy bits in zfs-fuse style).
Added by Seth Heeren on May 22, 2010 01:27 PM
Issue state: unconfirmedresolved
well tested

0.6.9 will have multithreaded ioctl dispatch anyway, combining these ideas (__thread cur_fd is back).

closing this because tests ok