#24 — Assertion `ddt_object_remove(ddt, otype, oclass, dde, tx) == 0` failed
| State | Resolved |
|---|---|
| Version: |
—
|
| Area | Functionality |
| Issue type | Bug |
| Severity | Medium |
| Submitted by | Bjoern Kahl |
| Submitted on | Feb 02, 2010 |
| Responsible | Seth Heeren |
| Target release: | 0.6.9 |
Last modified on
May 22, 2010
by
Seth Heeren
The zfs_fuse daemon dies with "lib/libzpool/ddt.c:958: ddt_sync_entry: Assertion `ddt_object_remove(ddt, otype, oclass, dde, tx) == 0` failed" on "zpool clear" after read / write errors ue to fault usb connections to the hard drives
Prior to issuing "zpool clear", the pool status was:
----------------------
zpool status:
pool: myzfspool1
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://www.sun.com/msg/ZFS-8000-JQ
scrub: scrub completed after 0h41m with 0 errors on Mon Feb 1 19:37:12 2010
config:
NAME STATE READ WRITE CKSUM
myzfspool1 ONLINE 55 112 0
/local/bj2/zfs/disk_devs/sda2 ONLINE 1 0 0
errors: 117 data errors, use '-v' for a list
-----------------------
this happened the second time within a day. (And yes, I know I should replace my usb controller.)
System configuration:
Debian Sarge, with custom-compiled zfs_fuse.
uname -a : Linux athlon.local 2.6.18-4-486 #1 Wed May 9 22:23:40 UTC 2007 i686 GNU/Linux
-----------------------
cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 8
model name : AMD Athlon(tm)
stepping : 1
cpu MHz : 1164.553
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow ts
bogomips : 2330.93
-----------------------
fuse: debian package 2.7.4, compiled & installed from source (fuse_2.7.4-1.1.dsc) + kernel module from fuse-source_2.7.1-2~bpo40+1_all.deb
zfs_fuse is compiled from source, as per instructions on this site. I used "git clone http://git.zfs-fuse.net/official" from 2010-01-31
the whole thing is for test purpose, no production data involved.
If you need more infos, let me know, I can try alternative version or run gdb against zfs_fuse etc. As this is a test install I don't care about the data in my zpool.
Prior to issuing "zpool clear", the pool status was:
----------------------
zpool status:
pool: myzfspool1
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://www.sun.com/msg/ZFS-8000-JQ
scrub: scrub completed after 0h41m with 0 errors on Mon Feb 1 19:37:12 2010
config:
NAME STATE READ WRITE CKSUM
myzfspool1 ONLINE 55 112 0
/local/bj2/zfs/disk_devs/sda2 ONLINE 1 0 0
errors: 117 data errors, use '-v' for a list
-----------------------
this happened the second time within a day. (And yes, I know I should replace my usb controller.)
System configuration:
Debian Sarge, with custom-compiled zfs_fuse.
uname -a : Linux athlon.local 2.6.18-4-486 #1 Wed May 9 22:23:40 UTC 2007 i686 GNU/Linux
-----------------------
cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 8
model name : AMD Athlon(tm)
stepping : 1
cpu MHz : 1164.553
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow ts
bogomips : 2330.93
-----------------------
fuse: debian package 2.7.4, compiled & installed from source (fuse_2.7.4-1.1.dsc) + kernel module from fuse-source_2.7.1-2~bpo40+1_all.deb
zfs_fuse is compiled from source, as per instructions on this site. I used "git clone http://git.zfs-fuse.net/official" from 2010-01-31
the whole thing is for test purpose, no production data involved.
If you need more infos, let me know, I can try alternative version or run gdb against zfs_fuse etc. As this is a test install I don't care about the data in my zpool.
- Steps to reproduce:
- Steps to reproduce:
hard to say, however at least on my system this happens reproducible when the usb subsystem got a hiccup (tons of "resting device xxx" followed by "end_request: I/O error ...") and zfs stops with read/write errors.
zpool failmode is set to "continue", dedup is set to verify, all the rest is at its default settings
zfs_fuse is started as "zfs-fuse -n -a 2 -e 2"
Added by
Seth Heeren
on
Feb 03, 2010 09:20 AM
Thanks for including he high-quality details.
My gut-response would be to retest with upstream.
Zfs-fuse has _no special_ handling for dedup whatsoever (this is a dedup table assert; it seems it cannot remove something from a dedup table and that is apparently not allowed). You could also see what happens with failmode is WAIT. Perhaps the daemon will be smart enough to keep trying.
Also note that DEDUP is quite unstable upstream (IMHO) and it should be expected that problems like this will take a while to be found, fixed and trickle back to zfs-fuse.
My gut-response would be to retest with upstream.
Zfs-fuse has _no special_ handling for dedup whatsoever (this is a dedup table assert; it seems it cannot remove something from a dedup table and that is apparently not allowed). You could also see what happens with failmode is WAIT. Perhaps the daemon will be smart enough to keep trying.
Also note that DEDUP is quite unstable upstream (IMHO) and it should be expected that problems like this will take a while to be found, fixed and trickle back to zfs-fuse.
Added by
Seth Heeren
on
May 22, 2010 11:15 AM
Issue state:
unconfirmed → resolved
Target release:
None → 0.6.9
Responsible manager:
(UNASSIGNED) → sgheeren
closing; old report
many upstream dedup related fixes landed in 0.6.9
please retest if desired
many upstream dedup related fixes landed in 0.6.9
please retest if desired

