#33 — zfs: lib/libavl/avl.c:637: avl_add: Assertion `0' failed.
| State | Resolved |
|---|---|
| Version: | 0.6.0 |
| Area | Functionality |
| Issue type | Bug |
| Severity | Medium |
| Submitted by | (anonymous) |
| Submitted on | Mar 14, 2010 |
| Responsible | Seth Heeren |
| Target release: | 0.6.0 |
Last modified on
May 22, 2010
by
Seth Heeren
I'm using the Ubuntu lucid Package wich is based on 0.6.0 but I'm not able to mount some zfs datasets.
I upgraded from 0.5.1 to the package 0.6.0 which will be delivered with lucid.
Some Datasets get mounted but if i try to mount zwo of the sets, which did not get automatically mounted i get the error:
zfs: lib/libavl/avl.c:637: avl_add: Assertion `0' failed.
I could not find anything about this on the web.
And it failed already before i started the scrub
I upgraded from 0.5.1 to the package 0.6.0 which will be delivered with lucid.
Some Datasets get mounted but if i try to mount zwo of the sets, which did not get automatically mounted i get the error:
zfs: lib/libavl/avl.c:637: avl_add: Assertion `0' failed.
I could not find anything about this on the web.
And it failed already before i started the scrub
- Steps to reproduce:
- root@utgard-2:/# zpool status tank
pool: tank
state: ONLINE
scrub: scrub in progress for 0h38m, 15.15% done, 3h35m to go
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
errors: No known data errors
root@utgard-2:/# zfs list
NAME USED AVAIL REFER MOUNTPOINT
tank 3.72T 3.43T 2.69T /mnt/tank/
tank/GPS-DATEN 3.88G 3.43T 3.88G /mnt/tank//GPS-DATEN
tank/apps 26.4G 3.43T 26.4G /mnt/tank//apps
tank/games 152G 3.43T 152G /mnt/tank//games
tank/home/ftp 29.6K 3.43T 29.6K /mnt/tank//home/ftp
tank/home/lan 33.6K 3.43T 33.6K /mnt/tank//home/lan
tank/home/share 694K 3.43T 694K /mnt/tank//home/share
root@utgard-2:/# zfs list -r tank/backup
NAME USED AVAIL REFER MOUNTPOINT
tank/backup 766G 33.8G 52.4G /mnt/tank//backup
tank/backup/apple 714G 33.8G 714G /mnt/tank//backup/apple
root@utgard-2:/# zfs mount tank/backup
zfs: lib/libavl/avl.c:637: avl_add: Assertion `0' failed.
Aborted
root@utgard-2:/#
Added by
Seth Heeren
on
Mar 14, 2010 05:47 PM
Issue state:
unconfirmed → open
Responsible manager:
(UNASSIGNED) → sgheeren
I remember seeing a message like that mentioned way back (over 12 months), I suppose on the google group, back when we had no issue tracker. I particularly recall there being no resolution (perhaps because the problem went away?)
Anyways, it has recently been decided bad practice to run in debug builds. The assert creeping up and getting out indicates that you have a debug build (which is the common way to build from source).
If you grab a (very recent) source, you should find that debug=0 is the new default.
Otherwise, you could try to build from source.
In Lucid, you could either go the apt way (apt-get source zfs-fuse; sudo apt-get builddep zfs-fuse) etc. or you can
$ git clone http://git.zfs-fuse.net/official 0.6.0
$ cd 0.6.0/src
$ scons debug=0
$ sudo scons debug=0 install
This get's the same behaviour on the 0.6.0 release.
Please
(a) have backups (if you can't, have a dd image of the pools vdevs?)
(b) report back whether that allows ZFS to get past the issue and perhaps fix it.
(c) after importing/exporting the pool, perhaps check back with the original version from the Lucid package. If that then works, this indicates that the error condition was somehow addressed and fixed.
Recently there has been another assert trap that did not allow a pool to be imported/mounted because of the debug asserts. It turns out, that the asserts should in fact have been a warning, since things are allowed to be slightly off on starting ZFS fresh. This is of course, due the fact that ZFS must, like any database server, anticipate 'unclean' shutdowns that leave rubble in the last uberblock. ZFS should allows correctly recover to the lastest 'clean' uberblock. The standard disk synch times are 5 secs, so this leaves the potential for 5 seconds of data loss (unless explicitely sync-e by e.g. the database or user application); However, due to the COW/log-structured mechanisms, the last uberblock should be fine and consistent.
Good luck
Seth
Anyways, it has recently been decided bad practice to run in debug builds. The assert creeping up and getting out indicates that you have a debug build (which is the common way to build from source).
If you grab a (very recent) source, you should find that debug=0 is the new default.
Otherwise, you could try to build from source.
In Lucid, you could either go the apt way (apt-get source zfs-fuse; sudo apt-get builddep zfs-fuse) etc. or you can
$ git clone http://git.zfs-fuse.net/official 0.6.0
$ cd 0.6.0/src
$ scons debug=0
$ sudo scons debug=0 install
This get's the same behaviour on the 0.6.0 release.
Please
(a) have backups (if you can't, have a dd image of the pools vdevs?)
(b) report back whether that allows ZFS to get past the issue and perhaps fix it.
(c) after importing/exporting the pool, perhaps check back with the original version from the Lucid package. If that then works, this indicates that the error condition was somehow addressed and fixed.
Recently there has been another assert trap that did not allow a pool to be imported/mounted because of the debug asserts. It turns out, that the asserts should in fact have been a warning, since things are allowed to be slightly off on starting ZFS fresh. This is of course, due the fact that ZFS must, like any database server, anticipate 'unclean' shutdowns that leave rubble in the last uberblock. ZFS should allows correctly recover to the lastest 'clean' uberblock. The standard disk synch times are 5 secs, so this leaves the potential for 5 seconds of data loss (unless explicitely sync-e by e.g. the database or user application); However, due to the COW/log-structured mechanisms, the last uberblock should be fine and consistent.
Good luck
Seth
Added by
(anonymous)
on
Mar 15, 2010 03:16 AM
Thanks for your long and detailed reply, an it seams to do the job.
I recompiled the Debian Package with Debug disabled in the SConstruct file
and know i'm able to mount the missing Datasets.
Unfortunatly they are not listet in
zfs list.
But perhaps that needs a reboot.
Will try that after my vacation. :)
I recompiled the Debian Package with Debug disabled in the SConstruct file
and know i'm able to mount the missing Datasets.
Unfortunatly they are not listet in
zfs list.
But perhaps that needs a reboot.
Will try that after my vacation. :)
Added by
Seth Heeren
on
Mar 15, 2010 04:35 AM
The fact that you are missing datasets in zpool list or zfs list indicates that you are probably hitting a bug in put_nvlist. There is a hotfix for that on the 0.6.0 version:
# git clone http://git.zfs-fuse.net/official
# cd official/
# git checkout origin/critical
No need to retest at the moment (this has been verified by at least 4 persons in the field)
This version is safe to upgrade to if you don't wish to have any other impact beyond the 0.6.0 upgrade, besides bugfixes.
PS. A reboot probably won't change this behaviour
# git clone http://git.zfs-fuse.net/official
# cd official/
# git checkout origin/critical
No need to retest at the moment (this has been verified by at least 4 persons in the field)
This version is safe to upgrade to if you don't wish to have any other impact beyond the 0.6.0 upgrade, besides bugfixes.
PS. A reboot probably won't change this behaviour
Added by
Seth Heeren
on
May 22, 2010 05:27 AM
Issue state:
open → resolved
Target release:
None → 0.6.0
closing

