#59 — Decrease in read performance of factor 4
| State | Confirmed |
|---|---|
| Version: | 0.6.9 |
| Area | Process |
| Issue type | Bug |
| Severity | Medium |
| Submitted by | (anonymous) |
| Submitted on | Jun 11, 2010 |
| Responsible | Seth Heeren |
| Target release: |
Last modified on
Sep 19, 2010
by
Seth Heeren
I noticed that with out of the box settings the sequential read performance is decreased by a factor of ~4. (between the 0.6.9 and the 0.60.1 release). Is this expected or is there room for improvement by changing parameters.
NOTE: benchmark data moved to attachment, because the very long lines messed the display in this issue tracker.
Ubuntu 10.4 - ZFS 0.6.9-6
Ubuntu 10.4 - ZFS 0.60.1
Reference benchmark:
Nexenta v3 beta 2
FreeBSD8.0 AMD64 Standard settings
NOTE: benchmark data moved to attachment, because the very long lines messed the display in this issue tracker.
Ubuntu 10.4 - ZFS 0.6.9-6
Ubuntu 10.4 - ZFS 0.60.1
Reference benchmark:
Nexenta v3 beta 2
FreeBSD8.0 AMD64 Standard settings
- Steps to reproduce:
- System:
Core i3 530
4Gbyte RAM
Intel H55 chipset
6 x on Intel chipset
2 x SiS3132
2 x SiS3132
Total 10 x Samsung F3 5400rpm 2Tbyte disk
RAIDZ2 pool configuration (created with FreeBSD 8.0)
Software:
Ubuntu 10.4 64bit (up to date)
0.60.1 : manual compiled from source
0.6.9 : package https://launchpad.net/[…]/zfs-fuse_0.6.9-6_amd64.deb
Added by
Seth Heeren
on
Jun 11, 2010 06:38 PM
Issue state:
unconfirmed → open
Target release:
None → 0.6.9
Responsible manager:
(UNASSIGNED) → sgheeren
With your setup it would be good to know for both benchmarks:
- the git revision used for the build (known for the deb package)
- the contents of
/etc/init.d/zfs-fuse
/etc/default/zfs-fuse
/etc/zfs/zfsrc
You might provide 'grep -i zfs /var/log/syslog' as well, as it confirms some of the above and might provide other pointers. E.g., there is a known issue with unmounting kstat on daemon exit, which will cause a relaunched zfs-fuse daemon to degrade performance due to continuously trying to mount/umount /zfs-kstat
That (the kstat) issue was diagnosed at issue #51; You can alleviate things by manually checking that no kstat remnants exist (/etc/mtab) before launching zfs-fuse, e.g. 'umount -fl /zfs-kstat'.
Another approach would be to use the experimental issue51 branch (http://gitweb.zfs-fuse.net/[…]/issue51, you can download a snapshot), which intends to remove the shutdown problem with kstat.
Right now I would just be interested in the details listed above, but as you go, you might recognize some of the above subjects for yourself :) FWIW I don't see a performance regression except when accidentally reunning debug=2 builds, which obviously is not the case in the .deb package
- the git revision used for the build (known for the deb package)
- the contents of
/etc/init.d/zfs-fuse
/etc/default/zfs-fuse
/etc/zfs/zfsrc
You might provide 'grep -i zfs /var/log/syslog' as well, as it confirms some of the above and might provide other pointers. E.g., there is a known issue with unmounting kstat on daemon exit, which will cause a relaunched zfs-fuse daemon to degrade performance due to continuously trying to mount/umount /zfs-kstat
That (the kstat) issue was diagnosed at issue #51; You can alleviate things by manually checking that no kstat remnants exist (/etc/mtab) before launching zfs-fuse, e.g. 'umount -fl /zfs-kstat'.
Another approach would be to use the experimental issue51 branch (http://gitweb.zfs-fuse.net/[…]/issue51, you can download a snapshot), which intends to remove the shutdown problem with kstat.
Right now I would just be interested in the details listed above, but as you go, you might recognize some of the above subjects for yourself :) FWIW I don't see a performance regression except when accidentally reunning debug=2 builds, which obviously is not the case in the .deb package
Added by
Edwin van Eggelen
on
Jun 12, 2010 05:11 PM
New benchmark data while system is in exact same state. I run the benchmark many times and the results are reproducible. The situation improved, but still I see a decrease of a factor of >2 in read performance.
-measured after reboot
-same state of pool (previously the data on the pool was different)
-same BIOS settings / updates / running services /background processes.
I uploaded the deb file for 0.60.1 to the Wiki. But looking in the log files it seems that my 0.60.1 enables big writes in fuse, while I do not see these entries in the log file for 0.9.6-6. What is the best way to verify if big writes are enabled.
Please find the requested date uploaded to the Wiki.
0.9.6
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
eggelen-deskt 8096M 14 17 52392 12 37088 10 4447 95 123731 9 169.9 1
Latency 593ms 1283ms 683ms 41310us 98005us 562ms
Version 1.96 ------Sequential Create------ --------Random Create--------
eggelen-desktop -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 2989 12 14810 18 3320 12 3057 12 16677 19 3519 10
Latency 15410us 1310us 1232us 27627us 167us 313us
1.96,1.96,eggelen-desktop,1,1276370420,8096M,,14,17,52392,12,37088,10,4447,95,123731,9,169.9,1,16,,,,,2989,12,14810,18,3320,12,3057,12,16677,19,3519,10,593ms,1283ms,683ms,41310us,98005us,562ms,15410us,1310us,1232us,27627us,167us,313us
eggelen@eggelen-desktop:/pool1/video$ cd /
0.9.1 <--- assuming you meant 0.6.1
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
eggelen-deskt 8096M 16 13 63494 11 42689 8 5469 97 323025 16 173.5 2
Latency 531ms 617ms 829ms 4426us 163ms 712ms
Version 1.96 ------Sequential Create------ --------Random Create--------
eggelen-desktop -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 3152 9 8165 14 3852 8 3494 10 7927 17 4665 5
Latency 35114us 1471us 5287us 39577us 28172us 8785us
1.96,1.96,eggelen-desktop,1,1276370240,8096M,,16,13,63494,11,42689,8,5469,97,323 025,16,173.5,2,16,,,,,3152,9,8165,14,3852,8,3494,10,7927,17,4665,5,531ms,617ms,8 29ms,4426us,163ms,712ms,35114us,1471us,5287us,39577us,28172us,8785us
-measured after reboot
-same state of pool (previously the data on the pool was different)
-same BIOS settings / updates / running services /background processes.
I uploaded the deb file for 0.60.1 to the Wiki. But looking in the log files it seems that my 0.60.1 enables big writes in fuse, while I do not see these entries in the log file for 0.9.6-6. What is the best way to verify if big writes are enabled.
Please find the requested date uploaded to the Wiki.
0.9.6
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
eggelen-deskt 8096M 14 17 52392 12 37088 10 4447 95 123731 9 169.9 1
Latency 593ms 1283ms 683ms 41310us 98005us 562ms
Version 1.96 ------Sequential Create------ --------Random Create--------
eggelen-desktop -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 2989 12 14810 18 3320 12 3057 12 16677 19 3519 10
Latency 15410us 1310us 1232us 27627us 167us 313us
1.96,1.96,eggelen-desktop,1,1276370420,8096M,,14,17,52392,12,37088,10,4447,95,123731,9,169.9,1,16,,,,,2989,12,14810,18,3320,12,3057,12,16677,19,3519,10,593ms,1283ms,683ms,41310us,98005us,562ms,15410us,1310us,1232us,27627us,167us,313us
eggelen@eggelen-desktop:/pool1/video$ cd /
0.9.1 <--- assuming you meant 0.6.1
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
eggelen-deskt 8096M 16 13 63494 11 42689 8 5469 97 323025 16 173.5 2
Latency 531ms 617ms 829ms 4426us 163ms 712ms
Version 1.96 ------Sequential Create------ --------Random Create--------
eggelen-desktop -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 3152 9 8165 14 3852 8 3494 10 7927 17 4665 5
Latency 35114us 1471us 5287us 39577us 28172us 8785us
1.96,1.96,eggelen-desktop,1,1276370240,8096M,,16,13,63494,11,42689,8,5469,97,323 025,16,173.5,2,16,,,,,3152,9,8165,14,3852,8,3494,10,7927,17,4665,5,531ms,617ms,8 29ms,4426us,163ms,712ms,35114us,1471us,5287us,39577us,28172us,8785us
Added by
Seth Heeren
on
Jun 13, 2010 07:28 AM
I'm looking into this.
Logging of 'enabling big_writes' was removed in 082539e5, so that one seems out of the way. I'll still regression test that one.
I tested with a blank Lucid64 image and your 0.6.1 package and I confirm that it was never possible to see the actual big_writes option enabled through mount(1) or /etc/mtab
I fixed the label above your second benchmark output. Please consider including the bonnie results in the atatchment(s) because it tends to mess up the issue tracke display (my two monitors cannot display the wide webpage).
Logging of 'enabling big_writes' was removed in 082539e5, so that one seems out of the way. I'll still regression test that one.
I tested with a blank Lucid64 image and your 0.6.1 package and I confirm that it was never possible to see the actual big_writes option enabled through mount(1) or /etc/mtab
I fixed the label above your second benchmark output. Please consider including the bonnie results in the atatchment(s) because it tends to mess up the issue tracke display (my two monitors cannot display the wide webpage).
Added by
Seth Heeren
on
Jun 13, 2010 07:30 AM
consider fixing
DAEMON_OPTS=""
in /etc/defaults/zfs-fuse
Use something like " -a 1 -e 1 " as a sane default.
(man zfs-fuse explains more in the latest deb package)
-- forget that: I discovered that your zfsrc contains the relevant settings (3600 secs, actually). Note that the commandline tkaes precedence over zfsrc
DAEMON_OPTS=""
in /etc/defaults/zfs-fuse
Use something like " -a 1 -e 1 " as a sane default.
(man zfs-fuse explains more in the latest deb package)
-- forget that: I discovered that your zfsrc contains the relevant settings (3600 secs, actually). Note that the commandline tkaes precedence over zfsrc
Added by
Edwin van Eggelen
on
Jun 13, 2010 08:25 AM
I will look in the options and read more on the background later today. But looking around I do not see a lot of people doing benchmarks with large amount of disks on zfs-fuse. In my case I have 10 in raid-z2 on relatively "new" hardware.
For the debian package. I took the package that came by default in Ubuntu 10.4 (through apt-get install) and replaced the files that I compiled my self. At that time there was a hot fix on your website with instructions. The 0.60 did not list all my zfs file systems (they disappeared after creating an addition filesystem). The 0.60.1 did find them all. I have the source code that I compiled still on backup drive if needed.
For the debian package. I took the package that came by default in Ubuntu 10.4 (through apt-get install) and replaced the files that I compiled my self. At that time there was a hot fix on your website with instructions. The 0.60 did not list all my zfs file systems (they disappeared after creating an addition filesystem). The 0.60.1 did find them all. I have the source code that I compiled still on backup drive if needed.
Added by
Seth Heeren
on
Jun 13, 2010 08:44 AM
Ok, hi (thanks for registering/logging in)
(a) I already edited my response with the info on /etc/zfs/zfsrc with respect to the options
(b) the deb pretty describes itself (dpkg --status)
I _did_ find (once) that using multiple disks was relatively slow on zfs-fuse (I had tested that striping on whole disks was marginally faster than having [a logical volume striped by lvm2 (dm) or BIOS raid] as a single vdev. However, I wasn't getting twice the throughput (or even close). You may be able to find the original info on the user group (but it was a long and confusing thread, if I remember well)
I will try to mimick your sitation with 8 loop block devices as raw vmdks in VBox guest, because I don't have 8 physical disks to spare (unless you want to sponsor me). Laters
(a) I already edited my response with the info on /etc/zfs/zfsrc with respect to the options
(b) the deb pretty describes itself (dpkg --status)
I _did_ find (once) that using multiple disks was relatively slow on zfs-fuse (I had tested that striping on whole disks was marginally faster than having [a logical volume striped by lvm2 (dm) or BIOS raid] as a single vdev. However, I wasn't getting twice the throughput (or even close). You may be able to find the original info on the user group (but it was a long and confusing thread, if I remember well)
I will try to mimick your sitation with 8 loop block devices as raw vmdks in VBox guest, because I don't have 8 physical disks to spare (unless you want to sponsor me). Laters
Added by
Seth Heeren
on
Jun 13, 2010 08:55 AM
Since I'm going to try your many-disk setup, what is the pool layout?
zpool status -v
zpool status -v
Added by
Edwin van Eggelen
on
Jun 13, 2010 09:10 AM
$ sudo zpool status -v
pool: pool1
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdb1 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdd1 ONLINE 0 0 0
sde1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdj1 ONLINE 0 0 0
errors: No known data errors
Normally I use -d /dev/disk/by-id. But with all the swapping of version I missed this.
pool: pool1
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdb1 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdd1 ONLINE 0 0 0
sde1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdj1 ONLINE 0 0 0
errors: No known data errors
Normally I use -d /dev/disk/by-id. But with all the swapping of version I missed this.
Added by
Edwin van Eggelen
on
Jun 13, 2010 04:24 PM
When installed ZFS 0.6.9-6
/etc/init.d/zfs-fuse
DAEMON_OPTS="-a 1 -e 1"
So this is already there. So I do not understand what I need to change. Can you give me hint.
/etc/init.d/zfs-fuse
DAEMON_OPTS="-a 1 -e 1"
So this is already there. So I do not understand what I need to change. Can you give me hint.
Added by
Seth Heeren
on
Jun 14, 2010 03:23 AM
I'm still busy doing my own perf testing here
> Can you give me a hint
Sure... http://zfs-fuse.net/issues/59/4
Two clues in there:
(a) It specificly mention /etc/default/zfs-fuse (because the initscript sources from there)
(b) -- forget that: ...
So, you really don't need to do anything, but the line you mentioned doesn't actually do anything because it gets overridden from /etc/default/zfs-fuse (removing the options), and later from /etc/zfs/zfsrc (setting the options to 3600s). Yeah, I'm starting to think we might clean this up. The DAEMON_OPTS are largely superfluous now that there is a zfsrc... I'll make (a) a FAQ soonish
-----
Meanwhile on your pool setup, please see the SUN documentation
http://www.solarisinternals.com/[…]/ZFS_Best_Practices_Guide
http://www.solarisinternals.com/[…]/ZFS_Evil_Tuning_Guide
Specifically:
http://www.solarisinternals[…]rements_and_Recommendations
I'd try breaking up the pool in two raidz[12] nodes. I'm including this type of test in my benches
> Can you give me a hint
Sure... http://zfs-fuse.net/issues/59/4
Two clues in there:
(a) It specificly mention /etc/default/zfs-fuse (because the initscript sources from there)
(b) -- forget that: ...
So, you really don't need to do anything, but the line you mentioned doesn't actually do anything because it gets overridden from /etc/default/zfs-fuse (removing the options), and later from /etc/zfs/zfsrc (setting the options to 3600s). Yeah, I'm starting to think we might clean this up. The DAEMON_OPTS are largely superfluous now that there is a zfsrc... I'll make (a) a FAQ soonish
-----
Meanwhile on your pool setup, please see the SUN documentation
http://www.solarisinternals.com/[…]/ZFS_Best_Practices_Guide
http://www.solarisinternals.com/[…]/ZFS_Evil_Tuning_Guide
Specifically:
http://www.solarisinternals[…]rements_and_Recommendations
I'd try breaking up the pool in two raidz[12] nodes. I'm including this type of test in my benches
Added by
Seth Heeren
on
Jun 14, 2010 05:59 AM
My benchmark results in a virtualized environment. Details on setup are provided through the link
http://downloads.sehe.nl/zfs-fuse/issue59/results.htm
I haven't _looked_ at the numbers yet. I'm just providing the raw information here so anyone who can spare the time can have a look
http://downloads.sehe.nl/zfs-fuse/issue59/results.htm
I haven't _looked_ at the numbers yet. I'm just providing the raw information here so anyone who can spare the time can have a look
Added by
Edwin van Eggelen
on
Jun 14, 2010 02:05 PM
As requested :
'zpool get all poolname'
NAME PROPERTY VALUE SOURCE
pool1 size 18.1T -
pool1 capacity 35% -
pool1 altroot - default
pool1 health ONLINE -
pool1 guid 5509138985904076621 local
pool1 version 13 local
pool1 bootfs - default
pool1 delegation on default
pool1 autoreplace off default
pool1 cachefile - default
pool1 failmode wait default
pool1 listsnapshots off default
pool1 autoexpand off default
pool1 dedupditto 0 default
pool1 dedupratio 1.00x -
pool1 free 11.7T -
pool1 allocated 6.44T -
'zfs get -r all poolname'
sent by e-mail
'zpool get all poolname'
NAME PROPERTY VALUE SOURCE
pool1 size 18.1T -
pool1 capacity 35% -
pool1 altroot - default
pool1 health ONLINE -
pool1 guid 5509138985904076621 local
pool1 version 13 local
pool1 bootfs - default
pool1 delegation on default
pool1 autoreplace off default
pool1 cachefile - default
pool1 failmode wait default
pool1 listsnapshots off default
pool1 autoexpand off default
pool1 dedupditto 0 default
pool1 dedupratio 1.00x -
pool1 free 11.7T -
pool1 allocated 6.44T -
'zfs get -r all poolname'
sent by e-mail
Added by
Seth Heeren
on
Sep 19, 2010 06:06 PM
Edwin, What is the state of affairs? Is this issue still relevant?
Added by
Seth Heeren
on
Sep 19, 2010 07:53 PM
Target release:
0.6.9 → 0.6.9-maint
Aha...? I just scrubbed the bug tracker and found #72 which _might_ explain the trouble.
Please retry with the latest Ubuntu packages from the homepage (zfs-fuse_0.6.9-7). Please allow for some time for the build server to complete these
Please retry with the latest Ubuntu packages from the homepage (zfs-fuse_0.6.9-7). Please allow for some time for the build server to complete these

bonnie_benchmarks.txt
