[PC-BSD Testing] Areas where FreeBSD needs work

Lars R. Noldan lars at w9zeb.org
Thu Dec 23 08:30:25 PST 2010

On Dec.23 I shared the following the Dru in #PCBSD.  It was written by 
my co-worker on Oct.10.  She requested that he post it here.  Since I 
follow this list, and he doesn't I am going to post it here for him.


FreeBSD needs work
last modified on 2010-10-10 at 19:33 - keywords: freebsd zfs iscsi whining

Warning: This is not a post whining about Adobe Flash or how hard it is 
to get my playstation controller to work in Quake.

This is, instead, a post whining about how hard it is to use ZFS and 
iSCSI on FreeBSD. Recently I was tasked with building some 
storage-related projects. The tools of choice involved FreeBSD sitting 
on some hardware with a large collection of high-volume drives. These 
drives were presented JBOD-style to the host OS, and a couple of them 
were SSDs. The idea was to make one massive raidz2 zpool with a mirrored 
ZIL on the SSDs. Then we could spin off zvols and pitch them via iSCSI 
to our application and database servers.

Sound simple? It isn't. To get it working, we had to switch to 
OpenSolaris. Which is a bit of a damn problem.

There were two really bad problems in FreeBSD that prevented us from 
deploying our OS of choice to these systems.

Problem the first:
ZFS on FreeBSD (at zpool version 13 as of this project) is only like 80% 
there. What I mean by that is ZFS is friggin fantastic for your massive 
desktop nerdmachine, or your creepy 8TB basement "media archive", or any 
other low-traffic consumer-grade project you might want to undertake. 
But when you start putting some load on it...

The biggest problem I had seemed to be caused by multiple read 
operations during snapshot creation. If you took a snapshot while lots 
of iSCSI (or NFS) (or local) access is going on, the zfs process gets 
stuck with a wait state. What it boils down to is that you have to be 
super careful to disable access while you're creating (or, in bad cases, 
sending) snapshots off -- which completely wrecks the point of the damn 
snapshot functionality.

Oh, and it would crash BSD too. Sometimes it would just eat all the RAM 
and the machine would fall down.

Problem the second:
iSCSI support in FreeBSD is abominable. I'm given to understand this is 
because the main iSCSI dev in FreeBSD-land is possessed of insufficient 
hardware to model high-performance workloads. If that's the case, we 
need to get that man some damn hardware. Or convince Covad to update 
their AoE stuff, since AoE is nicer anyway.

If the iSCSI traffic crossed subnets, performance would tank. If the 
iSCSI targets were accessed from a non-FreeBSD initiator, performance 
would tank. If a FreeBSD initiator accesses a non-FreeBSD target, 
performance tanks. Are you seeing a pattern here? Best-practices 
objections aside, it's clear that the dev has a handful of machines on a 
dumb switch, and that's the test platform. As soon as you instantiate 
some sophistication, the whole thing falls down. Again.
How do we fix this?

I suppose I could devote some time to mastering the implementation 
details of iSCSI and ZFS, then fix the stuff myself. That's not really 
the best use of my time, however, and I'm not in a position to get paid 
to do that sort of thing. But there are a few things anyone (myself 
included) can do:

  * report bugs.
  * provide stack traces and failure scenarios.
  * whine constantly.
  * provide testing hardware.

All of these are helpful, especially the constant whining. The problem 
needs to be front-and-center, or it'll get de-prioritized in favor of 
other (and in my opinion) less important problems. I don't give a crap 
that your video card doesn't push hard enough to run EVE in FreeBSD. I 
want the ZFS functionality that now only FreeBSD actively provides, and 
I want the iSCSI functionality ZFS was designed to enable.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pcbsd.org/pipermail/testing/attachments/20101223/572633cc/attachment.html>

More information about the Testing mailing list