[nylug-talk] Ghetto SAN ideas for Linux, was: SMB 10gigE (for Linux?)
Alex Pilosov
alex at pilosoft.com
Sat Sep 1 17:04:27 EDT 2007
On Sat, 1 Sep 2007, jh wrote:
> Alex Pilosov wrote:
> >
> > Sure. 10GE is the new GE. You can buy ghetto 10G switches for ~2-3k$.
> >
> Upon further reflection, I'm thinking that a SAN with a shared file
> system is a better option at this point. Prices for DIY SAN look pretty
> inexpensive, especially compared to 10G switches.
SAN and shared filesystems are hard (if you actually mean *shared*
filesystem - all clients have equal direct access to the storage). Don't
underestimate complexity of this.
> The problem being solved, since Alex asked: we deal with data in the
> following hierarchy:
>
> - Volumes - typically DVD sized or higher, say, 2-200 gigs, contains
> documents, typically in the range of thousands to hundreds of thousands.
> - Documents, which contain pages
> - Pages, which are our most atomic data type.
>
> On the front end of things ("ingestion") the volume is our atomic unit.
> We need to bring in the entire thing into our system from a single client.
>
> On the back end of things ("production") we have a dozen nodes each
> working on individual pages and documents. They're fine on GigE, as the
> actual processing is the bottleneck there.
>
> The issue: if we cannot ingest faster than the combined number of
> workers can do their thing, our production rates stall - we get the
> equivalent of "wait states". And GigE is our bottleneck.
>
> So, it seems to me that a better/cheaper/more mature solution is
> something using 4Gb FC on a SAN, plus a shared file system. That way our
> ingestion box can just drop the stuff into a shared file system, while
> the workers can continue to access it via a gateway server sharing it
> via NFS.
so it'll be actual shared filesystem for ingestion/gateway boxes. ok,
makes sense. note - it makes sense not because of bandwidth, but because
of multiple servers sharing the OS load of file creation etc.
> So, my questions:
>
> * Generally, how is compatibility between different vendors these days?
compatibility of what, exactly? SANs are hard. You will have problems
putting one together using duct tape, if you don't know exactly what you
are doing. which filesystem have you planned on using?
> * What 4Gb FC HBAs are best supported under Linux?
probably qlogic - dell uses them in their SAN solutions.
otoh, fc hbas have about the same quality of support as scsi hbas -
chipsets are generally identical. (ie many cards are available as both
scsi and fc hbas)
> * Suggestions for a cheap 4Gb switch?
no idea, ebay.
> * Has anyone used something like this Promise Vtrak SATA-to-FC raid box?
> Opinions? http://tinyurl.com/2h3bcy
its 2007 - fc is b-game today. rock out with SAS.
sas 3gbit is basic standard (1x), sas 4x (12gbit) exists (no idea about
pricing).
compare the above (~4k$) with infortrend S12S-G1030 (~5k$).
> * Is anyone using a shared file system - either Linux only (i.e., GFS),
> or something else? Especially under Debian?
i've managed to get gfs installed and usable once. its nontrivial (not
rocket science either, but definitely nontrivial).
> And lastly: is there a book or online resource that someone can
> recommend to get more up to speed on SANs and the like?
wikipedia, googol, call up emc and say you want to buy their thing.
-alex
More information about the nylug-talk
mailing list