[nylug-talk] Paper IT certs and disk drive fabrication differences -- WAS: Slim home server
Bryan J. Smith
b.j.smith at ieee.org
Wed May 21 20:55:04 EDT 2008
On Sat, 2008-05-17 at 18:49 -0400, Chris Knadle wrote:
> I got an ASEE, then an ASCS [associates in Computer Science], tried being a
> stockbroker, then after that brief dissappointing job went back to school for
> a BSEE in "Computer Engineering", which essentially just confuses everybody
> as to what it is. It definitely relates to network administration because of
> the elective classes I've taken. And after all that I'm still trying to
> figure out what I want to do "when I grow up". ;-)
Education doesn't define a person. Heck, for anyone with 12-16+ years
of technical experience, the "theory" becomes obvious -- even
differential and integral calculus and basic transforms. The 4-5 years
of "theory" is basically just a juggernaught of "commonality" and
"lessons learned," and that's why you forget 90% of it. Because it's
far more retained when you apply it. ;)
That's why most State Board of Professional Engineers (BoPEs) allow even
non-degreed engineers to become PEs, at least in the US (can't say with
regards to Canada's equivalent), after 12-16+ years experience. You
still have to sit the same exams, and the Fundamentals exam may present
some difficulty without more "academic" study. But it's still quite
doable for someone without an ABET accredited BSE to get their PE before
35.
> In the job interviews I've gone on so far, my degrees generally don't get
> discussed and occasionally I wonder if they're even considered.
They _never_ get discussed with the actual technical managers and staff.
They could care less. I could care less when I'm hiring as well.
But the HR and procurement departments seem to care. It's not only why
the best employees "never get an interview," but even "get lost in the
process" after an interview.
I've been dropped from consideration more than once because I didn't
have a CS degree. I've even had one case where they said they needed a
"BS in Computer Engineering" verbatim, and they marked me down as an EE
without the Computer Engineering option (which I have).
I've learned to send just the right info and, sadly enough, 'tude
towards the HR departments to get them to realize they are f'ing up,
legally. Not only for myself, but when I'm trying to hire someone.
> Uh, well, er... I don't understand what you're asking.
You said "kind of drive" (sorry, I regurgitated "type" meaning your word
"kind").
> You're comparing "enterprise" vs "comodity" drives, and everything
> I've used AFAIK has been "commodity", so I can only compare
> manufacturers or the interface type: 50-pin or 68-pin SCSI, IDE, SATA.
Neither of which has any influence over reliability. ;)
Today, vendors outsource left and right. IBM partnering with Hitachi
before selling out-right. Western Digital to Hitachi and Quantum
(including after Maxtor bought them). And Seagate finally starting
outsourcing heavily to Maxtor, although serious drops in QA caused them
to buy Maxtor.
[ Boeing has run into the same issue first-hand on the 787 Dreamliner.
And they ended up buying some of the same companies too, especially
given the "just in time" manufacturing cost savings can cost time. ;) ]
The only thing you can do is ...
1) Find out the model, and how it's fabbed, and
2) Read up on first-hand results with the model
> And I think all of the drives were
> 7200 RPM. There might have been one set of SCSI drives in a mail server that
> were 10k RPM -- not sure.
Anything 10,000rpm should be 292GB or less. I could be wrong though.
Those are of smaller, reduced density platters, and higher costing
materials.
> Oddly enough -- no. Most of the servers I personally helped purchase were
> from vendors that allowed specifying hardware choices. If standard servers
> from HP or others were purchased it just so happened that the hardware wasn't
> used on projects I worked on.
Okay. So you really haven't.
Understand the whole certification and sample QA on commodity disks --
near-line/enterprise/RAID/etc... OEM versus consumer/desktop OEM/retail
is for cost reasons.
If Hitachi, Seagate or Western Digital can sample a lot, and determine
the drives have reduced vibration and better operational tolerances.
Doesn't mean they are not going to fail. It just means over the lot,
there are less failures. So when you install dozens in systems, and
Tier-1 OEMs like Dell, HP, IBM, etc... sell hundreds of the drives in
dozens of the servers to a single customer, less failures are occurring.
Hitachi, Seagate and Western Digital are not going to warranty disks
that Dell, HP, IBM, etc... sell in systems that are running 24x7 and are
not of these samples. Again, anyone who has been a product manager will
instantly see the cost differentiation en masse. They are also far less
likely to warranty or even charge you a fee if you are sending _many_
consumer rated discs back to them and the SMART data is coming back as
operating 24x7, or has other operational data that is well outside
general, consumer usage. Not one or two-off, but when you end up
sending dozens.
That's why the sampling and QA differ. There is also some added
firmware that vendors like Western Digital do on buffer flushing on
their Caviar RE that should only be done on 0 Wait State Caching SRAM
(capacitor-backed) or buffering DRAM (battery-backed) storage
controllers.
> Yeah, friends have run into reliability issues on drives that were found to
> be related to driver firmware. That's definitely an interesting area.
That's more ATA issues. The ATA spec is a PITA, and vendors don't
follow it, making it worse.
ATA is nothing more than dumb traces with two end-points. Integrated
Drive Electronics (IDE) on the drive, using Direct Memory Access (DMA)
mechanisms of the peripheral bus to the system memory. In between is
the bus arbitrator, which not only needs its registers to be setup
proper in the firmware of the system, but handled correctly in the OS
driver. Those two things along often conflict, even before we get to
the system firmware/OS driver v. IDE firmware issues.
The only thing ATA was never, ever designed to do is handle DMA with
more than one end-point -- i.e., master/slave. The one thing they did
right with SATA is not even offer it (finally). The whole master/slave
is a left over from the Western Digital trademarked Enhanced IDE (EIDE)
specification, and it was never supported proper in the ATA spec (only
recognized for legacy compatibility), and definitely not well considered
when UltraDMA was offered with DDR and, later, QDR signaling, or CRC
checking for that matter.
Native Command Queuing (NCQ) has been the latest, horrendous mess. In
reality, using an intelligent ATA hardware storage controller (many do
SAS as well, since SAS is backward compatible with SATA) removes most of
the bonuses for NCQ. Many of those cards will also shut down the
point-to-point NCQ of the ATA bus when they start having issues. The OS
never sees it, because the intelligent controller is handling the
transfers. It's actually triple over nice because the intelligent ATA
controller is 1) system memory, 2) bus arbitrator and 3) embedded
OS/driver in one -- so it removes a lot of factors.
But that's another tangent. ;)
> Huh; the latest drives I've been purchasing were in the Seagate Barracuda
> 7200.10 series. I had not known the were related to the Maxtor line before
> buying them. They've been fine so far.
The Seagate Barracuda 7200.10 line (187GB/platter flagship,
250GB/platter in 250GB form) is probably their best model since the
7200.7 (100MB/platter flagship). The Maxtor drives of the same capacity
are actually fabbed as 7200.10 drives by Seagate, under their QA
control. ;)
The new 7200.11 line (250GB/platter flagship, 320GB/platter in 320GB
form) is having serious issues with firmware. It's something to be
avoided right now until the kinks are worked out. Seagate isn't the
only one having issues, Hitachi and Western Digital are as well.
> Last I worked directly for an I.T. department at a company they considered
> all hard disks to be equal (obviously a simplistic conclusion) and had no
> interest in having anyone research drive relibility, method of manufacture,
> etc. "Just get the box a disk" was essentially the bottom line. [I would
> have much prefered more design thought to go into hardware purchases, but
> hardware purchases were usually done in a rush for several reasons. :-(]
> And since then I still use commodity hardware for stuff I build for myself,
> and if I work on client's servers I don't carefully examine what type of
> drive a box has -- I haven't been asked to examine that. Etc.
My life is heavily CYA, especially since I work for a vendor myself.
I have more recently been in the middle of an Intel debacle, and their
utterly lack of full disclosure (even to us vendors under NDA) with
regardless to Machine Check Exception (MCE) issues -- specifically, as
you can find in public documentation now -- the TLB. Yes, TLB. AMD
isn't alone. ;)
Made me completely appreciate AMD's decision to withhold their Processor
10h (Barcelona) Stepping B2 multi-socket (Opteron) processors until they
worked out all the TLB issues on the B3. Intel shipped G0 steppings on
not just their uni-socket Core 2s, but their multi-socket Xeons. I was
hitting the Intel microcode dat file for Linux weekly for some time
there. ;)
Can't say more than is public, being under NDA and all.
Likewise, because SAS is the mature SCSI-2 protocol using the same SATA
PHY (although externally SAS is a crapload better, mechanical/electrical
than eSATA, but that's another story), a lot of enterprise go the SAS
route instead of SATA when they hit non-commodity material/fabbed
10-15,000rpm drives. After all, the cost of the added SAS firmware
isn't the biggest cost, but the materials/fab required for the spindle
of the drive.
But SAS, just like SCSI or FC, has nothing to do with the reliability of
the drive mechanics itself. It's more of the non-commodity pairing of
cost in fab with cost in firmware.
--
Bryan J Smith Professional, Technical Annoyance
mailto:b.j.smith at ieee.org http://www.linkedin.com/in/bjsmith
-------------------------------------------------------------
Fission Power: An Inconvenient Solution
More information about the nylug-talk
mailing list