[nylug-talk] ques: linux training: crash analysis
Sunny Dubey
Sun Aug 13 14:17:42 EDT 2006
Am Montag, 7. August 2006 11:45 schrieb Bob Kryger:
> I'm looking for a training class for crash analysis and root cause
> analysis.
>
> I need to get some time out of the office where i can concentrate on
> linux internals and
> cause crashes and oops and such and analyze them.
>
> Suggestions?
>
I've never seen a course dedicated to this, but I can offer the follow advice
based on experiences in dealing with such.
1 - learn how binaries work, how they interact/call other binaries/libraries,
etc
2 - learn how the build environment of a particular app (if possible)
works ... errors caught in config.log may help determine later causes of
crashing
3 - learn debugging tools like strace/ltrace/gdb, etc. In fact 90% of most
incidents can be determined by strace alone.
4 - learn the specific applications themselves. Often apps can print out
multiple levels of debug info, etc (ie: ssh -vv)
5 - learn your system's logging environment (syslog, syslog-ng, ulogd, etc).
not all apps log to the same places/files, etc
6 - learn kill signals and what they might mean (children dying, SIG11
hardware issues, OOM-VM killer, etc)
7 - use your package management system to your advantage. RPM can verify MD5
hashes, time stamps, etc. Using a package distro alone helps greatly because
if you are experiencing crashes with pre-packaged software, the chances of
others with the same issue are going to greatly increase.
I guess the above comes out of years of experience doing little forensics here
and there. At times the above is hardest with binary only applications you
have little control over.
cool, HTH
--
Sunny Dubey
mail: sunny at opencurve.org
tele: 212.333.3542
More information about the nylug-talk
mailing list