[nylug-talk] ques: linux training: crash analysis

Sunny Dubey
Sun Aug 13 14:17:42 EDT 2006


Am Montag, 7. August 2006 11:45 schrieb Bob Kryger:

> I'm looking for a training class for crash analysis and root cause
> analysis.
>
> I need to get some time out of the office where i can concentrate on
> linux internals and
> cause crashes and oops and such and analyze them.
>
> Suggestions?
>

I've never seen a course dedicated to this, but I can offer the follow advice 
based on experiences in dealing with such.

1 - learn how binaries work, how they interact/call other binaries/libraries, 
etc

2 - learn how the build environment of a particular app (if possible) 
works ... errors caught in config.log may help determine later causes of 
crashing

3 - learn debugging tools like strace/ltrace/gdb, etc.  In fact 90% of most 
incidents can be determined by strace alone.

4 - learn the specific applications themselves.  Often apps can print out 
multiple levels of debug info, etc (ie: ssh -vv)

5 - learn your system's logging environment (syslog, syslog-ng, ulogd, etc).  
not all apps log to the same places/files, etc

6 - learn kill signals and what they might mean (children dying, SIG11 
hardware issues, OOM-VM killer, etc)

7 - use your package management system to your advantage.  RPM can verify MD5 
hashes, time stamps, etc.  Using a package distro alone helps greatly because 
if you are experiencing crashes with pre-packaged software, the chances of 
others with the same issue are going to greatly increase.

I guess the above comes out of years of experience doing little forensics here 
and there.  At times the above is hardest with binary only applications you 
have little control over.

cool, HTH

-- 
Sunny Dubey

  mail: sunny at opencurve.org
  tele: 212.333.3542


More information about the nylug-talk mailing list