One of the apps running on Sun Cluster was randomly crashing. So, I decided to take a look what was happening. Yeah, there is DTrace in Solaris 10. Since I am pretty comfortable with truss I decided to give that a shot first:

root@node1 # truss -p 27462
truss: process is traced: 27462
root@node1 #

That’s it. No truss output, nothing. That was weird. truss will not work if there is a debugger attached to the process to be traced, which was not the case. So, I figured it might have something to do with the fact that the process is handled by the cluster software.

Finaly, NOTES section of pmfadm manpage gave me the answer:

To avoid collisions with other controlling processes. truss(1) does not allow tracing a process that it detects as being controlled by another process by way of the /proc interface. Since rpc.pmfd(1M) uses the /proc interface to monitor processes and their descendents, those processes that are submitted to rpc.pmfd by way of pmfadm cannot be traced or debugged.

So, Dtrace it was. Thankfully, Brendan Gregg already did the hard work for me, by creating DTrace version of truss. The more you know…