captain holly java blog

How does Sigar work?

Posted in native, Uncategorized by mcgyver5 on June 4, 2009

Sigar is an impressive suite of system monitoring and reporting tools made by a company called Hyperic (which was purchased by SpringSource in May). Hyperic also makes the enterprise Hyperic HQ monitoring tool suite.

Sigar’s basic function is to make underlying system information available to Java. The pieces that come with Sigar include:

  • A set of system info commands like ps, netstat, ifconfig, and df
  • The Sigar object. The Sigar object maintains platform specific internal state. Looking at the source code, we see that this object is passed around all over the place so each part of Sigar knows which platform it is dealing with.
  • The Sigar Proxy is an interface that is implemented by the sigar object which provides caching at the java level.
  • The sigar shell, which is a command shell used to issue sigar commands and queries
  • Sigar.jar library which can be included in java programs
  • hooks into jmx
  • A pager
  • hooks into the windows registry and other windows specific stuff
  • Process Table Query Language. Sigar has its own process table query language (PTQL) that is used to drill into process info and find out things including process children and viewing command line arguments used for a given process. It identifies processes by other attributes besides process ID so that sigar’s identification of the process can persist over time even if the process ID changes via a restart.
  • Hooks into vmware information. Allows java to monitor and control vmware appliances.

Many Sigar features have “toString()” overridden to provide a formatted output similar to what you would expect to see from your system as well as getter methods for you to customize the reporting.

A good example of what Sigar provides is the Top command

The command uses ptql
the simplest usage would be java -jar sigar.jar Top Pid.Pid.eq=3872
which gives a result of:

C:\Downloads\java\hyperic-sigar-1.6.0\hyperic-sigar-1.6.0\sigar-bin\lib>java -jar sigar.jar Top Pid.Pid.eq=3976
←[2J  3:14 PM  up 1 day, 6:50, (load average unknown)
62 processes: 0 sleeping, 62 running, 0 zombie, 0 stopped... 728 threads
CPU states: 0.0% user, 0.0% system, 0.0% nice, 0.0% wait, 100.0% idle
Mem: 2086444K av, 1408404K used, 678040K free
Swap: 4027084K av, 2015628K used, 2011456K free

PID     USER    STIME   SIZE    RSS     SHARE   STATE   TIME    %CPU    COMMAND
3976    McGuiT1 Jun1    296M     31M    -       R       0:45    0.0%    C:\Program Files\Microsoft Office\OFFICE11\WINWORD.EX

your java program can use it to get the top resource hogs on the underlying system with something like

String qs = "Mem.Size.gt=6000000";
ProcessQuery query;
        try {
            query = this.qf.getQuery(qs);
        } catch (MalformedQueryException e) {
            traceln("parse error: " + qs);
            throw e;
        }
        try {
            long[] pids = query.find(sigar);
        ...

Some of the source code contains funny comments:

//XXX currently weak sauce. should end up like netstat command.

I guess “weak sauce” refers to things like

printf(HEADER) 

where HEADER is an array of strings.

I found that Sigar’s Netstat command sometimes runs significantly slower than the equivalent system command.
output of Sigar’s netstat command (Loopelapsed time in milliseconds):

Loopelapsed: 4502
[tcp, localhost:5152, localhost:2857, CLOSE_WAIT, ]
Loopelapsed: 1
[tcp, localhost:2658, localhost:2659, ESTABLISHED, ]
Loopelapsed: 0
[tcp, localhost:2659, localhost:2658, ESTABLISHED, ]
Loopelapsed: 0
[tcp, CM03802-5RR31D1:3575, 10.10.84.1:https, ESTABLISHED, ]
Loopelapsed: 4503
[tcp, CM03802-5RR31D1:2523, 10.11.12.13:https, ESTABLISHED, ]
Loopelapsed: 4502
[tcp, CM03802-5RR31D1:1108, mrrr.mn.us:524, ESTABLISHED, ]

what in god’s name is it doing? The reason it takes so long is is that after it uses a native method to get a list of connections, it is using the getHostName() method from java’s InetAddress class to try and find a host name connected to the IP address. According to the InetAddress api, this is tied to my system doing a slow reverse DNS lookup. My windows netstat program obviously has some strong sauce and is not using the same slow DNS reverse lookup. I’m not sure why. This is a mystery for another day.

The grunt work in Sigar is handed off to C code with calls like this:

 public native NetConnection[] getNetConnectionList(int flags)
        throws SigarException;

In this case, this hands off the responsibility of getting a list of connections results from a native implementation in C. On windows this uses the IP helper dll called iphlpapi.dll.

Sigar test run from the command line returns ton of system information and a few errors. Most of these are due to permissions. Running as root makes most of these errors go away.

Experiment with the sigar shell java -jar sigar.jar
>sigar [TAB] gets you all the choices:

>sigar ps [TAB] gets you all the options for ps:
sigar> ps
Env
Args
Fd
Cpu
Time
State
Modules
Cred
Mem
CredName
Exe

>sigar ps Mem.[TAB]
gets you more options:
sigar> ps Mem.
Resident
Share
MinorFaults
MajorFaults
PageFaults
Rss
Vsize
Size

It is using the reflection package to look at the options available for each method.

    public static Collection getMethodOpNames(Method method) {
        if (method == null) {
            return SOPS;
        }
        Class rtype = method.getReturnType();
        if ((rtype == Character.TYPE) ||
            (rtype == Double.TYPE) ||
            (rtype == Integer.TYPE) ||
            (rtype == Long.TYPE))
        {
            return NOPS;
        }
        return SOPS;
    }

for finding the actual methods, it uses an anonymous static block:

  static {
        Method[] methods = SigarProxy.class.getMethods();
        for (int i=0; i<methods.length; i++) {
            String name = methods[i].getName();
            if (!name.startsWith(PROC_PREFIX)) {
                continue;
            }
            Class[] params = methods[i].getParameterTypes();
            if (!((params.length == 1) &&
                  (params[0] == Long.TYPE)))
            {
                continue;
            }
            METHODS.put(name.substring(PROC_PREFIX.length()), methods[i]);
        }
    }

With a static block, the code is run automatically when the class is loaded.

We can see a drawback to using reflection because the tab completion functionality returns some deprecated methods.

Other Problems:

  1. Depending on how you are logged in, sigar may not have access to some system resources
  2. missing a bunch of javadocs. Plenty of room here for you to chip in and write some docs.
Advertisements
Tagged with: , ,