Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Noisy ipmi0 after bootup? (really other kernel noise)



On Fri, 5 Sep 2008, Juan Romero Pardines wrote:

2008/9/5 Paul Goyette <paul%whooppee.com@localhost>:

I'm not at all familiar with ipmi, but looking at the code, I think there's
at least one problem here.  Shouldn't failure to send or receive the command
result in an SINVALID result?

Index: ipmi.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/x86/ipmi.c,v
retrieving revision 1.18
diff -u -p -r1.18 ipmi.c
--- ipmi.c      17 Apr 2008 05:26:11 -0000      1.18
+++ ipmi.c      5 Sep 2008 12:59:56 -0000
@@ -1351,7 +1351,7 @@ ipmi_sensor_status(struct ipmi_softc *sc
               if (ipmi_sendcmd(sc, s1->owner_id, s1->owner_lun,
                   SE_NETFN, SE_GET_SENSOR_THRESHOLD, 1, data) ||
                   ipmi_recvcmd(sc, sizeof(data), &rxlen, data))
-                       return ENVSYS_SVALID;
+                       return ENVSYS_SINVALID;


Probably that's the right thing.

Actually, I don't think it is, after having looked closer!

There's a separate check earlier that checks if ipmi returned valid sensor data. This code is specifically checking to see if the limits were successfully requested and received. It's perfectly reasonable that the limits might not be received, in which case the sensor data should be assumed to be within limits. Since the data has previously been found to be valid, return SVALID here is the correct thing to do.

However, I did find another error, just a few lines further down!

It seems that the ipmi_test_threshold() routine checks for both a high limit and a low limit, and returns zero or non-zero. But it doesn't differentiate between a high-limit-exceeded and a low-limit-exceeded. The calling code (just a few lines down from the above patch) assumes that it must be an upper-limit failing.

I'm suspecting that in this situation, we're actually detecting that the fans are running at a speed which is below the critical low-limit and we should therefore be returning CRITUNDER (or WARNUNDER) rather than the *OVER states. The attached patch (untested, not even compiled yet!) will split ipmi_test_threshold() into two separate routines, so we can tell which limit was exceeded and return the proper state.



----------------------------------------------------------------------
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul%whooppee.com@localhost   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette%juniper.net@localhost |
----------------------------------------------------------------------
Index: ipmi.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/x86/ipmi.c,v
retrieving revision 1.18
diff -u -p -r1.18 ipmi.c
--- ipmi.c      17 Apr 2008 05:26:11 -0000      1.18
+++ ipmi.c      5 Sep 2008 16:01:44 -0000
@@ -214,7 +214,8 @@ void        ipmi_unmap_regs(struct ipmi_softc *
 
 void   *scan_sig(long, long, int, int, const void *);
 
-int    ipmi_test_threshold(uint8_t, uint8_t, uint8_t, uint8_t);
+int    ipmi_test_threshold_lo(uint8_t, uint8_t, uint8_t);
+int    ipmi_test_threshold_hi(uint8_t, uint8_t, uint8_t);
 int    ipmi_sensor_status(struct ipmi_softc *, struct ipmi_sensor *,
                           envsys_data_t *, uint8_t *);
 
@@ -1307,11 +1308,17 @@ ipmi_convert(uint8_t v, struct sdrtype1 
 }
 
 int
-ipmi_test_threshold(uint8_t v, uint8_t valid, uint8_t hi, uint8_t lo)
+ipmi_test_threshold_hi(uint8_t v, uint8_t valid, uint8_t hi)
 {
-       dbg_printf(10, "thresh: %.2x %.2x %.2x %d\n", v, lo, hi,valid);
+       dbg_printf(10, "thresh_hi: %.2x %.2x %d\n", v, hi, valid);
+       return (valid & 8 && hi != 0xFF && v >= hi);
+}
+
+int
+ipmi_test_threshold_lo(uint8_t v, uint8_t valid, uint8_t lo)
+{
+       dbg_printf(10, "thresh_lo: %.2x %.2x %d\n", v, lo, valid);
        return ((valid & 1 && lo != 0x00 && v <= lo) ||
-           (valid & 8 && hi != 0xFF && v >= hi));
 }
 
 int
@@ -1357,18 +1364,30 @@ ipmi_sensor_status(struct ipmi_softc *sc
                    data[0], data[1], data[2], data[3], data[4], data[5],
                    data[6]);
 
-               if (ipmi_test_threshold(*reading, data[0] >> 2 ,
+               if (ipmi_test_threshold_hi(*reading, data[0] >> 2 ,
                    data[6], data[3]))
                        return ENVSYS_SCRITOVER;
 
-               if (ipmi_test_threshold(*reading, data[0] >> 1,
+               if (ipmi_test_threshold_hi(*reading, data[0] >> 1,
                    data[5], data[2]))
                        return ENVSYS_SCRITOVER;
 
-               if (ipmi_test_threshold(*reading, data[0] ,
+               if (ipmi_test_threshold_hi(*reading, data[0] ,
                    data[4], data[1]))
                        return ENVSYS_SWARNOVER;
 
+               if (ipmi_test_threshold_lo(*reading, data[0] >> 2 ,
+                   data[6], data[3]))
+                       return ENVSYS_SCRITUNDER;
+
+               if (ipmi_test_threshold_lo(*reading, data[0] >> 1,
+                   data[5], data[2]))
+                       return ENVSYS_SCRITUNDER;
+
+               if (ipmi_test_threshold_lo(*reading, data[0] ,
+                   data[4], data[1]))
+                       return ENVSYS_SWARNUNDER;
+
                break;
 
        case IPMI_SENSOR_TYPE_INTRUSION:


Home | Main Index | Thread Index | Old Index