Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Ipmi(4) under NetBSD-5 seems to have problems



        hello.  I don't get that exact crash, but I bet my patch fixes it.
What I see is some kind of null pointer reference under NetBSD-3 when
talking to the bmc board fails.  It's not consistent, so I suspect that
commands and data are corrupted when they go to the bmc board or when they
come from it.  I get different stuff on different machines.  One machine I
have just turns off with a  memory exception when the tickler screws up.
Other machines just reboot.  It could be scribbling to random spots on the
bmc's memory just does terrible things to one's machine.
I've been trying to track this down for a couple of weeks now.

-thanks
-Brian
that's under NetBSD-5.  
On Jul 14,  1:13pm, David Young wrote:
} Subject: Re: Ipmi(4) under NetBSD-5 seems to have problems
} On Tue, Jul 14, 2009 at 10:30:24AM -0700, Brian Buhrow wrote:
} >     Hello.  I've just filed kern/41724 with patches to fix the ipmi(4)
} > driver when using the watchdog timer.  I've submitted patches for
} > NetBSD-3.x, NetBSD-4.x and NetBSD-5.x.  These patches solve the problem of
} > making the watchdog timer crash the system when the tickler is unable to
} > communicate with it.  The problem is that the watchdog timer functions do
} > not honor the locks which are used when accessing sensor data.  By insuring
} > that only one set of functions is accessing the bmc boards at once, all
} > errors tickling the timers go away, and the rate of spurious out of range
} > conditions drops as well, although it doesn't go away entirely.  (It looks
} > like a new fix was just committed to the ipmi(4) driver  in -current that
} > might address that issue for once and all.)
} 
} I activated my the watchdog timer on ipmi0 for the first time, and the
} system crashed as follows.  Does your system crash similarly?
} My system is a Dell PowerEdge 1950, btw.
} 
} skyking# wdogctl -k -p 32 ipmi0
} skyking# wdogctl
} Available watchdog timers:
}         ipmi0, 32 second period [armed, kernel tickle]
}         ichlpcib0, 367 second period
} skyking# 
} skyking# panic: assert_sleepable: softint caller=0xc0452665
} cpu0: Begin traceback...
} 
kern_malloc(203a7325,63207325,656c6c61,70253d72,73736100,69747265,66206e6f,656c6961,6e203a64,6b636f6c)
 at netbsd:kern_malloc+0x175
} cpu0: End traceback...
} 
} dumping to dev 4,1 offset 8
} dump 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 
192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 
172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 
152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 
132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 
112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 
90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 
64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 
38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 
12 11 10 9 8 7 6 5 4 3 2 1 succeeded
} 
} Dave
} 
} -- 
} David Young             OJC Technologies
} dyoung%ojctech.com@localhost      Urbana, IL * (217) 278-3933
>-- End of excerpt from David Young




Home | Main Index | Thread Index | Old Index