Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

re: mysql sleep timeouts



On Sep 29, 22:02, John Nemeth wrote:
} On Sep 29, 20:51, John Nemeth wrote:
} } On Sep 30, 13:22, matthew green wrote:
} } } John Nemeth writes:
} } } >      I'm seeing a problem in MySQL Cluster where sleep timeouts
} } } > are running too long.  I don't know if this problem also affects
} } } > regular MySQL, but since they share a lot of code, it is very
} } } > possible.
} } } >
} } } >      After doing a fair bit of digging, I found a function called,
} } } > NdbSleep_MilliSleep().  The primary line in this function is,
} } } > "select(0, nullptr, nullptr, nullptr, &t);".  t is a "struct
} } } > timeval".  I'm guessing that select() timeout doesn't provide
} } } > millisecond level granularity?  Somebody can confirm.  What would
} } } > be a better option (hopefully reasonably portable)?
} } } 
} } } what's "too long"?  note that you can't get sleeps with higher
} } } resolution than hz currently, ie, default of 10ms, so if you're
} } } seeing 10ms instead of 1ms, the only current workaround is to
} } } run with HZ=1000 kernels.
} } 
} }      A sampling of log messages (these repeat many times):
} } 
} } 2024-05-26 21:58:51 [ndbd] INFO     -- Watchdog: User time: 10705  System time: 2339
} } 2024-05-26 21:58:51 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: JobHandling in block: 0, gsn: 0 elapsed=1654
} } 2024-05-26 21:58:51 [ndbd] INFO     -- Watchdog: User time: 10705  System time: 2339
} } 2024-05-26 21:58:51 [ndbd] WARNING  -- Time moved forward with 1678 ms
} } 2024-05-26 21:58:51 [ndbd] WARNING  -- timerHandlingLab, expected 10ms sleep, not scheduled for: 1682 (ms)
} } 2024-05-26 21:58:51 [ndbd] INFO     -- Bursty environment, mean burstiness of 92 pct, some risk of congestion issues
} } 2024-05-26 21:58:53 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: JobHandling in block: 0, gsn: 0 elapsed=109
} } 2024-05-26 21:58:53 [ndbd] INFO     -- Watchdog: User time: 10782  System time: 2347
} } 2024-05-26 21:58:53 [ndbd] INFO     -- timerHandlingLab, expected 10ms sleep, not scheduled for: 249 (ms)
} } 2024-05-26 21:58:58 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: JobHandling in block: 0, gsn: 0 elapsed=200
} } 
} } } when we have better timers available, the above method should
} } } work fine -- select() passes microsecond precision we'd only have
} } } to look it up to the future timer system.
} } } 
} } } (alternatively, if you _need_ this level of precision now, the
} } } only real way is to hard-spin until time passes.)
} } 
} }       How would one do that from userland?
} } 
} } }-- End of excerpt from matthew green
} 
}      I guess I should mention that it is running in a Xen domU.
} The host is netbsd-9 from September 17th, 2023 with Xen 4.15.5.
} It is slated for an update.

     And, I should really check.  My build boxes are on the Xen
host.  The production box running the database is on Proxmox, so
KVM with Qemu for device emulation, using virtio for network and
disk.  The timecounter is hpet0.

} }-- End of excerpt from John Nemeth
}-- End of excerpt from John Nemeth


Home | Main Index | Thread Index | Old Index