tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
what to do on memory or cache errors?
besides panicing, of course.
This is going to involve a lot of help from UVM.
It seems that uvm_fault is not the right place to handle this. Maybe we need a
void uvm_page_error(paddr_t pa, int etype);
where etype would indicate if this was a memory or cache fault, was the cache
line dirty, etc. If uvm_page_error can't "correct" the error, it would panic.
Interactions with copyin/copyout will also need to be addressed.
Preemptively, we could have a thread force dirty cache lines to memory if
they've been in L2 "too long" (thereby reducing the problem to an ECC error on
a clean cache line which means you just toss the cache-line contents.) We can
also have a thread that reads all of memory (slowly) thereby causing any single
bit errors to be corrected before they become double-bit errors.
I'm not familiar enough with UVM internals to actually know what to do but I
hope someone else reading this is.
Comments anyone?
Home |
Main Index |
Thread Index |
Old Index