...On Fri 17 Mar 2023 at 10:54:46 +1100, Kalvis Duckmanton wrote:On 15/3/23 18:01, Jan-Benedict Glaw wrote:On Thu, 2023-03-09 12:39:19 +1100, Kalvis Duckmanton <kalvisd%gmail.com@localhost> wrote:I have no explanation for why this might be a problem only for large memory configurations, though.I'm so keen to see if this works. But without understanding what's actually happening, that is probably not ready to be merged upstream?
The idea of Restartable Atomic Sequences is that you do NOT need to lock out interrupts (so in that regard this patch is totally the wrong thing to do), but instead you detect when an interrupt hits inside the sequence. If so, you restart it from the beginning.
I stand corrected - this patch is not the right thing to do.
4. Perhaps this is the reason why the whole thing doesn't work if you have a lot of RAM: if we take the comment "beyond the end of the system page table" at face value, then maybe such a large memory configuration causes the page tables to grow enough that the T_PTELEN trap does not occur. Instead there would be some other trap type.
This does seem to better fit the symptoms - what I see is a
T_ACCFLT not a T_PTELEN fault. I note that accessing a virtual
address with the top 2 bits being 1 (so VAddr<31:30> = 3) is
defined to cause a length-violation fault - so perhaps changing
the definition of CASMAGIC in trap.h is all that's needed?
/* Used by RAS to detect an interrupted CAS */
#define CASMAGIC 0xFEDABABE /* accessing the reserved region causes a length violation */
kalvis