NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-sparc64/39700: NFS client file corruptions on sparc64 SMP



>Number:         39700
>Category:       port-sparc64
>Synopsis:       NFS client file corruptions on sparc64 SMP
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 05 01:50:00 +0000 2008
>Originator:     Takeshi Nakayama
>Release:        NetBSD 4.99.72 20081005
>Organization:
        Private
>Environment:
System: NetBSD eos 4.99.72 NetBSD 4.99.72 (EOS.MP) #91: Sun Oct 5 05:50:45 JST 
2008 takeshi@nyx:/export/anoncvs/src/sys/arch/sparc64/compile/EOS.MP sparc64
Architecture: sparc64
Machine: sparc64
>Description:
        Using sparc64 SMP box as NFS client, I see sometimes file
        corruptions on writing to a NFS server under high load.

        I can observe file corruptions on the NFS server, but
        cannot see differences on NFS client immediately after the
        file created.  It seems the data in buffer cache was not
        corrupted.

>How-To-Repeat:
        See above.

>Fix:
        It seems using NFSv2 mount can avoid file corruptinos, but
        I cannot figure out the root cause.

        Also, the following patch can avoid file corruptions at
        least on my environment (Ultra60 with 2 CPU).

Index: pmap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/sparc64/sparc64/pmap.c,v
retrieving revision 1.221
diff -u -d -r1.221 pmap.c
--- pmap.c      23 Sep 2008 21:30:11 -0000      1.221
+++ pmap.c      5 Oct 2008 00:43:42 -0000
@@ -2500,11 +2500,13 @@
        int changed = 0;
 #ifdef DEBUG
        int modified = 0;
-
-       DPRINTF(PDB_CHANGEPROT|PDB_REF, ("pmap_clear_modify(%p)\n", pg));
+#endif
+#ifdef MULTIPROCESSOR  /* XXX -- workaround */
+       bool needflush = FALSE;
 #endif
 
-#if defined(DEBUG)
+#ifdef DEBUG
+       DPRINTF(PDB_CHANGEPROT|PDB_REF, ("pmap_clear_modify(%p)\n", pg));
        modified = pmap_is_modified(pg);
 #endif
        mutex_enter(&pmap_lock);
@@ -2549,11 +2551,19 @@
                                tsb_invalidate(va, pmap);
                                tlb_flush_pte(va, pmap);
                        }
+#ifdef MULTIPROCESSOR  /* XXX -- workaround */
+                       if (pmap->pm_refs > 0)
+                               needflush = TRUE;
+#endif
                        /* Then clear the mod bit in the pv */
                        if (pv->pv_va & PV_MOD)
                                changed |= 1;
                        pv->pv_va &= ~(PV_MOD);
                }
+#ifdef MULTIPROCESSOR  /* XXX -- workaround */
+               if (needflush)
+                       dcache_flush_page(VM_PAGE_TO_PHYS(pg));
+#endif
        }
        pv_check();
        mutex_exit(&pmap_lock);
@@ -2577,7 +2587,6 @@
 pmap_clear_reference(pg)
        struct vm_page *pg;
 {
-       paddr_t pa = VM_PAGE_TO_PHYS(pg);
        pv_entry_t pv;
        int rv;
        int changed = 0;
@@ -2637,7 +2646,7 @@
                        pv->pv_va &= ~(PV_REF);
                }
        }
-       dcache_flush_page(pa);
+       dcache_flush_page(VM_PAGE_TO_PHYS(pg));
        pv_check();
 #ifdef DEBUG
        if (pmap_is_referenced_locked(pg)) {



Home | Main Index | Thread Index | Old Index