Subject: Re: [HACKERS] PostgreSQL, NetBSD and NFS
To: Tom Lane <tgl@sss.pgh.pa.us>
From: D'Arcy J.M. Cain <darcy@druid.net>
List: current-users
Date: 02/05/2003 12:06:16
On Wednesday 05 February 2003 11:49, Tom Lane wrote:
> "D'Arcy J.M. Cain" <darcy@druid.net> writes:
> > Hmm.  This time it passed that point but this happened:
> >
> > COPY "certificate" FROM stdin;
> > NOTICE:  copy: line 253677, bt_insertonpg[certificate_pkey]: parent page
> > unfound - fixing branch
> > ERROR:  copy: line 253677, bt_fixlevel[certificate_pkey]: invalid item
> > order(1) (need to recreate index)
>
> Hoo boy.  I was already suspecting data corruption in the index, and
> this looks like more of the same.  My thoughts are definitely straying
> in the direction of "the NFS server is dropping bits, somehow".
>
> Both this and the (admittedly unproven) bt_moveright loop suggest
> corrupted values in the cross-page links that exist at the very end of
> each btree index page.  I wonder if it is possible that, every so often,
> you are losing just the last few bytes of an NFS transfer?

Yah, that's kind of what it looked like when I tried this before Christmas too 
although the actual errors differd.  At that time I got a PostgreSQL error 
that implied that something that was just written was not there when it went 
back.  Almost like a flushing issue.

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.