NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/44002: 3ware 9690 (ld driver) doesn't respond after transfer big amount of data



The following reply was made to PR kern/44002; it has been noted by GNATS.

From: Jiri Novotny <novotny%ics.muni.cz@localhost>
To: David Holland <dholland-bugs%netbsd.org@localhost>
Cc: gnats-bugs%netbsd.org@localhost, salvet%ics.muni.cz@localhost, 
novotny%ics.muni.cz@localhost
Subject: Re: kern/44002: 3ware 9690 (ld driver) doesn't respond after
 transfer big amount of data
Date: Thu, 18 Nov 2010 17:13:20 +0100

        Hi David,
 
 there is the diff between twa.c (r1.28) and our hack:
 
 # diff twa.c.orig twa.c.hack                                                   
                                
 1066c1066                                                                      
                                  
 <       if (sc->sc_product_id == PCI_PRODUCT_3WARE_9650) {                     
 --- 
 >       if (sc->sc_product_id == PCI_PRODUCT_3WARE_9690) { 
 1092c1092 
 <               if (sc->sc_product_id == PCI_PRODUCT_3WARE_9650) { 
 --- 
 >               if (sc->sc_product_id == PCI_PRODUCT_3WARE_9690) {
 
 It was just simple replacement of ID.
 
 After test with this hack I used your patch (with twa.c r1.33).
 
 The reported bug didn't appear (in both cases), but unfortunately 
 there is another one :-(. After testings with bonie++ the mashine 
 crash down (don't freeze). The behavior is the same with our hack 
 and your patch. 
 
 Right now I am runnig system from USB, so I have no dump
 as the system went into panic. I guess if the dump is neccessary
 I must run system from internal disk (and little rebuild
 the mashin :-( ). 
 
 Several times appears on the console:
 
 dev = 0xa800, block = 537144240, fs = /mnt
 
                                                Best regards
 
                                                        Jiri
 
 >  >  I guess we found it. The 9690 has the same bug as 9650, but in
 >  >  NetBSD twa.c driver is fix just for 9650. In Linux driver is
 >  >  the fix for both. We did ugly hack (just rename 9650 -> 9690)
 >  >  in twa.c and now it work (O.K. the test which freeze the machine)
 >  >  pass many times. Of course we need to go through bigger set of tests.
 > 
 > Good to hear.
 > 
 > Which of the 9650-specific things is it? The full queue issue (as
 > described in the comment at the beginning of twa_start), the behavior
 > in twa_drain_response_queue_large, or the queue errors during reset
 > thing in twa_check_ctlr_state?
 > 
 > I'm assuming the full queue thing as the others pertain to device
 > reset and you were seeing hangs during operation; please correct me if
 > I'm wrong.
 > 
 >  >  As I am not kernel hacker nor experienced programmer I ask my
 >  >  friend Zdenek Slavet to prepare the patch.
 > 
 > Something like the enclosed?
 > 
 >  > P.S. keep my and Zdenek adress in cc: I am not member of
 >  > gnats-bugs%NetBSD.org@localhost maillist
 > 
 > It will mail to you because you submitted the PR, but it won't mail
 > him, so I'm keeping the Cc:. In theory you can add his address to the
 > PR, but I'm not sure that actually works and it probably isn't worth
 > the trouble.
 > 
 > untested candidate patch based on the above assumptions:
 > 
 > 
 > Index: twa.c
 > ===================================================================
 > RCS file: /cvsroot/src/sys/dev/pci/twa.c,v
 > retrieving revision 1.33
 > diff -u -p -r1.33 twa.c
 > --- twa.c    18 Aug 2009 11:15:43 -0000      1.33
 > +++ twa.c    10 Nov 2010 20:15:17 -0000
 > @@ -1056,16 +1056,18 @@ twa_start(struct twa_request *tr)
 >      s = splbio();
 >  
 >      /*
 > -     * The 9650 has a bug in the detection of the full queue condition.
 > +     * The 9650 and 9690 have a bug in the detection of the full queue
 > +     * condition.
 > +     *
 >       * If a write operation has filled the queue and is directly followed
 >       * by a status read, it sometimes doesn't return the correct result.
 >       * To work around this, the upper 32bit are written first.
 >       * This effectively serialises the hardware, but does not change
 >       * the state of the queue.
 >       */
 > -    if (sc->sc_product_id == PCI_PRODUCT_3WARE_9650) {
 > +    if (sc->sc_quirks & TWA_QUIRK_QUEUEFULL_BUG) {
 >              /* Write lower 32 bits of address */
 > -            TWA_WRITE_9650_COMMAND_QUEUE_LOW(sc, tr->tr_cmd_phys +
 > +            TWA_WRITE_COMMAND_QUEUE_LOW(sc, tr->tr_cmd_phys +
 >                      sizeof(struct twa_command_header));
 >      }
 >  
 > @@ -1089,12 +1091,12 @@ twa_start(struct twa_request *tr)
 >                      sizeof(struct twa_command_packet),
 >                      BUS_DMASYNC_PREWRITE | BUS_DMASYNC_PREREAD);
 >  
 > -            if (sc->sc_product_id == PCI_PRODUCT_3WARE_9650) {
 > +            if (sc->sc_quirks & TWA_QUIRK_QUEUEFULL_BUG) {
 >                      /*
 > -                     * Cmd queue is not full.  Post the command to 9650
 > +                     * Cmd queue is not full.  Post the command
 >                       * by writing upper 32 bits of address.
 >                       */
 > -                    TWA_WRITE_9650_COMMAND_QUEUE_HIGH(sc, tr->tr_cmd_phys +
 > +                    TWA_WRITE_COMMAND_QUEUE_HIGH(sc, tr->tr_cmd_phys +
 >                              sizeof(struct twa_command_header));
 >              } else {
 >                      /* Cmd queue is not full.  Post the command. */
 > @@ -1508,6 +1510,8 @@ twa_attach(device_t parent, device_t sel
 >  
 >      aprint_naive(": RAID controller\n");
 >      aprint_normal(": 3ware Apache\n");
 > +
 > +    sc->sc_quirks = 0;
 >              
 >      if (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_3WARE_9000) {
 >              sc->sc_nunits = TWA_MAX_UNITS;
 > @@ -1535,6 +1539,7 @@ twa_attach(device_t parent, device_t sel
 >                      aprint_error_dev(&sc->twa_dv, "can't map mem space\n");
 >                      return;
 >              }
 > +            sc->sc_quirks |= TWA_QUIRK_QUEUEFULL_BUG;
 >      } else if (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_3WARE_9690) {
 >              sc->sc_nunits = TWA_9690_MAX_UNITS;
 >              use_64bit = true;
 > @@ -1544,6 +1549,7 @@ twa_attach(device_t parent, device_t sel
 >                      aprint_error_dev(&sc->twa_dv, "can't map mem space\n");
 >                      return;
 >              }
 > +            sc->sc_quirks |= TWA_QUIRK_QUEUEFULL_BUG;
 >      } else {
 >              sc->sc_nunits = 0;
 >              use_64bit = false;
 > Index: twareg.h
 > ===================================================================
 > RCS file: /cvsroot/src/sys/dev/pci/twareg.h,v
 > retrieving revision 1.10
 > diff -u -p -r1.10 twareg.h
 > --- twareg.h 8 Sep 2008 23:36:54 -0000       1.10
 > +++ twareg.h 10 Nov 2010 20:15:17 -0000
 > @@ -102,13 +102,13 @@
 >      } while (0)
 >  #endif
 >  
 > -#define TWA_WRITE_9650_COMMAND_QUEUE_HIGH(sc, val)                  \
 > +#define TWA_WRITE_COMMAND_QUEUE_HIGH(sc, val)                               
 > \
 >      do {                                                            \
 >              TWA_WRITE_REGISTER(sc, TWA_COMMAND_QUEUE_OFFSET_HIGH,   \
 >                              (uint32_t)(((uint64_t)val)>>32));       \
 >      } while (0)
 >  
 > -#define TWA_WRITE_9650_COMMAND_QUEUE_LOW(sc, val)                   \
 > +#define TWA_WRITE_COMMAND_QUEUE_LOW(sc, val)                                
 > \
 >      do {                                                            \
 >              TWA_WRITE_REGISTER(sc, TWA_COMMAND_QUEUE_OFFSET_LOW,    \
 >                              (uint32_t)(val));                       \
 > Index: twavar.h
 > ===================================================================
 > RCS file: /cvsroot/src/sys/dev/pci/twavar.h,v
 > retrieving revision 1.9
 > diff -u -p -r1.9 twavar.h
 > --- twavar.h 6 May 2009 10:34:33 -0000       1.9
 > +++ twavar.h 10 Nov 2010 20:15:17 -0000
 > @@ -107,6 +107,7 @@ struct twa_softc {
 >  
 >      struct twa_request      *sc_twa_request;
 >      uint32_t                sc_product_id;
 > +    unsigned                sc_quirks;
 >  };
 >  
 >  
 > @@ -145,6 +146,9 @@ struct twa_softc {
 >  #define TWA_LOCK_FREE               0x0     /* lock is free */
 >  #define TWA_LOCK_HELD               0x1     /* lock is held */
 >  
 > +/* Possible values of sc->sc_quirks. */
 > +#define TWA_QUIRK_QUEUEFULL_BUG     0x1
 > +
 >  /* Driver's request packet. */
 >  struct twa_request {
 >      struct twa_command_packet *tr_command;
 > 
 > 
 > -- 
 > David A. Holland
 > dholland%netbsd.org@localhost
 


Home | Main Index | Thread Index | Old Index