Subject: FSCK TROUBLE WITH 4.2 UNIX FILESYSTEMS
To: None <netbsd-help@NetBSD.ORG, current-users@NetBSD.ORG>
From: Brian Buhrow <buhrow@cats.ucsc.edu>
List: current-users
Date: 05/12/1995 16:57:40
	I recently encountered this interesting weakness with fsck on 4.2 
filesystems.  This weakness exists on SunOS 4.x, ultrix, NetBSD, FreeBSD,
BSD-4.3, and any other OS's that use the Berkeley filesystem and its fsck.
I haven't yet check to see if the newer versions of fsck on NetBSD-current have
corrected the problem, but 1.0 certainly displays the trouble.  While this
bug is not fatal, it can be annoying.  Here is the description, as I sent
it to Sun.
Note: Although the output shown here is from a SunOS 4.x box, the NetBSD
1.0 fsck looks very similar, except that it asks you if you want to look
for alternate superblocks.  Also, the enclosed program will repair the
damaged filesystem when invoked as: "fsfix".  
-Brian

	When fsck performs sanity checks against the super block of a file system,
it does a comparison of the primary super block against the backup copy
located in the final cylinder group of the file system.  If the comparison
fails, fsck requests that the user use the -b flag to specify an alternate
copy of the super block in order to restore the primary super block.  If
the primary super block is corrupt, then it will be accurately updated at
the end of the fsck -b.  The fsck -b procedure, however, does not update
the last copy of the super block in the filesystem.  Thus, even if the
primary super block is correct, the next "fsck" will fail with the same
error.
	While it is highly unlikely that the last copy of the super block would
get munched, it seems as though there should be a way, by using fsck, to
restore both copies of the super block so that fsck will work cleanly
again.  Is there a way to do this besides mounting manually, dumping,
newfsing, restoring and continuing?  
	I have included a sample script which demonstrates the problem.  
I've also enclosed a program which trashes the last copy of the super block
in the filesystem in order to allow you to test this more extensively.
	Any suggestions, solutions, etc. would be very helpful.
-Brian


Script started on Fri May 12 13:44:47 1995
cobweb.UCSC.EDU# newfs /dev/rsd2c
/dev/rsd2c:     1954816 sectors in 1909 cylinders of 16 tracks, 64 sectors
        1000.9MB in 120 cyl groups (16 c/g, 8.39MB/g, 3840 i/g)
super-block backups (for fsck -b #) at:
 32, 16480, 32928, 49376, 65824, 82272, 98720, 115168, 131616,
 148064, 164512, 180960, 197408, 213856, 230304, 246752, 262176, 278624,
 295072, 311520, 327968, 344416, 360864, 377312, 393760, 410208, 426656,
 443104, 459552, 476000, 492448, 508896, 524320, 540768, 557216, 573664,
 590112, 606560, 623008, 639456, 655904, 672352, 688800, 705248, 721696,
 738144, 754592, 771040, 786464, 802912, 819360, 835808, 852256, 868704,
 885152, 901600, 918048, 934496, 950944, 967392, 983840, 1000288, 1016736,
 1033184, 1048608, 1065056, 1081504, 1097952, 1114400, 1130848, 1147296, 1163744,
 1180192, 1196640, 1213088, 1229536, 1245984, 1262432, 1278880, 1295328, 1310752,
 1327200, 1343648, 1360096, 1376544, 1392992, 1409440, 1425888, 1442336, 1458784,
 1475232, 1491680, 1508128, 1524576, 1541024, 1557472, 1572896, 1589344, 1605792,
 1622240, 1638688, 1655136, 1671584, 1688032, 1704480, 1720928, 1737376, 1753824,
 1770272, 1786720, 1803168, 1819616, 1835040, 1851488, 1867936, 1884384, 1900832,
 1917280, 1933728, 1950176,
cobweb.UCSC.EDU# fsck /dev/rsd2c
** /dev/rsd2c
** Last Mounted on 
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
2 files, 9 used, 917861 free (13 frags, 114731 blocks, 0.0% fragmentation)
cobweb.UCSC.EDU# trashfs /dev/sd2c
Primary super block at location 16
Wrote 255 bytes at offset 998490112
cobweb.UCSC.EDU# fsck /dev/rsd2c
** /dev/rsd2c
BAD SUPER BLOCK: TRASHED VALUES IN SUPER BLOCK
USE -b OPTION TO FSCK TO SPECIFY LOCATION OF AN ALTERNATE
SUPER-BLOCK TO SUPPLY NEEDED INFORMATION; SEE fsck(8).
cobweb.UCSC.EDU# fsck -b 32 /dev/rsd2c
Alternate super block location: 32
** /dev/rsd2c
** Last Mounted on 
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups

CLEAN FLAG IN SUPERBLOCK IS WRONG; FIX? y

2 files, 9 used, 917861 free (13 frags, 114731 blocks, 0.0% fragmentation)

***** FILE SYSTEM WAS MODIFIED *****
cobweb.UCSC.EDU# fsck /dev/rsd2c
** /dev/rsd2c
BAD SUPER BLOCK: TRASHED VALUES IN SUPER BLOCK
USE -b OPTION TO FSCK TO SPECIFY LOCATION OF AN ALTERNATE
SUPER-BLOCK TO SUPPLY NEEDED INFORMATION; SEE fsck(8).
cobweb.UCSC.EDU# exit
cobweb.UCSC.EDU# 
script done on Fri May 12 13:56:40 1995

<trashfs.c>
/*This program will trash the final alternate super block of your 4.2 Unix
filesystem.*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/param.h>
#include <string.h>
#include <ufs/fs.h>
#include <unistd.h>
#include <sys/file.h>

/*the usage is to give the name of the block device on which the filesystem
resides as the first command line argument.  The program does the rest*/
/*This version, when invoked as "fsfix", repairs its own damage.  Use at
your own risk.*/

main(argc, argv)
  int argc;
  char **argv;
{
  static struct fs superblock; /*primary super block*/
  int fdes; /*file descriptor for block device*/
  int status, fixflg;
  off_t blkno, dblkno; /*block number of last alternate superblock*/
  char tmpbuf[256], *ptr; /*scratch string buffer*/

  if (argc < 2){
    fprintf(stderr,"Usage: %s <block device>",argv[0]);
    exit(1);
  }
  ptr = strrchr(argv[0], '/');
  if (!ptr) ptr = argv[0];
    else ptr++;
  if (!strcmp(ptr, "trashfs")) {
    fixflg = 0;
  } else {
    fixflg = 1;
  }

  fdes = open(argv[1], O_RDWR, 0666);
  if (fdes < 0) {
    perror("Unable to open %s");
    exit(1);
  }

  dblkno = lseek(fdes, SBOFF, SEEK_SET);
  printf("Primary super block at block number:  %d\n",dblkno / 512);

  blkno = read(fdes, &superblock, (size_t)SBSIZE);
  if (blkno != SBSIZE) {
    fprintf(stderr,"read %d bytes, not %d bytes: ",blkno,SBSIZE);
    perror("Unable to read primary super block");
    exit(1);
  }

  /*now, get the block number of the alternate super block we're interested in*/
  blkno = cgsblock(&superblock, superblock.fs_ncg - 1);

  /*Now, decide whether to break or repair the filesystem*/
  dblkno = lseek(fdes, (off_t)fsbtodb(&superblock, blkno) * 512, SEEK_SET);
  if (fixflg) {
    status = write(fdes, &superblock, SBSIZE);
  } else {
    bzero(tmpbuf, 255);
    status = write(fdes, (char *)&tmpbuf[0], (int)255);
  }
  if (status < 0) {
    fprintf(stderr,"File descriptor = %d\n",fdes);
    perror("Unable to write");
    exit(1);
  }
  printf("Wrote %d bytes at block %lu\n",status,dblkno / 512);
  fflush(stdout);

  return;
}