Subject: Re: TS-72xxx flash and NetBSD
To: Jesse Off <joff@embeddedarm.com>
From: Francis Dupont <Francis.Dupont@enst-bretagne.fr>
List: tech-embed
Date: 02/05/2005 11:29:53
 In your previous mail you wrote:

   I'm Cc'ing this to the tech-kern@netbsd.org mailing list in order to 
   hopefully solicit a few other expert opinions on the subject.
   
=> I've registered it in order to avoid extra work to its moderator.

   I haven't implemented anything for the onboard TS72xx flash support
   because the best course of action isn't yet perfectly clear (to me
   anyway).  I know what I'd like to see avoided in an implementation,
   but don't yet have a good plan of action to offer that would avoid all
   the chaos thats currently in Linux in the form of "MTD" drivers and
   special filesystems unique to each type of flash chip (JFFS2, YAFFS,
   YAFFS2, FTL/EXT2, NFTL/EXT2, etc).  There are certain things unique
   about using direct mapped NAND/NOR flash (as opposed to something like
   CF, or USB flash which has an onboard controller), but a completely
   new filesystem (and one also for each type and variant of chip) seems
   a bit overkill.  Some ideas:
   
   * Come up with some sort of "unreliable/quirky block device" layer

=> I believe we need at least this in order to be able to use dd to
read and write the flash. BTW sys/arch/hpcmips/vr has a CIF driver.

   that can be used to implement the same sort of internal logic
   already present in devices like CompactFlash and hard drives such that
   a regular FFS filesystem could be placed on it.  Devices (512B NAND,
   NOR, 2KB NAND) simply register some generic block device access
   functions and their quirks with this layer, and then a regular FFS
   filesystem could be placed on the "de-quirked" block device.  The
   de-quirking driver has all the intelligence of filesystem agnostic
   wear-leveling, ECC generation/checking, and bad block management (by
   reserving a driver-defined percentage of blocks)
   
=> IMHO a bad144 like bad block management should be enough and its
main drawback (add head movements) does not exist on flashes (:-).

   * Perhaps 1) extend ffs's block allocation policy to be random within free 
   space, (with preference to a sector already erased or in an area already 
   marked for erase/rewrite) 2) force block rewrites to go through a block 
   reallocation instead of using the same block, 3) rate-limit/aggressively 
   buffer FS metadata (inode, block bitmap, etc) writes.

=> for 3 softupdates should help. But I believe there are another solutions
than to rewrite a file-system.

   This would still have issue w/NAND for bad block management,

=> again a bad144 like in lower part of the driver?

   and having to erase/rewrite an entire erase block at a time would
   expose ourselves to potential major data-lossage on crashes.

=> I have an idea here: modify a disk mirroring code in order to rewrite
a block only on the other part. It costs half the disk but is very simple
and metadata is just a small bit table.

   The more I think about this one, the more it seems it won't work.

=> unreliable media are not so new. IMHO there are proved solutions
and some of them should be simple enough to consider them.
   
   * LFS would be a good place to start...
   
   * Maybe do something by using something like fss with mfs and periodically 
   writing out/commiting changes to the real FFS.

=> this seems workable with a basic block driver

   (Still have bad block managment problems)

=> and bad144 like...

   By rate-limiting sector erase/rewrites you can acheive results
   similar to wear leveling for a given target flash minimum lifetime.
   
=> or a "sync on flash" call? Something like the async mount flag.
On FreeBSD it is used on install with the idea that it is simpler
to reinstall everything than to try to recover: the real I/O are done
at last as possible and on a box with gigabytes of memory I believe
they can be done when file-systems are umounted.

   * Maybe relegate the whole thing to userland and use some of the Linux 
   GPL'ed filesystems and an NFS loopback mount.  (prohibits use as root 
   filesystem)
   
=> as a NFS server can run in user mode this can be a good solution.

   * Just use a ramdisk image and mfs union mount and have an easy way to say 
   "save me" and thereafter write out a new ramdisk to the original flash 
   location the kernel+ramdisk reside.
   
=> this is bound to my "sync on flash" idea.

BTW we can divide the file-system issue in three parts:
 - a way to create read-only root file-systems (root is special
   because it should be a FFS). As they should be small IMHO
   there is no issue.
 - a way to handle read-only any size not-root file-systems
   where applications can be put. IMHO Linux cramfs/squashfs/...
   with loop NFS should be fine for this job.
 - a way to handle small read-write not-root file-systems
   where changing files are. This is the problem we discussed.
The whole idea is to adapt the traditional /, /usr and /var
partitioning to flashes. Of course, the first step is to get
a flash driver.

Thanks

Francis.Dupont@enst-bretagne.fr
   
PS: the first usage of a basic flash driver on TS-72xx shall be to
read/write kernels from NetBSD.