NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/52432: /etc/dumpdates format egregious



    Date:        Thu, 3 Aug 2017 13:48:51 -0700
    From:        Greywolf <greywolf%starwolf.com@localhost>
    Message-ID:  <9e2d4133-fab5-bf31-d226-4fdc07ce5e11%starwolf.com@localhost>

  | Hi, kre, christos, others,

Originally:
	I think you only sent this to Christos and me, so there
	are no others...

And now the message has been sent to gnats, and netbsd-bugs, here is the
rest of my reply (verbatim), for gnats, and netbsd-bugs.

  | I'm hoping the concept of a 'dump level' doesn't go away,

Definitely not in anything I am planning.   It isn't levels that are the
problem.

  | Having a GUID-ish (but optionally user-selectable) DevID attached
  | to each device sounds to me like a workable idea and it would
  | render /etc/dumpdates considerably less useless.

That is (more or less) the hope, yes.   Of course, right now this is not
much more than a dream, as:

  | My question here would be: Is there a provision inside FFS allowing
  | for an identifier?

And that is the question (or part of it, FFS isn't the only filesystem,
LFS needs to be handled too - and I suspect anything that follows the
inode/directory model (ie: cd9660, UDF, and msdosfs are irrelevant...)

That part needs to be investigated, and solved, before doing anything else.

  | Tangentially, kre, you mentioned "designing a new format for dump's
  | data" -- could you clarify the intent there? Did you mean the
  | dump timestamp metadata (dumpdates),

Yes, just that.

  | or did you mean the entire TOC format (inode maps)?

No, not that, I have lots of old dumps, one day one of them might be
needed, I don't want to change that at all (certainly not because of
dumpdates issues.)

  | [this can be taken off-line from this discussion.]

You already did,  even if not intentionally.

  | You also mentioned the place *whither* you back up your filesystems via
  | dump. I'm unclear on why the output device for the dump has anything to
  | do with this -- dumpdates only references the last dump of an INput
  | device.

In a way it doesn't - though being able to find the dump (eg: "I know from
dumpdates or whatever replaces it that there was a level 4 dump done two
weeks ago - but where is it?") but that was more along the lines of keeping
the dumpdates file with (somewhere near) the dumps (so if you have found
a dumpdates file, you have also found the relevant dumps).

But the version that gets kept in /etc/dumpdates could perhaps contain
some (optional) extra "where" info - and because that is so variable,
probably just in the form of a freeform string supplied as an arg to dump - we
don't have the technology to read the sticky label stuck on a tape, which
is what would be needed to fully automate tracking, and since I think most
people dump into a pipe to their favourite compression prog these days,
dump generally has no idea, and no way to find out, where its output
eventually ends up.

  | Restore uses restoresymtable to determine whether or not the
  | incremental restore is of the proper date.

That file is more for deleting files that were deleted before one dump
and the incremental that follows, if it also serves to check that the
correct incremental is being restored, that's news ... but I am not sure
how it really can (except perhaps to make sure that one from earlier
than the previous restore is not attempted) - it is built from the
contents of the level N dump when it is restored, and so can only possibly
have info that existed when that dump was done - what the next dump level
would be, for the next incremental, cannot possibly be known then, nor is
it really possible to tell if a level is skipped in the restore sequence,
as only the changes get dumped, and all the changes do, but when we come
to a restore, we (ie: restore) cannot tell if a dir/inode is not on the
backup because it was never changed (between previous and current) or
whether it was changed, but backed up onto an intermediate level dump that
happened between the one that has already been restored, and the one that
is now about to be - except in some quite rare cases (that is, nothing that
we can rely upon.)

  | Thank you both for chiming in on this. I'd forgotten about the
  | vulnerability of "%s" respective to *scanf().

That actually supplies a rationale for limiting the scanf() length, which
might then provide a reason for truncating the output file (%.511s) but
it really provides no reason at all for the padding, which is really just
(relatively) harmless meaningless white space (scanf simply skips it).
(and yes, I agree, it is ugly, though I rarely, if ever, manually look in
a dumpdates file to see it...)

That change should probably be undone, but is not important enough if the
whole dumpdates format is to be revisited anyway.

But since the filesystem additions to make this even worth starting might
take a while, perhaps we should just delete the "-511", or change it to
(%.511s, though the idea of truncating the name is kind of ugly, I'd rather
rewrite the reading function and get rid of the scanf parsing of the device
name, sscanf(string, %s) is, after all, just index(string, ' ') followed by
a strncpy()) from DUMPOUTFMT in protocols/dumprestore.h ... not even sure
why this stuff is in that file, it is really private to dump, restore
certainly never touches it.   scanf is convenient for the rest of each line,
but it could start after the device name has been manually extracted, into
memory that is malloc'd to be big enough - however big that is.   NAME_MAX
is a crock.

kre





Home | Main Index | Thread Index | Old Index