Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Duplicate commits in git clone of src



I was looking at the git clone of the src repo
(https://github.com/netbsd/src) and I noticed that there are lots of
duplicate commits in there; some commits are even present 3 or 4 times.
At first I thought this occurs only with very old commits, but it is the
case for relatively recent ones as well.

Normally this isn't so easy to see, but with gitk and these settings it
is fairly obvious: choose menu View -> New View, select under
References: All refs, All (local) branches, All tags, All
remote-tracking branches. Lower down, select Strictly sort by date.

If you dan scroll back just a few years of commits, you can find a bunch
below the time "2017-04-10 23:53:37"

Taking some random commits from 2017-03-22 23:37:41:

c75b502dcf23b51c8d2504be7a9b5dd7823e4a09 
    Author: sevan <sevan>  2017-03-22 23:37:41
    Committer: sevan <sevan>  2017-03-22 23:37:41
    Parent: 20d6933e4ccdf0811b2b11f64dd019c016cea33e (On second through, it may be possible to have a NULL kfs_v in read and write)
    Child:  fa4a1a6573dcb68fb2675cb80653b446a3231bb9 (KDTRACE_HOOKS is enabled by default in GENERIC.common, remove references in)
    Branch: remotes/origin/jdolecek_ncq

d595117d197582e247e9d5d89ea2c3327feb9e3c
    Author: sevan <sevan%NetBSD.org@localhost>  2017-03-22 23:37:41
    Committer: sevan <sevan%NetBSD.org@localhost>  2017-03-22 23:37:41
    Parent: 058026589ba723ce74452748b5e78aa0a7cd15bc (On second through, it may be possible to have a NULL kfs_v in read and write)
    Child:  b13c9c92f5f3fb3b6e010d31acd1b2a6bd1b1c22 (KDTRACE_HOOKS is enabled by default in GENERIC.common, remove references in)
    Branches: netbsd-9, remotes/origin/ad-namecache, remotes/origin/bouyer-xenpvh, remotes/origin/is-mlppp, remotes/origin/isaki-audio2, remotes/origin/jdolecek-ncq, remotes/origin/jdolecek-ncqfixes, remotes/origin/matt-nb8-mediatek, remotes/origin/netbsd-8, remotes/origin/netbsd-9, remotes/origin/perseant-stdc-iso10646, remotes/origin/pgoyette-compat, remotes/origin/phil-wifi, remotes/origin/prg-localcount2, remotes/origin/trunk, trunk

Looking at the differences between these, I notice a different
conversion of the author/committer name. Also it is on branch
"jdolecek_ncq".

The second one has improved the author/committer, mentions several
branches, one of which is "jdolecek-ncq", with a dash rather than an
underscore.

With some other commits I saw, the branch names are "ROY" vs "roy".
Around 1999-12-05 you can see triple commits (but there are too many
branches and gitk doesn't show them, so analyzing that is more
difficult).

My guess here is that there was an incremental conversion, with
improvements in author and branch name conversion along the way. But
commits and branches from earlier processing stayed in the result, and
hence the duplicates.

Maybe it just needs a fresh conversion from the start to get rid of
these duplicates. Or if that is not feasible, removal of the outdated
branches from the origin repo would probably help a lot.

But it is cool to be able to look back all the way to 1992 to the first
commit!

-Olaf.
-- 
Olaf 'Rhialto' Seibert -- rhialto at falu dot nl
___  Anyone who is capable of getting themselves made President should on
\X/  no account be allowed to do the job.       --Douglas Adams, "THGTTG"

Attachment: signature.asc
Description: PGP signature



Home | Main Index | Thread Index | Old Index