Browse Source

first cut of this..

remotes/origin/main
John-Mark Gurney 2 years ago
commit
aae2f2f66a
4 changed files with 69 additions and 0 deletions
  1. +22
    -0
      README.md
  2. +10
    -0
      doupdate.sh
  3. +35
    -0
      reponames.py
  4. +2
    -0
      repos.txt

+ 22
- 0
README.md View File

@@ -0,0 +1,22 @@
GITMIRROR
=========

This repo is a mirror of various repositories that I want to keep track of.
I realized that git, w/ it's inharently dedupability, and the ability to
store many trees in a single repo, that it'd be easy to create a repo that
regularly clones/mirrors other source repos. Not only this, but the
state of the tags and branches can be archived on a daily basis,
consuming very little space.

The main reason that I want this is from a supply chain availability
perspective. As a consumer of source, it isn't always guaranteed that
the project you depend upon will continue to exist in the future. It
could also be that older version are removed, etc.

Process
-------

1. Repo will self update main to get latest repos/code to mirror.
2. Fetch the repos to mirror into their respective date tagged tags/branches.
3. Push the tags/branches to the parent.
4. Repeate

+ 10
- 0
doupdate.sh View File

@@ -0,0 +1,10 @@
#!/bin/sh

runtime=$(TZ=UTC date +'%Y-%m-%dT%HZ')

while read repourl name c; do
baseref="gm/$runtime/$name"
git fetch --dry-run --no-tags "$repourl" +refs/tags/*:refs/tags/"$baseref/*" +refs/heads/*:refs/heads/"$baseref/*"
done <<EOF
$(python3 reponames.py < repos.txt)
EOF

+ 35
- 0
reponames.py View File

@@ -0,0 +1,35 @@
import re
import sys
import unittest

# man git-check-ref-format

reponameregex = re.compile(r'^(https://(?P<domain>github\.com)/(?P<slashpath>.*)\.git$)')

def doconvert(i):
mat = reponameregex.match(i)

gd = mat.groupdict()

p = gd['slashpath'].replace('/', '-')

return '%s--%s' % (gd['domain'], p)

if __name__ == '__main__':
for i in sys.stdin:
i = i.strip()
if not i or i.startswith('#'):
continue

print(i, doconvert(i))

class _TestCases(unittest.TestCase):
def test_foo(self):
data = [
('https://github.com/python/cpython.git', 'github.com--python-cpython'),
]

for i in data:
r = doconvert(i[0])

self.assertEqual(r, i[1], msg='%s resulting in %s, should have been %s' % tuple(repr(x) for x in (i[0], r, i[1])))

+ 2
- 0
repos.txt View File

@@ -0,0 +1,2 @@
#https://github.com/python/cpython.git
https://github.com/lark-parser/lark.git

Loading…
Cancel
Save