[Cryptech Core] Project infrastructure: Trac, Gitolite, cgit, Gogs, and Pelican
Rob Austein
sra at hactrn.net
Tue Aug 28 18:11:35 UTC 2018
You can skip this if you don't care about the infrastructure we use
for our git repositories, engineering wiki, et cetera.
So we've been using Trac since the Cryptech Project's inception. It
did pretty well by us for a while, but, as Trac instances always seem
to do sooner or later, it started developing problems. This appears
to be inherent in the Trac code base and the way it's (not)
maintained, and no, we're not going to launch a project to debug Trac.
At the moment, our wiki occasionally hangs, and the git repository
browser is slow even when it's working. We can get by for a while
just restarting Trac occasionally, but if past experience is any guide
this will just gradually get worse in the long run.
Part of the problem is pretty clearly the git browser. For a long
time Subversion was the only first-class VCS citizen in Trac, and
while git is supported out of the box now, it's quirky, and the
implementation is fairly inefficient.
Losing content in the Wiki is also part of the problem: it's all
there, somewhere, but most people seem unable to find half of it, and
even for those of us with sufficient Trac-fu to find stuff, it's
tedious.
Then there's gitolite, which does the job but does not really seem to
be anybody's friend.
So I've been looking at alternate tools. The ones I've been
considering for some or all of this job are Gogs, cgit, and Pelican.
* Gogs is a GitHub work-alike. If you've used GitHub, you pretty much
already know how Gogs works. It's not difficult to set up, and
several other projects I'm involved in use it. If we were to
install it, Gogs would be a replacement for both Gitolite and for
the git-browser functions of Trac.
There are, however, a couple of caveats:
1) Like GitHub, and like most other git hosting solutions, Gogs does
not support hierarchical trees of repository URLs. It's a two
level space: owner/repository, where "owner" is either an
individual or an organization. Just like GitHub. This means
that our current hierarchy of repositories would have to change.
Separate repositories are no problem, and the git submodule
tooling we use for release engineering would still work just
fine, but the external view of a tree of repositories, no.
If necessary, we could provide backwards-compatible HTTPS URLs
for the current set of repositories via Apache configuration
rules (redirects, mod_rewrite, whatever), so we don't need to
break existing clones of our code by outside parties. Hacking
backwards-compatible SSH URLs is probably not worth the trouble,
there aren't that many of us with SSH access and we can write a
script that updates existing clones using "git remote".
2) Gogs supports git hooks, so we could keep our existing
all-commits-must-be-signed enforcement hook, but the antics
involved in forcing this hook site-wide would be a bit tedious
(mostly for me, since the point of the exercise would be to make
it automatic rather than leaving it as a per-repository choice).
3) Gogs, like GitHub, has wikis, but like GitHub wikis, they're per
repository, which is not a great match for our current Trac wiki
content. This is where something like Pelican comes in (below).
* If we just wanted to disable the Trac repository viewer and leave
Trac's wiki functions intact, we'd want a replacement repository
viewer. GitWeb, which comes with git, is a Perl CGI hack: we could
use it, but since I suspect that web crawlers continuously
retrieving every blob in every commit in every repository are part
of what is giving indigestion, I'm concerned that GitWeb would have
essentially the same problem.
One possible solution if we just want a replacement git viewer would
be cgit, which is supposedly a lot faster (written in C, using git's
libraries) and has some built-in caching capabilities to defend
against drowning in spiders.
In this context, I see Gogs vs cgit as a choice, we probably
wouldn't run both of them.
* Pelican is a static content generation tool which takes input in
several formats (including Markdown) and converts it into a
(moderately) pretty website. The default behavior is blog-like: the
default content type is "Article", which is dated content, but it
also supports static "Pages" which are not dated (generally used for
infrastructure content, eg "About"). I've used Pelican a bit for a
personal blog and found it pretty easy to work with.
The things that make it potentially useful here are:
1) It's simple to set it up to work hand in hand with git: you just
add content by editing Markdown files, commit and push. The push
triggers the web site generation, and git's built-in ff-only
check for pushes handles serialization if two people happen to be
editing at the same time.
2) Because the input content is just Markdown in a git repository,
you can use grep to search it if necessary :)
3) It occurred to me while thinking about this that a lot of our
current content is in fact more "article-like" than "page-like".
That is: we have a bunch of content talking about the Novena or
about what we were doing three years ago, or .... That stuff is
in fact dated, and perhaps a presentation style which
automatically treated it as dated material would make more sense
to the reader than the current mess. We could of course still
have "pages" for stuff that should never age; yes, we would
probably want that to be a relatively small set, but guess what,
absent somebody doing serious content maintenance, we kind of
need that to be a small set anyway because otherwise we'll have
just replicated the current mess where everything is somewhere
but nobody can find it.
There are of course other approaches to static content generation,
and I suppose there might be project members who actually *like*
editing things in a web browser instead of in a proper text editor,
but I think that something like Pelican would be a step forward from
where we are now.
Salvaging wiki content from Trac is trivial, there's a one line
command to the admin tool which will dump the entire wiki into a
directory, one file per page. Converting from TracWiki syntax to
Markdown is a bit more work, but nothing very difficult, and a lot
of it can be automated with a few nasty regexp replacements.
Joachim has suggested that we might want to make some time to talk
about this stuff at the f2f meeting in October. I'm fine with that,
but would recommend putting it towards the end of the agenda, as
discussions about this kind of tool choice either never get started
(yawn) or fail to terminate except by exhaustion (bikeshed syndrome).
OK, the $dayjob thing I was waiting for is done compiling now.
More information about the Core
mailing list