[Cryptech Core] Project infrastructure: Trac, Gitolite, cgit, Gogs, and Pelican

Rob Austein sra at hactrn.net
Tue Aug 28 18:11:35 UTC 2018


You can skip this if you don't care about the infrastructure we use
for our git repositories, engineering wiki, et cetera.

So we've been using Trac since the Cryptech Project's inception.  It
did pretty well by us for a while, but, as Trac instances always seem
to do sooner or later, it started developing problems.  This appears
to be inherent in the Trac code base and the way it's (not)
maintained, and no, we're not going to launch a project to debug Trac.

At the moment, our wiki occasionally hangs, and the git repository
browser is slow even when it's working.  We can get by for a while
just restarting Trac occasionally, but if past experience is any guide
this will just gradually get worse in the long run.

Part of the problem is pretty clearly the git browser.  For a long
time Subversion was the only first-class VCS citizen in Trac, and
while git is supported out of the box now, it's quirky, and the
implementation is fairly inefficient.

Losing content in the Wiki is also part of the problem: it's all
there, somewhere, but most people seem unable to find half of it, and
even for those of us with sufficient Trac-fu to find stuff, it's
tedious.

Then there's gitolite, which does the job but does not really seem to
be anybody's friend.

So I've been looking at alternate tools.  The ones I've been
considering for some or all of this job are Gogs, cgit, and Pelican.

* Gogs is a GitHub work-alike.  If you've used GitHub, you pretty much
  already know how Gogs works.  It's not difficult to set up, and
  several other projects I'm involved in use it.  If we were to
  install it, Gogs would be a replacement for both Gitolite and for
  the git-browser functions of Trac.

  There are, however, a couple of caveats:

  1) Like GitHub, and like most other git hosting solutions, Gogs does
     not support hierarchical trees of repository URLs.  It's a two
     level space: owner/repository, where "owner" is either an
     individual or an organization.  Just like GitHub.  This means
     that our current hierarchy of repositories would have to change.
     Separate repositories are no problem, and the git submodule
     tooling we use for release engineering would still work just
     fine, but the external view of a tree of repositories, no.

     If necessary, we could provide backwards-compatible HTTPS URLs
     for the current set of repositories via Apache configuration
     rules (redirects, mod_rewrite, whatever), so we don't need to
     break existing clones of our code by outside parties.  Hacking
     backwards-compatible SSH URLs is probably not worth the trouble,
     there aren't that many of us with SSH access and we can write a
     script that updates existing clones using "git remote".

  2) Gogs supports git hooks, so we could keep our existing
     all-commits-must-be-signed enforcement hook, but the antics
     involved in forcing this hook site-wide would be a bit tedious
     (mostly for me, since the point of the exercise would be to make
     it automatic rather than leaving it as a per-repository choice).

  3) Gogs, like GitHub, has wikis, but like GitHub wikis, they're per
     repository, which is not a great match for our current Trac wiki
     content.  This is where something like Pelican comes in (below).

* If we just wanted to disable the Trac repository viewer and leave
  Trac's wiki functions intact, we'd want a replacement repository
  viewer.  GitWeb, which comes with git, is a Perl CGI hack: we could
  use it, but since I suspect that web crawlers continuously
  retrieving every blob in every commit in every repository are part
  of what is giving indigestion, I'm concerned that GitWeb would have
  essentially the same problem.

  One possible solution if we just want a replacement git viewer would
  be cgit, which is supposedly a lot faster (written in C, using git's
  libraries) and has some built-in caching capabilities to defend
  against drowning in spiders.

  In this context, I see Gogs vs cgit as a choice, we probably
  wouldn't run both of them.

* Pelican is a static content generation tool which takes input in
  several formats (including Markdown) and converts it into a
  (moderately) pretty website.  The default behavior is blog-like: the
  default content type is "Article", which is dated content, but it
  also supports static "Pages" which are not dated (generally used for
  infrastructure content, eg "About").  I've used Pelican a bit for a
  personal blog and found it pretty easy to work with.

  The things that make it potentially useful here are:

  1) It's simple to set it up to work hand in hand with git: you just
     add content by editing Markdown files, commit and push.  The push
     triggers the web site generation, and git's built-in ff-only
     check for pushes handles serialization if two people happen to be
     editing at the same time.

  2) Because the input content is just Markdown in a git repository,
     you can use grep to search it if necessary :)

  3) It occurred to me while thinking about this that a lot of our
     current content is in fact more "article-like" than "page-like".
     That is: we have a bunch of content talking about the Novena or
     about what we were doing three years ago, or ....  That stuff is
     in fact dated, and perhaps a presentation style which
     automatically treated it as dated material would make more sense
     to the reader than the current mess.  We could of course still
     have "pages" for stuff that should never age; yes, we would
     probably want that to be a relatively small set, but guess what,
     absent somebody doing serious content maintenance, we kind of
     need that to be a small set anyway because otherwise we'll have
     just replicated the current mess where everything is somewhere
     but nobody can find it.

  There are of course other approaches to static content generation,
  and I suppose there might be project members who actually *like*
  editing things in a web browser instead of in a proper text editor,
  but I think that something like Pelican would be a step forward from
  where we are now.

  Salvaging wiki content from Trac is trivial, there's a one line
  command to the admin tool which will dump the entire wiki into a
  directory, one file per page.  Converting from TracWiki syntax to
  Markdown is a bit more work, but nothing very difficult, and a lot
  of it can be automated with a few nasty regexp replacements.

Joachim has suggested that we might want to make some time to talk
about this stuff at the f2f meeting in October.  I'm fine with that,
but would recommend putting it towards the end of the agenda, as
discussions about this kind of tool choice either never get started
(yawn) or fail to terminate except by exhaustion (bikeshed syndrome).

OK, the $dayjob thing I was waiting for is done compiling now.


More information about the Core mailing list