[Cryptech Tech] Updated README - and discussion on needed documentation

Rob Austein sra at hactrn.net
Sun Nov 16 01:22:20 UTC 2014


[NB: I read all of your message, trying to avoid quoting all of it]

At Fri, 14 Nov 2014 11:09:27 +0100, Joachim Strömbergson wrote:
> 
> Ok. Lets start by discussing and decide what documentation we need for
> design objects and systems.
> 
> To me a design object is a separate piece of design. For example a
> HW-core or a SW-library, a set of schematics, a sub system etc. A design
> object is stored as a separate repo in the repo server. The object may
> have dependencies on other design objects. But in terms of
> documentation, the design object should be self contained. That is, if I
> clone a repo for an object I should find enough info in the repo to know
> what it does, how to build and use it. This includes info about what
> other objects to clone and what environment the object assumes.

Agreed, with the proviso that //somewhere// there needs to be overview
documentation that explains how all the design objects {fit together |
can be made to fit together | were intended to fit together} to do
something useful.

Randy is no doubt laughing at this point, but keep in mind that in at
least some cases, the person doing the build may not be the person who
does the in-depth review of the security properties (etc) of all the
pieces.  So we have to write doc not just for the wizard who's going
to go over everything with a fine tooth comb but also for the code
monkey who's been ordered to get the silly thing to build while the
wizard flies off to another conference with fine lunches and dinners.

In other words, we need to document not only for the reader who wants
to know all the details but also for the reader who wants to know the
bare minimum necessary to get his or her job done.

> To me the README file I presented is a first attempt at providing that
> kind of information. But a README is rarely the complete documentation
> but rather a short introduction to the object and the repo itself.

Sure.  But it's what we have at the moment, so other than whining
about nonexistent doc it's the only thing on which I can comment. :)

> We should therefore in my opinion have somegthing like a doc/ dir in
> each repo for the real documentation. What is available there will vary
> with the specific object (object type - HW core, SW-lib, SW tool etc).
> 
> For a HW-core the general documentation is normally called Design
> Specification or Data Sheet. Sometimes you also have a Programmers
> Reference or similar (if the object is a CPU). Sometimes one also has a
> Product Prief or Flyer that gives a fast intro with key points and a
> nice illustration. A bit like the README, but more focused on the object
> itself. And prettier.
> 
> I would like to have to write as few documents as possible and would
> therefore have a single technical document unless it becomes cumbersome.
> Refactor when needed basically.
> 
> I my view, for a HW-core the prio order is:
> 
> (1) README with a short intro.
> (2) Data Sheet - with expanded stuff from READMEs, missing stuff.
> (3) Flyer
> 
> And then we have the Wiki that at least should contain something close
> to the README and the Flyer and then a link to the doc/ in the object repo.

I would probably leave the flyer to our nonexistent marketing
department.

I have worked on software products that had three levels of doc:

- Overview
- User's Guide
- Technical Reference

In practice this tends to be fairly expensive to maintain, as there's
a fair amount of overlap and things tend to get out of sync.

"Overview" in this kind of doc probably maps reasonably well to
README as you're using it.  Whether the more detailed doc is a User's
Guide or a Technical Reference depends to some extent on the intended
use.  Technical Reference is the thing that documents the complete
API, User's Guide tends to be more oriented towards helping the user
get started with common tasks.

In our case it may be that the User's Guide spans multiple design
objects, or maybe even all of our design objects.

Not married to any of this, just describing the stuff I know.

> We also need to discuss what the Data Sheet needs to contain. My attempt
> at the README was an attempt at doing that. Your comment on the missing
> API tells me I failed pretty hard. ;-)

Or that I'm a clueless software grunt who doesn't know how to read
your kind of doc yet.   Don't worry about it too much, this is a
hybrid project where we're cross-training each other as we go.

> Finally, bikeshedding, what document formats to use? My suggestions are:
> 
> (1) README in either plain text (UTF-8) or Markdown.

I'd go with Markdown.
> 
> (2) Data Sheet and Flyer as PDF. With source in ODT or Tex stored in
> doc/ too. That is two files for each type of file.

Agree on PDF.  I'd go with LaTeX for source, not just because I prefer
it (I do, but I suspect this is one of those religious things, like
Emacs vs vi) but also because, done properly, I suspect it would give
us more options for automatically generating omnibus documentation in
various formats without a lot of extra work.  LyX might also be worth
looking into in this context: I don't use it myself (I like Emacs),
but friends who need to do TeX but want a GUI in front of it seem to
think pretty highly of LyX.

  http://www.lyx.org/

> > Nit: The description of toolruns/ is kind of circular, and assumes
> > one knows what kind of tools one wants to run there.  I think maybe
> > you mean "test tools" or something like that, since as far as I can
> > tell everything there is running stuff that in a software world one
> > might lump under "make test".
> 
> No. To me, toolruns/ is where all tools are executed.

Ah.  OK, we're dealing with different conventions here.

In the open source software world, by far the most common convention
is some kind of language-specific thing in the top-level directory.
Eg, for (non-Microsoft) C code, the "./configure && make" convention
is pretty wide-spread; for Python, it's "python setup.py build", etc.
Other environments (eg, IDE C environments like Microsoft's or the
proprietary compilers used in various embedded environments) have
their own conventions, some of which don't support the command line at
all, but at least in my experience it's almost always the default
assumption that one does something in the top-level directory.

Note that the Makefile in a top-level directory may just cd to a
subdirectory and run a recursive make there, that's fine, and may be
the right answer in this case, but there needs to be //something//
obvious in the top-level directory, so if we're not going to have a
Makefile/setup.py/... there we need to mention it in the README.md for
every design object.

> As you have seen from using ISE, FPGA tools have a tendency to generate
> a huge number of files and dirs. This way of using toolruns makes it
> possible to keep it under control. This is esp important since one often
> uses 3, 4, 5 or more tools. Right now I'm for example using Icarus
> Verilog, ModelSim, Altera Quartus, Xilinx ISE (and several of its
> separate tools) and Verilator.

Agreed.  I have no problem with using directory trees to hide that
awfulness, I just want something a bit more obvious at the top.

> One isse here is how to handle tools specific input files such as design
> constraint files, but also project files (that defines files to
> include). Right now I add these to the repo where the tool uses them,
> for example in the quartus/terasic_c5g dir. But this might not be the
> best place. My view is that they are related to the specific target and
> thus belongs there.

I don't have a better theory.

> Another issue is generated implementations such as a bitfile for a
> Xilinx based design. I think that we want to provide ready made files
> and notforce people to always build their own bitfiles. I think these
> files should be in the repo and in the specifc platform subdir in the
> repo. Others are hardline on never checking in generated files.

Mostly agree.  Religion aside, the main argument I see against
checking in bitfiles is that there is no way a user could possibly
verify that there is any real relation between the bitfile and the
reviewable source.  Of course, we have enough tool chain issues at the
moment that they can't be sure of that even if they build it
themselves, so this may not matter.

> (1) Each object needs to describe what it does, how to use it and what
> it assumes to be possible to use.
>
> (2) For subssystems, it also needs info on how it relates to other cores.

> Here we have an important thing to decide - the relation between HW
> cores and SW drivers. Should drivers be a separate design object or part
> of the HW-object? I assume they can be different as long as the objects
> points to each other.

I suspect they will need to be different, as a driver is generally
trying to make a particular piece of hardware work in a particular
software environment, and is therefore specific to both the hardware
in question and the software environment in question.

But it all depends on how we're packaging this.  If we think of it all
as a set of hardware modules that plug into a single unified software
stack, putting each driver with the hardware it's designed to drive
would make sense; if it's many-to-many, either the drivers become
design objects in their own right or they end up as part of "board
support packages" (BSPs).

> > If these were software libraries, I'd say that what's missing is the 
> > API documentation.
...
> My README and esp the table was a first attempt at that. What you
> normally have for a core is:
> 
> (1) A description of logical and possibly even the electrical interface
> including timing access reastrictions.
> 
> (2) A memory map with all memories and registers. The memory map
> describes what each register do, how you may access it (Read/Write),
> preconditions and postcoditions including side effects, possible errors,
> intrerrupts that can be caused etc.
> 
> (1) was very badly described. (2) was started in the table. My hope was
> this would be good enough to get feedback on to improve, not just
> "missing". ;-)

The address map for the SHA-1 core looks pretty complete.  My problem
is that I have no clue how to get to those registers, because I don't
understand the multiple layers of other Verilog that sit between my
software and the SHA-1 core.

At the end of the day I'm just a software weenie.  Show me assembly
language (which includes C) I can execute to get the core to talk to
me and I'm happy.  Not immediately obvious how to bridge the gap from
your address map to assembly language. :)

This is why I keep asking about buses and coretest and how all the
pieces fit together.  It's also why I ordered a textbook on Verilog
(allegedly waiting for me when I get home), since expecting this level
of detail from you is probably unreasonable.

Also keep in mind that the Novena environment is a bit weird: we're
trying to simulate a deeply embedded software environment (no
operating system worth of the name) inside a Linux process talking to
an FPGA.  This is not really what we expect anybody to use in
production, we're just doing it because it's so much easier to develop
and debug software in this kind of environment.  At some point we're
going to have to do the real embedded stuff on bare silicon (or
perhaps in a soft-core CPU), but we haven't really done anything with
that yet.

> > One overall thing that confuses me: at present we're hooking just 
> > about everything up into coretest, which makes sense, but when 
> > somebody integrates this, aren't they going to need something that 
> > fills the same role as coretest does for us?  Is that thing going to 
> > look significantly different from coretest or is coretest just 
> > misnamed or ...?  See previous "how does it all fit together", and 
> > expand that to "how should this all fit together?"
> 
> Yes. No. Possibly. It depends. ;-)

Oh good, clarity :)

Full disclosure: my own direct experience with writing device drivers
is quite minimal, and most of it was a very long time ago.  I think
Paul has more relevant experience here than I do (much of it also
quite long ago, but I suspect it's like riding a bicycle...).


More information about the Tech mailing list