Eric S. Raymond
2018-10-01 13:53:17 UTC
This is a note to the GCC list that I have not given up on the
repository translation. Having concluded that doing forensics with
9-hour test turnaround times is insupportable and that I had reached
the limit of what off-the-shelf hardware and my existing tools could
do, I am engaged in translating reposurgeon to the Go language.
I expect this to speed up my test cycles by at least an order of
magnitude. Actually, published Python-vs.-Go benchmarks for programs
with similar resource-usage distributions suggest that a speedup
factor of *40* is a realistic expectation. That would pull my test
runs down to about 15 minutes each.
I also expect a drastic reduction in working set, eliminating the
memory pressure I was having on the later runs. Python's per-object
overhead is pretty large (for example, every Python string has a fixed
base cost of 40 bytes). Go's is perhaps a bit larger than C's due to
its type-reflection support, but not overly so.
Even though the semantic gap between Python and Go is relatively
small, translating 14KLOC of algorithmically dense Python to
*anything* else is necessarily a huge undertaking. I did not wish to
announce the effort until I was well into it and reasonably confident
of success.
This morning the translation reached its 30% mark. That gives me
reasonable confidence.
There is still difficulty ahead, however. A copy-on-write store
that was central to the interpretation of Subversion dumps turned
out not to be practically implementable in a language with static
types and no iterators.
I'm going to have to rework significant parts of the Subversion dump
interpreter to get around that, and if you guessed that that is the
most opaque and dangerous part of the codebase (and the locus of the
translation bugs reported here) you got that right.
I shall persevere. The necessity was unfortunate, but one effect of
the translation is that I'm writing unit tests for each piece as I
go. There is good reason to hope that thus will make fault isolation
easier than it was in the Python version.
repository translation. Having concluded that doing forensics with
9-hour test turnaround times is insupportable and that I had reached
the limit of what off-the-shelf hardware and my existing tools could
do, I am engaged in translating reposurgeon to the Go language.
I expect this to speed up my test cycles by at least an order of
magnitude. Actually, published Python-vs.-Go benchmarks for programs
with similar resource-usage distributions suggest that a speedup
factor of *40* is a realistic expectation. That would pull my test
runs down to about 15 minutes each.
I also expect a drastic reduction in working set, eliminating the
memory pressure I was having on the later runs. Python's per-object
overhead is pretty large (for example, every Python string has a fixed
base cost of 40 bytes). Go's is perhaps a bit larger than C's due to
its type-reflection support, but not overly so.
Even though the semantic gap between Python and Go is relatively
small, translating 14KLOC of algorithmically dense Python to
*anything* else is necessarily a huge undertaking. I did not wish to
announce the effort until I was well into it and reasonably confident
of success.
This morning the translation reached its 30% mark. That gives me
reasonable confidence.
There is still difficulty ahead, however. A copy-on-write store
that was central to the interpretation of Subversion dumps turned
out not to be practically implementable in a language with static
types and no iterators.
I'm going to have to rework significant parts of the Subversion dump
interpreter to get around that, and if you guessed that that is the
most opaque and dangerous part of the codebase (and the locus of the
translation bugs reported here) you got that right.
I shall persevere. The necessity was unfortunate, but one effect of
the translation is that I'm writing unit tests for each piece as I
go. There is good reason to hope that thus will make fault isolation
easier than it was in the Python version.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
Gun Control: The theory that a woman found dead in an alley, raped and
strangled with her panty hose, is somehow morally superior to a
woman explaining to police how her attacker got that fatal bullet wound.
-- L. Neil Smith
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
Gun Control: The theory that a woman found dead in an alley, raped and
strangled with her panty hose, is somehow morally superior to a
woman explaining to police how her attacker got that fatal bullet wound.
-- L. Neil Smith