Fixpoint

2020-06-27

A bevy of fixes for V in Perl

Filed under: Software, V — Jacob Welsh @ 21:49

"Fixes" may be a strong term, in that it could be argued the item in question wasn't exactly broken. It was however darn near unusable in practice, and contained a number of fragile and unstated assumptions about its environment and invocation.

I thought I'd introduce my work by putting it in context with a quick recap of what V is; but then it turned out that my understanding of the matter, and perhaps even my approach to gaining understanding, needed some fixing too.(i) In short, it's nearsighted to define V as an improved kind of version control system; rather, its proper placement is a few levels up in the tree of concepts as a new way of thinking about, talking about, and deploying software, broadly construed. Thus I surmise it's also inadequate to consider the present "V in Perl" artifact, its label notwithstanding, to be a kind of reference implementation of V, but rather an early implementation of a relatively small part of the overall vision.

Still, whether I fully manage to see in this horseless carriage prototype a "transportation revolution to permanently alter the shape of civilization" or "just a faster kind of cart" - or something in between - if I'm to be riding in it then I want the wheels on straight; and I'm plenty capable of seeing to that.

The first patch packs in quite a few relatively small and independent fixes, building off the GNAT-demanding ksum/vpatch branch. Paralleling my previous work, the second swaps in keksum and patch to avoid the secondary compiler requirement (a change that's now more straightforward).

Download

Changes

Taking the main patch, one item at a time, from the manifest:

(1) Eliminate use of external binaries (cat ls sort pwd which) and provide more hygienic directory listing;

Now I'm no Perl monk (and would likely never choose it myself for a new project) but I know you don't have to open subshells and pipelines just in order to read a file or list a directory, really now! The "ls" scraping is a particularly risky pattern due to the possibility of unexpected control characters in filenames. I was surprised to learn that stock Perl lacks an internal "getcwd" function, but the "pwd" and "which" usage turned out to be doing more harm than good anyway and easily eliminated.

(2) add missing error handling in build_wot;

This is a repeated pattern in the code, and one I probably haven't entirely eliminated. Much like the underlying C, many Perl functions don't raise exceptions but expect the caller to check manually for errors. Much like the underlying C programmers, many Perl programmers don't bother with that pesky error checking stuff. Combine this with mutable variables repeatedly reinitialized by a loop on the assumption that nothing fails, and you get all sorts of interesting leakage possibilities. In this case, if one of the GPG key files placed in ".wot" was invalid, the program would not only fail to notice but label the bad key with the metadata of an unrelated victim that was seen previously.(ii)

(3) eliminate some variable regex patterns;

Using a complex tool when a simple one suffices - and, predictably, not using it correctly (i.e. by quoting regex metacharacters).

(4) avoid slurping(iii) full vpatches and verbose output into memory;

Thus it should now work (albeit slowly) on arbitrarily large vpatch files with respect to the system's main memory, and for verbose mode flush the "patch" output to terminal in closer to real time.

(5) fix exponential recursion blowup in traverse_press_path and get_all_descendant_nodes;

This was the original motivation for this work, as I'd noted:

Algorithmic inefficiency is a serious drawback of this tool. I suspect the "toposort" is something like O(n^2 log n) while "traverse_press_path" is exponential, like the textbook Fibonacci example of how not to do recursion. This becomes acutely noticeable around 32+ patches.

Exponential traversals turned out to be in two different functions; happily, the memoization needed to avoid revisiting the same subtrees over and over was already inherent in the data structures being built.

After tracking down the second instance, I used some "awk" to build a full list of recursive calls in the program for scrutiny (namely: traverse_press_path verify_ante remove_desc add_desc_edges add_desc_src_files), so I'm fairly confident there are no more heads on this particular hydra.

(6) allow the patchdir to be a relative path;

It happened that relative paths worked already for the seals and wot directories.

(7) replace some numeric indexing with named variables;
(8) tweak hash program parsing to not require two spaces;
(9) document restricted positioning of global options handled distinctly from commands;

The option-handling code clearly wasn't doing what it was meant to be doing; I haven't exactly fixed that but have at least noted the extant restrictions on where the wotdir/patchdir/sealdir options may be given.

(10) make help and version commands work without a wotdir;

You can't demand I already know how to use a program in order to view its documentation! Well I mean, you can, but... you know what I mean?!

(11) allow patches and seals to share a directory but require standard extensions (.vpatch .sig);

I never quite saw the point of separate subdirectories here, so now you can just ln -s patches .seals and keep them all under "patches", saving a fair amount of pointless shuffling in my own usage. (The default ".seals" path is preserved for compatibility.)

(12) take basenames of patch arguments to allow tab-completeable paths;

That is:

v.pl press a some_big_long_name.vpatch

can now be spelled as

v.pl press a patches/some_big_long_name.vpatch

(i.e. the actual path, although the prefix could be anything). This applies to all subcommands that take patch names.

(13) clean up the tempdir on SIGINT (^C);

Less important now that it doesn't get effectively wedged on exponential algorithms, but still.

(14) factor out some repetition;
(15) other minor simplifications.
Version 99989.

Stats

$ diffstat v_fix_exptimes_paths_etc.vpatch
 manifest |    1
 v.pl     |  234 +++++++++++++++++++++++++++++++++------------------------------
 2 files changed, 124 insertions(+), 111 deletions(-)

Enjoy!

  1. Full details in the logs:

    jfw: diana_coman: I'm trying for a concise intro/description of V, in the present (post-Republic) context. Does this about capture it: "versioning system that supports owner control of computing by placing primary focus on the change and explicit management of trust through strong cryptography" ?
    diana_coman: jfw - hm, what do you mean by "of computing" there?
    jfw: well, of the operation of one's own computers
    jfw: possibly a bit circular with "ownership of one's own"...
    diana_coman: it's more that the definition as you gave it doesn't do all that much - though it takes a few readings, hm.
    diana_coman: it's a bit tortured on various fences by the looks of it; for one thing, defining it as a versioning system cuts away an important part - the deployment of software that is usually not all that much the traditional concern of versioning systems
    diana_coman: jfw - what's the audience you have in mind there or is this blog/generic?
    jfw: it's the blog, yes - and partly for clarifying it for myself, heh.
    jfw: my grasp of what V does for deployment is basically to say that the other tools traditionally used for it aren't necessary
    diana_coman: ahaha, going for once fully-negative-space there (and that getting rid of all the other "tools" is not a tiny thing either at that, but it's more of a consequence than anything else)
    diana_coman: V is a complete solution in that sense, hence "the other tools [...] aren't necessary"
    jfw: (though um, it's still known to lean on 'wget' etc.)
    diana_coman: well, it also still requires an OS!
    diana_coman: anyways, I wouldn't say that "other tools are not necessary" - it's more that the change is so fundamental that previous tools don't fit /don't have a useful place anymore; other tools though *are* still necessary - only they need to be built
    diana_coman: it changes the whole landscape if you want
    diana_coman: but let's rewind and try to grab it from some more concrete end perhaps
    jfw: alright
    diana_coman: so for one thing, V is not some particular implementation but essentially a paradigm for software
    diana_coman: and software as a whole, not just development, nor even just deployment, it goes all the way to even what software *is*
    diana_coman: sure, one can use V for some narrow part that they care about and it's true that the first implementation was just that, a very narrow thing in fact, but that doesn't mean much.
    diana_coman: and I suppose that the current state of V-use and development otherwise might give the impression that there isn't anything more to it either, huh
    jfw: I suppose I've tried to understand the species based on observations of what's shared by the known instances
    diana_coman: jfw - you know, I think your attempt and question there hits actually deeper (and well done for it, too) than you intended, lol
    jfw: haha, indeed
    diana_coman: jfw - so where did you start from, anyway? from the current implementations of V, is that what you mean by the instances?
    jfw: right
    jfw: heh, you know the one about the blind men and the elephant?
    diana_coman: that kind of locks you unhelpfully into some rather sterile and narrow mindframe, myeah (and I'll leave the tracing of the root cause there to each log reader)
    diana_coman: jfw - hm? doesn't come to mind, no.
    jfw: apparently a story that exists in many versions, but basically each man feels a different part of the elephant and extrapolates a completely different (& quite incomplete) picture of what an elephant is.
    diana_coman: ah, the fable, yes
    diana_coman: I can see the similarity, indeed
    jfw: https://allpoetry.com/The-Blind-Man-And-The-Elephant - possibly the main English version.
    jfw: ponders how to "see true v-elephant with mind's eye"
    diana_coman: the thing is, V is not just a different type of versioning system - a bit like a car is not just a faster cart, hm
    diana_coman: jfw - well, better start from the beginning as it were which indeed is *not* whatever implementation, no matter what claims are made otherwise; e.g. [http://trilema.com/2015/no-such-labs-releases-v-for-victory/?b=change&e=satellites#select][the change similar to that introduced by the understanding and controlling movement in terms of mass, impulse and energy, such as it occurs in the launching of
    diana_coman: satellites]
    diana_coman: damn, it still broke the link, didn't it
    jfw: space between the words in the text, yeah
    diana_coman: jfw - my, yrc can't recall previous line??
    jfw: nope :/
    diana_coman: jfw - why, why, why whyyyyyy
    diana_coman: the change similar to that introduced by the understanding and controlling movement in terms of mass, impulse and energy, such as it occurs in the launching of satellites
    jfw: because it's young still
    diana_coman: so based on the above, you can start perhaps with a broad definition of V as a new way of understanding software - and therefore, as a consequence of this deeper and more precise understanding, the resulting more efficient way of talking about software, developing (version controlling being only one part of that developing) software, deploying software, maintaining software and so on.
    diana_coman: jfw - well, yrc may be young and have all the time ahead of it indeed but what can I say, I'm getting older day by day here so pleaaaase: can haz tab-completion and last-line recall?
    jfw: yes; and kill/yank (cut/paste) for the input is needed too.
    jfw: "manage his investment of trust at all junctures so that he is never required to implicitly trust either an unknown code author, or a code snippet of unknown provenance." - hey I pretty much got that part, right?
    diana_coman: with that broad definition at hand to help you avoid the pitfalls of stupid compartmentalizing, narrow focus, childish pick-and-choose and other numerous afflictions of the "software industry/engineering", the next step is to review the stated principles at the root of it all:
    jfw: (but yes, paradigm rather than particular set of scripts was missing.)
    diana_coman: namely software being the property of those running it and identity being constructed by others' view, upon a fixed support
    diana_coman: jfw - trust is possibly the skin of that particular elephant and at least the word itself has been repeatedly brandied about for sure
    diana_coman: it might have been bandied, but I do like brandied better.
    jfw: mmm, brandytrust!
    diana_coman: quite, it can produce... intoxication!
    jfw: especially hazardous when pregnant with concepts & definitions
    diana_coman: ahahah, indeed!
    diana_coman: looking back at your original definition, I'm afraid there isn't much of it left though.
    diana_coman: making a first attempt at tightening up that previous definition:
    sonofawitch: 2020-06-23 21:56:55 (#ossasepia) diana_coman: so based on the above, you can start perhaps with a broad definition of V as a new way of understanding software - and therefore, as a consequence of this deeper and more precise understanding, the resulting more efficient way of talking about software, developing (version controlling being only one part of that developing) software, deploying software, maintaining software and so on.
    diana_coman: V is a new conceptual framework for software, emerging from a better understanding of what software is and providing as main benefits the means for explicit, verifiable enforcement of software ownership by users as well as the correct incentives and supporting concepts for a qualitative jump in the way software is developed, deployed, maintained and evolved.
    diana_coman: jfw - does the above sound like the sort of concise definition you were looking for?
    diana_coman: it aims for a more practical intro so it necessarily leaves some stuff out/picks some to highlight.
    jfw: diana_coman: it's the sort of definition, yes - I don't know that I'll use it here directly though because if I'm to give a definition I'd want it to be one I fully understand myself (i.e. to have that new understanding of software & be able to explain why it's better)
    jfw: I'll work on getting there but the present article can make do without it.
    diana_coman: jfw - ah, no need to use it directly anywhere, lol; and anyways, if not clear, ask further tomorrow or whenever, sure.
    jfw: yep, & thanks for the pointers.
    diana_coman: yw
    diana_coman: such excellent questions are a pleasure to answer, so...keep asking them!
    [^]

  2. "User error", yes; but as the Python folks say, "errors should never pass silently, unless explicitly silenced", one reason that I still grade that language a cut above Perl. [^]
  3. It's the official Perl term, what can I say? [^]

6 Comments »

  1. You can't demand I already know how to use a program in order to view its documentation! Well I mean, you can, but... you know what I mean?!

    In fairness, v.pl came with a very nice and quite detailed manual that was a separate text file so yes, to read before you run the code, certainly. A bit older school perhaps but not really a fault for that. If I recall correctly, when I made the v-tree for it all, I was faced with the choice of either going through all that doc and update everything to reflect the latest version or ditch it and keep instead only the runtime help that seemed to me quite clear enough for the job and certainly more likely to be kept up to date as well.

    Comment by Diana Coman — 2020-06-28 @ 08:07

  2. Ah, I'd noticed and figured it was something like that; thanks for filling in. Either way I found it odd that the help/version commands would require having (part of) the setup in place.

    Comment by Jacob Welsh — 2020-06-29 @ 01:36

  3. [...] V. I used Jacob's V in Perl with keksum starter kit. [...]

    Pingback by GBW-NODE : Gales Bitcoin Wallet Node verified acquisition, build, install and run in 21ish short, simple steps. « Dorion Mode — 2020-07-01 @ 17:32

  4. Not sure if this is a problem with v.pl+keksum, but I'll report it here first.

    I attempted to press bvt's kernel using v.pl+keksum pressed to v_keksum_busybox_r2.vpatch on a Gales Linux system with 4 GB RAM and got the following error :

    time v.pl p linux-keccak-rng patches/linux-keccak-rng.vpatch

    Hunk 1 FAILED 0/1.
    +%PDF-1.4
    +%
    +5 0 obj
    +<</Length 6 0 R/Filter /FlateDecode>>
    +stream
    +xQo#+$Q"EoIp7#}Avq#_j>`/#|i7}C/_o26#S#9# :k#5Chkvxe#)@S4"
    +(DQ
    }4
    5
    #C z1Q1xcs=D!uCFd(}h/[)u%8i)=AMCwx
    +BQ7Cs=}Z[xM!p# !#-XR-GF=$ZklGS>pH8M1.*($Z<1yB5w-s
    3a
    #QLu#yCKE)6aM="M=tC=*zHXeSlM=$ZDU6h#C"zHXeSAfPCD6M=:
    c!C([D%&mb(1wk5~}>TDH=a#0zWPhbr>@Ch3<s%3yqhOaQ&HF@",kX"<7G&.fn.K9
    ys#}I~=5yvc/#;o4#v|kS>r
    ux<K1wkS}7+="}@HHHEb?F}oUD"cxkDjHeP#" R{{{J{h{=xH[oLEbbb
    +y*7*D1sj
    HMB~<o#Y(%@CHZEr5(KZ@#="rHVK$+S#k$Hh/[q[#Z4cBbES:^X#P#Y#fwC ]#AgbYtvhle<DjmG%G>Rq8t#Ag8
    +<Amxo@'Sf\P;N
    +=qjU
    20q@#8$)PNlKAPB[%UuJP=>JXR
    8*X*J p
    +R
    p#=PKE)TPQ
    +TUB,
    +@D!#0Q
    +m)TQ
    +TUB@kt
    +
    +U[T:PQ
    +TU2
    ~RX&tQ'S
    +
    e8/
    +eJa
    _B2Po`BF;/)P9<nUN)8:RU
    +Tx@8*<D
    +O,(K}#4"
    +(ObfC(9#Z#dD
    IQ#XQ-4@r=HmDjKb'b(VZE2?D*?D*?E"?D*?D*?D*?D*?yh##H#@C(q8TD*<"G_8S9kr@rP$r@jaZCxkT-(!f4#9wuc
    +Yg2NNpdD8NJP$C8#p<#"aX1(V<"HDQMczhvh>4LQ.Kqs#i8#r2X//JRFq(eZe)S Ai^y`cc#}_xo?~l_n<#cn%I#Ono__^|q=~}}9^4~!G_]XPrqIGq.]s~-i#=\_#g|zw:=}{vwW_w^~>;mzr#|q#rkw}se%c"`P?o:;oU0Utuq0endstream
    +endobj
    +6 0 obj
    +2760
    +endobj
    +4 0 obj
    +<</Type/Page/MediaBox [0 0 566 600]
    +/Parent 3 0 R
    +/Resources<</ProcSet[/PDF /Text]
    +/ExtGState 10 0 R
    +/Font 11 0 R
    +>>
    +/Contents 5 0 R
    +>>
    +endobj
    +3 0 obj
    +<< /Type /Pages /Kids [
    +4 0 R
    +] /Count 1
    +>>
    +endobj
    +1 0 obj
    +<</Type /Catalog /Pages 3 0 R
    +/Metadata 13 0 R
    +>>
    +endobj
    +7 0 obj
    +<</Type/ExtGState
    +/OPM 1>>endobj
    +10 0 obj
    +<</R7
    +7 0 R>>
    +endobj
    +11 0 obj
    +<</R8
    +8 0 R>>
    +endobj
    +8 0 obj
    +<</BaseFont/SEXXTC+Helvetica/FontDescriptor 9 0 R/Type/Font
    +/FirstChar 32/LastChar 118/Widths[
    +278 0 0 0 0 0 0 0 333 333 0 0 278 0 278 278
    +0 0 556 0 556 0 0 0 0 0 278 0 0 0 0 0
    +0 667 667 722 722 667 611 778 0 278 0 0 556 833 722 778
    +667 778 722 667 611 0 667 0 0 0 0 0 0 0 0 556
    +0 556 556 0 556 556 278 556 0 222 0 0 222 833 556 556
    +556 0 333 500 278 556 500]
    +/Encoding/WinAnsiEncoding/Subtype/Type1>>
    +endobj
    +9 0 obj
    +<</Type/FontDescriptor/FontName/SEXXTC+Helvetica/FontBBox[-22 -218 762 741]/Flags 4
    +/Ascent 741
    +/CapHeight 741
    +/Descent -218
    +/ItalicAngle 0
    +/StemV 114
    +/MissingWidth 278
    +/CharSet(/A/B/C/D/E/F/G/I/L/M/N/O/P/Q/R/S/T/V/a/b/colon/comma/d/e/f/four/g/i/l/m/n/o/p/parenleft/parenright/period/r/s/slash/space/t/two/u/underscore/v)/FontFile3 12 0 R>>
    +endobj
    +12 0 obj
    +<</Filter/FlateDecode
    +/Subtype/Type1C/Length 2959>>stream
    +xViTW> Q$

    +(l*qbL4n[4
    f&#9#~Un~ y#y
    ##&V$9De#c K¡dgy)#G;dd-KM$#>\\:Q*95'=1kTU##)RgO_,:gNhnzOgGf&
    gf1A;3L03D3!'31
    3s
    3`1&11gXf

    File: linux-keccak-rng/linux/sound/usb/bcd2000/Makefile
    Expected: 11ec3e836f217e10f3289a126e70e63fa2f412851e4623f7aab83900154343b4f1a0947a35f8625ee80d7418225cd51c67f5f9d6783615c2a49a027dfbb9c005
    Actual: c55f5d100907f966f8164e14f60f84df238840059fd0ea980759814631a950671f45bbadec595651c44ae66e63752a226d574d86e54ee2912b67ec86f6d0b44a
    Pressed file hash did not match expected!
    290m32.45s real 288m57.91s user 1m10.72s system

    Manually verifying the downloaded signature and vpatch appears the file was not corrupted prior/during transmisssion.
    gpg --verify linux-genesis.vpatch.bvt.sig linux-genesis.vpatch
    gpg: Signature made Mon Sep 2 19:35:59 2019 UTC using RSA key ID 4B962B68
    Primary key fingerprint: 6CF3 EFF8 92A7 F23E 7E79 8E5E BA6B 8C05 4B96 2B68

    I'll note that on line 22400605 of linux-genesis.vpatch I'm seeing a peculiar "\ No newline at end of file". This is where it attempts to press the file it failed on, i.e. linux/sound/usb/bcd2000/Makefile.

    Please let me know if you have any insight to what went wrong here or if there is further information I can provide.

    Comment by Robinson Dorion — 2020-07-21 @ 20:26

  5. There's two good finds in here: first, that a binary .pdf ended up in the kernel genesis despite efforts to separate those; second, that busybox patch doesn't support the \ directive to suppress a final newline, which phf's pure-Ada "vpatch" deliberately supports.

    Comment by Jacob Welsh — 2020-07-21 @ 20:58

  6. [...] Because "tree" is always used as the press directory name, you'll need to delete any conflicting one first or else the script will halt before trying to press. (This avoids the risk of clobbering uncommitted local changes and follows the behavior of the old v.pl.) [...]

    Pingback by The simplest way yet to fetch Bitcoin code « Fixpoint — 2023-04-21 @ 14:14

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by MP-WP. Copyright Jacob Welsh.