A bevy of fixes for V in Perl

Filed under: Software, V — Jacob Welsh @ 21:49

"Fixes" may be a strong term, in that it could be argued the item in question wasn't exactly broken. It was however darn near unusable in practice, and contained a number of fragile and unstated assumptions about its environment and invocation.

I thought I'd introduce my work by putting it in context with a quick recap of what V is; but then it turned out that my understanding of the matter, and perhaps even my approach to gaining understanding, needed some fixing too.(i) In short, it's nearsighted to define V as an improved kind of version control system; rather, its proper placement is a few levels up in the tree of concepts as a new way of thinking about, talking about, and deploying software, broadly construed. Thus I surmise it's also inadequate to consider the present "V in Perl" artifact, its label notwithstanding, to be a kind of reference implementation of V, but rather an early implementation of a relatively small part of the overall vision.

Still, whether I fully manage to see in this horseless carriage prototype a "transportation revolution to permanently alter the shape of civilization" or "just a faster kind of cart" - or something in between - if I'm to be riding in it then I want the wheels on straight; and I'm plenty capable of seeing to that.

The first patch packs in quite a few relatively small and independent fixes, building off the GNAT-demanding ksum/vpatch branch. Paralleling my previous work, the second swaps in keksum and patch to avoid the secondary compiler requirement (a change that's now more straightforward).



Taking the main patch, one item at a time, from the manifest:

(1) Eliminate use of external binaries (cat ls sort pwd which) and provide more hygienic directory listing;

Now I'm no Perl monk (and would likely never choose it myself for a new project) but I know you don't have to open subshells and pipelines just in order to read a file or list a directory, really now! The "ls" scraping is a particularly risky pattern due to the possibility of unexpected control characters in filenames. I was surprised to learn that stock Perl lacks an internal "getcwd" function, but the "pwd" and "which" usage turned out to be doing more harm than good anyway and easily eliminated.

(2) add missing error handling in build_wot;

This is a repeated pattern in the code, and one I probably haven't entirely eliminated. Much like the underlying C, many Perl functions don't raise exceptions but expect the caller to check manually for errors. Much like the underlying C programmers, many Perl programmers don't bother with that pesky error checking stuff. Combine this with mutable variables repeatedly reinitialized by a loop on the assumption that nothing fails, and you get all sorts of interesting leakage possibilities. In this case, if one of the GPG key files placed in ".wot" was invalid, the program would not only fail to notice but label the bad key with the metadata of an unrelated victim that was seen previously.(ii)

(3) eliminate some variable regex patterns;

Using a complex tool when a simple one suffices - and, predictably, not using it correctly (i.e. by quoting regex metacharacters).

(4) avoid slurping(iii) full vpatches and verbose output into memory;

Thus it should now work (albeit slowly) on arbitrarily large vpatch files with respect to the system's main memory, and for verbose mode flush the "patch" output to terminal in closer to real time.

(5) fix exponential recursion blowup in traverse_press_path and get_all_descendant_nodes;

This was the original motivation for this work, as I'd noted:

Algorithmic inefficiency is a serious drawback of this tool. I suspect the "toposort" is something like O(n^2 log n) while "traverse_press_path" is exponential, like the textbook Fibonacci example of how not to do recursion. This becomes acutely noticeable around 32+ patches.

Exponential traversals turned out to be in two different functions; happily, the memoization needed to avoid revisiting the same subtrees over and over was already inherent in the data structures being built.

After tracking down the second instance, I used some "awk" to build a full list of recursive calls in the program for scrutiny (namely: traverse_press_path verify_ante remove_desc add_desc_edges add_desc_src_files), so I'm fairly confident there are no more heads on this particular hydra.

(6) allow the patchdir to be a relative path;

It happened that relative paths worked already for the seals and wot directories.

(7) replace some numeric indexing with named variables;
(8) tweak hash program parsing to not require two spaces;
(9) document restricted positioning of global options handled distinctly from commands;

The option-handling code clearly wasn't doing what it was meant to be doing; I haven't exactly fixed that but have at least noted the extant restrictions on where the wotdir/patchdir/sealdir options may be given.

(10) make help and version commands work without a wotdir;

You can't demand I already know how to use a program in order to view its documentation! Well I mean, you can, but... you know what I mean?!

(11) allow patches and seals to share a directory but require standard extensions (.vpatch .sig);

I never quite saw the point of separate subdirectories here, so now you can just ln -s patches .seals and keep them all under "patches", saving a fair amount of pointless shuffling in my own usage. (The default ".seals" path is preserved for compatibility.)

(12) take basenames of patch arguments to allow tab-completeable paths;

That is: press a some_big_long_name.vpatch

can now be spelled as press a patches/some_big_long_name.vpatch

(i.e. the actual path, although the prefix could be anything). This applies to all subcommands that take patch names.

(13) clean up the tempdir on SIGINT (^C);

Less important now that it doesn't get effectively wedged on exponential algorithms, but still.

(14) factor out some repetition;
(15) other minor simplifications.
Version 99989.


$ diffstat v_fix_exptimes_paths_etc.vpatch
 manifest |    1     |  234 +++++++++++++++++++++++++++++++++------------------------------
 2 files changed, 124 insertions(+), 111 deletions(-)


  1. Full details in the logs:

    jfw: diana_coman: I'm trying for a concise intro/description of V, in the present (post-Republic) context. Does this about capture it: "versioning system that supports owner control of computing by placing primary focus on the change and explicit management of trust through strong cryptography" ?
    diana_coman: jfw - hm, what do you mean by "of computing" there?
    jfw: well, of the operation of one's own computers
    jfw: possibly a bit circular with "ownership of one's own"...
    diana_coman: it's more that the definition as you gave it doesn't do all that much - though it takes a few readings, hm.
    diana_coman: it's a bit tortured on various fences by the looks of it; for one thing, defining it as a versioning system cuts away an important part - the deployment of software that is usually not all that much the traditional concern of versioning systems
    diana_coman: jfw - what's the audience you have in mind there or is this blog/generic?
    jfw: it's the blog, yes - and partly for clarifying it for myself, heh.
    jfw: my grasp of what V does for deployment is basically to say that the other tools traditionally used for it aren't necessary
    diana_coman: ahaha, going for once fully-negative-space there (and that getting rid of all the other "tools" is not a tiny thing either at that, but it's more of a consequence than anything else)
    diana_coman: V is a complete solution in that sense, hence "the other tools [...] aren't necessary"
    jfw: (though um, it's still known to lean on 'wget' etc.)
    diana_coman: well, it also still requires an OS!
    diana_coman: anyways, I wouldn't say that "other tools are not necessary" - it's more that the change is so fundamental that previous tools don't fit /don't have a useful place anymore; other tools though *are* still necessary - only they need to be built
    diana_coman: it changes the whole landscape if you want
    diana_coman: but let's rewind and try to grab it from some more concrete end perhaps
    jfw: alright
    diana_coman: so for one thing, V is not some particular implementation but essentially a paradigm for software
    diana_coman: and software as a whole, not just development, nor even just deployment, it goes all the way to even what software *is*
    diana_coman: sure, one can use V for some narrow part that they care about and it's true that the first implementation was just that, a very narrow thing in fact, but that doesn't mean much.
    diana_coman: and I suppose that the current state of V-use and development otherwise might give the impression that there isn't anything more to it either, huh
    jfw: I suppose I've tried to understand the species based on observations of what's shared by the known instances
    diana_coman: jfw - you know, I think your attempt and question there hits actually deeper (and well done for it, too) than you intended, lol
    jfw: haha, indeed
    diana_coman: jfw - so where did you start from, anyway? from the current implementations of V, is that what you mean by the instances?
    jfw: right
    jfw: heh, you know the one about the blind men and the elephant?
    diana_coman: that kind of locks you unhelpfully into some rather sterile and narrow mindframe, myeah (and I'll leave the tracing of the root cause there to each log reader)
    diana_coman: jfw - hm? doesn't come to mind, no.
    jfw: apparently a story that exists in many versions, but basically each man feels a different part of the elephant and extrapolates a completely different (& quite incomplete) picture of what an elephant is.
    diana_coman: ah, the fable, yes
    diana_coman: I can see the similarity, indeed
    jfw: - possibly the main English version.
    jfw: ponders how to "see true v-elephant with mind's eye"
    diana_coman: the thing is, V is not just a different type of versioning system - a bit like a car is not just a faster cart, hm
    diana_coman: jfw - well, better start from the beginning as it were which indeed is *not* whatever implementation, no matter what claims are made otherwise; e.g. [][the change similar to that introduced by the understanding and controlling movement in terms of mass, impulse and energy, such as it occurs in the launching of
    diana_coman: satellites]
    diana_coman: damn, it still broke the link, didn't it
    jfw: space between the words in the text, yeah
    diana_coman: jfw - my, yrc can't recall previous line??
    jfw: nope :/
    diana_coman: jfw - why, why, why whyyyyyy
    diana_coman: the change similar to that introduced by the understanding and controlling movement in terms of mass, impulse and energy, such as it occurs in the launching of satellites
    jfw: because it's young still
    diana_coman: so based on the above, you can start perhaps with a broad definition of V as a new way of understanding software - and therefore, as a consequence of this deeper and more precise understanding, the resulting more efficient way of talking about software, developing (version controlling being only one part of that developing) software, deploying software, maintaining software and so on.
    diana_coman: jfw - well, yrc may be young and have all the time ahead of it indeed but what can I say, I'm getting older day by day here so pleaaaase: can haz tab-completion and last-line recall?
    jfw: yes; and kill/yank (cut/paste) for the input is needed too.
    jfw: "manage his investment of trust at all junctures so that he is never required to implicitly trust either an unknown code author, or a code snippet of unknown provenance." - hey I pretty much got that part, right?
    diana_coman: with that broad definition at hand to help you avoid the pitfalls of stupid compartmentalizing, narrow focus, childish pick-and-choose and other numerous afflictions of the "software industry/engineering", the next step is to review the stated principles at the root of it all:
    jfw: (but yes, paradigm rather than particular set of scripts was missing.)
    diana_coman: namely software being the property of those running it and identity being constructed by others' view, upon a fixed support
    diana_coman: jfw - trust is possibly the skin of that particular elephant and at least the word itself has been repeatedly brandied about for sure
    diana_coman: it might have been bandied, but I do like brandied better.
    jfw: mmm, brandytrust!
    diana_coman: quite, it can produce... intoxication!
    jfw: especially hazardous when pregnant with concepts & definitions
    diana_coman: ahahah, indeed!
    diana_coman: looking back at your original definition, I'm afraid there isn't much of it left though.
    diana_coman: making a first attempt at tightening up that previous definition:
    sonofawitch: 2020-06-23 21:56:55 (#ossasepia) diana_coman: so based on the above, you can start perhaps with a broad definition of V as a new way of understanding software - and therefore, as a consequence of this deeper and more precise understanding, the resulting more efficient way of talking about software, developing (version controlling being only one part of that developing) software, deploying software, maintaining software and so on.
    diana_coman: V is a new conceptual framework for software, emerging from a better understanding of what software is and providing as main benefits the means for explicit, verifiable enforcement of software ownership by users as well as the correct incentives and supporting concepts for a qualitative jump in the way software is developed, deployed, maintained and evolved.
    diana_coman: jfw - does the above sound like the sort of concise definition you were looking for?
    diana_coman: it aims for a more practical intro so it necessarily leaves some stuff out/picks some to highlight.
    jfw: diana_coman: it's the sort of definition, yes - I don't know that I'll use it here directly though because if I'm to give a definition I'd want it to be one I fully understand myself (i.e. to have that new understanding of software & be able to explain why it's better)
    jfw: I'll work on getting there but the present article can make do without it.
    diana_coman: jfw - ah, no need to use it directly anywhere, lol; and anyways, if not clear, ask further tomorrow or whenever, sure.
    jfw: yep, & thanks for the pointers.
    diana_coman: yw
    diana_coman: such excellent questions are a pleasure to answer, so...keep asking them!

  2. "User error", yes; but as the Python folks say, "errors should never pass silently, unless explicitly silenced", one reason that I still grade that language a cut above Perl. [^]
  3. It's the official Perl term, what can I say? [^]


yrc re-genesis and patch for smooth scrolling and other fixes

Filed under: Software, yrc — Jacob Welsh @ 06:50

I revisit now my IRC client from the other side of seven months since initial public release and a full year since my last round of development on it. Its usage in this interval has been pleasantly free of unexpected problems; indeed the Unix process containing my own "daily driver" has an uptime of that full year, handling everything Freenode's managed to throw at it without a hitch. On the other hand, the software didn't see all that much attention from others, and I became rather complacent about its known quirks and problems.

The interval also saw me coming around to Mircea Popescu's insistence that indentation is done with tabs and not spaces(i) while line breaks are for conveying some kind of meaning beyond "my terminal is too dumb to wrap text on its own". I also switched to the tree structuring convention, prevalent in the software that grew in TMSR, of having a top-level directory named to match the project. As making these changes to an existing tree results in quite a noisy patch, perhaps to the point of effectively severing the connection to the previous state for all the good it does the reader, I've taken this occasion to re-bake the genesis patch yet again. As I'm unaware of anyone else having reviewed the code, I expect this won't pose any real inconvenience.

This out of the way, we get to the more substantive change: refining the scrolling model to count by lines on screen rather than full IRC lines, while still staying anchored to the visible text when the window is resized. I also managed to preserve algorithmic independence from scrollback log length: wrapping is computed on demand for only the necessary lines, ensuring UI responsiveness.

The reason for prioritizing this fix was that a new user pointed out its importance. There's more work upcoming, including tab-completion of names.

A few other fixes and improvements are included, including Python 2.6 compatibility (as yet untested by me); see the included NEWS file for details.

To install, you will need a Keccak V implementation such as my starter, plus the patches and seals:

Or download in bulk, e.g.:

wget -m --no-parent

Once the tree is pressed, see yrc/README.txt for futher instructions.

  1. I'm still not sure what if anything this says about Lisp. [^]

Powered by MP-WP. Copyright Jacob Welsh.