Fixpoint

2020-04-02

V in Perl with parsing fix, keksum, and starter, plus the ill-fated vdiff

Filed under: Software, V — Jacob Welsh @ 17:50

Following my prior adventures, I reoriented my efforts toward some simpler changes to the v.pl tree, abandoning hopes of a robust patch creation tool built on Busybox diff.

I've split the changes into two patches. The first is "v_strict_headers", which I think would be of interest to any v.pl user. It tightens vpatch parsing to prevent false-positive header matches that could cause incorrect or nonsensical antecedent information to be extracted from valid vpatches. Following the precedent of the vtools vpatch program, this is done by requiring the string "diff " at the start of a line preceding the header, which works because all other lines of a diff "packet"(i) start with either @, +, -, or space characters. This patch also backfills the manifest file and brings it fully in line with the spec.

The second patch, "v_keksum_busybox", swaps keksum and patch in for ksum and vpatch, making V presses possible again on systems with little more than a C toolchain, Busybox utilities and Perl.

I have also mirrored the rest of the VTree and contributed my own seals, which can be found in the same directory.

For deployment on systems with no previous V, there's a starter tarball which includes the tree pressed to v_keksum_busybox, the keksum code, and an install script. Take a look at what it does, then run as root, from the extracted directory:

# sh install.sh

Download

The ill-fated vdiff

What follows is my abandoned attempt at vdiff in awk, supporting any conforming diff program. It identifies headers using a three-state machine to recognize the ---, +++, @@ sequence. This would still be fooled by a ---, +++ sequence followed immediately by another hunk, except that the lines of context prevent this, unless the change comes at the end of the file in which case there can't be another hunk prior to the next file header.

It works as far as parsing both GNU and Busybox diff output, produces working vpatches in the GNU case, and could even be expanded to do the same for Busybox. But since fully-reproducible output seems to be desirable, I can't presently justify further work in this direction or recommend it over the vtools vdiff.

#!/bin/sh
export LC_COLLATE=C
diff -uNr $1 $2 | awk -v sq=\' '
function shell_quote(s) {
	gsub(sq, sq "\\" sq sq, s);
	return sq s sq;
}

function vhash(path) {
	if (path == "/dev/null") return "false";
	qpath = shell_quote(path);
	cmd = "test -e " qpath " && keksum -s256 -l512 -- " qpath;
	gotline = cmd | getline rec;
	close(cmd);
	if (!gotline) return "false";
	split(rec, parts);
	return parts[1];
}

function print_header(line) {
	split(line, parts);
	print parts[1], parts[2], vhash(parts[2]);
}

{
	if (state == 0) {
		if ($0 ~ /^---/) {
			from = $0;
			state = 1;
		}
		else {
			print;
		}
	}
	else if (state == 1) {
		if ($0 ~ /^\+\+\+/) {
			to = $0;
			state = 2;
		}
		else if ($0 ~ /^---/) {
			print from;
			from = $0;
		}
		else {
			print from;
			print;
			state = 0;
		}
	}
	else if (state == 2) {
		if ($0 ~ /^@@/) {
			print_header(from);
			print_header(to);
			print;
			state = 0;
		}
		else if ($0 ~ /^---/) {
			print from;
			print to;
			from = $0;
			state = 1;
		}
		else {
			print from;
			print to;
			print;
			state = 0;
		}
	}
}

END {
	if (state == 1) {
		print from;
	}
	else if (state == 2) {
		print from;
		print to;
	}
}'
  1. Or what else do you call the header and sequence of hunks associated with a single file? [^]

9 Comments »

  1. Thanks for the fixes and the sigs. I've mirrored the sigs on all as well as your 2 new vpatches (and my sig on the v_strict_headers one).

    As a side note: it was a pain to find your jfw_unchecked key - maybe update the contact page to list that one too and/or point to it? The wot.deedbot site lists anyway obsolete ratings currently but even asking deedbot directly for it didn't work this time.

    Comment by Diana Coman — 2020-04-04 @ 16:42

  2. Cheers.

    Sorry for that pain; I tested as far as !!key but perhaps I've got some assumption to kill about things working because they once did and still say so on the label. I've added direct links on the contact page.

    Comment by Jacob Welsh — 2020-04-04 @ 20:45

  3. [...] are four patches to the V-tree for Mircea Popescu's Wordpress, which amount to an approximation of the changes I've been running [...]

    Pingback by Selection and other sundries for MP-WP « Fixpoint — 2020-04-07 @ 04:49

  4. Algorithmic inefficiency is a serious drawback of this tool. I suspect the "toposort" is something like O(n^2 log n) while "traverse_press_path" is exponential, like the textbook Fibonacci example of how not to do recursion. This becomes acutely noticeable around 32+ patches.

    Comment by Jacob Welsh — 2020-04-20 @ 17:48

  5. @Jacob Welsh:

    IIRC in my original v.py, sort was O(n log n) (worst case) while pressing was O(n).

    At the same time, IMHO a vtree where this detectably matters, is obese.

    Comment by Stanislav Datskovskiy — 2020-04-20 @ 22:11

  6. @Stanislav Datskovskiy: I'm speaking only of the Perl one. Should be an easy enough fix too, by memoization tables if nothing else, as seen in yours.

    Comment by Jacob Welsh — 2020-04-21 @ 18:04

  7. [...] install, you will need a Keccak V implementation such as my v.pl starter, plus the patches and [...]

    Pingback by yrc re-genesis and patch for smooth scrolling and other fixes « Fixpoint — 2020-06-08 @ 06:51

  8. [...] small and independent fixes, building off the GNAT-demanding ksum/vpatch branch. Paralleling my previous work, the second swaps in keksum and patch to avoid the secondary compiler requirement (a change that's [...]

    Pingback by A bevy of fixes for V in Perl « Fixpoint — 2020-06-27 @ 21:49

  9. I've corrected the awk "shell_quote" function shown here to use gsub rather than sub.

    Comment by Jacob Welsh — 2020-06-28 @ 03:38

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by MP-WP. Copyright Jacob Welsh.