diff -uNr a/gscm/command/.keep b/gscm/command/.keep --- a/gscm/command/.keep false +++ b/gscm/command/.keep f232109157b9d22fecc1ce426b7dd8513530d23917cc566b0925e1ef0769169c21223c323e7ab5bdafea431b09d57b17bb44eb2b0efbd3c2437a611ce00f097f @@ -0,0 +1 @@ +version control placeholder diff -uNr a/gscm/doc/CHANGES b/gscm/doc/CHANGES --- a/gscm/doc/CHANGES false +++ b/gscm/doc/CHANGES 75cc14c68063f12efc27bf2b7eb9bf7a83aa496cef7c9d4ceb04d5d5bd10aeba2a89869171903624beee7f94b37a2b5ca4e69bc21ba9afe9d8ce77731a1a95a1 @@ -0,0 +1,747 @@ +Testing & debugging after completion of rewrite: + +1. Read/write ordering bug in pop() -- returned top of stack after having +already removed it. Needed to store to a temporary. + +2. "return SC_NULL" instead of "RETURN(SC_NULL)" in reader. Considered renaming +the macro, but by any other name it would still refer to the same concept and +still be subject to confusion. Seems like a fundamental pun between C and the +embedded stack machine language. Wouldn't happen in Scheme since there return +is just something that happens to the last expresssion in a sequence, not a +command. + +3. Toplevel error handler seems to be getting run without an actual error. +Replacing toplevel with more basic evaluation tests for now. + +1 -> ERROR: apply: not a procedure: -945832504593831248 + + The error is correct (1 is not a procedure) but the display is wrong. + (Incorrect untagging?) + + Problem was in parse_integer / parse_decimal. Before, I used a C heap based + dynamic buffer mechanism, which tracked buffer pointer, allocation size and + filled size in a struct. In the rewrite I converted this to a string on the + Scheme heap, but since that doesn't distinguish fill from allocation, I + tracked fill in the lexeme_length global variable. (Ugly, but this one + internal use didn't seem to justify a whole new Scheme type.) But in + porting the integer parsers I used the Scheme string's length (allocation) + rather than the global. The chief hazard seems to be the two notions of + length, and passing around this un-truncated buffer in the reader internals + for efficiency. Reviewed other use of R_LEXEME for similar problems, + finding none. Changed the number parsers to take length as an argument to + at least reduce use of the global. + +#(1 2 3) -> ERROR: apply: not a procedure: #() + + Again, a correct error but bad read or print of the vector. + + In the reader, the length header of the vector was being set after + the "len" variable had already been used to count down to zero. + Rather than building the vector "by hand" I replaced this with the + recently added make_vector_uninit function. + + But this also revealed that the local variable "len" was being unsafely + used across a recursive call. Rather than push/popping it every time I took + it out of the loop to be computed later by list_length(). Except here we + already know the list is really a list, so I wrote a new unsafe_list_length + function to omit the type and cycle checks of list_length. + +(letrec () 1) -> ERROR: eval: empty combination + + LETREC was passing the body to EVAL_BODY through R_EXPR instead of R_BODY. + (Missed during manual register optimization.) Verified the other call to + EVAL_BODY. This fixed #3. + +4. A register conflict in the error handler, where r_error_handler was assigned +to R_PROC, then R_ERR_DETAIL was referenced, but the latter two had been +optimized to share a register. Solved by reassigning R_ERR_DETAIL to a free +register. I did go to some pains to verify the register assignments but perhaps +gave short shrift to the error handler. It would be a good exercise to do a +full audit for register conflicts, once the code is a bit more settled. The +rules for this are detailed in a comment. + +5. Two bugs relating to list_length, which is supposed to return -1 for safely +detecting non-lists. First, builtin_list_length stored its result in a "value" +instead of a "long", which being an unsigned type, prevented recognizing the +negative. Second, for odd length improper lists (e.g. (1 2 3 . 4)), the is_pair +check after the first of the two "fast = cdr(fast)" pointer advancements in the +loop returned directly, bypassing the "fast != SC_NULL" improper list check +after the loop. This irregularity resulted from being overly preoccupied with +(premature) hand-optimization (doing a single length += 2 after the two +advances in each loop instead of a length++ after each one), a tendency I've +had to fight in other places as well. (Chiefly, using the friendly type +constructor and accessor functions, which tag/untag values as needed, rather +than duplicating their functionality to save a few instructions on redundant +tag operations. In this, it helped that I convinced myself, by examining the +generated assembly, that the compiler optimizations really are smart enough to +factor out the tag operations, at least in some cases, and at least with static +functions.) + +6. In SET, checked for cdr(R_EXPR) == SC_NULL after having overwritten the +intended R_EXPR with car(R_EXPR). Swapped the order. (Discovered by running the +"bad" semantic test set, but woulda come up soon enough anyway as it completely +broke "set!".) + +(Later... somewhere around Jan 2017) + +7. yes '(' | gscm -> inf loop between gc/error handler for out of memory. +sc_error1 does a string() allocation to set R_ERR_MSG, but the stack doesn't +get dropped until after the longjmp. Fixed by dropping stack in the error +handler. Also wrote a toplevel error handler callable from Scheme implementing +the same behavior using a captured continuation. + +8. The "apply" builtin did not copy its final argument (a list). The standard +doesn't seem to specify whether it should or not, but "list" is required to +return a newly allocated list, and my no-op implementation was assuming its +argument list was already a fresh copy. Wrote a test case and fixed by making +builtin_apply copy. Also broken in tinyscheme. Affects what is probably the +simplest way to copy a list, (apply list some-list). + +Major refactoring: compiler, variable references... + +October 2017: + +9. Register effect bugs in assoc and member builtins (see tests-misc.scm). + +10. Miscompilation of "delay" (see tests-misc.scm). + +11. The compiler was not immune to redefinition of builtins. (REPL too?) +Implemented immutable environments with a separate copy for the interaction +environment. Immutable string/list/vector constants still need doing. + +Implemented piped subprocess extension + +12. Realized input_port()/output_port() could leak the file object and +descriptor if they hit an out of memory error. + +November 2017: fuzz testing + +13. Infinite loop in the lexer for EOF in a comment. + +Implemented CLI options, memory stats, floor/ceiling/truncate/round/port? + +January 2018: + +14. Heap discipline broken in arithmetic builtins on flonums by way of "fold" +helper. + +April 2018: + +Partial implementation of the long-missing number->string and string->number +builtins, enabling removal of a bunch of printfs. + +15. Discovered by inspection that the open_file helper was unsafely storing a +port in a local variable across a call to string_append_null. Made it pass the +port by register instead and refactored a bit. + +16. Realized that the changes implemented for #12 can't actually solve the problem; +file objects and descriptors still leak if a subsequent unrelated call gets the +memory error. Backed them out in order to simplify port construction. + +17. builtin_open_subprocess was a mess all around. Changed the interface, +improved comments, simplified the temporary argv construction (resolving a +strict aliasing warning), and tightened syscall error handling. As above, +realized that leaking resources due to allocation failures is not a problem +that can be solved within the scope of this function (so e.g. it's fine to +allocate the return value list after the child is spawned). + +18. REPL now accepts multi-valued returns, printing them one per line. + +19. Extensions are now available in an immutable (gales-scheme-environment), +analogous to (scheme-report-environment 5), in addition to the interaction +environment. This allows for hygienic packages making use of the extensions +without dependency injection. Probably useless flush-input-port removed. Some +toplevel error handler setup inconsistencies fixed, but the error plumbing +needs more scrutiny. + +20. Implemented double-width multiply functions, signed and unsigned, in +portable C and x86_64 assembly, plus range checking and float conversion +helpers. Used these to fix long-standing overflows in * and expt builtins. +Implemented extensions for wrapped, carried and bitwise arithmetic on fixnums. +Factored tagging macros out of machine-specific #if-blocks and tightened their +types by casting. Need to finish correctness proof for signed wide multiply. + +May 2018: + +21. Fixed self-initialization bug in macro expansion in fx-/wrap builtin (and +added -Winit-self to gcc flags). Added fx-/carry-unsigned and fxmaj builtins. +Made some internal functions static. Added startup assertion to enforce the +assumption that unsigned long matches the value / pointer size. Minor tweaks to +build in C99 pedantic mode; would like to move toward C89. + +22. Implemented if-exists extension for open-output-file. Switched to "let" +forms in toplevel. Made static symbol variable setup/usage more consistent. + +23. Core: +- Catch I/O errors in flush_port, read_char and peek_char, simplifying spec for + flush-output-port builtin. +- Light comment editing. Remove specification comments for nonstandard + builtins, which now have their own extensions document. +- Make eqv? and equal? do the simpler checks first, short-circuiting deep + inspection where possible. +- Rename quit builtin to exit. +- Add a GC thrash factor (knob not exposed yet) to hasten OOM detection when GC + reclaims less than heap_size/factor space. +- Wipe registers on OOM detection (previously just the stack was dropped in + sc_error1, with reason unclear). +- Rename fallback error handlers to fatal/fatal1 for clarity. + +Core/compiler/toplevel: +- Implement "error" as a normal builtin, parallel to sc_error1, making + set-error-handler! work for Scheme-signalled errors too and removing the + compiler's special error procedure with its null-terminator wrangling. +- Remove syn-env from the special compiler environment as it can be handled + purely from Scheme. +- Merge the compiler environment (now just one binding) into the privileged + toplevel environment. + +Toplevel: +- Extend call-with-output-file and with-output-to-file to pass through the + if-exists option. +- Make with-input-from-file and with-output-to-file restore the current ports + on escape or reentry, not required by spec but clearly the right thing. + (Using a stub dynamic-wind, so won't actually work yet.) +- Fix all four of those to work with multi-valued returns. +- Make REPL error handler support multiple detail arguments (though the other + components don't yet). + +24. Added sync and data-sync options to flush-output-port extension. +Refactoring of port argument wranglers. + +25. Big move of builtins to library. Startup time and performance of the +builtins in question is degraded, but much hazardous and hard-to-verify +microcode is eliminated. + +Removed compiler dependence on builtin forms: + and + or + +Removed compiler dependence on builtin procedures: + append + apply + assq + for-each + list + list-tail + map + memq + vector->list + zero? + +Compiler now depends on some extensions and uses eq? for fixnum comparison, in +hopes of reducing performance impact on replaced builtins. + +Merged the syntax library into the toplevel, removing some startup complexity +and allowing the toplevel to use library syntax. + +Fixed some edge cases of the "case" macro. + +The "load" builtin, by way of exec-from-port, now reads the whole file before +evaluating the forms. + +Forms moved from core to syntax library: + and + or + +Builtins moved to library: + list + append + reverse + list-tail + list-ref + memq + memv + member + assq + assv + assoc + vector->list + apply + map + for-each + +New privileged builtins: + apply/unchecked + car/unchecked + cdr/unchecked + set-cdr/unchecked! + vector-ref/unchecked + fx+/unchecked + +Wrote some tests for the new library procedures. Accelerated negative tests by +replacing bash/sed gunk with a custom error handler to run all in one process. +Moved the currently failing immutability test to the negatives and expanded. + +June 2018: + +26. Make "string-append" run in linear rather than quadratic time, fixing a +FIXME. Make toplevel "apply" and "append" list copying helpers work tail +recursively, by mutation in one pass as in the original C, reducing memory +overhead. + +27. Implemented dynamic-wind by adding a "spool" register and a new field in +continuation objects to capture it, the necessary additions to the APPLY +subroutine (now separated into APPLY_CONTINUATION), and a lowest common +ancestor helper function. + +Substantially refactored error handling in both core and toplevel to: +* work with dynamic winding while avoiding infinite loops; +* avoid some unnecessary C to Scheme string conversions; +* remove string_append helper, a poor interface (per change 26); +* support multiple detail arguments to ERROR, per SRFI-23; +* be generally clearer, I think. + +Similarly, the EXIT extension is now implemented using a continuation that +returns the given code from the toplevel (and sc_toplevel now returns the code +to main). This both makes it support unwinding and removes the need for a +builtin exposing exit(). (At some point it might be nice to eliminate remaining +uses of exit(), but so far it's still needed as an error handling fallback.) + +Builtins moved to library: + string + string->list + vector + set-error-handler! + exit + +New privileged builtins: + push-winding! + string-ref/unchecked + set-error-continuation! + +Extension added to library: + error + +builtin_error still exists, but only for reporting startup errors before it's +replaced by the toplevel. + +Fixed a few cases where pointers were being followed into the heap without +untagging. This had gone unnoticed because the three tag bits happen to +overflow in the shift for array indexing on 64-bit. + +Some minor clarity and error context improvements to library procedures. + +Renamed list_length/unsafe_list_length, to avoid giving the impression that +safety can be assumed for these internal helper functions. + +Some C89ist shuffling of declarations. + +Optimized builtin_cons: reuse pair from arg list as it's freshly allocated. + +sc_toplevel now returns the integer exit code, + +TODO: think thorough / test what happens on error or other escape in +unwinding/rewinding, explicitly unspecified by R5RS. + +28. Minor optimization: compile quoted empty lists and quoted vectors to the +values themselves, as they're not valid expressions and thus can be made +self-evaluating simply by removing the check. Deprioritize "quote" in the +builtin syntax dispatch (based on pure guesswork as to frequency). 1% speedup +observed in some existing tests, 5% in a contrived one. + +Condense some redundant register aliases (R_OPERATOR->R_EXPR; +R_BODY->R_OPERANDS), avoiding some pedantic self-assignment. + +29. Fixed a subtle semantics bug in FORCE, and wrote test: a promise's code may +be entered more than once if it forces itself recursively; if results differ, +the first must not be clobbered. Found by re-reading rationale in R5RS. + +In the same spirit of "mutability is not your friend", reverted the +builtin_cons optimization from #27. While safe, arglist mutation is premature +at this point; for example it may hinder debugging, which ought to be a +strength of an interpreter. If I do decide to allow it, there's a bunch of +other builtins that could do it too. But a better optimization might be to +avoid building arglists in the first place for fixed-arity builtins. + +30. New builtins, with tests: + immutable? + cons/immutable + string-copy/immutable + vector-copy/immutable + +Slightly simplified some related code. + +In previous immutability work, rather than expose new primitives I had the +reader construct strings and vectors as immutable. This was neither correct nor +useful. Literal constants are now properly petrified by the compiler, and with +that, all my tests are finally passing. + +31. Mega-IO-refactor, replacing stdio with Scheme-managed buffering, and +implementing sockets extension. + +Port-based write functions now all use R_PORT rather than an argument and +signal errors rather than returning status. This tidies up the printer quite a +bit. + +Checking for open port on each read/write no longer needed - the kernel's FD +validity check is enough. +- Added bonus: (close-output-port (current-output-port)) at REPL was a segfault + as the check had been missed in the fallback error handler. Now "FATAL: Bad + file descriptor". + - Should probably prevent closing stdin/out though. (close-input-port + (current-input-port)) at REPL is an infinite loop. + +Replaced posix_spawn* with vfork, simplifying builtin_open_subprocess, possibly +making it fast on more platforms, and removing another likely malloc user. + +New privileged builtins operating at the file descriptor level: + inet-stream-socket + inet-dgram-socket + unix-stream-socket + unix-dgram-socket + socket-ports + getsockname + getpeername + connect-inet + connect-unix + listen + accept + close + +New library procedures: + open-tcp-connection + open-tcp-listener + open-udp-socket + open-unix-connection + open-unix-listener + open-unix-datagram-socket + call-with-tcp-connection + call-with-unix-connection + sequential-tcp-server + sequential-unix-server + +32. Fix assertion violation (buffer overflow) introduced in #31, when writing +to a closed port, catching the error, then writing again. + +Relatedly, any other write errors on the underlying FD will now close the port. +(Rationale: if recovery were possible, the underlying system should already be +doing it, so write errors are generally permanent; even if not, they can be +asynchronous so no way to know how much was successfully written.) + +REPL error handler changed back to restarting using a captured continuation. (I +think I had removed this deliberately in #27 in the spirit of simplifying, but +the result was the REPL exiting with error status on a normal ^D termination if +an error had ever been caught, because error handlers aren't really supposed to +return.) + +Ugly (define-r5rs 'foo (lambda ...)) forms changed to internal definitions +followed by bulk registration, in the same manner as the extensions in #31. + +33. New builtin: char-ready?, made possible by direct access to input buffers. +Stream sockets are made nonblocking to work around poll/select defects, but +undetected blocking is still possible, e.g. reading from disk or if standard +input is an externally provided socket. + +Builtin moved to library and tests added: equal? + +34. Libc in-sourcing: +- Removed stdio from main.c and indirect use via assert +- Removed last direct uses of malloc by replacing GC root linked list with an + array and heap creation with mmap, also enabling the hugepages option +- Removed last indirect use of malloc via atexit +- Replaced stdlib.h/string.h with individual declarations + +Made write_err handle EINTR/EAGAIN/EWOULDBLOCK, sharing with flush_output_port. + +July 2018: + +35. Start of library-level bignum support, including addition, subtraction, +Comba multiplication, and automatic fixnum promotion and demotion. Using +non-opaque values until the proper privileged interface is settled. + +It looks like the reader and printer will have to be done in Scheme now. +(They'd have to make non-tail calls back into Scheme to do base conversion, and +such C-level recursion is verboten.) So much the better for sanity and future +rewrites of the core though. A pared-down bootstrap reader will be preserved. + +Core: +- More assertions +- Flush ports on abort (with re-entry check, as flushing itself has assertions) +- Bignum awareness in number predicates +- Start replacing car(cdr(x)) pattern with cadr macro + +Builtins moved to library: + - * + +New builtin: fxvector/unchecked + +Builtins moved to library: read string->number + +The full range of numeric syntax is now supported except for rational and +complex: alternate radix (2/8/10/16), radix and exactness prefixes, exponents +and sharped-out digits (though significant figures aren't actually tracked). +Bignums come "for free", though base conversion is the naive quadratic +algorithm. + +Simplified the (now) bootstrap number decoder by removing float syntax and +promotions. + +Internal tweaks to the lexer and (now) bootstrap reader to support read-token. +Also loosened the overly anal interpretation of allowed whitespace. + +Started downcasing labels (all-caps doesn't really serve a purpose and risks +interference from libc extensions). + +37. Bignum division, conversion to flonum and output. + +Bugfix: bignum negation failed to demote in the case of *LEAST-FIXNUM*. + +Minor optimization: avoid some unnecessary untagging in fixnum ops. + +Builtins moved to library: + zero? + positive? + negative? + abs + quotient + remainder + modulo + exact->inexact + number->string + write + display + newline + +In addition to bignum support, quotient/remainder/modulo now properly maintain +inexactness for flonum inputs, though can produce inf/nan. + +In the Scheme rewrite of the printer, I realized the special save/restore of +current ports for error recovery is unnecessary now that dynamic-wind is taking +care of it (they can't be set directly by user code). + +New library procedures: gcd lcm + +New builtins: fx= fx< fx<= fx<=/unsigned fxlength/unsigned + +Renamed builtin: fxshift-unsigned to fxshift/unsigned + +New privileged builtins: + fxdiv/unchecked + fxdiv/ext/unchecked + fixnum->dec/unchecked + fixnum->hex/unchecked + fixnum->oct/unchecked + fixnum->bin/unchecked + flonum->dec/unchecked + flonum/unsigned/unchecked + flodiv/unchecked + floquotient/unchecked + floremainder/unchecked + builtin? + builtin-name + promise? + continuation? + +38. Full bignum integration. + +Serious bugfix: (QUOTIENT *LEAST-FIXNUM* -1) => *LEAST-FIXNUM* +(On the first pass for #37 it didn't occur to me that signed fixnum quotient +had an edge case besides zero divisor. The equivalent problem existed in the +prior C-only implementation, or possibly worse as C leaves negative division +implementation-defined.) + +Builtins moved to library: + eqv? + = < > <= >= + odd? even? + max min + / + floor ceiling truncate round + exp log + sin cos tan + asin acos atan + sqrt + expt + inexact->exact + +Bignums now opaque and supported by all generic operators. Unlike the fixnum +and flonum ops, which could also exist as unprivileged extensions, the bignum +primitives are implicitly unchecked. In addition to bignum support, the +irrational functions now handle some special cases for exactness e.g. zero (as +encouraged but not required). + +New privileged builtins: + toplevel-environment + set-car/unchecked! + fxbin/unsigned/unchecked + flo=/unchecked + flofixnum/unchecked + floor/unchecked + ceiling/unchecked + truncate/unchecked + round/unchecked + exp/unchecked + log/unchecked + sin/unchecked + cos/unchecked + tan/unchecked + asin/unchecked + acos/unchecked + atan/unchecked + atan2/unchecked + sqrt/unchecked + make-bignum + bignum? + bignum-negative? + bignum-set-negative! + bignum-ref + bignum-set! + bignum-length + bignum + bignum/unsigned + bignum2 + bignum-truncate! + +Renamed privileged builtins: + fxdiv/unchecked -> fxdiv/unsigned/unchecked + fxdiv/ext/unchecked -> fxdiv/ext/unsigned/unchecked + +(They were already unsigned; I think I'd been trying to avoid the crazy long +names, but it was a hazard.) + +Core: +- Grabbed the last extended type slot to fit the bignum sign bit +- Type assertions added to fixnum_val, new counterpart unsigned_fixnum_val, and + flonum_val, expanding coverage while reducing boilerplate +- Bounds check assertion added to vector_ref and vector_set +- With flonum coercion headaches out of core, finally replaced safe_fixnum_val + and exact_fixnum_val with something simple and correct +- is_number optimized based on extended number type tag values being contiguous +- is_integer made possibly more standard conformant by using ROUND (nearbyint) + instead of FLOOR for flonums (don't see how this could make a difference but + neither am I sure it can't) +- Made shallow_print report particular environment specifiers, the internal + specials, immutability, and vector lengths +- make_str and make_vector size checks simplified based on the unsigned trick + +Compiler: +- Removed fairly useless vector support from macro language +- Fixed old bug: improper rejection of macros expanding to #f +- Removed trivial dependency on generic < +- Switched to storing error context as symbol, avoiding SYMBOL->STRING calls in + the normal case + +Toplevel: +- Pretty "evaluates to" symbol prefixing REPL results +- Tracing for library procedure calls, allowing removal of the growing sprawl + of ad-hoc error context reporting +- Filled in stubs for bignum to hex, octal and binary conversion +- Handled bignum status in EXIT by remaindering there rather than in core + +asm-generic: +- Replaced "seems right but can't prove" implementation of signed double-width + multiply with ugly but more obvious method + +August 2018: + +39. +Serious bugfix: plug leaks of uncompiled subtrees in named LET and DO, while +simplifying, by feeding the full expansions back through compilation as is done +for ordinary macros. (Less seriously, the same for lambda annotation in plain +LET, at worst a performance issue.) + +Core: +- Add is_port assertions to R_PORT-based I/O routines +- Remove unnecessary variable pre-check in SET, allowing resolve_variable_ref + to operate on unresolved refs only, saving a few branches +- Remove fairly pointless and expensive is_list assertion in EVAL +- Remove redundant is_fixnum assertion from builtin_make_bignum + +Compiler: +- More consistently unquote self-evaluating objects +- Reject non-readable objects (possible through EVAL) + +Syntax library: +- OR: optimize by avoiding unnecessary temporary binding (multiple evaluation + is safe for non-list expressions as they're all side-effect free and O(1)) +- Explain how the hardcoded temporary names are safe +- COND: simplify by handling the temporary binding case with OR +- CASE: optimize by using EQV? instead of MEMV for the single-datum case + +40. +Core/compiler: +- Prevent integer overflows by limiting parameter list length (number of + bindings per lexical frame) +- Save a word in the procedure object and a pair in the annotated lambda form + by encoding variadic status in the sign of the arity + +Core: +- Avoid possibly undefined overflow behavior in untag_signed by doing the left + shift as unsigned +- Note a hazard in add_tag/ext_add_tag and audit + +Compiler: +- Slight simplifications/optimizations: filterize CHECK-IDENT; avoid repeated + SYMBOL->STRING in duplicate search + +40.maint1. + +Starting a maintenance branch as the crucial variable lookup optimization has +not yet made it back after the compiler overhaul of revision 41. + +Core: +- Remove assertions in stack ops, redundant since adding type assertions to + car/cdr long ago +- Remove invalid assertion added in r39: input_port_ready doesn't use R_PORT +- Remove mostly redundant is_list assertions in the full environment lookups + +March 2020: + +40.maint2. + +Overall: +- Restructure tree for slashpackage +- Trim Makefile +- Write install + check scripts and README + +Main (backports): +- Make default heap size build-time tunable (isolating magic numbers) +- Report build configuration in help text +- Remove "strtol" stdlib dependency + +40.maint3. + +Bulk reindent & reflow. +- Tabs: good +- Non-meaningful newlines in human text: bad +- "if" as a special "lisp word" producing less indentation: bad +- Code width: mostly sticking to 80 columns diff -uNr a/gscm/doc/extensions.txt b/gscm/doc/extensions.txt --- a/gscm/doc/extensions.txt false +++ b/gscm/doc/extensions.txt 3b2bc5374429e68c2f3e2dd2de2e7a0201edbdefa1815dde2a44451b777427819249168a433e67a9d6d825d27874e30cae5f1b58e62fa99ee58c1409e4550c24 @@ -0,0 +1,261 @@ +Report on Gales Scheme Extensions +================================= + +J. Welsh + +19 April 2018 + +This is a supplement to the "Revised(5) Report on the Algorithmic Language Scheme" defining the language extensions in the Gales Scheme system. The intent is to be stringent enough to be useful, yet lenient enough to facilitate adoption by a variety of Scheme implementations on a variety of platforms. + +Syntax library +-------------- + +RECEIVE, per SRFI 8. + +Runtime environment +------------------- + +*args* list + +(set-error-handler! handler) procedure + +Registers an error handler procedure. When an error is signalled, either internally or by a call to ERROR, execution exits any active dynamic extents (see DYNAMIC-WIND) and HANDLER is called with a message string as first argument and possibly relevant values as additional arguments. To prevent recursive errors, the error handler is first reset to the system's default, so HANDLER may need to re-register after completion of any error-prone operations. If HANDLER returns, the Scheme process terminates. + +It may not be possible to guarantee recovery from an out-of-memory error, as the handler could trigger it again before it is able to do anything useful. + +Rationale: +This mechanism is a poor substitute for a typed exception system but enables basic recovery such as restarting a REPL. + +(error string detail ...) procedure + +Per SRFI-23. Exits any current dynamic extents and does not return. Arguments are passed to the current error handler. + +(toplevel) procedure + +(call-as-toplevel) procedure + +(save-core filename) procedure + +(exit) procedure +(exit status) procedure + +Exits any current dynamic extents and terminates the Scheme system, returning the given exact integer exit status to the operating system. What exactly is returned is implementation dependent, but a normal or successful status shall be indicated by zero, which is the default if STATUS is omitted. + +(gc) procedure + +Runs a garbage collection, returning the number of live cells. + +Eval +---- + +(gales-scheme-environment) procedure + +Input and output +---------------- + +(flush-output-port) procedure +(flush-output-port port) procedure +(flush-output-port port option) procedure + +Flushes an open output port, scheduling any buffered writes for delivery to their destination as soon as possible. Returns an unspecified value. The port argument, if omitted, defaults to the value returned by current-output-port. The option argument, if included, is a symbol specifying additional semantics: + + sync -- performs a synchronous flush, delivering any buffered data and metadata to the underlying storage device and not returning until complete. If this fails or is not possible, an error is signalled. The intent is that for a file on nonvolatile storage media, all data previously written to the port be safe with respect to power failure or system crash upon successful return. In practice, this requires cooperation from external components such as operating system and storage hardware. + data-sync -- like sync except that updates to metadata not required for correctly reading back the data, such as timestamps, are not required to be completed. + +Writes are still to be delivered in a timely manner even without explicit flushing; buffered implementations are suggested to use a timer on the order of 10 milliseconds. Buffers are always flushed on CLOSE-OUTPUT-PORT and on orderly termination of the Scheme system, which means termination for any reason within the control of the implementation. Asynchronous write errors must not be signalled until the next operation on the affected port, and thus may be lost if the port is not explicitly flushed or closed. + +Rationale: +The ideal port abstraction is to deliver individual characters without delay. Unfortunately this approach often has considerable overhead, such as context switching for system calls or per-packet headers and computation on a network. The magic number 10 is intended to provide adequate responsiveness by default for common applications, while this procedure provides the means to make stricter latency or persistence requirements explicit when needed. + +(open-subprocess path arg1 ...) procedure + +(wait-subprocess) procedure +(wait-subprocess pid) procedure + +(open-output-file path if-exists) procedure +(call-with-output-file path proc if-exists) procedure +(with-output-to-file path thunk if-exists) procedure + +These extended variants of the corresponding R5RS procedures specify what to do if the named file exists. Symbolic options include: + + truncate -- the file is truncated to length 0. + append -- writes are performed at the end of the file. + overwrite -- file contents are overwritten in place starting at offset 0. + +Rationale: +A default is not specified, as it wasn't in R5RS and existing implementations vary. An "error" option is not provided as there would generally be a race condition between the existence check and creation. POSIX for example provides the O_EXCL open flag but only requires it be atomic with respect to other calls with the same flag. + +(read-token) procedure +(read-token port) procedure + +READ-TOKEN reads a single Scheme token from PORT, or the current input port. It is a recognizer for the nonterminal from R5RS section 7.1.1. With the exception of number tokens, an error is signalled if a lexically invalid character sequence is encountered in the input, or if an end of file is encountered after the beginning of a token and the token is incomplete. If an end of file is encountered before any characters are found that can begin a token, an end of file object is returned. Otherwise, a pair is returned whose CAR is a symbol indicating the token type and whose CDR contains either a string representation of its value, the value itself, or the empty list, depending on the type, as follows: + + (LITERAL . value) -- boolean, character, or string + (IDENTIFIER . string) -- in the implementation's preferred case + (NAMED-CHAR . string) -- in lowercase; omitting the #\; not validated + (NUMBER . string) -- including prefix if given; not validated + (STRING . string) + (OPEN-PAREN) + (CLOSE-PAREN) + (OPEN-VECTOR) + (ABBREV . symbol) -- one of QUOTE, QUASIQUOTE, UNQUOTE, or UNQUOTE-SPLICING + (DOT) + +Rationale: +Constructing numeric values at this stage is undesirable as it may require super-linear algorithms; likewise interning identifiers as symbols. Syntactic validation could still be done on numbers, but would be redundant with STRING->NUMBER. + +Sockets +------- + +(define (socket-address socket) (socket 'address)) +(define (close-socket socket) (socket 'close)) + +Stream sockets: +* input-port +* output-port +* shutdown-read +* shutdown-write + +Listeners: +* accept + +Datagram sockets: +* send +* receive + +Data types used within this section are defined as follows: + + Octet -- exact integer from 0 to 255, inclusive + IP address -- vector of four octets + IP6 address -- vector of 16 octets + Socket address -- list + +(ip-address string) library procedure +(lookup-host string) library procedure +(lookup-service string) library procedure +(lookup-addresses node service) + +(open-tcp-connection address) procedure +(open-tcp-connection address bind-address) procedure +(open-unix-connection path) procedure + +ADDRESS and BIND-ADDRESS are 2-lists (HOST SERVICE), where HOST is either: + - an IPv4 address as 4-vector of octets -- exact integers in the interval [0,256) + - an IPv4 address as string in dotted decimal notation + - a host name string, to be resolved by a fresh lookup in the system hosts database or DNS + +and SERVICE is either: + - a port number -- exact integer in the interval [0,65536) + - a service name string, to be resolved by a fresh lookup in the system services database + +(listen-tcp bind-address backlog) procedure +(listen-unix path backlog) procedure + +(listening-socket? obj) procedure + +This type predicate returns #t if the object is a listening socket (whether or not it has been closed) and #f otherwise. Listening sockets form a subtype of PORT?, disjoint from INPUT-PORT? and OUTPUT-PORT?. + +(close-listening-socket socket) procedure + +Closes the listening socket SOCKET. Has no effect if it has already been closed. Returns an unspecified value. There may be an underlying socket object in the network stack that remains in a listening state if it remains open in external processes. + +(accept-connection socket) procedure + +Rationale: +Stream sockets should be distinct from input- or output-only ports (if those exist), + +One-way shutdown of stream sockets should be possible as it is essential for some application protocols. +On the other hand, it is a common pattern to combine sockets with OS-level threads or process forking, which is important in particular for utilizing multiple processors on a shared-memory system. If shutdown weren't distinguished from close, a socket being closed either explicitly or through garbage collection in one process would block its use + +Numbers +------- + +[TODO] +(quotient/remainder n1 n2) procedure + +This procedure returns two values equivalent to (QUOTIENT N1 N2) and (REMAINDER N1 N2), but may be more efficient than calling the two separately. + +Commentary: +In a survey of existing variants of this interface, I found this one from Racket to best fit the spirit of R5RS. MIT Scheme has INTEGER-DIVIDE which returns "an object with two components" requiring dedicated selector procedures. Guile has TRUNCATE/, but this also handles non-integer reals and, as the name suggests, is part of a larger family implementing several division conventions. R6RS abandons both REMAINDER and MODULO in favor of a never-negative MOD. + +Fixnum operations +----------------- + +Fixnums are a subtype of exact integers having a fixed precision defined by the execution environment. They behave as specified for integers under the generic arithmetic operators, for example being promoted to bignum or inexact on overflow, but can also be used with a dedicated set of operators for modular and bitwise arithmetic. These are intended as efficient low-level building blocks for applications such as hash functions and multiple precision arithmetic. All operators must behave as if fixnums are represented as two's complement binary integers. Unless otherwise noted, any timing and energy consumption invariance with respect to operand values provided by the underlying machine must be preserved. + +The argument naming conventions of R5RS section 1.3.3 are extended such that f, f1, ... imply a fixnum type restriction. + +*fixnum-width* fixnum + +This constant indicates the implementation's fixnum precision in bits, which must be at least 16. Note that it may be less than the underlying machine word size due to tag bits. + +*greatest-fixnum* fixnum +*least-fixnum* fixnum + +These constants are equal to (- (expt 2 (- *fixnum-bits* 1)) 1) and (- (expt 2 (- *fixnum-bits* 1))) respectively, and define the fixnum range as a closed interval. + +(fixnum? obj) procedure + +This type predicate returns #t if the object is a fixnum and #f otherwise. All exact integers within the fixnum range are fixnums. + +(fx= f1 f2) procedure +(fx< f1 f2) procedure +(fx<= f1 f2) procedure + +These are equivalent to the generic =, <, and <= procedures, except that they are only defined for two fixnum arguments, which may allow more efficient implementation. + +(fx (values d0 d1) where (- f1 f2 f3) = d = (- d0 (* d1 (expt 2 *fixnum-width*))) + d1 = (- (quotient d (expt 2 *fixnum-width*))) ;; FIXME + d0 = (modulo d (expt 2 *fixnum-width*)) + + Args and results interpreted as unsigned. + +(fx*/wrap f1 ...) procedure + +(fx*/carry f1 f2) procedure +3-arg form reserved for R6 style + +(fx*/carry-unsigned f1 f2) procedure +3-arg form reserved for R6 style + +(fxnot f) procedure + +(fxand f1 ...) procedure +(fxior f1 ...) procedure +(fxxor f1 ...) procedure + +(fxif f1 f2 f3) procedure + +If/mux/choose/merge + (fxior (fxand f1 f2) (fxand (fxnot f1) f3)) + +(fxmaj f1 f2 f3) procedure + +Majority; carry-out of full adder; borrow-out of full subber by inverting minuend input + (fxior (fxand f1 f2) (fxand f1 f3) (fxand f2 f3)) + +(fxshift f bits) procedure +(fxshift/unsigned f bits) procedure + +(fxlength/unsigned f) procedure + +This procedure returns the bit length of a fixnum; that is, the one-based index of its most significant 1-bit, or zero if all bits are zero. This is mathematically equivalent to (CEILING (LG (+ F 1))) where LG is the base-2 logarithm, but computed exactly. + +(integer->fixnum n) (unsure) procedure diff -uNr a/gscm/library/compiler.scm b/gscm/library/compiler.scm --- a/gscm/library/compiler.scm false +++ b/gscm/library/compiler.scm 516ca959d303b4f87fa395d9550c047135243a115aa5978116d3e78e624a7bc28158da8a8670deb64b7e47d29586502b83711373c5a29017e5941f28872c593d @@ -0,0 +1,529 @@ +;;;; Front-end compiler for the Gales Scheme interpreter +;;; Jacob Welsh, Jan 2017 +;;; +;;; This "compiler" performs structural checks on the various expression types so the interpreter can avoid doing them repeatedly, copies to prevent mutation by user code after checking, replaces symbols with container objects for memoizing variable references, and implements a simplified (non-hygienic) macro language and some special cases for translating fancier expression types to primitive forms. +;;; +;;; To avoid the chicken-and-egg bootstrapping problem, the code is restricted to the forms supported directly by the interpreter (LAMBDA, QUOTE, IF, SET!, BEGIN, LETREC, DELAY). It should be scrutinized carefully: any malformed expressions in its code may invoke the dreaded Undefined Behavior in the interpreter, and any incorrect output opens the interpreter to UB from user code. +;;; +;;; For ease of loading from C, the compiler is structured as a single expression evaluating to a procedure. This is then registered as the global compiler procedure and is called on all further expressions passed to EVAL. +;;; +;;; The compiler's environment must not be mutable from user code, and some extensions are required. +;;; +;;; See also test-compiler.scm. +;;; +;;; BUGS: The macro language is fairly stupid -- I've got some reading to do on the subject. + +(letrec + ((err-context #f) + (cerror + (lambda (msg . detail) + (apply/unchecked + error (cons (string-append (symbol->string err-context) ": " msg) + detail)))) + + (syn-env '()) + (builtin-syn-env #f) + + (vector-set/unchecked! vector-set!) + + (unspecified-value '()) + + (list (lambda args args)) + (map + (lambda (proc l) + (if (null? l) '() + (cons (proc (car l)) (map proc (cdr l)))))) + (for-each + (lambda (proc l) + (letrec ((loop (lambda (l) (if (null? l) l (step l)))) + (step (lambda (l) (proc (car l)) (loop (cdr l))))) + (loop l)))) + (all + (lambda (pred items) + (if (null? items) #t + (if (pred (car items)) (all pred (cdr items)) + #f)))) + (compose (lambda (f g) (lambda (x) (f (g x))))) + (vmap + (lambda (proc vec) + (letrec ((result (make-vector (vector-length vec))) + (loop (lambda (k) (if (fx=/unchecked k -1) result (step k)))) + (step (lambda (k) + (vector-set/unchecked! + result k (proc (vector-ref/unchecked vec k))) + (loop (fx-/unchecked k 1))))) + (loop (fx-/unchecked (vector-length vec) 1))))) + (deep-copy/immutable + (lambda (x) + (if (pair? x) (cons/immutable (deep-copy/immutable (car/unchecked x)) + (deep-copy/immutable (cdr/unchecked x))) + (if (vector? x) (vector-copy/immutable (vmap deep-copy/immutable x)) + (if (string? x) (string-copy/immutable x) + x))))) + (deep-copy #f) ;; alias binding + (memq + (lambda (x l) + (if (pair? l) (if (eq? x (car/unchecked l)) l + (memq x (cdr/unchecked l))) + #f))) + (not-list-of-length + (lambda (obj . lengths) + (if (list? obj) + (if (memq (length obj) lengths) #f #t) + #t))) + (assq + (lambda (x l) + (if (pair? l) (if (eq? x (caar l)) (car/unchecked l) + (assq x (cdr/unchecked l))) + #f))) + (append2 + (lambda (x y) + (if (pair? x) (cons (car/unchecked x) + (append2 (cdr/unchecked x) y)) + (if (null? x) y + (error + "COMPILER BUG: append2: non-final argument not a list"))))) + (append+/unchecked + (lambda (lists) + (if (pair? (cdr/unchecked lists)) + (append2 (car/unchecked lists) + (append+/unchecked (cdr/unchecked lists))) + (car/unchecked lists)))) + (append* + (lambda (lists) + (if (pair? lists) (append+/unchecked lists) + lists))) + (list-head + (lambda (l k) + (if (fx=/unchecked k 0) '() + (cons (car l) (list-head (cdr l) (fx-/unchecked k 1)))))) + (list-tail + (lambda (l k) + (if (fx=/unchecked k 0) l + (list-tail (cdr l) (fx-/unchecked k 1))))) + + (dict-extend (lambda (d name value) (cons (cons name value) d))) + (dict-get (lambda (d name default) + ((lambda (binding) (if binding (cdr binding) default)) + (assq name d)))) + + (add-syntax! + (lambda (name rules) + ((lambda (binding) + (if binding (set-cdr! binding rules) + (set! syn-env (dict-extend syn-env name rules)))) + (assq name syn-env)))) + + (with-split-list + (lambda (l k proc) + (proc (list-head l k) (list-tail l k)))) + + (check-ident + (lambda (obj) (if (symbol? obj) obj (cerror "not an identifier:" obj)))) + + ;; Check that a parameter list contains only symbols, no duplicates, and won't overflow internal limits (arity annotation in procedures and frame index in variable refs). Return the number of parameters, negated for an improper list. + (check-param-list + (lambda (vars) + (letrec + ((len 0) + (improper? #f) + (make-proper + ;; Unsurprisingly I hope, if you build an expression tree with a cyclic parameter list and eval it, you'll exhaust the heap. + (lambda (l) + (if (null? l) '() + (begin (if (fx=/unchecked len *max-parameters*) + (cerror "too many parameters")) + (set! len (fx+/unchecked len 1)) + (if (pair? l) + (cons (car/unchecked l) + (make-proper (cdr/unchecked l))) + (begin (set! improper? #t) + (list l))))))) + ;; Mergesort based duplicate search + (sort + (lambda (l) + (if (null? l) l + (if (null? (cdr/unchecked l)) l + (with-split-list + l (fxshift (length l) -1) + (lambda (left right) + (merge (sort left) (sort right)))))))) + (merge + (lambda (left right) + (if (null? left) right + (if (null? right) left + (if (stringstring check-ident) (make-proper vars))) + (if improper? (fxneg/unchecked len) len)))) + (max-args + (lambda (form n) + (if (fx omitted, use + (caddr spec)))) + (make-loop-proc + (lambda () + (list + 'lambda (map car specs) + (list 'if (car test-clause) + ;; If is true, return result of + (if (null? (cdr test-clause)) + (list 'quote unspecified-value) + (cons 'begin (cdr test-clause))) + ;; Otherwise execute and loop + (cons 'begin (append2 + commands + (list (cons loop-name + (map step-expr specs)))))))))) + (if (not (list? specs)) + (cerror "iteration specs not a list:" specs)) + (for-each (lambda (spec) + (if (not-list-of-length spec 2 3) + (cerror "ill-formed iteration spec:" spec))) + specs) + (if (not (list? test-clause)) + (cerror "test clause not a list:" test-clause)) + (if (null? test-clause) (cerror "missing test expression")) + (compile-expr + (list 'letrec + (list (list loop-name (make-loop-proc))) + (cons loop-name (map cadr specs))))))) + + (compile-letrec + (lambda (operands) + ;; TODO annotate arity as with lambda (watch out for the other manual letrec constructions) + (if (null? operands) (cerror "missing bindings")) + (check-bindings (car operands)) + (check-param-list (map car (car operands))) + (cons 'letrec + (cons (map (lambda (binding) + (list (car binding) (compile-expr (cadr binding)))) + (car operands)) + (compile-body (cdr operands)))))) + (compile-define + (lambda (operands) + (set! err-context 'define) + (if (null? operands) (cerror "missing name")) + (if (pair? (car operands)) + ;; (define (proc . params) . body) + ;; -> (define proc (lambda params . body)) + (list 'define + (check-ident (caar operands)) + (compile-lambda (cons (cdar operands) (cdr operands)))) + ;; (define var value) + (begin (if (null? (cdr operands)) (cerror "missing value")) + (max-args operands 2) + (list 'define + (check-ident (car operands)) + (compile-expr (cadr operands))))))) + (compile-bad-define + (lambda (operands) + (cerror "only allowed at start of body or top level:" + (cons 'define operands)))) + (compile-bad-define-syntax + (lambda (operands) + (cerror "only allowed at top level:" (cons 'define-syntax operands)))) + (compile-delay + (lambda (operands) + (if (null? operands) (cerror "missing expression")) + (max-args operands 1) + (list 'delay (compile-expr (car operands))))) + (compile-define-syntax + (lambda (operands) + (set! err-context 'define-syntax) + (if (null? operands) (cerror "missing name")) + (if (null? (cdr operands)) (cerror "missing syntax-rules")) + (max-args operands 2) + (if (not (list? (cadr operands))) + (cerror "not a combination (expected syntax-rules):" + (cadr operands))) + (if (null? (cadr operands)) + (cerror "empty combination (expected syntax-rules)")) + (if (not (eq? (caadr operands) 'syntax-rules)) + (cerror "expected syntax-rules:" (cadr operands))) + (add-syntax! (check-ident (car operands)) + (compile-syntax-rules (car operands) (cdadr operands))) + unspecified-value)) + (compile-syntax-rules + (lambda (keyword operands) + (set! err-context 'syntax-rules) + (if (null? operands) (cerror "missing literals")) + (if (not (list? (car operands))) (cerror "literals not a list")) + (check-param-list (car operands)) + (cons (deep-copy (car operands)) + (map (compile-syntax-rule keyword (car operands)) + (cdr operands))))) + (compile-syntax-rule + (lambda (keyword literals) + (lambda (rule) + (if (not-list-of-length rule 2) (cerror "ill-formed rule")) + (list (compile-pattern keyword literals (car rule)) + (deep-copy (cadr rule)))))) + (compile-pattern + (lambda (keyword literals pat) + (letrec + ((get-pat-vars + ;; Walk a pattern, extending a list of pattern variables + (lambda (vars pat) + (if (pair? pat) (get-pat-vars (get-pat-vars vars (car pat)) + (cdr pat)) + (if (symbol? pat) + (if (eq? pat '...) (cerror "ellipsis unsupported") + (if (memq pat literals) vars (cons pat vars))) + vars))))) + (if (null? pat) (cerror "empty pattern")) + (if (not (pair? pat)) (cerror "pattern not a list:" pat)) + (if (not (eq? (car pat) keyword)) + (cerror "pattern doesn't begin with keyword:" pat)) + (check-param-list (get-pat-vars '() pat)) + (deep-copy pat)))) + + (transform + (lambda (form literals rules) + (letrec + ((try-rules + (lambda (rules) + (if (null? rules) (cerror "no matching syntax rule:" form)) + ;; rule is ((keyword . pattern) template) + ((lambda (dict) + (if dict (instantiate dict (cadar rules)) + (try-rules (cdr rules)))) + (match literals '() form (cdaar rules)))))) + (try-rules rules)))) + (match + (lambda (literals dict form pattern) + (if (symbol? pattern) + (if (memq pattern literals) + (if (eq? form pattern) dict #f) + (dict-extend dict pattern form)) + (if (pair? pattern) + (if (pair? form) + ((lambda (dict) + (if dict (match literals dict (cdr form) (cdr pattern)) + #f)) + (match literals dict (car form) (car pattern))) + #f) + (if (eq? form pattern) dict + #f))))) + (instantiate + (lambda (dict template) + (if (symbol? template) (dict-get dict template template) + (if (pair? template) (cons (instantiate dict (car template)) + (instantiate dict (cdr template))) + template))))) + + (set! deep-copy deep-copy/immutable) ;; distinction not currently necessary + + (set! builtin-syn-env + (list (cons 'lambda compile-lambda) + (cons 'quote compile-quote) + (cons 'if compile-if) + (cons 'set! compile-set) + (cons 'begin compile-begin) + (cons 'let compile-let) + (cons 'do compile-do) + (cons 'letrec compile-letrec) + (cons 'delay compile-delay) + (cons 'define compile-bad-define) + (cons 'define-syntax compile-bad-define-syntax))) + compile-expr-or-def) diff -uNr a/gscm/library/test-compiler.scm b/gscm/library/test-compiler.scm --- a/gscm/library/test-compiler.scm false +++ b/gscm/library/test-compiler.scm 3ffa94def8c9e63a6a46979fb90902ac383fb22854b4aeed0baa672e01670509a99816395856b6b93ee9e800d874111dd9c550bfa17c2bfba0b26a13ce47a598 @@ -0,0 +1,46 @@ +;;; Test the GSCM compiler, e.g. to run it on its own code. Should support any +;;; standard R5RS implementation, including the bare GSCM interpreter. + +(define error #f) +(define variable-ref (lambda (sym) sym)) +;; to verify variable ref conversion +;(define variable-ref (lambda (sym) (list 'variable-ref sym))) +;(define *max-parameters* 10) ; deliberately low for testing +(define *max-parameters* (expt 2 24)) +(define car/unchecked car) +(define cdr/unchecked cdr) +(define fx=/unchecked =) +(define fx) + ((cond (else a . b)) (begin a . b)) + ((cond (t => r)) (let ((tmp5f920465446f5c2d t)) + (if tmp5f920465446f5c2d (r tmp5f920465446f5c2d)))) + ((cond (t => r) . c) (let ((tmp5f920465446f5c2d t)) + (if tmp5f920465446f5c2d (r tmp5f920465446f5c2d) + (cond . c)))) + ;; return unspecified when no tests pass + ((cond (t)) (or t '())) + ((cond (t) . c) (or t (cond . c))) + ((cond (t a . b)) (if t (begin a . b))) + ((cond (t a . b) . c) (if t (begin a . b) (cond . c))))) + + (define-syntax case + (syntax-rules (else) + ((case (a . b) c . d) (let ((tmp5f920465446f5c2d (a . b))) + (case tmp5f920465446f5c2d c . d))) + ((case k (else a . b)) (begin a . b)) + ;; return unspecified when no tests pass + ((case k (() a . b)) '()) + ((case k (() a . b) c . d) (case k c . d)) + ((case k ((s) a . b)) (if (eqv? k 's) (begin a . b))) + ((case k ((s) a . b) c . d) (if (eqv? k 's) (begin a . b) + (case k c . d))) + ((case k ((s . t) a . b)) (if (memv k '(s . t)) (begin a . b))) + ((case k ((s . t) a . b) c . d) (if (memv k '(s . t)) (begin a . b) + (case k c . d))))) + + ;; SRFI 8 + (define-syntax receive + (syntax-rules () + ((receive names expr body . a) + (call-with-values (lambda () expr) (lambda names body . a))))) + + ;;; Toplevel + + (let ((prompt "GSCM> ") + (result-prefix " => ") + (*args* (eval '*args* (interaction-environment))) + (intr-env (interaction-environment)) + (error-cont #f) + (exit-cont #f) + (repl-cont #f) + (error-handler #f) + (trace-ring-length 8)) + + ;;; Various constants + ;; Inexact number and named char literals unsupported in bootstrap reader + (define flo-1 (flonum/unchecked -1)) + (define flo0 (flonum/unchecked 0)) + (define flo1 (flonum/unchecked 1)) + (define flo2 (flonum/unchecked 2)) + (define flo10 (flonum/unchecked 10)) + (define flo1/10 (flodiv/unchecked (flonum/unchecked 1) + (flonum/unchecked 10))) + (define flo-log-10 (log/unchecked (flonum/unchecked 10))) + (define nl (integer->char 10)) + (define sp (integer->char 32)) + + (define *fixnum-width*-1 (fx-/unchecked *fixnum-width* 1)) + + ;;; Tracing + ;; When a library procedure is called externally, its name is recorded in a circular log structure. On error, this is copied to a proper list and passed to the handler. Crude, perhaps, but fairly effective. (Errors signalled by builtins already include the builtin name in the message.) + (define trace-ring + (let ((head (cons #f '()))) + (do ((p head (cons #f p)) + (k 1 (fx+/wrap k 1))) + ((fx= k trace-ring-length) (set-cdr! head p) head)))) + + (define (clear-trace-ring) + (do ((p trace-ring (cdr p)) + (k 0 (fx+/wrap k 1))) + ((fx= k trace-ring-length)) + (set-car! p #f))) + + (define (get-trace-log) + (do ((p trace-ring (cdr p)) + (acc '() (if (car p) (cons (car p) acc) acc)) + (k 0 (fx+/wrap k 1))) + ((fx= k trace-ring-length) (reverse acc)))) + + (define (trace . args) + (set-car/unchecked! trace-ring args) + (set! trace-ring (cdr/unchecked trace-ring))) + + (define (traced-procedure name proc) + (lambda args (trace name) (apply/unchecked proc args))) + + (define (define-traced injector name proc) + (injector name (traced-procedure name proc))) + + (define (write-line val) (write val) (newline)) + + (define (print-error message args trace-log) + (display "ERROR: ") + (display message) + (for-each (lambda (arg) (write-char sp) (write arg)) args) + (newline) + (display "Trace log: ") + (write-line trace-log)) + + (define (repl) + (clear-trace-ring) + (display prompt) + (flush-output-port) + (let ((expr (read))) + (if (eof-object? expr) (newline) + (receive vals (eval expr intr-env) + (for-each (lambda (val) + (display result-prefix) + (write-line val)) vals) + (flush-output-port) + (repl))))) + + (define (repl-error message args trace-log) + (print-error message args trace-log) + (repl-cont '())) + + (define (exec-from-port p) + (do ((expr (read p) (read p)) + (acc '() (cons expr acc))) + ((eof-object? expr) + (close-input-port p) + (for-each (lambda (expr) (eval expr intr-env)) + (reverse acc))))) + + (define (not-integer x) (error "not an integer:" x)) + (define (not-list) (error "not a list")) + (define (not-number x) (error "not a number:" x)) + (define (not-exact-int x) (error "not an exact integer:" x)) + (define (not-output-port x) (error "not an output port:" x)) + (define (not-procedure x) (error "not a procedure:" x)) + + (define (too-many-args) (error "too many arguments")) + (define (zero-divisor) (error "zero divisor")) + (define (bad-radix x) (error "invalid radix:" x)) + (define (uneven-lists) (error "uneven lists")) + + (define (require-integer n) (if (integer? n) n (not-integer n))) + (define (require-procedure p) (if (procedure? p) p (not-procedure p))) + + (define (check-exact-non-negative-int k) + (if (cond ((fixnum? k) (fxnegative/unchecked? k)) + ((bignum? k) (bignum-negative? k)) + (else #t)) + (error "not an exact non-negative integer:" k))) + + (define (foldl f) + ;; type-safe but doesn't detect improper or cyclic list + (define (loop accum l) + (if (pair? l) (loop (f accum (car/unchecked l)) + (cdr/unchecked l)) + accum)) + loop) + + (define (all pred) + ;; type-safe but doesn't detect improper or cyclic list + (define (loop l) + (if (pair? l) (and (pred (car/unchecked l)) + (loop (cdr/unchecked l))) + #t)) + loop) + + (define (all-pairwise pred) + ;; type-safe but doesn't detect improper or cyclic list + (define (loop a b l) + (and (pred a b) + (if (pair? l) (loop b (car/unchecked l) (cdr/unchecked l)) + #t))) + loop) + + ;; not tail recursive + (define (map1 proc) + ;; type-safe but doesn't detect improper or cyclic list + (define (loop l) + (if (pair? l) + (cons (proc (car/unchecked l)) + (loop (cdr/unchecked l))) + l)) + loop) + + (define (copy-list l) + (define r (list '())) + (define (loop tail l) + (cond ((pair? l) (set-cdr/unchecked! tail (cons (car/unchecked l) '())) + (loop (cdr/unchecked tail) (cdr/unchecked l))) + ((null? l) (cdr/unchecked r)) + (else (not-list)))) + (loop r l)) + + (define (append2 x y) + (define r (list '())) + (define (loop tail x) + (cond ((pair? x) (set-cdr/unchecked! tail (cons (car/unchecked x) '())) + (loop (cdr/unchecked tail) (cdr/unchecked x))) + ((null? x) (set-cdr/unchecked! tail y) + (cdr/unchecked r)) + (else (not-list)))) + (loop r x)) + + (define (saving-values thunk after) + (call-with-values thunk + (lambda results + (after) + (apply/unchecked values results)))) + + (define (opt-args args . defaults) + (let loop ((acc '()) (args args) (defaults defaults)) + (if (pair? defaults) + (if (pair? args) + (loop (cons (car/unchecked args) acc) + (cdr/unchecked args) (cdr/unchecked defaults)) + (loop (cons (car/unchecked defaults) acc) + '() (cdr/unchecked defaults))) + (if (pair? args) (too-many-args) + (apply/unchecked values (reverse acc)))))) + + (define (first a . args) a) + (define (second a b . args) b) + + (define (fxquotient/unsigned/unchecked a b) + (call-with-values (lambda () (fxdiv/unsigned/unchecked a b)) first)) + + (define (fxremainder/unsigned/unchecked a b) + (call-with-values (lambda () (fxdiv/unsigned/unchecked a b)) second)) + + (define (fxquotient/ext/unsigned/unchecked al ah b) + (call-with-values + (lambda () (fxdiv/ext/unsigned/unchecked al ah b)) first)) + + ;;; Bignums + ;; + ;; Bignums are represented in a tagged sign-magnitude form. (I first tried two's complement but got stuck on efficient multiplication.) The magnitude is a vector of one or more fixnum words, least significant first. Bignums are normalized to drop trailing zeros or demoted to fixnum when possible. Words are considered unsigned, thus may appear negative to ordinary (signed) fixnum operations. + ;; + ;; Privileged primitives are as follows. Unless specified, all might omit type checks, and most internal bignum procedures inherit this! + ;; + ;; (MAKE-BIGNUM K) => bignum of K words + ;; Fixnum K is checked against the largest possible size, if any. The words may be uninitialized. + ;; (BIGNUM? OBJ) => boolean + ;; (BIGNUM-NEGATIVE? BN) => boolean + ;; (BIGNUM-SET-NEGATIVE! BN) => BN + ;; (BIGNUM-REF BN K) => word K of BN + ;; (BIGNUM-SET! BN K V) => unspecified, updating word K of BN to fixnum V + ;; (BIGNUM-LENGTH BN) => fixnum + ;; (BIGNUM N) => bignum of one word (not demoting) + ;; Constructs bignum with the value of N (signed). + ;; (BIGNUM/UNSIGNED N) => bignum of one word (not demoting) + ;; Constructs bignum with single word N + ;; (BIGNUM2 LO HI) => fixnum or bignum of one or two words + ;; Constructs bignum, demoting, from double-width signed value represented in two's complement by fixnums LO and HI. + ;; (BIGNUM-TRUNCATE! BN K) => bignum (not demoting) + ;; Returns truncation of BN to K words, possibly in-place. + + (define (fx+/promote a b) + (call-with-values (lambda () (fx+/carry a b)) bignum2)) + + (define (fx*/promote a b) + (call-with-values (lambda () (fx*/carry a b)) bignum2)) + + (define (fxneg/promote n) + (if (fx=/unchecked n *least-fixnum*) (bignum/unsigned n) + (fxneg/unchecked n))) + + (define (fxquotient/promote/unchecked a b) + (if (fxnegative/unchecked? b) + (if (fxnegative/unchecked? a) + (let ((q (fxquotient/unsigned/unchecked (fxneg/unchecked a) + (fxneg/unchecked b)))) + (if (fx=/unchecked q *least-fixnum*) (bignum/unsigned q) q)) + (fxneg/unchecked (fxquotient/unsigned/unchecked + a (fxneg/unchecked b)))) + (if (fxnegative/unchecked? a) + (fxneg/unchecked (fxquotient/unsigned/unchecked + (fxneg/unchecked a) b)) + (fxquotient/unsigned/unchecked a b)))) + + (define (fxremainder/unchecked a b) + (if (fxnegative/unchecked? b) + (if (fxnegative/unchecked? a) + (fxneg/unchecked (fxremainder/unsigned/unchecked + (fxneg/unchecked a) (fxneg/unchecked b))) + (fxremainder/unsigned/unchecked a (fxneg/unchecked b))) + (if (fxnegative/unchecked? a) + (fxneg/unchecked (fxremainder/unsigned/unchecked + (fxneg/unchecked a) b)) + (fxremainder/unsigned/unchecked a b)))) + + (define (bnneg n) + (let ((len (bignum-length n))) + (define r (make-bignum len)) + (define (copy k) + (bignum-set! r k (bignum-ref n k)) + (if (fx=/unchecked k 0) r (copy (fx-/unchecked k 1)))) + (if (bignum-negative? n) (copy (fx-/unchecked len 1)) + (if (and (fx=/unchecked len 1) + (fx=/unchecked (bignum-ref n 0) *least-fixnum*)) + *least-fixnum* + (bignum-set-negative! (copy (fx-/unchecked len 1))))))) + + (define (bignum-normalize! n) + (let loop ((k (fx-/unchecked (bignum-length n) 1))) + (cond ((fx=/unchecked k 0) + (let ((lo (bignum-ref n 0))) + (if (fxnegative/unchecked? lo) + ;; high bit set: fits in signed fixnum in singular case + (if (and (bignum-negative? n) + (fx=/unchecked lo *least-fixnum*)) lo + (bignum-truncate! n 1)) + ;; high bit clear: always fits in signed fixnum + (if (bignum-negative? n) (fxneg/unchecked lo) lo)))) + ((fx=/unchecked (bignum-ref n k) 0) (loop (fx-/unchecked k 1))) + (else (bignum-truncate! n (fx+/unchecked k 1)))))) + + (define (bignum-bit-stream n) + (define nwords (bignum-length n)) + (define (loop word seen-words bits-left) + (if (fx=/unchecked bits-left 0) + (if (fx=/unchecked seen-words nwords) '() + (loop (bignum-ref n seen-words) (fx+/unchecked seen-words 1) + *fixnum-width*)) + (cons (fxand word 1) + (delay (loop (fxshift word -1) seen-words + (fx-/unchecked bits-left 1)))))) + (loop (bignum-ref n 0) 1 *fixnum-width*)) + + (define stream-head car) + (define (stream-tail s) (force (cdr s))) + + (define (bn< a b) + (if (bignum-negative? a) + (if (bignum-negative? b) + (fxnegative/unchecked? (bncompare/unsigned b a)) + #t) + (if (bignum-negative? b) + #f + (fxnegative/unchecked? (bncompare/unsigned a b))))) + + (define (bncompare/unsigned a b) + (let ((alen (bignum-length a)) + (blen (bignum-length b))) + (cond ((fx=/unchecked alen blen) + (let loop ((k (fx-/unchecked alen 1))) + (if (fx=/unchecked k -1) 0 + (let ((ak (bignum-ref a k)) + (bk (bignum-ref b k))) + (cond ((fx=/unchecked ak bk) (loop (fx-/unchecked k 1))) + ((fxflonum n) + (define len (bignum-length n)) + (define (loop k acc bits) + (cond ((fx=/unchecked k -1) acc) + ((fx<=/unchecked 53 bits) ;; significant bits in IEEE 754 double + ;; TODO use LOAD-EXPONENT/UNCHECKED, but beware of fixnum overflow in computing exponent + (flo*/unchecked + acc (exp/unchecked (flo*/unchecked + (flonum/unchecked (fx+/unchecked k 1)) + flo-log-bn-radix)))) + (else (loop (fx-/unchecked k 1) + (flo+/unchecked (flo*/unchecked acc flo-bn-radix) + (flonum/unsigned/unchecked + (bignum-ref n k))) + (fx+/unchecked bits *fixnum-width*))))) + (let ((r (loop (fx-/unchecked len 2) + (flonum/unsigned/unchecked + (bignum-ref n (fx-/unchecked len 1))) + 1))) ;; high word might only have one significant bit + (if (bignum-negative? n) (floneg/unchecked r) r))) + + ;; Convert nonzero bignum to minimal list of digits in a smaller power-of-two radix, least significant first, "reflowing" in linear time. + (define (bignum->digits n bits) + (define len (bignum-length n)) + (define mask (fx-/unchecked (fxshift 1 bits) 1)) + (define acc '()) + (define (push! digit) + (set! acc (cons (fxand mask digit) acc))) + (define (loop word bit) + (if (fxdec n) + ;; Quadratic (repeated division by fixnum constant) + (define (loop acc n) + (cond ((bignum? n) + (receive (q r) (bndiv n dec-chunk-radix) + (loop (extend-padded acc (fixnum->dec/unchecked r) + dec-chunk-digits) + q))) + ((fx=/unchecked n 0) acc) + (else (cons (fixnum->dec/unchecked n) acc)))) + (let ((neg (bignum-negative? n))) + (press-num-string neg (loop '() (if neg (bnneg n) n))))) + + (define hex-chunk-digits (delay (quotient *fixnum-width*-1 4))) + (define hex-chunk-bits (delay (* hex-chunk-digits 4))) + + (define (bignum->hex n) + (do ((digits (bignum->digits n hex-chunk-bits) (cdr/unchecked digits)) + (acc '() (extend-padded + acc (fixnum->hex/unchecked (car/unchecked digits)) + hex-chunk-digits))) + ((null? (cdr/unchecked digits)) + (press-num-string + (bignum-negative? n) + (cons (fixnum->hex/unchecked (car/unchecked digits)) acc))))) + + (define oct-chunk-digits (delay (quotient *fixnum-width*-1 3))) + (define oct-chunk-bits (delay (fx*/wrap 3 oct-chunk-digits))) + + (define (bignum->oct n) + (do ((digits (bignum->digits n oct-chunk-bits) (cdr/unchecked digits)) + (acc '() (extend-padded + acc (fixnum->oct/unchecked (car/unchecked digits)) + oct-chunk-digits))) + ((null? (cdr/unchecked digits)) + (press-num-string + (bignum-negative? n) + (cons (fixnum->oct/unchecked (car/unchecked digits)) acc))))) + + (define (bignum->bin n) + (define len-1 (fx-/unchecked (bignum-length n) 1)) + (define (loop k acc) + (if (fx=/unchecked k len-1) + (cons (fixnum->bin/unsigned/unchecked (bignum-ref n k)) acc) + (loop (fx+/unchecked k 1) + (extend-padded acc (fixnum->bin/unsigned/unchecked + (bignum-ref n k)) + *fixnum-width*)))) + (press-num-string (bignum-negative? n) (loop 0 '()))) + + ;;; Procedure library + + ;;; 6.1 Equivalence predicates + + (define (eqv? a b) + (cond ((eq? a b) #t) + ;; char and fixnum covered by EQ? (implementation dependent) + ((bignum? a) (and (bignum? b) (bn= a b))) + ;; exactness must match + ((flonum? a) (and (flonum? b) (flo=/unchecked a b))) + (else #f))) + + (define (vector=? a b) + (define len (vector-length a)) + (define (loop k) + (if (fx=/unchecked k len) #t + (and (equal? (vector-ref/unchecked a k) (vector-ref/unchecked b k)) + (loop (fx+/unchecked k 1))))) + (and (fx=/unchecked len (vector-length b)) + (loop 0))) + + (define (equal? a b) + (cond ((eqv? a b) #t) + ((pair? a) (and (pair? b) + (equal? (car/unchecked a) (car/unchecked b)) + (equal? (cdr/unchecked a) (cdr/unchecked b)))) + ((vector? a) (and (vector? b) (vector=? a b))) + ((string? a) (and (string? b) (string=? a b))) + (else #f))) + + ;;; 6.2.5 Numerical operations + + (define (exp10/flonum x) + (exp/unchecked (flo*/unchecked flo-log-10 x))) + + ;; For comparisons of mixed exactness to be transitive, arguments must not be converted inexactly (e.g. "promoting" to flonum) if this could cause the result to differ from a true comparison of the represented values. + + (define (=/mixed flonum exact-int) + (and (integer? flonum) + (not (inf/unchecked? flonum)) + (=/generic (inexact->exact flonum) exact-int))) + + (define (exact (floor/unchecked flonum)) exact-int))) + + (define (>/mixed flonum exact-int) + (if (inf/unchecked? flonum) (not (flonegative/unchecked? flonum)) + (>/generic (inexact->exact (ceiling/unchecked flonum)) exact-int))) + + (define (<=/mixed flonum exact-int) + (if (inf/unchecked? flonum) (flonegative/unchecked? flonum) + (<=/generic (inexact->exact (ceiling/unchecked flonum)) exact-int))) + + (define (>=/mixed flonum exact-int) + (if (inf/unchecked? flonum) (not (flonegative/unchecked? flonum)) + (>=/generic (inexact->exact (floor/unchecked flonum)) exact-int))) + + ;; The convention for 2-arg generics is to dispatch on the type of the second argument first, as some operations (subtraction, division) do special things based on it but not the first. + + (define (=/fixnum z fx) + (cond ((fixnum? z) (fx=/unchecked z fx)) + ((bignum? z) #f) ;; assuming demotion + ((flonum? z) (=/mixed z fx)) + (else (not-number z)))) + + (define (/mixed fl x)) ;; reversing + (else (not-number x)))) + + (define (<=/flonum x fl) + (cond ((flonum? x) (flo<=/unchecked x fl)) + ((integer? x) (>=/mixed fl x)) ;; reversing + (else (not-number x)))) + + (define (=/generic z1 z2) + (cond ((fixnum? z2) (=/fixnum z1 z2)) + ((bignum? z2) (=/bignum z1 z2)) + ((flonum? z2) (=/flonum z1 z2)) + (else (not-number z2)))) + + (define (/generic x1 x2) ;; reversing dispatch order + (cond ((fixnum? x1) (=/generic x1 x2) ;; reversing dispatch order + (cond ((fixnum? x1) (<=/fixnum x2 x1)) + ((bignum? x1) (<=/bignum x2 x1)) + ((flonum? x1) (<=/flonum x2 x1)) + (else (not-number x1)))) + + (define =* (delay (all-pairwise =/generic))) + (define <* (delay (all-pairwise * (delay (all-pairwise >/generic))) + (define <=* (delay (all-pairwise <=/generic))) + (define >=* (delay (all-pairwise >=/generic))) + + (define (= z1 z2 . zs) (=* z1 z2 zs)) + (define (< x1 x2 . xs) (<* x1 x2 xs)) + (define (> x1 x2 . xs) (>* x1 x2 xs)) + (define (<= x1 x2 . xs) (<=* x1 x2 xs)) + (define (>= x1 x2 . xs) (>=* x1 x2 xs)) + + (define (zero? z) + (cond ((fixnum? z) (fx=/unchecked z 0)) + ((bignum? z) #f) + ((flonum? z) (flo=/unchecked z flo0)) + (else (not-number z)))) + + (define (positive? x) + (cond ((fixnum? x) (fxflonum x)) + ((flonum? x) x) + (else (not-number x)))) + + (define (max2 x1 x2) + (cond ((flonum? x1) (flonum/generic (if (flonum bn))) + (else (not-number z))))) + + (define (dispatch-num2 fx-op bn-op flo-op) + (let ((dfx (dispatch-fixnum fx-op bn-op flo-op)) + (dbn (dispatch-bignum bn-op flo-op))) + (lambda (z1 z2) + (cond ((fixnum? z2) (dfx z1 z2)) + ((bignum? z2) (dbn z1 z2)) + ((flonum? z2) (flo-op (flonum/generic z1) z2)) + (else (not-number z2)))))) + + (define add2 (delay (dispatch-num2 fx+/promote bn+ flo+/unchecked))) + (define mul2 (delay (dispatch-num2 fx*/promote bn* flo*/unchecked))) + + (define sub/bignum (delay (dispatch-bignum bn- flo-/unchecked))) + + (define (sub2 z1 z2) + (cond ((fixnum? z2) (add2 z1 (fxneg/promote z2))) + ((bignum? z2) (sub/bignum z1 z2)) + ((flonum? z2) (flo-/unchecked (flonum/generic z1) z2)) + (else (not-number z2)))) + + (define (div2 z1 z2) + (flodiv/unchecked + (flonum/generic z1) + (cond ((fixnum? z2) (if (fx=/unchecked z2 0) (zero-divisor) + (flonum/unchecked z2))) + ((bignum? z2) (bignum->flonum z2)) + ((flonum? z2) (if (flo=/unchecked z2 flo0) (zero-divisor) z2)) + (else (not-number z2))))) + + (define add* (delay (foldl add2))) + (define mul* (delay (foldl mul2))) + (define sub* (delay (foldl sub2))) + (define div* (delay (foldl div2))) + + (define (+ . zs) + (if (pair? zs) (add* (car/unchecked zs) (cdr/unchecked zs)) 0)) + + (define (* . zs) + (if (pair? zs) (mul* (car/unchecked zs) (cdr/unchecked zs)) 1)) + + (define (- z . zs) + (cond ((pair? zs) (sub* (sub2 z (car/unchecked zs)) + (cdr/unchecked zs))) + ((fixnum? z) (fxneg/promote z)) + ((bignum? z) (bnneg z)) + ((flonum? z) (floneg/unchecked z)) + (else (not-number z)))) + + (define (/ z . zs) + (if (pair? zs) (div* (div2 z (car/unchecked zs)) + (cdr/unchecked zs)) + (div2 flo1 z))) + + (define (abs x) + (cond ((fixnum? x) (if (fxnegative/unchecked? x) (fxneg/promote x) x)) + ((bignum? x) (if (bignum-negative? x) (bnneg x) x)) + ((flonum? x) (if (flonegative/unchecked? x) (floneg/unchecked x) + x)) + (else (not-number x)))) + + (define (floquotient/int/unchecked n1 n2) + (floquotient/unchecked (require-integer n1) (require-integer n2))) + + (define (floremainder/int/unchecked n1 n2) + (floremainder/unchecked (require-integer n1) (require-integer n2))) + + (define quotient/fixnum + (delay (dispatch-fixnum fxquotient/promote/unchecked bnquotient + floquotient/int/unchecked))) + + (define quotient/bignum + (delay (dispatch-bignum bnquotient floquotient/int/unchecked))) + + (define remainder/fixnum + (delay (dispatch-fixnum fxremainder/unchecked bnremainder + floremainder/int/unchecked))) + + (define remainder/bignum + (delay (dispatch-bignum bnremainder floremainder/int/unchecked))) + + (define (quotient n1 n2) + (cond ((fixnum? n2) + (if (fx=/unchecked n2 0) (zero-divisor) (quotient/fixnum n1 n2))) + ((bignum? n2) (quotient/bignum n1 n2)) + ((flonum? n2) + (if (flo=/unchecked n2 flo0) (zero-divisor) + (floquotient/int/unchecked (flonum/generic n1) n2))) + (else (not-integer n2)))) + + (define (remainder n1 n2) + (cond ((fixnum? n2) + (if (fx=/unchecked n2 0) (zero-divisor) (remainder/fixnum n1 n2))) + ((bignum? n2) (remainder/bignum n1 n2)) + ((flonum? n2) + (if (flo=/unchecked n2 flo0) (zero-divisor) + (floremainder/int/unchecked (flonum/generic n1) n2))) + (else (not-integer n2)))) + + (define (modulo n1 n2) + (let ((r (remainder n1 n2))) + (if (eq? (negative? r) (negative? n2)) r + (add2 r n2)))) + + (define (gcd2 n1 n2) + (if (= n2 0) n1 + (gcd2 n2 (remainder n1 n2)))) + + (define gcd* (delay (foldl gcd2))) + + (define (gcd . ns) + (if (pair? ns) (gcd* (car/unchecked ns) (cdr/unchecked ns)) 0)) + + (define (lcm . ns) + (if (pair? ns) + (quotient (mul* (car/unchecked ns) (cdr/unchecked ns)) + (gcd* (car/unchecked ns) (cdr/unchecked ns))) + 1)) + + ; numerator + ; denominator + + (define (floor x) + (cond ((flonum? x) (floor/unchecked x)) + ((exact? x) x) + (else (not-number x)))) + + (define (ceiling x) + (cond ((flonum? x) (ceiling/unchecked x)) + ((exact? x) x) + (else (not-number x)))) + + (define (truncate x) + (cond ((flonum? x) (truncate/unchecked x)) + ((exact? x) x) + (else (not-number x)))) + + (define (round x) + (cond ((flonum? x) (round/unchecked x)) + ((exact? x) x) + (else (not-number x)))) + + ; rationalize + + (define (irrational flo-op special-fx-arg special-result) + (lambda (z) + (cond ((flonum? z) (flo-op z)) + ((fixnum? z) (if (fx=/unchecked z special-fx-arg) special-result + (flo-op (flonum/unchecked z)))) + ((bignum? z) (flo-op (bignum->flonum z))) + (else (not-number z))))) + + (define atan1 (delay (irrational atan/unchecked 0 0))) + + (define (atan2 y x) + (cond ((flonum? y) (atan2/unchecked y (flonum/generic x))) + ((fixnum? y) (if (and (fx=/unchecked y 0) (positive? x)) 0 + (atan2/unchecked (flonum/unchecked y) + (flonum/generic x)))) + ((bignum? y) (atan2/unchecked (bignum->flonum y) + (flonum/generic x))) + (else (not-number y)))) + + (define (atan z . zs) + (cond ((null? zs) (atan1 z)) + ((null? (cdr/unchecked zs)) (atan2 z (car/unchecked zs))) + (else (too-many-args)))) + + ;; For exact base and non-negative fixnum power + (define (expt/exact/fixnum base power) + (do ((base base (* base base)) + (power power (fxshift power -1)) + (acc 1 (if (fx=/unchecked (fxand power 1) 1) (* acc base) acc))) + ((fx=/unchecked power 0) acc))) + + ;; For exact base and sign-ignored bignum power. XXX this seems a bit ridiculous to compute, but who am I to stop you from trying? + (define (expt/exact/bignum base power) + (do ((base base (* base base)) + (power (bignum-bit-stream power) (stream-tail power)) + (acc 1 (if (fx=/unchecked (stream-head power) 1) (* acc base) acc))) + ((null? power) acc))) + + ;; For positive flonum base, flonum power and result + (define (expt/inexact/+base base power) + (exp/unchecked (flo*/unchecked (log/unchecked base) power))) + + ;; For nonzero flonum base, flonum power and result + (define (expt/inexact base power) + (if (and (flonegative/unchecked? base) (integer? power)) + (if (odd? power) + (floneg/unchecked + (expt/inexact/+base (floneg/unchecked base) power)) + (expt/inexact/+base (floneg/unchecked base) power)) + ;; For non-integer power of negative base, proceed anyway to get NaN (unspecified, but consistent with sqrt) + (expt/inexact/+base base power))) + + ;; For fixnum power + (define (expt/fixnum base power) + (cond ((or (fixnum? base) (bignum? base)) + (if (fxnegative/unchecked? power) + (/ (expt/exact/fixnum base (fxneg/promote power))) + (expt/exact/fixnum base power))) + ((flonum? base) + (cond ((fx=/unchecked power 0) 1) + ((flo=/unchecked base flo0) flo0) + ((flonegative/unchecked? base) + (if (odd? power) + (floneg/unchecked + (expt/inexact/+base (floneg/unchecked base) + (flonum/unchecked power))) + (expt/inexact/+base (floneg/unchecked base) + (flonum/unchecked power)))) + (else (expt/inexact/+base base (flonum/unchecked power))))) + (else (not-number base)))) + + ;; For bignum power + (define (expt/bignum base power) + (cond ((fixnum? base) + (cond ((fx=/unchecked base 0) 0) + ((fx=/unchecked base 1) 1) + ((fx=/unchecked base -1) (if (odd? power) -1 1)) + ((bignum-negative? power) + (/ (expt/exact/bignum base power))) + (else (expt/exact/bignum base power)))) + ((bignum? base) + (cond ((bignum-negative? power) + (/ (expt/exact/bignum base power))) + (else (expt/exact/bignum base power)))) + ((flonum? base) + (cond ((flo=/unchecked base flo0) flo0) + ((flo=/unchecked base flo1) flo1) + ((flo=/unchecked base flo-1) (if (odd? power) flo-1 flo1)) + ((flonegative/unchecked? base) + (if (odd? power) + (floneg/unchecked + (expt/inexact/+base (floneg/unchecked base) + (bignum->flonum power))) + (expt/inexact/+base (floneg/unchecked base) + (bignum->flonum power)))) + (else (expt/inexact/+base base (bignum->flonum power))))) + (else (not-number base)))) + + ;; For flonum power (result could still be exact) + (define (expt/flonum base power) + (cond ((fixnum? base) + (cond ((flo=/unchecked power flo0) flo1) + ((fx=/unchecked base 0) flo0) + ;; ^ inexact due to the 0^0 case (a low-valued inexact power could be a zero with error) + ((fx=/unchecked base 1) 1) + (else (expt/inexact (flonum/unchecked base) power)))) + ((bignum? base) + (cond ((flo=/unchecked power flo0) flo1) + (else (expt/inexact (bignum->flonum base) power)))) + ((flonum? base) + (cond ((flo=/unchecked power flo0) flo1) + ((flo=/unchecked base flo0) flo0) + (else (expt/inexact base power)))) + (else (not-number base)))) + + (define (expt z1 z2) + (cond ((fixnum? z2) (expt/fixnum z1 z2)) + ((bignum? z2) (expt/bignum z1 z2)) + ((flonum? z2) (expt/flonum z1 z2)) + (else (not-number z2)))) + + ; make-rectangular + ; make-polar + ; real-part + ; imag-part + ; magnitude + ; angle + + (define (flonum->exact-int fraction exponent) + (let ((n (load-exponent/unchecked fraction *fixnum-width*-1)) + (exponent (fx-/unchecked exponent *fixnum-width*-1))) + (let ((word (flonum->fixnum/unchecked n))) + (if (fxnegative/unchecked? exponent) + (fxshift word exponent) + (let ((fraction (flo-/unchecked n (flonum/unchecked word)))) + (if (flo=/unchecked fraction flo0) + (* word (expt/exact/fixnum 2 exponent)) + (+ (* word (expt/exact/fixnum 2 exponent)) + (flonum->exact-int fraction exponent)))))))) + + (define (inexact->exact z) + (cond ((flonum? z) + (cond ((flo=/unchecked z flo0) 0) + ((inf/unchecked? z) (error "infinite flonum")) + ((integer? z) (call-with-values + (lambda () (fraction/exponent/unchecked z)) + flonum->exact-int)) + (else (error "non-integer flonum:" z)))) ;; includes NaN + ((exact? z) z) + (else (not-number z)))) + + ;;; 6.2.6 Numerical I/O + + (define (number->string z . args) + (define radix 10) + (if (pair? args) + (begin (if (pair? (cdr args)) (too-many-args)) + (set! radix (car args)) + (if (not (fixnum? radix)) + (if (inexact? radix) (not-exact-int radix) + (bad-radix radix))))) + (cond ((fixnum? z) + ((cond ((fx=/unchecked radix 10) fixnum->dec/unchecked) + ((fx=/unchecked radix 16) fixnum->hex/unchecked) + ((fx=/unchecked radix 8) fixnum->oct/unchecked) + ((fx=/unchecked radix 2) fixnum->bin/unchecked) + (else (bad-radix radix))) z)) + ((bignum? z) + ((cond ((fx=/unchecked radix 10) bignum->dec) + ((fx=/unchecked radix 16) bignum->hex) + ((fx=/unchecked radix 8) bignum->oct) + ((fx=/unchecked radix 2) bignum->bin) + (else (bad-radix radix))) z)) + ((flonum? z) + (cond ((fx=/unchecked radix 10) (flonum->dec/unchecked z)) + ((memv radix '(16 8 2)) + ;; XXX the spec is confusing here: seems to be allowed, but point and exponent notation are only allowed in decimal radix. So inexact integers could pass the round-trip test e.g. "#iabcdef000000000", but ew. + (error "non-decimal radix for inexact number")) + (else (bad-radix radix)))) + (else (not-number z)))) + + (define (string->number s . args) + ;; XXX what to do if both #e prefix and point/exponent/#? + (define fail #f) + (define acc #f) + + (define radix 10) + (define exact #t) + (define got-radix #f) + (define got-exact #f) + (define (set-radix! r) + (if got-radix (fail)) (set! got-radix #t) (set! radix r)) + (define (set-exact! e) + (if got-exact (fail)) (set! got-exact #t) (set! exact e)) + + (define pos 0) + (define len #f) + (define (peek) (string-ref s pos)) + (define (next!) + (let ((c (string-ref s pos))) + (set! pos (fx+/unchecked pos 1)) c)) + (define (end?) (fx=/unchecked pos len)) + + (define (char->digit c) + ;; Simultaneous ASCII digit conversion and range checking with memoized radix specialization. + (set! char->digit + (cond ((fx=/unchecked radix 2) + (lambda (c) + (cond ((charinteger #\0) -> 48 + ((char<=? c #\1) (fx-/unchecked (char->integer c) 48)) + (else #f)))) + ((fx=/unchecked radix 8) + (lambda (c) + (cond ((charinteger c) 48)) + (else #f)))) + ((fx=/unchecked radix 10) + (lambda (c) + (cond ((charinteger c) 48)) + (else #f)))) + ((fx=/unchecked radix 16) + (lambda (c) + (set! c (char->integer (char-upcase c))) + ;; (- (char->integer #\A) 10) -> 55 + (set! c (fx-/unchecked c (if (fxdigit c)) + + (define () + (cond ((end?) (fail)) + ((char=? (peek) #\#) + (next!) + (if (end?) (fail)) + (case (char-downcase (next!)) + ((#\e) (set-exact! #t)) + ((#\i) (set-exact! #f)) + ((#\b) (set-radix! 2)) + ((#\o) (set-radix! 8)) + ((#\d) (set-radix! 10)) + ((#\x) (set-radix! 16)) + (else (fail))) + ()) + (else ()))) + + (define () + (let ((r ())) + (if (end?) r (fail)))) ;; stub + + (define () + (let ((proc (if (fx=/unchecked radix 10) ))) + (cond ((end?) (fail)) + ((char=? (peek) #\+) (next!) (proc)) + ((char=? (peek) #\-) (next!) (- (proc))) + (else (proc))))) + + (define () + (if (end?) (fail)) + (let ((d (char->digit (next!)))) + (cond (d (set! acc (if exact d (flonum/unchecked d))) + (ureal-pow2-loop)) + (else (fail))))) + (define (ureal-pow2-loop) + (if (end?) acc + (let* ((c (peek)) (d (char->digit c))) + ;; XXX naive algorithm + (cond (d (next!) + (set! acc (+ (* acc radix) d)) + (ureal-pow2-loop)) + ((char=? c #\#) (set! acc (flonum/generic acc)) + (sharp-loop)) + ((char=? c #\/) (next!) + (fraction)) + (else acc))))) + + (define () + (if (end?) (fail)) + (let* ((c (next!)) (d (char->digit c))) + (cond (d (set! acc (if exact d (flonum/unchecked d))) + (ureal-10-loop)) + ((char=? c #\.) (initial-point)) + (else (fail))))) + (define (ureal-10-loop) + (if (end?) acc + (let* ((c (peek)) (d (char->digit c))) + ;; XXX naive algorithm + (cond (d (next!) + (set! acc (+ (* acc 10) d)) + (ureal-10-loop)) + ((char=? c #\.) (next!) + (set! acc (flonum/generic acc)) + (point-loop flo1/10)) + ((char=? c #\#) (set! acc (flonum/generic acc)) + (sharp-loop)) + ((char=? c #\/) (next!) (fraction)) + (else ()))))) + + (define (sharp-loop) + (cond ((end?) acc) + ((char=? (peek) #\#) (next!) + (set! acc (* radix acc)) + (sharp-loop)) + ((char=? (peek) #\.) (next!) + (if (fx=/unchecked radix 10) + (point-sharp-loop) (fail))) + ((char=? (peek) #\/) (next!) + (fraction)) + (else acc))) + + (define (initial-point) + (if (end?) (fail)) + (let ((d (char->digit (next!)))) + (cond (d (set! acc (flo*/unchecked (flonum/unchecked d) flo1/10)) + (point-loop (flo*/unchecked flo1/10 flo1/10))) + (else (fail))))) + (define (point-loop place-val) + (if (end?) acc + (let ((d (char->digit (peek)))) + ;; XXX naive algorithm + (cond (d (next!) + (set! acc (flo+/unchecked + acc (flo*/unchecked place-val + (flonum/unchecked d)))) + (point-loop (flo*/unchecked place-val flo1/10))) + (else (point-sharp-loop)))))) + (define (point-sharp-loop) + (cond ((end?) acc) + ((char=? (peek) #\#) (next!) (point-sharp-loop)) + (else ()))) + + (define () + (cond ((end?) acc) + ((memv (char-downcase (peek)) '(#\e #\s #\f #\d #\l)) + ;; all markers equivalent as we only have one float precision + (next!) (exponent-sign)) + (else acc))) + + (define (fraction) + (fail)) ;; stub + + (define (exponent-sign) + (define (exponent negative) + (if (end?) (fail) + (let ((d (char->digit (next!)))) + (if d (let ((e (loop (flonum/unchecked d)))) + (flo*/unchecked + (flonum/generic acc) + (exp10/flonum (if negative (floneg/unchecked e) e)))) + (fail))))) + (define (loop e) + (if (end?) e + (let ((d (char->digit (peek)))) + (if d (begin (next!) (loop (flo+/unchecked + (flo*/unchecked e flo10) + (flonum/unchecked d)))) + e)))) + (cond ((end?) (fail)) + ((char=? (peek) #\+) (next!) (exponent #f)) + ((char=? (peek) #\-) (next!) (exponent #t)) + (else (exponent #f)))) + + (set! len (string-length s)) + (if (pair? args) + (if (pair? (cdr args)) (too-many-args) + (begin (set! radix (car args)) + (if (memv radix '(2 8 10 16)) + (if (not (fixnum? radix)) (not-exact-int radix)) + (bad-radix radix))))) + (call/cc (lambda (return) (set! fail (lambda () (return #f))) + ()))) + ;; end of string->number (whew!) + + ;;; 6.3.2 Pairs and lists + + (define (list . args) args) + + ;; not tail recursive + (define (append . args) + (if (pair? args) + (let loop ((lists args)) + (let ((next (cdr/unchecked lists))) + (if (pair? next) + (append2 (car/unchecked lists) (loop next)) + (car/unchecked lists)))) + '())) + + (define (reverse l) + (let loop ((accum '()) (l l)) + (cond ((pair? l) (loop (cons (car/unchecked l) accum) + (cdr/unchecked l))) + ((null? l) accum) + (else (not-list))))) + + (define (list-tail l k) + (check-exact-non-negative-int k) + (do ((k k (- k 1)) (l l (cdr l))) ((zero? k) l))) + + (define (list-ref l k) (car (list-tail l k))) + + (define (memq x l) + (cond ((pair? l) (if (eq? x (car/unchecked l)) l + (memq x (cdr/unchecked l)))) + ((null? l) #f) + (else (not-list)))) + + (define (memv x l) + (cond ((pair? l) (if (eqv? x (car/unchecked l)) l + (memv x (cdr/unchecked l)))) + ((null? l) #f) + (else (not-list)))) + + (define (member x l) + (cond ((pair? l) (if (equal? x (car/unchecked l)) l + (member x (cdr/unchecked l)))) + ((null? l) #f) + (else (not-list)))) + + (define (assq x l) + (cond ((pair? l) (if (eq? x (caar l)) (car/unchecked l) + (assq x (cdr/unchecked l)))) + ((null? l) #f) + (else (not-list)))) + + (define (assv x l) + (cond ((pair? l) (if (eqv? x (caar l)) (car/unchecked l) + (assv x (cdr/unchecked l)))) + ((null? l) #f) + (else (not-list)))) + + (define (assoc x l) + (cond ((pair? l) (if (equal? x (caar l)) (car/unchecked l) + (assoc x (cdr/unchecked l)))) + ((null? l) #f) + (else (not-list)))) + + ;;; 6.3.5 Strings + + (define (string . args) (list->string args)) + + (define (string->list s) + (do ((k (fx+/unchecked (string-length s) -1) (fx+/unchecked k -1)) + (accum '() (cons (string-ref/unchecked s k) accum))) + ((fx=/unchecked k -1) accum))) + + ;;; 6.3.6 Vectors + + (define (vector . args) (list->vector args)) + + (define (vector->list v) + (do ((k (fx-/unchecked (vector-length v) 1) (fx-/unchecked k 1)) + (accum '() (cons (vector-ref/unchecked v k) accum))) + ((fx=/unchecked k -1) accum))) + + ;;; 6.4 Control Features + + (define (apply proc . args) + (if (pair? args) + (let ((p2 (cdr/unchecked args))) + (if (pair? p2) + (do ((p1 args p2) + (p2 p2 p3) + (p3 (cdr/unchecked p2) (cdr/unchecked p3))) + ((null? p3) + (set-cdr/unchecked! p1 (copy-list (car/unchecked p2))))) + (set! args (copy-list (car/unchecked args))))) + (error "missing argument list")) + (apply/unchecked proc args)) + + (define all-null? (delay (all null?))) + (define all-pair? (delay (all pair?))) + (define car*unchecked (delay (map1 car/unchecked))) + (define cdr*unchecked (delay (map1 cdr/unchecked))) + + ;; not tail recursive + (define (map proc list1 . lists) + (define (loop lists) + (cond ((pair? (car/unchecked lists)) + (if (all-pair? (cdr/unchecked lists)) + (cons (apply/unchecked proc (car*unchecked lists)) + (loop (cdr*unchecked lists))) + (uneven-lists))) + ((null? (car/unchecked lists)) + (if (all-null? (cdr/unchecked lists)) '() (uneven-lists))) + (else (not-list)))) + (loop (cons list1 lists))) + + (define (for-each proc list1 . lists) + (define (loop lists) + (cond ((pair? (car/unchecked lists)) + (if (not (all-pair? (cdr/unchecked lists))) (uneven-lists)) + (apply/unchecked proc (car*unchecked lists)) + (loop (cdr*unchecked lists))) + ((null? (car/unchecked lists)) + (if (all-null? (cdr/unchecked lists)) '() + (uneven-lists))) + (else (not-list)))) + (loop (cons list1 lists))) + + (define (dynamic-wind before thunk after) + (call/cc (lambda (return) + (push-winding! before after) + (before) + (call-with-values thunk return)))) + + ;;; 6.6.1 Ports + + (define (call-with-input-file filename proc) + (let ((port (open-input-file filename))) + (saving-values (lambda () (proc port)) + (lambda () (close-input-port port))))) + + (define (call-with-output-file filename proc . options) + (let ((port (apply open-output-file filename options))) + (saving-values (lambda () (proc port)) + (lambda () (close-output-port port))))) + + (define (with-input-from-file filename proc) + (let ((saved (current-input-port)) + (port (open-input-file filename))) + (saving-values + (lambda () (dynamic-wind (lambda () (set-input-port! port)) + proc + (lambda () (set-input-port! saved)))) + (lambda () (close-input-port port))))) + + (define (with-output-to-file filename proc . options) + (let ((saved (current-output-port)) + (port (apply open-output-file filename options))) + (saving-values + (lambda () (dynamic-wind (lambda () (set-output-port! port)) + proc + (lambda () (set-output-port! saved)))) + (lambda () (close-output-port port))))) + + ;;; 6.6.2 Input + + ;; Reader sentinels: newly allocated so not EQ? to any readable object (not sure if this would be guaranteed with a quoted pair...) + (define rd-close-paren (cons 'special 'close-paren)) + (define rd-dot (cons 'special 'dot)) + + (define (read . port) + (define (eof-list) (error "unexpected end-of-file in list")) + (define (read-datum) + (let ((t (apply/unchecked read-token port))) + (if (eof-object? t) t + (case (car t) + ((identifier) (string->symbol (cdr t))) + ((open-paren) (build-list (read-datum))) + ((close-paren) rd-close-paren) + ((literal) (cdr t)) + ((number) (or (string->number (cdr t)) + (error "bad number token:" (cdr t)))) + ((abbrev) (abbrev (cdr t) (read-datum))) + ((dot) rd-dot) + ((open-vector) (build-vector '() (read-datum))) + ((named-char) (named-char (cdr t))) + (else (error "BUG: unmatched token type")))))) + (define (build-list d) + (cond ((eq? d rd-close-paren) '()) + ((eq? d rd-dot) (error "dotted list without first item")) + ((eof-object? d) (eof-list)) + (else (extend-list (list d))))) + (define (extend-list head) + (define tail head) + (define (loop d) + (cond ((eq? d rd-close-paren) head) + ((eq? d rd-dot) (improper (read-datum))) + ((eof-object? d) (eof-list)) + (else (set-cdr! tail (list d)) + (set! tail (cdr tail)) + (loop (read-datum))))) + (define (improper d) + (cond ((eof-object? d) (eof-list)) + ((eq? d rd-close-paren) + (error "dotted list without last item")) + ((eq? d rd-dot) (error "extra dot in dotted list"))) + (let ((t (apply/unchecked read-token port))) + (cond ((eof-object? t) (eof-list)) + ((eq? (car t) 'close-paren) (set-cdr! tail d) head) + (else (error "excess item in tail of dotted list"))))) + (loop (read-datum))) + (define (abbrev sym d) + (cond ((eof-object? d) + (error "unexpected end-of-file in abbreviation")) + ((eq? d rd-close-paren) + (error "unexpected close-paren in abbreviation")) + ((eq? d rd-dot) + (error "unexpected dot in abbreviation"))) + (list sym d)) + (define (build-vector acc d) + (cond ((eof-object? d) (error "unexpected end-of-file in vector")) + ((eq? d rd-close-paren) (reverse-list->vector/unchecked acc)) + ((eq? d rd-dot) (error "unexpected dot in vector")) + (else (build-vector (cons d acc) (read-datum))))) + (define (named-char n) + (integer->char + (cond ((string=? n "space") 32) + ((string=? n "newline") 10) + (else (error "bad character name:" n))))) + (let ((d (read-datum))) + (cond ((eq? d rd-close-paren) (error "unexpected close-paren")) + ((eq? d rd-dot) (error "unexpected dot")) + (else d)))) + + ;;; 6.6.3 Output + + (define char-printer #f) + (define string-printer #f) + + (define environment-specs + `((,(null-environment 5) "NULL") + (,(scheme-report-environment 5) "SCHEME-REPORT") + (,(gales-scheme-environment) "GALES-SCHEME") + (,(interaction-environment) "INTERACTION") + (,(toplevel-environment) "TOPLEVEL"))) + + (define (write-char/quoted c) + (write-string "#\\") + (case (char->integer c) + ((32) (write-string "space")) + ((10) (write-string "newline")) + (else (write-char c)))) + + (define (print obj) + (cond ((pair? obj) + (let ((head (car/unchecked obj)) + (tail (cdr/unchecked obj))) + (cond ((and (symbol? head) + (pair? tail) + (null? (cdr/unchecked tail)) + (assq head '((quote "'") + (quasiquote "`") + (unquote ",") + (unquote-splicing ",@")))) + => (lambda (r) + (write-string (cadr r)) + (print (car/unchecked tail)))) + (else (write-char #\() + (let loop ((head head) (tail tail)) + (print head) + (cond ((pair? tail) + (write-char sp) + (loop (car/unchecked tail) + (cdr/unchecked tail))) + ((null? tail)) + (else (write-string " . ") + (print tail)))) + (write-char #\)))))) + ((null? obj) (write-string "()")) + ((eq? obj #t) (write-string "#t")) + ((eq? obj #f) (write-string "#f")) + ((char? obj) (char-printer obj)) + ((number? obj) (write-string (number->string obj))) + ((string? obj) (string-printer obj)) + ((symbol? obj) (write-string obj)) + ((vector? obj) (write-string "#(") + (let ((len (vector-length obj))) + (if (> len 0) + (begin + (print (vector-ref obj 0)) + (let loop ((k 1)) + (if (< k len) + (begin (write-char sp) + (print (vector-ref obj k)) + (loop (+ k 1)))))))) + (write-char #\))) + ((builtin? obj) (write-string "#BUILTIN:") + (write-string (builtin-name obj))) + ((continuation? obj) (write-string "#CONTINUATION")) + ((procedure? obj) (write-string "#PROCEDURE")) + ((promise? obj) (write-string "#PROMISE")) + ((input-port? obj) (write-string "#INPUT-PORT")) + ((output-port? obj) (write-string "#OUTPUT-PORT")) + ((eof-object? obj) (write-string "#EOF")) + ((assq obj environment-specs) + => (lambda (r) + (write-string "#ENVSPEC:") (write-string (cadr r)))) + (else (error "BUG: unmatched type")))) + + (define (with-output-port-option thunk args) + (if (pair? args) + (if (null? (cdr/unchecked args)) + (let ((port (car/unchecked args)) + (saved (current-output-port))) + (if (output-port? port) + (dynamic-wind (lambda () (set-output-port! port)) thunk + (lambda () (set-output-port! saved))) + (not-output-port port))) + (too-many-args)) + (thunk))) + + (define (write obj . args) + (set! char-printer write-char/quoted) + (set! string-printer write-string/quoted) + (with-output-port-option (lambda () (print obj)) args)) + + (define (display obj . args) + (set! char-printer write-char) + (set! string-printer write-string) + (with-output-port-option (lambda () (print obj)) args)) + + (define (newline . args) + (apply/unchecked write-char (cons nl args))) + + ;;; 6.6.4 System Interface + + (define (load filename) (call-with-input-file filename exec-from-port)) + + ;;; Extensions + + (define (set-error-handler! p) + (set! error-handler (require-procedure p)) + ;; (re)register hook for internal errors + (set-error-continuation! error-cont)) + + (define (exit . args) + (let ((status (opt-args args 0))) + (if (and (exact? status) (integer? status)) + (exit-cont (modulo status 256)) + ;; ^ ensures fixnum, and seems to be what unix does anyway + (not-exact-int status)))) + + ;;; Lispy socket interface based on internal Unixy one. + ;; + ;; A socket object is a dispatch a-list of methods and constants. This object-oriented design avoids exposing bare file descriptors while allowing for more esoteric methods to be added in the future if needed. Unnecessary mutators are avoided, and stream sockets are not conflated with listeners. + ;; + ;; Explicitly closing one of the directional ports of a stream socket is interpreted as a promise that no further data will be read/written, including through other aliases of the underlying socket. Thus it is a half-shutdown visible to the peer as well as any external processes sharing the socket (e.g. if a fork extension is implemented). When closing listeners or datagram sockets shared by external processes, the usual Unix reference counting semantics apply. + ;; + ;; Host and service lookup is intended to be implemented in Scheme, at a higher layer. (That is, open-* won't do implicit lookups when the address looks like a name.) + + ;;; Internal constructors for general socket types + + (define (make-socket fd) + `((type socket) + (address ,(getsockname fd)))) ;; immutable + + (define (make-stream-socket fd) + (receive (i o) (socket-ports fd) + `((type stream-socket) + (input-port ,i) + (output-port ,o) + (peer-address ,(getpeername fd)) ;; immutable + . ,(make-socket fd)))) + + (define (make-listener fd) + `((type listener) + (accept ,(lambda () (make-stream-socket (accept fd)))) + (close ,(lambda () (close fd))) + . ,(make-socket fd))) + + (define (make-dgram-socket fd) + `((type datagram-socket) + (send ,(lambda (address msg) (sendto fd address msg))) + (receive ,(lambda () (recvfrom fd))) + (close ,(lambda () (close fd))) + . ,(make-socket fd))) + + ;;; Specific constructors + + (define (open-tcp-connection address . args) + (let* ((bind-address (opt-args args #f)) + (fd (if bind-address (inet-stream-socket bind-address) + (inet-stream-socket)))) + (connect-inet fd address) + (make-stream-socket fd))) + + (define (open-tcp-listener backlog . args) + (let* ((address (opt-args args #f)) + (fd (if address (inet-stream-socket address) + (inet-stream-socket)))) + (listen fd backlog) + (make-listener fd))) + + (define (open-udp-socket . args) + (let* ((address (opt-args args #f)) + (fd (if address (inet-dgram-socket address) + (inet-dgram-socket)))) + (make-dgram-socket fd))) + + (define (open-unix-connection address . args) + (let* ((bind-address (opt-args args #f)) + (fd (if bind-address (unix-stream-socket bind-address) + (unix-stream-socket)))) + (connect-unix fd address) + (make-stream-socket fd))) + + (define (open-unix-listener backlog address) + ;; address required as Unix listeners can't bind to an automatic path + (let ((fd (unix-stream-socket address))) + (listen fd backlog) + (make-listener fd))) + + (define (open-unix-datagram-socket . args) + (let* ((address (opt-args args #f)) + (fd (if address (unix-dgram-socket address) + (unix-dgram-socket)))) + (make-dgram-socket fd))) + + ;;; Higher-level wrappers + + (define (: object field) (cadr (assq field object))) + + (define (close-io i o) (close-input-port i) (close-output-port o)) + + (define (call-with-connection sock proc) + (let ((i (: sock 'input-port)) + (o (: sock 'output-port))) + (saving-values (lambda () (proc i o)) + (lambda () (close-io i o))))) + + (define (sequential-server listener handler) + (let ((accept (: listener 'accept))) + (let loop () + (let ((sock (accept))) + (let ((i (: sock 'input-port)) + (o (: sock 'output-port))) + (let ((r (handler i o (: sock 'peer-address)))) + (close-io i o) + (if r (loop) ((: listener 'close))))))))) + + (define (call-with-tcp-connection address proc) + ;; parse IP / resolve name here... + (call-with-connection (open-tcp-connection address) proc)) + + (define (call-with-unix-connection address proc) + (call-with-connection (open-unix-connection address) proc)) + + (define (sequential-tcp-server backlog address handler) + (sequential-server (open-tcp-listener backlog address) handler)) + + (define (sequential-unix-server backlog address handler) + (sequential-server (open-unix-listener backlog address) handler)) + + ;;; End of internal definitions; apply deferred initializations (implicit forcing or R6's letrec* semantics could avoid this) + + (set! flo-bn-radix (force flo-bn-radix)) + (set! flo-log-bn-radix (force flo-log-bn-radix)) + (set! =* (force =*)) + (set! <* (force <*)) + (set! >* (force >*)) + (set! <=* (force <=*)) + (set! >=* (force >=*)) + (set! max* (force max*)) + (set! min* (force min*)) + (set! add2 (force add2)) + (set! mul2 (force mul2)) + (set! sub/bignum (force sub/bignum)) + (set! add* (force add*)) + (set! mul* (force mul*)) + (set! sub* (force sub*)) + (set! div* (force div*)) + (set! quotient/fixnum (force quotient/fixnum)) + (set! quotient/bignum (force quotient/bignum)) + (set! remainder/fixnum (force remainder/fixnum)) + (set! remainder/bignum (force remainder/bignum)) + (set! gcd* (force gcd*)) + (set! atan1 (force atan1)) + (set! all-null? (force all-null?)) + (set! all-pair? (force all-pair?)) + (set! car*unchecked (force car*unchecked)) + (set! cdr*unchecked (force cdr*unchecked)) + (set! dec-chunk-digits (force dec-chunk-digits)) + (set! dec-chunk-radix (force dec-chunk-radix)) + (set! hex-chunk-digits (force hex-chunk-digits)) + (set! oct-chunk-digits (force oct-chunk-digits)) + (set! hex-chunk-bits (force hex-chunk-bits)) + (set! oct-chunk-bits (force oct-chunk-bits)) + + (for-each + (lambda (rec) (define-traced define-r5rs (car rec) (cadr rec))) + `((eqv? ,eqv?) + (equal? ,equal?) + (= ,=) + (< ,<) + (> ,>) + (<= ,<=) + (>= ,>=) + (zero? ,zero?) + (positive? ,positive?) + (negative? ,negative?) + (odd? ,odd?) + (even? ,even?) + (max ,max) + (min ,min) + (+ ,+) + (* ,*) + (- ,-) + (/ ,/) + (abs ,abs) + (quotient ,quotient) + (remainder ,remainder) + (modulo ,modulo) + (gcd ,gcd) + (lcm ,lcm) + (floor ,floor) + (ceiling ,ceiling) + (truncate ,truncate) + (round ,round) + (exp ,(irrational exp/unchecked 0 1)) + (log ,(irrational log/unchecked 1 0)) + (sin ,(irrational sin/unchecked 0 0)) + (cos ,(irrational cos/unchecked 0 1)) + (tan ,(irrational tan/unchecked 0 0)) + (asin ,(irrational asin/unchecked 0 0)) + (acos ,(irrational acos/unchecked 1 0)) + (atan ,atan) + (sqrt ,(irrational sqrt/unchecked 0 0)) + ;; ^ could be refined to handle exact perfect squares, or bignums with greater precision and range + (expt ,expt) + (exact->inexact ,flonum/generic) + (inexact->exact ,inexact->exact) + (number->string ,number->string) + (string->number ,string->number) + (list ,list) + (append ,append) + (reverse ,reverse) + (list-tail ,list-tail) + (list-ref ,list-ref) + (memq ,memq) + (memv ,memv) + (member ,member) + (assq ,assq) + (assv ,assv) + (assoc ,assoc) + (string ,string) + (string->list ,string->list) + (vector ,vector) + (vector->list ,vector->list) + (apply ,apply) + (map ,map) + (for-each ,for-each) + (dynamic-wind ,dynamic-wind) + (call-with-input-file ,call-with-input-file) + (call-with-output-file ,call-with-output-file) + (with-input-from-file ,with-input-from-file) + (with-output-to-file ,with-output-to-file) + (read ,read) + (write ,write) + (display ,display) + (newline ,newline) + (load ,load))) + + (for-each + (lambda (rec) (define-traced define-gscm (car rec) (cadr rec))) + `((set-error-handler! ,set-error-handler!) + (exit ,exit) + (open-tcp-connection ,open-tcp-connection) + (open-tcp-listener ,open-tcp-listener) + (open-udp-socket ,open-udp-socket) + (open-unix-connection ,open-unix-connection) + (open-unix-listener ,open-unix-listener) + (open-unix-datagram-socket ,open-unix-datagram-socket) + (call-with-tcp-connection ,call-with-tcp-connection) + (call-with-unix-connection ,call-with-unix-connection) + (sequential-tcp-server ,sequential-tcp-server) + (sequential-unix-server ,sequential-unix-server))) + + (cond ((call/cc (lambda (c) (set! error-cont c) #f)) + => (lambda (args) + (let ((h error-handler) + (msg (car args)) + (args (cdr args))) + (set! error-handler print-error) + (h msg args (get-trace-log))) + 1)) + ((call/cc (lambda (c) (set! exit-cont c) #f))) + (else (set-error-handler! print-error) + (define-traced + define-gscm 'error + (lambda (msg . args) + (if (string? msg) (error-cont (cons msg args)) + (error "not a string:" msg)))) + (cond ((member *args* '(() ("--"))) + (call/cc (lambda (c) (set! repl-cont c))) + (set-error-handler! repl-error) + (repl)) + ((string=? (car *args*) "-") + (exec-from-port (current-input-port))) + ((string=? (car *args*) "--") + (set! *args* (cdr *args*)) + (eval `(set! *args* ',*args*) intr-env) + (load (car *args*))) + (else (load (car *args*)))) + 0)))) diff -uNr a/gscm/manifest b/gscm/manifest --- a/gscm/manifest false +++ b/gscm/manifest aad7baf2787f4cf9aa83ff22cc23eb2946010bf6e60df096b1ea36982a9e9a0ee2da4e3f5cf4da5d7e719263845662a2dd639ab5cde272aa6ab05967b8a50eee @@ -0,0 +1 @@ +648626 gscm_subdir_genesis jfw The Gales Scheme interpreter. (Reground from gscm_genesis to follow the project subdir and manifest naming conventions.) diff -uNr a/gscm/package/README b/gscm/package/README --- a/gscm/package/README false +++ b/gscm/package/README a331b4a6465eb256b65d809a577b3d158e20f44d9052408322e32ea234b8ea02664c26b255ffaa51c89fcde1556d5b8cdf42054eca2595fa1ee189cacfebd1a7 @@ -0,0 +1,51 @@ +About +----- + +This is Gales Scheme: a Scheme interpreter for Unix striving for simplicity, soundness, minimal artificial restrictions, and R5RS compliance with strict error checking. + +Written by Jacob Welsh for JWRD Computing. + +Prerequisites +------------- + +gcc, targetting an x86_64 POSIX environment. Linux and OpenBSD are known to work. Other ISAs are supportable in principle but there's presently some assembly math code lacking portable equivalent. + +Installation +------------ + +This software sheds some historical Unix conventions in favor of Bernstein's /package scheme ( http://cr.yp.to/slashpackage.html ), which I find meshes fairly well with the ideas of V. If this is your first time using it, you may need to create the /package directory and add /command to your shell's PATH variable. + +1. Press or otherwise install the tree at the path: + + /package/gscm-0.40.3 + +The installation path is not configurable. This amounts to a claim on the global namespace, as command names always do. People and programs can count on finding components at known paths if they are to be found at all, as surely as with /bin/sh. This doesn't mean the files must physically exist under /package; for example, a symlink from there could ease working on the code as a non-root user. + +2. From the above directory, run (as root): + + sh package/install + +3. Run the test suite if you like: + + sh package/check + +Known failures at present include, from tests/numbers.scm: + + (fail inexact-read-precision) : float formatting and parsing algorithms are imprecise. + (fail 1 1 expt13) : (expt 1 -1) returns an inexact (float) result when it could be exact. + +From tests/semantic-bad.scm: + + (fail unary-add-type bad-type) + (fail unary-mul-type bad-type) : the single-argument passthrough base case for + and * doesn't bother checking for numbers. + +Usage notes +----------- + +See "gscm -h" for CLI options. + +The main shortfalls with respect to R5RS are the macro language and advanced numeric types (rational and complex). Arbitrary precision (bignum) integer arithmetic is supported. + +A number of extensions are provided including system interface, sockets, and fixnum arithmetic; see doc/extensions.txt for (incomplete) details. + +Error recovery, debugging, and concurrency support are known weaknesses (no help from the standard here). diff -uNr a/gscm/package/check b/gscm/package/check --- a/gscm/package/check false +++ b/gscm/package/check b09d2304d0a13f776e774c44ee2864ccf37bb67fd162403bb73705a11ac2d5991b31c43393f7084987ae90b05ca2b6bea0eff8d0ed6cdf2629392c6e2379529a @@ -0,0 +1,7 @@ +#!/bin/sh + +cd tests || exit +for f in compiler control extensions library numbers storage test-errors ; do + printf '\n; %s\n' $f.scm + ../command/gscm $f.scm +done diff -uNr a/gscm/package/commands b/gscm/package/commands --- a/gscm/package/commands false +++ b/gscm/package/commands ae63134a7fc5e8fdbb79ff861420abae073abae0f1ce8180ccfc2b740dd9dfa9c97e8f9dddd4aead0b49c2a9275106bbc041d185f40eade6082d97f0778ae1ce @@ -0,0 +1 @@ +gscm diff -uNr a/gscm/package/install b/gscm/package/install --- a/gscm/package/install false +++ b/gscm/package/install 6f74c7e4e8a0b2923a429be3a3159cf5fda2b490aab33d8d3abd498e8c5fcb7494b917b9931fb37d6ea55ca1a4850726f33054f813b708349678e6010ae61b97 @@ -0,0 +1,25 @@ +#!/bin/sh +set -e + +P=gscm +V=0.40.3 +cd /package/$P-$V + +# Versioned path duplicated in: +# package/README +# src/Makefile + +make -C src + +echo "Creating symlink $P -> $P-$V" +rm -f /package/$P'{new}' +ln -s $P-$V /package/$P'{new}' +mv -f /package/$P'{new}' /package/$P + +echo 'Making command links in /command' +mkdir -p /command +for i in `cat package/commands` ; do + rm -f /command/$i'{new}' + ln -s /package/$P/command/$i /command/$i'{new}' + mv -f /command/$i'{new}' /command/$i +done diff -uNr a/gscm/src/Makefile b/gscm/src/Makefile --- a/gscm/src/Makefile false +++ b/gscm/src/Makefile b316459fdeb9d43d3a59e816d81e7756dd91da9b975a64cdee4fade1194a852d0ae690fae0def63e8cc14b7cdeb70bf1679e55af4742a58a1c81a375c2eb18aa @@ -0,0 +1,19 @@ +PREFIX := /package/gscm-0.40.3 +ASM_ARCH := x86_64 + +CFLAGS := -std=c99 -pedantic -Wall -Wextra -Winit-self -Wstrict-aliasing=1 -g -O2 +CPPFLAGS := -DGSCMLIB=\"$(PREFIX)/library\" + +# Comment to enable internal assertions (~2x slowdown) +CPPFLAGS += -DNDEBUG + +LDLIBS := -lm + +../command/gscm: gscm.o main.o asm-$(ASM_ARCH).o + $(CC) -o $@ $(LDFLAGS) $^ $(LDLIBS) + +clean: + rm -f ../command/gscm *.o + +gscm.o: gscm.c gscm.h +main.o: main.c gscm.h diff -uNr a/gscm/src/TODO b/gscm/src/TODO --- a/gscm/src/TODO false +++ b/gscm/src/TODO 47cf7fc661b15a4f8edeed6777c53c1c6c119c5b88ef2f414865e950090b0c8fdc1062f40916325c57c6a63c053ecb4dd3ea8d83615c5c5739b2128a08de98aa @@ -0,0 +1,20 @@ +* Bootstrapping + * allow reloading library + * library compiler + * replace lazy var refs + * builtin syntax refs +* Builtin sanity: + * Table: slot, name, min args, max args or -1 +* finish bignums + * Karatsuba +* finish builtins + * numeric I/O: round-trip float conversion + * finish sc_div_extended in asm-generic.c + * rational/complex +* GC symbols and file descriptors +* R5RS hygienic macros, let-syntax +* syntax/variable shadowing +* track open output ports; auto-flush on timer +* deal with SIGPIPE +* more EINTR cases e.g. connect() +* sendto, recvfrom, tests for datagram sockets diff -uNr a/gscm/src/asm-generic.c b/gscm/src/asm-generic.c --- a/gscm/src/asm-generic.c false +++ b/gscm/src/asm-generic.c c7df888bb445203fcb722c87a858a11e10281f643f40792a57159c59c868ee2317e6c7b8cf69bff576f1c4d8387e34ca79d89fd4fde9b1e1692699f0a573365c @@ -0,0 +1,340 @@ +#include + +#if __SIZEOF_POINTER__ == 4 +typedef uint32_t full; +typedef int32_t sfull; +typedef uint16_t half; + +#elif __SIZEOF_POINTER__ == 8 +typedef uint64_t full; +typedef int64_t sfull; +typedef uint32_t half; + +#else +#error Unsupported word size +#endif + +#define HALF_BITS (8*sizeof(half)) +#define FULL_BITS (8*sizeof(full)) + +/* Low half of word, unsigned */ +#define LO(n) ((full)((half)(n))) +/* High half of word, unsigned */ +#define HI(n) ((n)>>HALF_BITS) +/* High half of word, sign extended (casting back to unsigned because overflow + * on signed types is Undefined Behavior) */ +#define S_HI(n) ((full)(((sfull)(n))>>HALF_BITS)) + +/** sc_wide_mul + * + * Computes the full (double-word) product of two unsigned words, returning the + * low word through the first argument and the high word through the second. + * + * The algorithm works by decomposition into half-word products. To verify, + * consider the variables arranged much as in schoolbook multiplication, each + * "digit" here being a half-word: + * + * ah al + * x bh bl + * --------------- + * c + * p1h p1l + * p2h p2l + * p3h p3l + * p4h p4l + * \_____/ \_____/ + * high / low outputs + */ + +void sc_wide_mul(full *a, full *b) { + full ah=HI(*a), al=LO(*a), bh=HI(*b), bl=LO(*b), + p1=al*bl, p2=ah*bl, p3=al*bh, p4=ah*bh, /* no overflow */ + t=LO(p2) + LO(p3), /* no overflow */ + c=HI(HI(p1) + t); /* no overflow */ + *a = p1 + (t<> (FULL_BITS-1), + a_sign_ext = ((sfull)(*a)) >> (FULL_BITS-1), + b_sign_bit = (*b) >> (FULL_BITS-1), + b_sign_ext = ((sfull)(*b)) >> (FULL_BITS-1), + p_sign_bit = a_sign_bit ^ b_sign_bit, + p_sign_ext = a_sign_ext ^ b_sign_ext, + tmp; + /* Conditional two's complement based on sign */ + *a = (*a ^ a_sign_ext) + a_sign_bit; + *b = (*b ^ b_sign_ext) + b_sign_bit; + sc_wide_mul(a, b); + /* Likewise for the product, but extending to two words. C being + * brain-dead, we have to re-derive the carry bit; this trick from FFA + * (http://www.loper-os.org/?p=1913). */ + *a = *a ^ p_sign_ext; + tmp = *a + p_sign_bit; + *b = (*b ^ p_sign_ext) + ((*a & ~tmp) >> (FULL_BITS-1)); + *a = tmp; +} + +/* Count leading zeros in a nonzero word, returning an integer from zero to + * BITS-1. (This implementation happens to treat zero the same as one, but this + * can't be assumed, especially for assembly versions.) + * + * 1. Let A be a 2^N-bit word rounded down to a power of two, isolating the + * most significant 1-bit. + * 2. Let D be a de Bruijn bit sequence of order N: a cyclic sequence of 2^N + * bits containing all possible N-bit subsequences. + * 3. Multiply D by A, thereby up-shifting it. + * 4. If at least the N-1 high bits of D are zero, the shift is equivalent to a + * rotation within the N high bits, so they form a perfect hash function + * over the possible values of A. + * + * Note that the table so constructed will be complete only if the hash + * function is indeed perfect; one need not understand why it works to verify + * that it does. + * + * See Leiserson, Prokop and Randall, "Using de Bruijn Sequences to Index a 1 + * in a Computer Word", MIT Laboratory for Computer Science, 1998. + * + * As a further hack, some subset of de Bruijn sequences also work when + * rounding UP to the next power of two minus one, avoiding the need to clear + * the low bits. I'm not aware of a better way to find these than brute force. + * + * A program to generate the numbers is included at the end of this file. */ +static int count_leading_zeros(full a) { +#if __SIZEOF_POINTER__ == 8 + static const uint8_t tbl[64] = { + 63, 52, 62, 51, 47, 34, 61, 50, 41, 46, 22, 38, 33, 15, 60, 2, 49, 43, + 40, 45, 29, 27, 21, 37, 25, 32, 10, 19, 14, 7, 59, 1, 53, 48, 35, 42, + 23, 39, 16, 3, 44, 30, 28, 26, 11, 20, 8, 54, 36, 24, 17, 4, 31, 12, 9, + 55, 18, 5, 13, 56, 6, 57, 58, 0 + }; + a |= a >> 1; + a |= a >> 2; + a |= a >> 4; + a |= a >> 8; + a |= a >> 16; + a |= a >> 32; + return tbl[((full)0x03f08a4c6acb9dbd * a) >> 58]; +#elif __SIZEOF_POINTER__ == 4 + static const uint8_t tbl[32] = { + 31, 22, 30, 21, 18, 10, 29, 2, 20, 17, 15, 13, 9, 6, 28, 1, 23, 19, 11, + 3, 16, 14, 7, 24, 12, 4, 8, 25, 5, 26, 27, 0 + }; + a |= a >> 1; + a |= a >> 2; + a |= a >> 4; + a |= a >> 8; + a |= a >> 16; + return tbl[((full)0x07c4acdd * a) >> 27]; +#else +#error Unsupported word size +#endif +} + +/* Bit length of unsigned integer, aka 1-based index of most significant 1-bit, + * aka ceil(log2(a+1)). This formulation is well defined for all possible + * inputs. */ +int sc_bit_length(full a) { + return a ? 8*sizeof a - count_leading_zeros(a) : 0; +} + +void sc_div_extended(full *a_lo, full *a_hi, full b) { + /* Input: two-word dividend a, one-word divisor b + * Precondition: divisor greater than high word of dividend (ensuring + * quotient fits in one word) + * Output: quotient in a_hi, remainder in a_lo */ + full q0, q1, a0 = LO(*a_lo), a1 = HI(*a_lo), b0 = LO(b), b1 = HI(b); + if (b1 == 0) { + q1 = *a_hi / b0; + a1 = ((*a_hi << HALF_BITS) | a1) - q1*b0; + *a_lo = (a1 << HALF_BITS) | a0; + q0 = *a_lo / b0; + *a_lo -= q0*b0; + } + else { + full a2 = LO(*a_hi), a3 = HI(*a_hi); + q1 = (a3 == b1) ? (half) -1 : *a_hi / b1; + /* TODO + while (q1*(b) > (a3 a2 a1)) --q1; + subtract q1*(b) from (a3 a2 a1) */ + q0 = (a2 == b1) ? (half) -1 : ((a2 << HALF_BITS) | a1) / b1; + /* + while (q0*(b) > (a2 a1 a0) --q0; + subtract q0*(b) from (a2 a1 a0) */ + *a_lo = (a1 << HALF_BITS) | a0; + } + *a_hi = q1 << HALF_BITS | q0; +} + +#ifdef SEARCH_PHF +/* Search for perfect hash functions based on de Bruijn sequences */ + +#include +#include + +static void tab32(uint32_t d) { + int8_t tbl[32], i; + for (i=0; i<32; ++i) tbl[i] = -1; + for (i=0; i<32; ++i) { + int hash = (d * (((uint32_t)-1) >> i)) >> (32-5); + tbl[hash] = i; + } + printf("{"); + for (i=0; i<32; ++i) printf("%d, ", tbl[i]); + printf("}\n"); +} + +static int test32(uint32_t d) { + uint32_t slots = 0; + int i; + /* Check for perfect hash function on powers of two minus one */ + for (i=0; i<32; ++i) { + int hash = (d * (((uint32_t)-1) >> i)) >> (32-5); + uint32_t bit = ((uint32_t)1) << hash; + if (slots & bit) return 0; /* collision */ + slots |= bit; + } + /* For good measure, make sure it also works on powers of two (is a de + * Bruijn sequence) */ + slots = 0; + for (i=0; i<32; ++i) { + int hash = (d * (((uint32_t)1) << i)) >> (32-5); + uint32_t bit = ((uint32_t)1) << hash; + if (slots & bit) return 0; + slots |= bit; + } + printf("%x ", d); + tab32(d); + return 1; +} + +static void tab64(uint64_t d) { + int8_t tbl[64], i; + for (i=0; i<64; ++i) tbl[i] = -1; + for (i=0; i<64; ++i) { + int hash = (d * (-1UL >> i)) >> (64-6); + tbl[hash] = i; + } + printf("{"); + for (i=0; i<64; ++i) printf("%d, ", tbl[i]); + printf("}\n"); +} + +static int test64(uint64_t d) { + uint64_t slots = 0; + int i; + for (i=0; i<64; ++i) { + int hash = (d * (-1UL >> i)) >> (64-6); + uint64_t bit = 1UL << hash; + if (slots & bit) return 0; + slots |= bit; + } + slots = 0; + for (i=0; i<64; ++i) { + int hash = (d * (1UL << i)) >> (64-6); + uint64_t bit = 1UL << hash; + if (slots & bit) return 0; + slots |= bit; + } + printf("%lx ", d); + tab64(d); + return 1; +} + +int main() { + uint64_t d; + for (d = 0; !test32(d); ++d) ; + /* XXX 19 magic internet bits remain in this prefix to make the brute force + * approach tractable. Smarter algorithm needed. */ + for (d = 0x03f08a4000000000; !test64(d); ++d) ; + return 0; +} + +#endif diff -uNr a/gscm/src/asm-x86_64.s b/gscm/src/asm-x86_64.s --- a/gscm/src/asm-x86_64.s false +++ b/gscm/src/asm-x86_64.s b5fe0c81d946e7fe9add7578abeeb48d9bb2caed955aa3d39772cd560717decc80d2ff6bf295aea1d7536275af3eb5fa4ae22920e32c401d086ffbf35b2eb86c @@ -0,0 +1,38 @@ +.intel_syntax noprefix +.global sc_wide_mul +.global sc_wide_mul_signed +.global sc_bit_length +.global sc_div_extended +.text + +sc_wide_mul: /* rdi=*a, rsi=*b */ + mov rax, [rdi] /* *a -> rax */ + mul qword ptr [rsi] /* unsigned rax*(*b) -> rdx:rax */ + mov [rdi], rax /* lo -> *a */ + mov [rsi], rdx /* hi -> *b */ + ret + +sc_wide_mul_signed: + mov rax, [rdi] + imul qword ptr [rsi] + mov [rdi], rax + mov [rsi], rdx + ret + +sc_bit_length: /* rdi=a */ + bsr rax, rdi /* Bit Scan Reverse */ + jz .L1 + inc rax + ret +.L1: /* a=0: rax undefined */ + xor rax, rax + ret + +sc_div_extended: /* rdi=*a_lo, rsi=*a_hi, rdx=b */ + mov rcx, rdx /* save b -> rcx */ + mov rax, [rdi] /* *a_lo -> rax */ + mov rdx, [rsi] /* *a_hi -> rdx */ + div rcx /* unsigned rdx:rax / b -> Q rax, R rdx */ + mov [rsi], rax /* quotient -> *a_hi */ + mov [rdi], rdx /* remainder -> *a_lo */ + ret diff -uNr a/gscm/src/compiler-deps b/gscm/src/compiler-deps --- a/gscm/src/compiler-deps false +++ b/gscm/src/compiler-deps f08b6aa712e00c50583977e362b32387fa8e165f512c933695e074b495946a1af98c11ff79ee5e15fa680b81f423043e2eace772d88ba51d6262c075579fea8c @@ -0,0 +1,54 @@ +apply/unchecked +begin +boolean? +caadr +caar +cadar +caddr +cadr +car +car/unchecked +cdaar +cdadr +cdar +cddr +cdr +cdr/unchecked +char? +cons +cons/immutable +define +eq? +eqv? +error +fx+/unchecked +fx-/unchecked +fxstring +symbol? +variable-ref +vector-copy/immutable +vector-length +vector-ref +vector-ref/unchecked +vector-set! +vector? diff -uNr a/gscm/src/gscm.c b/gscm/src/gscm.c --- a/gscm/src/gscm.c false +++ b/gscm/src/gscm.c b1b1de8605e906f79d7eed152dd0a2ab8522a380733859b35ffafa8bb6ca3321fad9b7b0e539070fb5663fc07a73d7d32a53258748e78284e728e940134f74bc @@ -0,0 +1,4467 @@ +/************** + * Gales Scheme + * + * A Scheme interpreter for Unix striving for simplicity, soundness, minimal + * artificial restrictions, and R5RS compliance with strict error checking. + * + * J. Welsh + * January 2017 - April 2018 + */ + +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include +#include +#include + +#ifndef MAP_ANON +#define MAP_ANON MAP_ANONYMOUS +#endif + +int snprintf(char *, size_t, const char *, ...); /* to be replaced */ +void abort(void); +size_t strlen(const char *); +char *strerror(int); +void *memcpy(void *, const void *, size_t); +void *memset(void *, int, size_t); +int memcmp(const void *, const void *, size_t); +pid_t vfork(void); + +#include "gscm.h" + + +/****************** + * Memory structure + */ + +/* The Scheme heap is an array of N-bit cells where N is the size of a machine + * address. */ + +typedef size_t value; +typedef value (*builtin_func_t)(value args); +typedef unsigned char uchar; +typedef unsigned long ulong; +typedef unsigned int uint; + +/* Principal type tag: three most significant bits of cell */ +#define TAG_BITS 3 + +#define T_SPECIAL 0 /* Special values listed below */ +#define T_MOVED 1 /* "Broken heart" pointer to GC moved object */ +#define T_IMMUT_PAIR 2 /* Pointer to car with cdr following */ +#define T_PAIR 3 +#define T_CHARACTER 4 /* Character in least significant byte */ +#define T_FIXNUM 5 /* N-3 bit two's complement signed integer */ +#define T_EXTENDED 6 /* Pointer to extended object */ +#define T_EXT_HEADER 7 /* Extended type header */ + +/* Special values indicated by T_SPECIAL. Since that's zero, these can be + * compared with values directly. */ +#define SC_NULL 0 +#define SC_TRUE 1 +#define SC_FALSE 2 +#define SC_EOF 3 +#define SC_NULL_ENV 4 +#define SC_REPORT_ENV 5 +#define SC_GSCM_ENV 6 +#define SC_INTERACT_ENV 7 +#define SC_TOPLEVEL_ENV 8 +/* Inaccessible from Scheme */ +#define UNDEFINED 9 +#define RD_CLOSEPAREN 10 /* Returned internally by reader subroutines */ +#define RD_DOT 11 + +/* T_SPECIAL is also implicitly (ab)used for return addresses (EV_*, RD_* and + * so on) and loop counters on the stack. GC doesn't have to know what they + * really are as long as it treats them as immediate values. */ + +/* Extended objects consist of a header cell (T_EXT_HEADER) containing extended + * type information followed by possibly untagged data cells, depending on + * type. The four bits following the principal tag in the header are the + * extended type tag: */ +#define T_IMMUT_STRING 0x0 +#define T_STRING 0x1 +#define T_IMMUT_VECTOR 0x2 +#define T_VECTOR 0x3 +#define T_VARIABLE_REF 0x4 +#define T_SYMBOL 0x5 +#define T_BUILTIN 0x6 +#define T_PROCEDURE 0x7 +#define T_CONTINUATION 0x8 +#define T_PROMISE 0x9 +#define T_PORT 0xA +#define T_FLONUM 0xB /* is_number assumes all numbers from here */ +#define T_BIGNUM 0xC +#define T_NEG_BIGNUM 0xD +#define T_RATIONAL 0xE +#define T_COMPLEX 0xF + +/* Tags for types with immutable variants, both principal and extended, must + * be equal to the bitwise OR of 1 with the immutable variant. That is, the + * least significant tag bit is the mutability flag, where applicable. */ + +/* Symbols, strings, vectors, and bignums store their length in the header as + * an N-7 bit unsigned integer. For vectors and bignums, that many cells + * follow. Strings and symbols are packed, so ceil(length/(N/8)) cells follow. + * Lexical variable references store the argument index in this space. + * + * Example for 32-bit systems: + * - Pointers/fixnums have 29 bits + * - Max heap size is 2^29 = 512M cells of 4 bytes = 2 GiB (4 during GC) + * - Longest string is 2^25 characters = 32 MiB + * - Longest vector is 2^25 cells = 128 MiB (not counting any pointer targets) + * - Longest bignum is 2^25 cells = 2^30 bits for a magnitute ~ 10^10^8 + * + * If the size limits are a problem, the length could be stored in an untagged + * or fixnum cell after the header. */ + +#if __SIZEOF_POINTER__ == 8 +#define VAL_BITS 61 +#define EXT_VAL_BITS 57 +#define FIXNUM_MAX 0x0FFFFFFFFFFFFFFF +#define FIXNUM_MIN -0x1000000000000000 +#define EXT_LENGTH_MAX 0x01FFFFFFFFFFFFFF +#define packed_str_len(bytes) (((bytes) + 7) >> 3) +#define FLONUM_CELLS 1 + +#elif __SIZEOF_POINTER__ == 4 +#define VAL_BITS 29 +#define EXT_VAL_BITS 25 +#define FIXNUM_MAX 0x0FFFFFFF +#define FIXNUM_MIN -0x10000000 +#define EXT_LENGTH_MAX 0x01FFFFFF +#define packed_str_len(bytes) (((bytes) + 3) >> 2) +#define FLONUM_CELLS 2 + +#else +#error Unsupported pointer size +#endif + +#define tag(v) (((value)(v)) >> VAL_BITS) +#define add_tag(v, t) ((v) | (((value)(t)) << VAL_BITS)) +#define untag(v) ((((value)(v)) << 3) >> 3) +#define untag_signed(v) (((long) (((value)(v)) << 3)) >> 3) +#define ext_tag(v) (((v) >> EXT_VAL_BITS) & 0xF) +#define ext_add_tag(v, t) ((v) | ((value)(t) << EXT_VAL_BITS) | \ + (((value)T_EXT_HEADER) << VAL_BITS)) +#define ext_untag(v) ((((value)(v)) << 7) >> 7) +#define ext_untag_signed(v) (((long) (((value)(v)) << 7)) >> 7) +/* WARNING: add_tag/ext_add_tag assume v's tag bits are zero */ + +static value car(value); +static value cdr(value); + + +/****************** + * Scheme registers + */ + +/* General purpose */ +static value r0, r1, r2, r3, r4, r5, r6; +/* Special purpose */ +static value r_stack, r_spool, r_error_cont, r_signal_handler, r_compiler, + r_compiler_expr, r_input_port, r_output_port, r_dump; +static enum { + f_none, + f_compile, + f_apply, + f_force, + f_call_with_values, + f_values, +} r_flag; + +/* Register aliases to make usage more readable. Some rules for validation: + * - A subroutine may use a single register under different aliases, but before + * it is read or used as an argument under one alias, it must have been: + * - Assigned or declared as a parameter under the same alias, and + * - Not meanwhile assigned under a different alias. + * - Parameter registers must be distinct. + */ +#define R_EXPR r0 /* expression being evaluated */ +#define R_ARGS r0 /* arguments to apply procedure to */ + +#define R_ENV r1 /* evaluation environment */ +#define R_PROC r1 /* procedure to apply */ +#define R_PORT r1 /* argument to I/O routines */ +#define R_ARG r1 + +#define R_RESULT r2 /* subroutine return value */ +#define R_LEXEME r2 +#define R_FORMALS r2 +#define R_WIND_TO r2 + +#define R_VARNAME r3 +#define R_TAIL r3 /* last pair of a list being built */ +#define R_LCA r3 + +#define R_OPERANDS r4 +#define R_SECOND_LAST r4 + +#define R_CAR r5 /* argument to cons or push */ + +#define R_CDR r6 /* argument to cons */ +#define R_ITER r6 + + +/***************** + * Syscall helpers + */ + +static int open_cloexec(const char *path, int flags) { + return open(path, flags | O_CLOEXEC, 0666); + /* Non-atomic version for systems lacking O_CLOEXEC + int fd = open(path, flags, 0666); + if (fd != -1) fcntl(fd, F_SETFD, FD_CLOEXEC); + return fd; + */ +} + +static int pipe_cloexec(int pipefd[2]) { + return pipe2(pipefd, O_CLOEXEC); + /* Non-atomic version for systems lacking pipe2 + if (pipe(pipefd) == -1) return -1; + fcntl(pipefd[0], F_SETFD, FD_CLOEXEC); + fcntl(pipefd[1], F_SETFD, FD_CLOEXEC); + return 0; + */ +} + +/* Reliably catching close errors is NOT POSSIBLE on Linux and others. The call + * may block and be interrupted by a signal handler, yet cannot be retried as + * the FD is deallocated early. HPUX at least has the atypical behavior of + * leaving the FD open, so it would leak. Should figure out where exactly close + * can block. */ +static void blind_close(int fd) { + int saved_errno = errno; + close(fd); + errno = saved_errno; +} + +static int poll1(int fd, short events, int timeout) { + int r; + struct pollfd sp; + sp.fd = fd; + sp.events = events; + while ((r = poll(&sp, 1, timeout)) == -1) + if (errno != EAGAIN && errno != EINTR) sc_perror(); + return r; +} + +static int write_all(int fd, const char *buf, ssize_t len) { + ssize_t n; + while ((n = write(fd, buf, len)) < len) { + if (n != -1) len -= n, buf += n; + else if (errno == EAGAIN || errno == EWOULDBLOCK) + poll1(fd, POLLOUT, -1); + else if (errno != EINTR) return -1; + } + return 0; +} + +void sc_write_error(const char *msg) { + size_t len = strlen(msg); + if (len) write_all(2, msg, len); +} +#define write_err sc_write_error + +static void flush_all(void); + +__attribute__((noreturn)) +void sc_exit(int status) { + flush_all(); + _exit(status); +} + + +/**************** + * Error handling + */ + +/* Failsafe error handler */ + +__attribute__((noreturn)) +static void fatal(const char *msg) { + write_err("FATAL: "); + write_err(msg); + write_err("\n"); + sc_exit(1); +} + +__attribute__((noreturn)) +static void fatal1(const char *msg, const char *detail) { + write_err("FATAL: "); + write_err(msg); + write_err(": "); + write_err(detail); + write_err("\n"); + sc_exit(1); +} + +__attribute__((noreturn)) +void sc_error(const char *msg) { sc_error1(msg, UNDEFINED); } + +__attribute__((noreturn)) +void sc_perror(void) { sc_error(strerror(errno)); } + +__attribute__((noreturn)) +void sc_perror1(value detail) { sc_error1(strerror(errno), detail); } + +static int chkp(int r) { if (r == -1) sc_perror(); return r; } + +static const char *fmt_ulong_dec(ulong); + +__attribute__((noreturn)) +void sc_assert_fail(const char *file, ulong line, const char *func, + const char *expr) { + const char *sep = ": "; + static int aborting = 0; + if (!aborting) flush_all(); + aborting = 1; + write_err("Assertion failed: "); + write_err(file); write_err(sep); + write_err(fmt_ulong_dec(line)); write_err(sep); + write_err(func); write_err(sep); + write_err(expr); write_err("\n"); + abort(); +} + +/* various common errors */ + +__attribute__((noreturn)) +static void not_a_number(value v) { sc_error1("not a number:", v); } + + +/******************************* + * Garbage collector & allocator + */ + +/* Heap discipline: + * + * This garbage collector uses the stop-and-copy (Minsky-Fenichel-Yochelson) + * method. Because it relocates values into a new heap and is triggered by + * allocation, any function that directly or indirectly calls sc_malloc cannot + * keep pointer types (T_PAIR, T_IMMUT_PAIR, T_EXTENDED) in local variables + * across such calls, as the addresses may be invalidated. The Scheme stack, + * registers, or otherwise statically stored variables registered as roots must + * be used instead. + * + * Such functions will generally be constructors and take their arguments + * through the stack or registers. Notably included are push and cons. + * Specifically not included are pop, peek, drop, car, cdr, set_car and + * set_cdr. + * + * The reward for this trouble is fast and compacting garbage collection. + */ + +static value *heap, *new_heap; +static value heap_size, free_ptr; + +#define ROOTS_ALLOC 48 +static value *roots[ROOTS_ALLOC]; +static value roots_fill; + +static void gc_root(value *handle) { + if (roots_fill >= ROOTS_ALLOC) fatal("insufficient ROOTS_ALLOC"); + roots[roots_fill] = handle; + ++roots_fill; +} + +static value ext_obj_size(value header) { + switch (ext_tag(header)) { + case T_IMMUT_STRING: + case T_STRING: return 1 + packed_str_len(ext_untag(header)); + case T_IMMUT_VECTOR: + case T_VECTOR: return 1 + ext_untag(header); + case T_VARIABLE_REF: return 2; + case T_SYMBOL: return 1 + packed_str_len(ext_untag(header)); + case T_BUILTIN: return 3; + case T_PROCEDURE: return 4; + case T_CONTINUATION: return 3; + case T_PROMISE: return 3; + case T_PORT: return 6; + case T_FLONUM: return 1 + FLONUM_CELLS; + case T_BIGNUM: + case T_NEG_BIGNUM: return 1 + ext_untag(header); + case T_RATIONAL: return 3; + case T_COMPLEX: return 3; + default: fatal("BUG: invalid extended tag"); + } +} + +/* Process one cell (in either a root or the new heap), returning number of + * cells to advance */ +static value scan_cell(value *scan_val) { + int scan_tag = tag(*scan_val); + value ptr, old_val, length; + assert(scan_tag != T_MOVED); + switch (scan_tag) { + case T_IMMUT_PAIR: + case T_PAIR: + case T_EXTENDED: + ptr = untag(*scan_val); + old_val = heap[ptr]; + if (tag(old_val) == T_MOVED) + *scan_val = add_tag(untag(old_val), scan_tag); + else { + *scan_val = add_tag(free_ptr, scan_tag); + length = (scan_tag == T_EXTENDED) ? ext_obj_size(old_val) : 2; + memcpy(&new_heap[free_ptr], &heap[ptr], length*sizeof(value)); + heap[ptr] = add_tag(free_ptr, T_MOVED); + free_ptr += length; + } + return 1; + case T_EXT_HEADER: + switch (ext_tag(*scan_val)) { + /* For compound types, skip the header and scan each element */ + case T_IMMUT_VECTOR: + case T_VECTOR: + case T_VARIABLE_REF: + case T_PROCEDURE: + case T_CONTINUATION: + case T_PROMISE: + case T_PORT: + case T_RATIONAL: + case T_COMPLEX: + return 1; + /* Otherwise skip the whole blob */ + default: + return ext_obj_size(*scan_val); + } + default: + /* All other principal types are immediate values */ + return 1; + } +} + +uint sc_gc_verbose = 0, sc_gc_thrash_factor = 16; + +void sc_gc(void) { + value root, scan_ptr, *tmp; + if (sc_gc_verbose) { + static ulong gc_count = 0; + write_err("GC: cycle "); + write_err(fmt_ulong_dec(++gc_count)); + write_err(" | "); + } + free_ptr = 0; + for (root = 0; root < roots_fill; ++root) scan_cell(roots[root]); + for (scan_ptr = 0; scan_ptr < free_ptr; + scan_ptr += scan_cell(&new_heap[scan_ptr])) + assert(free_ptr <= heap_size); + tmp = heap; + heap = new_heap; + new_heap = tmp; + if (sc_gc_verbose) { + /* using floating point to avoid overflow */ + double live_bytes = free_ptr*sizeof(value); + double live_pct = 100.*free_ptr/heap_size; + write_err(fmt_ulong_dec(free_ptr)); + write_err(" cells | "); + write_err(fmt_ulong_dec((live_bytes+1023.)/1024.)); + write_err("K | "); + write_err(fmt_ulong_dec(live_pct)); + write_err("."); + write_err(fmt_ulong_dec(((unsigned)(10.*live_pct))%10)); + write_err("% live\n"); + } +} + +static value sc_malloc(size_t cells) { + value result = free_ptr; + free_ptr += cells; + if (free_ptr > heap_size) { + sc_gc(); + result = free_ptr; + free_ptr += cells; + if (free_ptr > (heap_size - heap_size/sc_gc_thrash_factor)) { + /* Clear registers in hopes of freeing space. While not guaranteed, + * this can help simple cases like recovering the REPL after a + * runaway recursion. */ + r0 = r1 = r2 = r3 = r4 = r5 = r6 = r_stack = SC_NULL; + sc_error("out of memory"); + } + } + return result; +} + + +/************************* + * Scheme stack operations + */ + +/* Push R_CAR onto the stack (no other side effects) */ +static void push(void) { + value new_stack = sc_malloc(2); + heap[new_stack] = R_CAR; + heap[new_stack+1] = r_stack; + r_stack = add_tag(new_stack, T_PAIR); +} + +/* Shorthand to push an arbitrary value */ +#define PUSH(val) { R_CAR = (val); push(); } + +/* Remove the top of the stack */ +static void drop(void) { + r_stack = cdr(r_stack); +} + +/* Return the top of the stack */ +static value peek(void) { + return car(r_stack); +} + +/* Remove and return the top of the stack */ +static value pop(void) { + value v = car(r_stack); + r_stack = cdr(r_stack); + return v; +} + + +/*************************************************** + * Builtin type constructors, predicates & accessors + */ + +static int is_ext_type(value v, uint t) { + return tag(v) == T_EXTENDED && ext_tag(heap[untag(v)]) == t; +} + +static int is_mutable(value v) { + int t = tag(v); + if (t != T_EXTENDED) return t == T_PAIR; + t = ext_tag(heap[untag(v)]); + return t == T_STRING || t == T_VECTOR; +} + +/* Booleans */ + +static value boolean(int b) { return b ? SC_TRUE : SC_FALSE; } +static int is_boolean(value v) { return v == SC_TRUE || v == SC_FALSE; } + +/* Pairs & lists */ + +/* Return a new pair from the values of R_CAR and R_CDR */ +static value cons(void) { + value p = sc_malloc(2); + heap[p] = R_CAR; + heap[p+1] = R_CDR; + return add_tag(p, T_PAIR); +} +static value cons_immutable(void) { + value p = sc_malloc(2); + heap[p] = R_CAR; + heap[p+1] = R_CDR; + return add_tag(p, T_IMMUT_PAIR); +} +static int is_pair(value v) { return (tag(v) | 1) == T_PAIR; } +static value car(value p) { + assert(is_pair(p)); + return heap[untag(p)]; +} +static value cdr(value p) { + assert(is_pair(p)); + return heap[untag(p)+1]; +} +static void set_car(value p, value v) { + assert(is_pair(p)); + heap[untag(p)] = v; +} +static void set_cdr(value p, value v) { + assert(is_pair(p)); + heap[untag(p)+1] = v; +} +static value safe_car(value p) { + if (!is_pair(p)) sc_error1("not a pair:", p); + return car(p); +} +static value safe_cdr(value p) { + if (!is_pair(p)) sc_error1("not a pair:", p); + return cdr(p); +} +#define cadr(x) car(cdr(x)) + +/* Safely compute the length of a list, returning -1 if not a proper list */ +static long safe_list_length(value v) { + /* Floyd's cycle-finding algorithm */ + value slow = v, fast = v, length = 0; + while (is_pair(fast)) { + slow = cdr(slow); + fast = cdr(fast); + length++; + if (!is_pair(fast)) break; + fast = cdr(fast); + if (fast == slow) return -1; /* cycle */ + length++; + } + if (fast != SC_NULL) return -1; /* improper list or not a pair */ + return length; +} +static int is_list(value v) { return safe_list_length(v) >= 0; } + +/* Compute the length of a proper list */ +static value list_length(value l) { + value length = 0; + for (; l != SC_NULL; l = cdr(l)) length++; + return length; +} + +/* Find the first node shared by two proper lists; that is, the LCA of two + * nodes in the parent-pointer tree rooted at the empty list. */ +static value lowest_common_ancestor(value a, value b) { + value al = list_length(a), bl = list_length(b); + if (al != bl) { + if (al > bl) + do a = cdr(a), --al; while (al > bl); + else + do b = cdr(b), --bl; while (bl > al); + } + while (a != b) a = cdr(a), b = cdr(b); + return a; +} + +/* Numbers */ + +static value fixnum_zero, fixnum_one; + +/* Not bounds checked! */ +static value fixnum(long n) { return add_tag(untag(n), T_FIXNUM); } +static int is_fixnum(value v) { return tag(v) == T_FIXNUM; } +static long fixnum_val(value v) { + assert(is_fixnum(v)); + return untag_signed(v); +} +static ulong unsigned_fixnum_val(value v) { + assert(is_fixnum(v)); + return untag(v); +} +static long safe_fixnum_val(value v) { + if (is_fixnum(v)) return untag_signed(v); + sc_error1("not an integer or out of bounds:", v); +} + +static value flonum(double x) { + value f = sc_malloc(1 + FLONUM_CELLS); + heap[f] = ext_add_tag(0, T_FLONUM); + /* strict aliasing? + *((double *)&heap[f+1]) = x; */ + memcpy(&heap[f+1], &x, sizeof x); + return add_tag(f, T_EXTENDED); +} +static int is_flonum(value v) { return is_ext_type(v, T_FLONUM); } +static double flonum_val(value f) { + /* strict aliasing? + return *((double *)&heap[untag(f)+1]); */ + double x; + assert(is_flonum(f)); + memcpy(&x, &heap[untag(f)+1], sizeof x); + return x; +} + +static value make_bignum_uninit(value len, int neg) { + value ptr; + if (len > EXT_LENGTH_MAX) sc_error("length too large for bignum"); + ptr = sc_malloc(1 + len); + heap[ptr] = ext_add_tag(len, T_BIGNUM | neg); + return add_tag(ptr, T_EXTENDED); +} +static int is_bignum(value v) { + return tag(v) == T_EXTENDED && (ext_tag(heap[untag(v)]) | 1) == + T_NEG_BIGNUM; +} +static value bignum_len(value n) { + assert(is_bignum(n)); + return ext_untag(heap[untag(n)]); +} +static value bignum_ref(value n, value k) { + assert(k < bignum_len(n)); + return heap[untag(n)+k+1]; +} +static void bignum_set(value n, value k, value word) { + assert(k < bignum_len(n)); + assert(is_fixnum(word)); + heap[untag(n)+k+1] = word; +} +static int is_bignum_negative(value n) { + assert(is_bignum(n)); + return ext_tag(heap[untag(n)]) & 1; +} +static value bignum_set_negative(value n) { + assert(is_bignum(n)); + heap[untag(n)] |= (1UL << EXT_VAL_BITS); + return n; +} +/* Truncate bignum in place (consider carefully how GC works) */ +static value bignum_truncate(value n, value len) { + assert(len <= bignum_len(n)); + value ptr = untag(n); + heap[ptr] = ext_add_tag(len, ext_tag(heap[ptr])); + return n; +} + +static int is_rational(value v) { return is_ext_type(v, T_RATIONAL); } + +static int is_exact(value v) { + return is_fixnum(v) || is_bignum(v) || is_rational(v); +} +static int is_number(value v) { + return is_fixnum(v) || + (tag(v) == T_EXTENDED && ext_tag(heap[untag(v)]) >= T_FLONUM); +} +static int is_integer(value v) { + if (is_fixnum(v) || is_bignum(v)) return 1; + if (is_flonum(v)) { + double f = flonum_val(v); + return f == nearbyint(f); + } + return 0; +} + +/* Characters */ + +static value character(uchar c) { return add_tag(c, T_CHARACTER); } +static int is_character(value v) { return tag(v) == T_CHARACTER; } +static uchar safe_char_val(value c) { + if (!is_character(c)) sc_error1("not a character:", c); + return (uchar)c; +} +#define char_val(c) ((uchar)(c)) + +/* Convert ASCII characters to upper/lowercase */ +static uchar uc(uchar c) { + if (c >= 'a' && c <= 'z') return c - 0x20; + return c; +} +static uchar lc(uchar c) { + if (c >= 'A' && c <= 'Z') return c + 0x20; + return c; +} + +/* Strings */ + +static value alloc_string(value len) { + if (len > EXT_LENGTH_MAX) + sc_error("length negative or too large for string"); + return sc_malloc(1 + packed_str_len(len)); +} +static value make_string_uninit(value len) { + value ptr = alloc_string(len); + heap[ptr] = ext_add_tag(len, T_STRING); + return add_tag(ptr, T_EXTENDED); +} +static value make_immutable_string(value len) { + value ptr = alloc_string(len); + heap[ptr] = ext_add_tag(len, T_IMMUT_STRING); + return add_tag(ptr, T_EXTENDED); +} +static int is_string(value v) { + return tag(v) == T_EXTENDED && (ext_tag(heap[untag(v)]) | 1) == T_STRING; +} +static int is_symbol(value); +static uchar * string_buf(value s) { + assert(is_string(s) || is_symbol(s)); + return (uchar *)&heap[untag(s)+1]; +} +/* C thinks strings are made of signed chars for some reason... */ +static char * c_string_buf(value s) { + assert(is_string(s) || is_symbol(s)); + return (char *)string_buf(s); +} +static value string_len(value s) { + assert(is_string(s) || is_symbol(s)); + return ext_untag(heap[untag(s)]); +} +/* Construct string from null-terminated C string not on the Scheme heap */ +static value string(const char *c_str) { + value len = strlen(c_str); + value str = make_string_uninit(len); + memcpy(string_buf(str), c_str, len); + return str; +} +static value make_string(value len, uchar fill) { + value s = make_string_uninit(len); + memset(string_buf(s), fill, len); + return s; +} +/* Construct immutable copy of string or symbol in R_EXPR */ +static value string_copy_immutable(void) { + value len = string_len(R_EXPR), ptr = alloc_string(len); + heap[ptr] = ext_add_tag(len, T_IMMUT_STRING); + memcpy(heap+ptr+1, string_buf(R_EXPR), len); + return add_tag(ptr, T_EXTENDED); +} +/* Construct copy of string in R_EXPR */ +static value string_copy(void) { + value len = string_len(R_EXPR); + value result = make_string_uninit(len); + memcpy(string_buf(result), string_buf(R_EXPR), len); + return result; +} +/* Construct copy of string in R_EXPR with null byte appended */ +static value string_append_null(void) { + value len = string_len(R_EXPR); + value result = make_string_uninit(len + 1); + uchar *buf = string_buf(result); + memcpy(buf, string_buf(R_EXPR), len); + buf[len] = '\0'; + return result; +} +/* Truncate string in place (consider carefully how GC works) */ +static void string_truncate(value s, value len) { + assert(len <= string_len(s)); + value ptr = untag(s); + heap[ptr] = ext_add_tag(len, ext_tag(heap[ptr])); +} + +/* Symbols */ + +static value symbols; /* interning list */ + +/* Frequently used symbols */ +static value s_lambda, s_quote, s_quasiquote, s_unquote, s_unquote_splicing, + s_if, s_set, s_begin, s_letrec, s_define, s_delay, s_literal, + s_open_paren, s_close_paren, s_dot, s_open_vector, s_identifier, + s_named_char, s_abbrev, s_number, s_truncate, s_overwrite, + s_append, s_sync, s_data_sync; + +static value find_symbol(const uchar *buf, value len) { + value iter, sym; + /* some type checks skipped because interning list is not (directly) user + * modifiable */ + for (iter = symbols; iter != SC_NULL; iter = cdr(iter)) { + sym = car(iter); + if (len == ext_untag(heap[untag(sym)]) && + memcmp(buf, &heap[untag(sym)+1], len) == 0) + return sym; + } + return SC_NULL; +} +/* Get symbol from a null-terminated C string not on the Scheme heap, not + * converting case (side effects: R_CAR R_CDR) */ +static value symbol(const char *c_str) { + value len = strlen(c_str); + value sym = find_symbol((uchar *)c_str, len); + if (sym != SC_NULL) return sym; + value sym_ptr = sc_malloc(1 + packed_str_len(len)); + heap[sym_ptr] = ext_add_tag(len, T_SYMBOL); + memcpy(&heap[sym_ptr+1], c_str, len); + R_CAR = add_tag(sym_ptr, T_EXTENDED); + R_CDR = symbols; + symbols = cons(); + return R_CAR; +} +/* Get symbol from a Scheme string in R_CAR, not converting case + * (side effects: R_CAR R_CDR) */ +static value string_to_symbol(void) { + value len = string_len(R_CAR); + value sym = find_symbol(string_buf(R_CAR), len); + if (sym != SC_NULL) return sym; + value sym_ptr = sc_malloc(1 + packed_str_len(len)); + heap[sym_ptr] = ext_add_tag(len, T_SYMBOL); + memcpy(&heap[sym_ptr+1], string_buf(R_CAR), len); + R_CAR = add_tag(sym_ptr, T_EXTENDED); + R_CDR = symbols; + symbols = cons(); + return R_CAR; +} +static int is_symbol(value v) { return is_ext_type(v, T_SYMBOL); } + +/* Vectors */ + +static value alloc_vector(value len) { + if (len > EXT_LENGTH_MAX) + sc_error("length negative or too large for vector"); + return sc_malloc(1 + len); +} +/* Uninitialized constructors: caller must fill without further allocation */ +static value make_vector_uninit(value len) { + value vec = alloc_vector(len); + heap[vec] = ext_add_tag(len, T_VECTOR); + return add_tag(vec, T_EXTENDED); +} +static value make_immutable_vector(value len) { + value vec = alloc_vector(len); + heap[vec] = ext_add_tag(len, T_IMMUT_VECTOR); + return add_tag(vec, T_EXTENDED); +} +/* Build a new vector with each element initialized to R_EXPR */ +static value make_vector(value len) { + value vec = make_vector_uninit(len), *p; + for (p = heap+untag(vec)+1; len; --len, ++p) *p = R_EXPR; + return vec; +} +/* Build a new vector by reversing the elements of proper list R_EXPR */ +static value rev_list_to_vec(void) { + value len = list_length(R_EXPR), + vec = make_vector_uninit(len), + *p = heap+untag(vec)+len; + for (; R_EXPR != SC_NULL; --p, R_EXPR = cdr(R_EXPR)) *p = car(R_EXPR); + return vec; +} +static int is_vector(value v) { + return tag(v) == T_EXTENDED && (ext_tag(heap[untag(v)]) | 1) == T_VECTOR; +} +static value vector_len(value v) { + assert(is_vector(v)); + return ext_untag(heap[untag(v)]); +} +static value vector_ref(value v, value k) { + assert(k < vector_len(v)); + return heap[untag(v)+k+1]; +} +static void vector_set(value v, value k, value obj) { + assert(k < vector_len(v)); + heap[untag(v)+k+1] = obj; +} + +/* Builtin procedures */ + +static value builtin(const char *name, builtin_func_t func) { + value b = sc_malloc(3); + heap[b] = ext_add_tag(0, T_BUILTIN); + heap[b+1] = (value)name; + heap[b+2] = (value)func; + return add_tag(b, T_EXTENDED); +} +static int is_builtin(value v) { return is_ext_type(v, T_BUILTIN); } +static const char * builtin_name(value b) { + return (char *)heap[untag(b)+1]; +} +static builtin_func_t builtin_func(value b) { + return (builtin_func_t)heap[untag(b)+2]; +} + +/* Compound procedures */ + +/* Return a new procedure object from lambda expression operands in R_OPERANDS + * and environment in R_ENV. + * Side effects: R_OPERANDS R_CAR R_CDR */ +static value procedure(void) { + value p, arity; + arity = car(R_OPERANDS); + if (is_fixnum(arity)) { + /* Compiler annotated parameter list attributes to save a traversal */ + R_OPERANDS = cdr(R_OPERANDS); + } + else { + /* ...this traversal (still needed for bootstrapping), which in turn + * saves traversing each time the procedure is applied */ + p = arity; /* parameter list */ + arity = 0; + for (; is_pair(p); p = cdr(p)) arity++; + if (p == SC_NULL) arity = fixnum(arity); + else { + /* improper (variadic) */ + assert(is_symbol(p)); + arity = (value)(-1L - (long)arity); + } + } + p = sc_malloc(4); + heap[p] = ext_add_tag(ext_untag(arity), T_PROCEDURE); + heap[p+1] = car(R_OPERANDS); /* parameter list */ + heap[p+2] = cdr(R_OPERANDS); /* body */ + heap[p+3] = R_ENV; + return add_tag(p, T_EXTENDED); +} +static int is_compound_proc(value v) { return is_ext_type(v, T_PROCEDURE); } +static long proc_arity(value p) { return ext_untag_signed(heap[untag(p)]); } +static value proc_params(value p) { return heap[untag(p)+1]; } +static value proc_body(value p) { return heap[untag(p)+2]; } +static value proc_env(value p) { return heap[untag(p)+3]; } + +/* Continuations */ + +static value current_continuation(void) { + value cont = sc_malloc(3); + heap[cont] = ext_add_tag(0, T_CONTINUATION); + heap[cont+1] = r_stack; + heap[cont+2] = r_spool; + return add_tag(cont, T_EXTENDED); +} +static int is_continuation(value v) { return is_ext_type(v, T_CONTINUATION); } +static value continuation_stack(value c) { return heap[untag(c)+1]; } +static value continuation_spool(value c) { return heap[untag(c)+2]; } + +static int is_procedure(value v) { + return is_builtin(v) || is_compound_proc(v) || is_continuation(v); +} + +/* Promises */ + +/* Construct a promise from an expression in R_EXPR and environment in R_ENV */ +static value promise(void) { + value p = sc_malloc(3); + heap[p] = ext_add_tag(0, T_PROMISE); + heap[p+1] = R_EXPR; + heap[p+2] = R_ENV; + return add_tag(p, T_EXTENDED); +} +static int is_promise(value v) { return is_ext_type(v, T_PROMISE); } +static int promise_done(value p) { return heap[untag(p)] & 1; } +static value promise_value(value p) { return heap[untag(p)+1]; } +static value promise_env(value p) { return heap[untag(p)+2]; } +static void promise_memoize(value p, value v) { + value ptr = untag(p); + heap[ptr] = ext_add_tag(1, T_PROMISE); + heap[ptr+1] = v; + heap[ptr+2] = SC_NULL; /* release to GC */ +} + +/* Ports */ + +static value stdin_port, stdout_port; + +#define DEFAULT_R_BUF 4096 +#define DEFAULT_W_BUF 4096 + +/* Flags in header */ +#define PORT_OUTPUT_BIT 1 +#define PORT_SOCKET_BIT 2 +#define PORT_EOF_BIT 4 + +/* Fields */ +#define PORT_FD 1 +#define PORT_START 2 +#define PORT_FILL 3 +#define PORT_BUF 4 +#define PORT_COUNTERPART 5 + +/* Construct unidirectional port. Side effects: R_RESULT */ +static value make_port(int fd, int is_output, long buf_size) { + value port, *p; + if (buf_size < 1) sc_error("buffer size must be at least one"); + R_RESULT = make_string_uninit(buf_size); + port = sc_malloc(6); + p = heap+port; + p[0] = ext_add_tag(is_output ? PORT_OUTPUT_BIT : 0, T_PORT); + p[PORT_FD] = fixnum(fd); + p[PORT_START] = fixnum(0); + p[PORT_FILL] = fixnum(0); + p[PORT_BUF] = R_RESULT; + p[PORT_COUNTERPART] = SC_NULL; + return add_tag(port, T_EXTENDED); +} +/* Construct input port in r0 and output port in r1 from socket file + * descriptor. Side effects: R_RESULT */ +static void make_socket_ports(int fd, value rbuf_size, value wbuf_size) { + value *p; + chkp(fcntl(fd, F_SETFL, O_NONBLOCK)); + r0 = make_port(fd, 0, rbuf_size); + r1 = make_port(fd, 1, wbuf_size); + /* Cross-reference the two directions so the underlying FD can be closed + * promptly when both ports are. */ + p = heap+untag(r0); + p[0] |= PORT_SOCKET_BIT; + p[PORT_COUNTERPART] = r1; + p = heap+untag(r1); + p[0] |= PORT_SOCKET_BIT; + p[PORT_COUNTERPART] = r0; +} + +static int is_port(value v) { return is_ext_type(v, T_PORT); } +static int is_input_port(value v) { + value header; + if (tag(v) != T_EXTENDED) return 0; + header = heap[untag(v)]; + return ext_tag(header) == T_PORT && (header & PORT_OUTPUT_BIT) == 0; +} +static int is_output_port(value v) { + value header; + if (tag(v) != T_EXTENDED) return 0; + header = heap[untag(v)]; + return ext_tag(header) == T_PORT && (header & PORT_OUTPUT_BIT) != 0; +} + +static int set_port_closed(value *p) { + int fd = fixnum_val(p[PORT_FD]); + /* Set an invalid FD so writes to a closed port are caught by the kernel + * with no extra cost in the normal case. Disable buffering so they're + * caught immediately. */ + p[PORT_FD] = fixnum(-1); + p[PORT_START] = p[PORT_FILL] = fixnum(0); + string_truncate(p[PORT_BUF], 1); + if (p[PORT_COUNTERPART] == SC_NULL) return close(fd); + heap[untag(p[PORT_COUNTERPART])+PORT_COUNTERPART] = SC_NULL; + p[PORT_COUNTERPART] = SC_NULL; + return 0; +} +static ssize_t fill_input_port(value *p, int nonblock) { + int fd = fixnum_val(p[PORT_FD]); + uchar *buf = string_buf(p[PORT_BUF]); + value len = string_len(p[PORT_BUF]); + ssize_t n; + while ((n = read(fd, buf, len)) < 0) { + if (errno == EINTR) continue; + if (errno == EAGAIN || errno == EWOULDBLOCK) { + if (nonblock) return -1; + poll1(fd, POLLIN, -1); continue; + } + if (fd == -1) sc_error("input port closed"); + sc_perror(); + } + p[PORT_START] = fixnum(0); + p[PORT_FILL] = fixnum(n); + return n; +} +static void flush_output_port(value *p) { + int fd = fixnum_val(p[PORT_FD]); + long fill = fixnum_val(p[PORT_FILL]); + assert(fill > 0); /* zero-length write unspecified on non-regular files */ + assert((ulong)fill <= string_len(p[PORT_BUF])); + p[PORT_FILL] = fixnum(0); + if (write_all(fd, c_string_buf(p[PORT_BUF]), fill) == -1) { + int saved; + if (fd == -1) sc_error("output port closed"); + /* Probably no sensible way to recover from write errors, so force the + * port closed. XXX Closing standard streams is a concern (i.e. a + * subsequent open gets FD 1 or 2 and terminal output goes to the file + * unexpectedly), except: 1) the interpreter writes to stdout through + * the port object only; 2) the open-subprocess extension always pipes + * the child's stdout; 3) there's no port for stderr. But these are + * fragile assumptions. */ + saved = errno; set_port_closed(p); errno = saved; + sc_perror(); + } +} + +static void flush_if_needed(value port) { + value *p = heap+untag(port); + if (fixnum_val(p[PORT_FILL]) > 0) flush_output_port(p); +} +static void close_port(value port) { + value *p = heap+untag(port), header = p[0]; + int fd = fixnum_val(p[PORT_FD]); + if (fd == -1) return; + if (header & PORT_OUTPUT_BIT) flush_if_needed(port); + if (header & PORT_SOCKET_BIT) + shutdown(fd, header & PORT_OUTPUT_BIT ? SHUT_WR : SHUT_RD); + chkp(set_port_closed(p)); +} +static value read_char(value port) { + value *p = heap+untag(port), start = p[PORT_START]; + uchar *buf = string_buf(p[PORT_BUF]); + if (start == p[PORT_FILL]) { + if (p[0] & PORT_EOF_BIT) { p[0] ^= PORT_EOF_BIT; return SC_EOF; } + if (!fill_input_port(p, 0)) return SC_EOF; + start = 0; + } + else start = untag(start); + p[PORT_START] = fixnum(start+1); + return character(buf[start]); +} +static value peek_char(value port) { + value *p = heap+untag(port), start = p[PORT_START]; + uchar *buf = string_buf(p[PORT_BUF]); + if (start == p[PORT_FILL]) { + /* EOF is not always permanent, e.g. on a tty, so the condition must be + * saved specially for the next peek or read. */ + if (p[0] & PORT_EOF_BIT) return SC_EOF; + if (!fill_input_port(p, 0)) { p[0] |= PORT_EOF_BIT; return SC_EOF; } + start = 0; + } + else start = untag(start); + return character(buf[start]); +} +static value input_port_ready(value port) { + value *p; + int fd; + p = heap+untag(port); + fd = fixnum_val(p[PORT_FD]); + if (p[PORT_START] < p[PORT_FILL]) return SC_TRUE; + if (fd == -1) sc_error("input port closed"); + if (!poll1(fd, POLLIN, 0)) return SC_FALSE; + /* XXX Linux poll/select are broken and have false positives for + * readability, at least for sockets, so we try a nonblocking read. But + * this doesn't work for regular files! Seems marginally better to break + * "the next READ-CHAR operation on the given PORT is guaranteed not to + * hang" than have CHAR-READY? itself hang. Alternately, djb's SIGALARM + * hack could be used. */ + if (p[0] & PORT_SOCKET_BIT && fill_input_port(p, 1) == -1) return SC_FALSE; + return SC_TRUE; +} +/* Barbarous relic from writing the lexer based on stdio/ungetc */ +#define EOF (-1) +static void put_back_char(int c) { + value *p; + assert(is_port(R_PORT)); + p = heap+untag(R_PORT); + if (c == EOF) p[0] |= PORT_EOF_BIT; + else { + value start = untag(p[PORT_START]); + assert(start); + --start; + string_buf(p[PORT_BUF])[start] = c; + p[PORT_START] = fixnum(start); + } +} +static void write_char(uchar c) { + value *p, fill, len; + uchar *buf; + assert(is_port(R_PORT)); + p = heap+untag(R_PORT); + fill = untag(p[PORT_FILL]); + len = string_len(p[PORT_BUF]); + assert(fill < len); + buf = string_buf(p[PORT_BUF]); + buf[fill] = c; + ++fill; + p[PORT_FILL] = fixnum(fill); + if (fill == len) flush_output_port(p); +} + +static int stdout_ready; +static void flush_all(void) { + /* TODO */ + if (stdout_ready) flush_if_needed(stdout_port); +} + +static void write_cstr(const char *s) { for (; *s; ++s) write_char(*s); } +static void write_str(value s) { /* also for symbols */ + value len = string_len(s); + uchar *buf = string_buf(s); + assert(is_string(s) || is_symbol(s)); + for (; len; --len, ++buf) write_char(*buf); +} +static void write_str_quoted(value s) { + value i, len = string_len(s); + uchar *buf = string_buf(s); + write_char('"'); + for (i = 0; i < len; i++) { + uchar c = buf[i]; + if (c == '"' || c == '\\') write_char('\\'); + write_char(c); + } + write_char('"'); +} +static void newline(void) { write_char('\n'); } + +/* Environments + * + * An environment is a list of lexical frames followed by global frames. + * + * A lexical frame is a vector of which the first element is the list of + * symbols naming the variables (possibly improper, as in a lambda expression), + * and the remaining elements are the corresponding values. + * + * A global frame is a list of (symbol . value) binding pairs. */ + +static value r5rs_env, gscm_env, interaction_env, toplevel_env; + +static void check_mutable_env(value env, value name) { + if (env != interaction_env) { + assert(env == r5rs_env || env == gscm_env || env == toplevel_env); + sc_error1("variable in immutable environment:", name); + } +} + +/* Construct a new lexical frame for the application of the procedure in R_PROC + * to the freshly allocated argument list in R_ARGS (no other side effects) */ +static value make_lex_frame(void) { + value k, frame, args, arity, fixed_arity; + long encoded_arity = proc_arity(R_PROC); + if (encoded_arity < 0) { + arity = (value)(-encoded_arity); + fixed_arity = arity - 1; + } + else { + arity = (value)encoded_arity; + fixed_arity = arity; + } + frame = make_vector_uninit(1 + arity); + vector_set(frame, 0, proc_params(R_PROC)); + args = R_ARGS; + for (k = 1; k <= fixed_arity; k++) { + if (args == SC_NULL) sc_error("too few arguments"); + vector_set(frame, k, car(args)); + args = cdr(args); + } + if (fixed_arity < arity) vector_set(frame, k, args); + else if (args != SC_NULL) sc_error("too many arguments"); + return frame; +} + +/* Construct a new lexical frame for a LETREC binding list in r2, that is, bind + * the given names to not-yet-defined values. The name list is constructed in + * reverse order. + * Side effects: r2 R_CAR R_CDR */ +static value make_letrec_frame(void) { + /* TODO optimize: transpose the binding list? */ + value k, len, frame; + R_CDR = SC_NULL; + len = 1; + for (; r2 != SC_NULL; r2 = cdr(r2)) { + len++; + R_CAR = car(car(r2)); + R_CDR = cons(); + } + frame = make_vector_uninit(len); + vector_set(frame, 0, R_CDR); + for (k = 1; k < len; k++) + vector_set(frame, k, UNDEFINED); + return frame; +} + +/* Add a new binding for R_CAR to R_CDR to the topmost frame of global R_ENV. + * Side effects: R_CAR R_CDR */ +static void extend_global_env(void) { + R_CAR = cons(); /* new binding */ + R_CDR = car(R_ENV); /* top frame */ + assert(is_pair(R_CDR) || R_CDR == SC_NULL); + R_CDR = cons(); + set_car(R_ENV, R_CDR); +} + +/* Construct a new global frame containing copies of the bindings in the frame + * in R_EXPR. Side effects: R_CAR R_CDR R_EXPR R_TAIL R_RESULT */ +static value copy_global_frame(void) { + value temp; + R_CAR = R_CDR = SC_NULL; + R_TAIL = R_RESULT = cons(); + for (; R_EXPR != SC_NULL; R_EXPR = cdr(R_EXPR)) { + temp = car(R_EXPR); + R_CAR = car(temp); R_CDR = cdr(temp); + R_CAR = cons(); /* copied binding */ + R_CDR = SC_NULL; + temp = cons(); + set_cdr(R_TAIL, temp); + R_TAIL = temp; + } + return cdr(R_RESULT); +} + +static value global_frame_lookup(value name, value frame) { + value binding; + for (; frame != SC_NULL; frame = cdr(frame)) { + binding = car(frame); + if (car(binding) == name) return binding; + } + return SC_FALSE; +} + +static value lex_frame_lookup(value name, value frame) { + value names, index; + index = 1; + for (names = vector_ref(frame, 0); is_pair(names); names = cdr(names)) { + if (car(names) == name) goto found; + index++; + } + if (names != name) return 0; +found: + if (vector_ref(frame, 1) == UNDEFINED) /* see LETREC */ + sc_error1("undefined variable:", name); + return index; +} + +static value env_lookup(value name, value env) { + value frame, binding, index; + assert(is_symbol(name)); + for (; env != SC_NULL; env = cdr(env)) { + frame = car(env); + if (is_vector(frame)) { + index = lex_frame_lookup(name, frame); + if (index) return vector_ref(frame, index); + } + else { + binding = global_frame_lookup(name, frame); + if (binding != SC_FALSE) return cdr(binding); + } + } + sc_error1("unbound variable:", name); +} + +static void env_lookup_set(value name, value env, value new) { + value frame, binding, index; + assert(is_symbol(name)); + for (; env != SC_NULL; env = cdr(env)) { + frame = car(env); + if (is_vector(frame)) { + index = lex_frame_lookup(name, frame); + if (index) { + vector_set(frame, index, new); + return; + } + } + else { + binding = global_frame_lookup(name, frame); + if (binding != SC_FALSE) { + check_mutable_env(env, name); + set_cdr(binding, new); + return; + } + } + } + sc_error1("unbound variable:", name); +} + +/* Variable references: created by compiler to memoize environment lookups */ + +static int is_variable_ref(value v) { return is_ext_type(v, T_VARIABLE_REF); } + +/* Return an unresolved variable reference for a symbol in R_CAR */ +static value make_variable_ref() { + assert(is_symbol(R_CAR)); + value ref = sc_malloc(2); + heap[ref] = ext_add_tag(0, T_VARIABLE_REF); + heap[ref+1] = R_CAR; + return add_tag(ref, T_EXTENDED); +} + +/* Look up an unresolved variable reference and memoize */ +static void resolve_variable_ref(value ref, value env, int mutable) { + value ptr, name, frame, height, binding, index; + ptr = untag(ref); + name = heap[ptr+1]; + assert(is_symbol(name)); + height = 0; + for (; env != SC_NULL; env = cdr(env)) { + frame = car(env); + if (is_vector(frame)) { + index = lex_frame_lookup(name, frame); + if (index) { + if (height > FIXNUM_MAX) + /* maybe possible on small architectures */ + sc_error("environment too deep"); + heap[ptr] = ext_add_tag(index, T_VARIABLE_REF); + heap[ptr+1] = add_tag(height, T_FIXNUM); + return; + } + } + else { + binding = global_frame_lookup(name, frame); + if (binding != SC_FALSE) { + if (mutable) check_mutable_env(env, name); + heap[ptr+1] = binding; + return; + } + } + height++; + } + sc_error1("unbound variable:", name); +} + +static value variable_ref_get(value ref, value env) { + value ptr, contents, height; + ptr = untag(ref); +retry: + contents = heap[ptr+1]; + if (is_pair(contents)) /* global */ + return cdr(contents); + else if (is_fixnum(contents)) { /* lexical */ + for (height = fixnum_val(contents); height; height--) + env = cdr(env); + return vector_ref(car(env), ext_untag(heap[ptr])); + } + else { /* unresolved */ + resolve_variable_ref(ref, env, 0); + goto retry; + } +} + +static void variable_ref_set(value ref, value env, value new) { + value ptr, contents, height; + ptr = untag(ref); +retry: + contents = heap[ptr+1]; + if (is_pair(contents)) /* global */ + set_cdr(contents, new); + else if (is_fixnum(contents)) { /* lexical */ + for (height = fixnum_val(contents); height; height--) + env = cdr(env); + vector_set(car(env), ext_untag(heap[ptr]), new); + } + else { /* unresolved */ + resolve_variable_ref(ref, env, 1); + goto retry; + } +} + + +/*********** + * Debugging + */ + +static void shallow_print(void); + +void sc_dump(value v) { + r_dump = v; + PUSH(R_CAR); + PUSH(R_EXPR); + PUSH(R_PORT); + R_EXPR = r_dump; + R_PORT = stdout_port; + shallow_print(); + newline(); + R_PORT = pop(); + R_EXPR = pop(); + R_CAR = pop(); + r_dump = SC_NULL; +} + + +/**************** + * Core evaluator + * + * The evaluator is a set of subroutines delimited by labels, with "switch" + * cases serving as pushable return addresses. (Caution is needed in case of + * nested switches or "break".) Properly tail recursive calls are where "goto" + * is used rather than CALL, that is, a new return address is not pushed. + * Nothing else may be left on the subroutine's stack frame in these cases! + */ + +/* Shorthand for non-tail subroutine calls. Beware of the register side effects + * or confusing RETURN with C return. */ +#define CALL(subroutine_label, return_address) \ + { R_CAR = return_address; push(); goto subroutine_label; } +#define RETURN(val) { R_RESULT = (val); goto dispatch; } + +/* Return addresses */ +#define EV_DONE 0 +#define EV_COMPILE_RESULT 1 +#define EV_CALL_OPERATOR 2 +#define EV_CALL_LOOP 3 +#define EV_UNWIND_LOOP 4 +#define EV_REWIND_LOOP 5 +#define EV_SEQ_LOOP 6 +#define EV_IF_PREDICATE 7 +#define EV_SET_RESULT 8 +#define EV_LETREC_LOOP 9 +#define EV_DEFINE_RESULT 10 +#define EV_FORCE_RESULT 11 +#define EV_CALL_WITH_VALUES 12 + +static const char *err_context; +static jmp_buf err_longjmp_env; + +/* Takes expression in R_EXPR and environment in R_ENV */ +static void evaluator(void) { + value k; + if (setjmp(err_longjmp_env)) goto APPLY; + if (r_compiler) CALL(COMPILE, EV_DONE); + CALL(EVAL, EV_DONE); +dispatch: + switch (pop()) { + case EV_DONE: + assert(r_stack == SC_NULL); + r_error_cont = SC_NULL; + break; + +COMPILE: + /* Compile expression R_EXPR then evaluate in environment R_ENV */ + PUSH(R_ENV); + R_CAR = R_EXPR; + R_CDR = SC_NULL; + R_ARGS = cons(); + R_PROC = r_compiler; + CALL(APPLY, EV_COMPILE_RESULT); + case EV_COMPILE_RESULT: + R_EXPR = R_RESULT; + R_ENV = pop(); + goto EVAL; + +EVAL: + /* Evaluate expression R_EXPR in environment R_ENV */ + err_context = "eval"; + if (is_pair(R_EXPR)) { /* Combination */ + R_OPERANDS = cdr(R_EXPR); + R_EXPR = car(R_EXPR); + if (is_symbol(R_EXPR)) { + if (R_EXPR == s_lambda) RETURN(procedure()); + if (R_EXPR == s_if) goto IF; + if (R_EXPR == s_set) goto SET; + if (R_EXPR == s_begin) goto EVAL_BODY; + if (R_EXPR == s_letrec) goto LETREC; + if (R_EXPR == s_quote) RETURN(car(R_OPERANDS)); + if (R_EXPR == s_define) goto DEFINE; + if (R_EXPR == s_delay) goto DELAY; + } + goto EVAL_CALL; + } + if (is_variable_ref(R_EXPR)) /* Cacheable variable reference */ + RETURN(variable_ref_get(R_EXPR, R_ENV)); + if (is_symbol(R_EXPR)) + /* Slow and stupid variable lookup: replacing symbols in the + * expression tree with variable references is done by the + * compiler, so this is needed to bootstrap */ + RETURN(env_lookup(R_EXPR, R_ENV)); + assert(is_number(R_EXPR) || + is_boolean(R_EXPR) || + is_character(R_EXPR) || + is_string(R_EXPR) || + /* not valid Scheme, but allowed in compiler output */ + R_EXPR == SC_NULL || + is_vector(R_EXPR)); + RETURN(R_EXPR); /* Self-evaluating */ + +EVAL_CALL: + /* Procedure call (operator operand ...) + * Evaluate operator in R_EXPR and each operand in R_OPERANDS, build + * argument list and apply in R_ENV. */ + PUSH(R_OPERANDS); + PUSH(R_ENV); + CALL(EVAL, EV_CALL_OPERATOR); + case EV_CALL_OPERATOR: + R_ENV = pop(); + R_CAR = R_RESULT; + R_OPERANDS = pop(); + push(); /* evaluated operator */ + R_CAR = R_CDR = SC_NULL; + R_TAIL = cons(); /* arg list tail pointer */ + PUSH(R_TAIL); /* arg list head pointer */ + PUSH(R_ENV); + for (; R_OPERANDS != SC_NULL; R_OPERANDS = cdr(R_OPERANDS)) { + PUSH(R_OPERANDS); + PUSH(R_TAIL); + R_EXPR = car(R_OPERANDS); + CALL(EVAL, EV_CALL_LOOP); + case EV_CALL_LOOP: + R_CAR = R_RESULT; + R_TAIL = pop(); + R_OPERANDS = pop(); + R_ENV = peek(); + R_CDR = SC_NULL; + R_CDR = cons(); + set_cdr(R_TAIL, R_CDR); + R_TAIL = R_CDR; + } + drop(); /* environment */ + R_ARGS = cdr(pop()); /* arg list head pointer */ + R_PROC = pop(); /* evaluated operator */ + goto APPLY; + +APPLY: + /* Extend the lexical environment of procedure R_PROC by binding its + * formal parameters to arguments in the freshly allocated list R_ARGS, + * then evaluate its body in the new environment. */ + if (is_builtin(R_PROC)) { + err_context = builtin_name(R_PROC); + r_flag = f_none; + R_RESULT = (builtin_func(R_PROC))(R_ARGS); + /* Builtins cannot call back into the evaluator as that would break + * tail recursion and enable unlimited recursion on the C stack. + * Instead they can set a flag to signal a tail call to a given + * subroutine. */ + switch (r_flag) { + case f_none: RETURN(R_RESULT); + case f_compile: goto COMPILE; + case f_apply: goto APPLY; + case f_force: goto FORCE; + case f_call_with_values: goto CALL_WITH_VALUES; + /* optimization, see RETURN_VALUES */ + case f_values: goto VALUES; + } + } + err_context = "apply"; + if (is_compound_proc(R_PROC)) { + R_OPERANDS = proc_body(R_PROC); + R_CAR = make_lex_frame(); + R_CDR = proc_env(R_PROC); + R_ENV = cons(); + goto EVAL_BODY; + } + if (is_continuation(R_PROC)) goto APPLY_CONTINUATION; + sc_error1("not a procedure:", R_PROC); + +APPLY_CONTINUATION: + /* Return the value(s) R_ARGS to the continuation R_PROC, restoring its + * stack and applying any thunks registered to exit the current dynamic + * extent and re-enter the captured one. */ + R_WIND_TO = continuation_spool(R_PROC); + if (r_spool != R_WIND_TO) { + R_LCA = lowest_common_ancestor(r_spool, R_WIND_TO); + r_stack = SC_NULL; + PUSH(R_ARGS); + PUSH(R_PROC); + /* Unwind: apply "after" thunks from the current extent up to (but + * not including) the narrowest common extent */ + while (r_spool != R_LCA) { + assert(r_spool != SC_NULL); + /* XXX ^ possible to violate if thunk escapes? */ + R_PROC = cdr(car(r_spool)); + r_spool = cdr(r_spool); + R_ARGS = SC_NULL; + PUSH(R_LCA); + CALL(APPLY, EV_UNWIND_LOOP); + case EV_UNWIND_LOOP: + R_LCA = pop(); + } + /* Rewind: apply "before" thunks down to the captured extent + * starting below the common extent */ + R_WIND_TO = continuation_spool(peek()); + for (r_spool = R_WIND_TO; r_spool != R_LCA; r_spool = cdr(r_spool)) + PUSH(r_spool); + while (r_spool != R_WIND_TO) { + R_PROC = car(car(peek())); + R_ARGS = SC_NULL; + PUSH(R_WIND_TO); + CALL(APPLY, EV_REWIND_LOOP); + case EV_REWIND_LOOP: + R_WIND_TO = pop(); + r_spool = pop(); + } + R_PROC = pop(); + R_ARGS = pop(); + assert(r_stack == SC_NULL); + } + r_stack = continuation_stack(R_PROC); + VALUES: + if (peek() == EV_CALL_WITH_VALUES) { + drop(); + goto CALL_WITH_VALUES_CONT; + } + if (R_ARGS == SC_NULL) sc_error("no value for ordinary continuation"); + if (cdr(R_ARGS) != SC_NULL) + sc_error1("multiple values for ordinary continuation:", R_ARGS); + RETURN(car(R_ARGS)); + +EVAL_BODY: + /* Evaluate one or more commands/expressions. (No definitions; we don't + * need to distinguish sequence from body, as internal definitions are + * converted to letrec by the compiler.) + * Paramters: R_OPERANDS R_ENV */ + PUSH(R_ENV); + assert(R_OPERANDS != SC_NULL); + for (; cdr(R_OPERANDS) != SC_NULL; R_OPERANDS = cdr(R_OPERANDS)) { + R_EXPR = car(R_OPERANDS); + PUSH(R_OPERANDS); + CALL(EVAL, EV_SEQ_LOOP); + case EV_SEQ_LOOP: + R_OPERANDS = pop(); + R_ENV = peek(); + } + drop(); /* environment */ + R_EXPR = car(R_OPERANDS); + goto EVAL; + +IF: + /* (if predicate consequent [alternate]) + * Parameters: R_OPERANDS R_ENV */ + R_EXPR = car(R_OPERANDS); /* predicate */ + R_OPERANDS = cdr(R_OPERANDS); + R_CAR = car(R_OPERANDS); /* consequent */ + R_OPERANDS = cdr(R_OPERANDS); + push(); /* consequent */ + PUSH(R_OPERANDS); /* (alternate) */ + PUSH(R_ENV); + CALL(EVAL, EV_IF_PREDICATE); + case EV_IF_PREDICATE: + R_ENV = pop(); + if (R_RESULT != SC_FALSE) { + drop(); /* (alternate) */ + R_EXPR = pop(); /* consequent */ + goto EVAL; + } + R_EXPR = pop(); /* (alternate) */ + drop(); /* consequent */ + if (R_EXPR != SC_NULL) { + R_EXPR = car(R_EXPR); /* alternate */ + goto EVAL; + } + RETURN(SC_NULL); + +SET: + /* (set! variable value) + * Parameters: R_OPERANDS R_ENV */ + err_context = "set!"; + R_CAR = car(R_OPERANDS); /* variable name/ref */ + R_EXPR = cadr(R_OPERANDS); /* value expression */ + push(); + PUSH(R_ENV); + CALL(EVAL, EV_SET_RESULT); + case EV_SET_RESULT: + R_ENV = pop(); + R_CAR = pop(); /* variable name/ref */ + if (is_variable_ref(R_CAR)) + variable_ref_set(R_CAR, R_ENV, R_RESULT); + else /* Slow and stupid lookup for bootstrap, as in EVAL */ + env_lookup_set(R_CAR, R_ENV, R_RESULT); + RETURN(SC_NULL); + +LETREC: + /* (letrec ((var init) ...) body) + * Parameters: R_OPERANDS R_ENV */ + r2 = R_ARGS = car(R_OPERANDS); /* binding specifiers */ + PUSH(cdr(R_OPERANDS)); /* body */ + R_CAR = make_letrec_frame(); /* new frame */ + k = vector_len(R_CAR); + R_CDR = R_ENV; + R_ENV = cons(); /* new environment */ + /* Evaluate initializers in the new environment */ + PUSH(R_ENV); + for (; R_ARGS != SC_NULL; R_ARGS = cdr(R_ARGS)) { + k--; + PUSH(k); + PUSH(R_ARGS); + R_EXPR = car(cdr(car(R_ARGS))); + CALL(EVAL, EV_LETREC_LOOP); + case EV_LETREC_LOOP: + R_ARGS = pop(); + k = pop(); + R_ENV = peek(); + vector_set(car(R_ENV), k, R_RESULT); + /* Trick: all variables in a frame are considered UNDEFINED if the + * first one is. (Checking this is cheap due to memoized variable + * refs.) Since we're filling in the frame backwards, to match the + * reversed name list from make_letrec_frame, we catch uses of + * undefined variables in the initializers without needing to store + * their results in a temporary list here and then copy. */ + } + drop(); + assert(k == 1); + /* Evaluate body in the now populated environment */ + R_OPERANDS = pop(); /* body */ + goto EVAL_BODY; + +DEFINE: + /* (define variable value) + * Paramters: R_OPERANDS R_ENV */ + if (R_ENV != interaction_env) { + err_context = "define"; + sc_error("not allowed in this environment"); + } + PUSH(car(R_OPERANDS)); /* variable name */ + R_EXPR = car(cdr(R_OPERANDS)); /* value expression */ + CALL(EVAL, EV_DEFINE_RESULT); + case EV_DEFINE_RESULT: + /* XXX is this supposed to not handle variable refs? */ + R_ENV = interaction_env; + R_CAR = pop(); /* variable name */ + R_EXPR = global_frame_lookup(R_CAR, car(R_ENV)); + if (R_EXPR == SC_FALSE) { + R_CDR = R_RESULT; + extend_global_env(); + } + else set_cdr(R_EXPR, R_RESULT); + RETURN(SC_NULL); + +DELAY: + /* (delay expr) + * Parameters: R_OPERANDS R_ENV */ + R_EXPR = car(R_OPERANDS); + RETURN(promise()); + +FORCE: + /* Parameters: R_EXPR: promise */ + if (!is_promise(R_EXPR)) sc_error1("not a promise:", R_EXPR); + if (promise_done(R_EXPR)) RETURN(promise_value(R_EXPR)); + PUSH(R_EXPR); + R_ENV = promise_env(R_EXPR); + R_EXPR = promise_value(R_EXPR); + CALL(EVAL, EV_FORCE_RESULT); + case EV_FORCE_RESULT: + R_EXPR = pop(); + /* If promise forces itself recursively, keep the first result */ + if (promise_done(R_EXPR)) RETURN(promise_value(R_EXPR)); + promise_memoize(R_EXPR, R_RESULT); + RETURN(R_RESULT); + +CALL_WITH_VALUES: + /* Parameters: R_PROC: producer, R_ARGS: consumer */ + PUSH(R_ARGS); + R_ARGS = SC_NULL; + CALL(APPLY, EV_CALL_WITH_VALUES); + case EV_CALL_WITH_VALUES: + /* Producer returned a single value normally */ + R_CAR = R_RESULT; + R_CDR = SC_NULL; + R_ARGS = cons(); + CALL_WITH_VALUES_CONT: + /* Producer returned by calling a continuation */ + R_PROC = pop(); + goto APPLY; + + } +} + +/* Internal error signaller: similar in form to an evaluator subroutine, but + * callable from downstack C functions. */ +__attribute__((noreturn)) +void sc_error1(const char *msg, value detail) { + static int in_handler = 0; + const char *sep = ": "; + if (r_error_cont != SC_NULL) { + /* Hook installed by toplevel. As it's a captured continuation, + * unwinding from where the error occurred happens in the usual way. */ + R_PROC = r_error_cont; + /* Mirroring toplevel, fall back to the default if an error is + * recursively raised in the handler (or the allocations here). If a + * handler is restored using SET-ERROR-HANDLER!, r_error_cont is + * restored alongside. */ + r_error_cont = SC_NULL; + R_CDR = SC_NULL; + if (detail != UNDEFINED) { + R_CAR = detail; + R_CDR = cons(); + } + if (err_context) { + value cl = strlen(err_context), sl = strlen(sep), ml = strlen(msg); + uchar *buf = string_buf(R_CAR = make_string_uninit(cl + sl + ml)); + memcpy(buf, err_context, cl); buf += cl; + memcpy(buf, sep, sl); buf += sl; + memcpy(buf, msg, ml); + } + else R_CAR = string(msg); + R_CAR = cons(); + R_CDR = SC_NULL; + R_ARGS = cons(); + longjmp(err_longjmp_env, 1); + } + else if (stdout_port && !in_handler) { + /* Default handler: print and halt */ + in_handler = 1; /* fall back to fatal if this too raises an error */ + R_PORT = stdout_port; + write_cstr("ERROR [fallback]: "); + if (err_context) { + write_cstr(err_context); + write_cstr(sep); + } + write_cstr(msg); + if (detail != UNDEFINED) { + write_char(' '); + R_EXPR = detail; + shallow_print(); + } + newline(); + sc_exit(1); + } + else fatal(msg); /* Not initialized, or loop */ +} + + +/***************** + * Lexical scanner + */ + +/* Initial buffer allocation for token types that need it */ +#define DEFAULT_LEXBUF_SIZE 32 + +static value lexeme_length; +static void lexbuf_init(void) { + lexeme_length = 0; + R_LEXEME = make_string_uninit(DEFAULT_LEXBUF_SIZE); +} +static void lexbuf_append(uchar c) { + value buf_length = string_len(R_LEXEME); + if (lexeme_length == buf_length) { + value new_length = buf_length * 2; + if (new_length > EXT_LENGTH_MAX) { + new_length = EXT_LENGTH_MAX; + if (lexeme_length == new_length) sc_error("token too long"); + } + value new_buf = make_string_uninit(new_length); + memcpy(string_buf(new_buf), string_buf(R_LEXEME), buf_length); + R_LEXEME = new_buf; + } + string_buf(R_LEXEME)[lexeme_length] = c; + lexeme_length++; +} +static void lexbuf_done(void) { string_truncate(R_LEXEME, lexeme_length); } + +static int is_letter(int c) { + return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'); +} +static int is_digit(int c) { return (c >= '0' && c <= '9'); } +static int in_str(int c, const char *s) { + for (; *s; s++) if (*s == c) return 1; + return 0; +} +static int is_whitespace(int c) { return in_str(c, " \t\n\f\r"); } +static int is_delimiter(int c) { return c == EOF || in_str(c, " \n()\";"); } +static int is_special_initial(int c) { return in_str(c, "!$%&*/:<=>?^_~"); } +static int is_special_subsequent(int c) { return in_str(c, "+-.@"); } + +typedef enum { + tok_eof, + tok_literal, + tok_open_paren, + tok_close_paren, + tok_dot, + tok_open_vector, + tok_identifier, + tok_named_char, + tok_abbrev, + tok_number, +} token_type; + +typedef enum { + lex_start, + lex_comment, + lex_sharp, + lex_bool, + lex_comma, + lex_dot, + lex_dot2, + lex_dot3, + lex_ident, + lex_string, + lex_string_escape, + lex_char, + lex_char2, + lex_named_char, + lex_plus, + lex_minus, + lex_number, +} lexer_state; + +/* Finite state machine to read a token from R_PORT. Returns the token type and + * sets R_LEXEME to the value, if applicable: the expanded symbol for the + * quoting abbreviations, and a string for identifiers, named characters, and + * numbers. */ + +static token_type read_token(void) { + lexer_state state = lex_start; + uchar saved_char = 0; + R_LEXEME = SC_NULL; +#define TRANSITION(s) { state = s; continue; } +#define PUT_BACK put_back_char(c) + for (;;) { + int c; + value cv = read_char(R_PORT); + c = (cv == SC_EOF) ? EOF : char_val(cv); + + switch (state) { + case lex_start: + switch (c) { + case EOF: return tok_eof; + case '(': return tok_open_paren; + case ')': return tok_close_paren; + case '\'': R_LEXEME = s_quote; return tok_abbrev; + case '`': R_LEXEME = s_quasiquote; return tok_abbrev; + case '#': TRANSITION(lex_sharp); + case ',': TRANSITION(lex_comma); + case '.': TRANSITION(lex_dot); + case ';': TRANSITION(lex_comment); + case '"': lexbuf_init(); TRANSITION(lex_string); + case '+': TRANSITION(lex_plus); + case '-': TRANSITION(lex_minus); + default: + if (is_whitespace(c)) continue; + lexbuf_init(); + if (is_letter(c) || is_special_initial(c)) { + lexbuf_append(lc(c)); TRANSITION(lex_ident); + } + if (is_digit(c)) { + lexbuf_append(c); TRANSITION(lex_number); + } + sc_error1("bad character at start of token:", character(c)); + } + case lex_comment: + if (c == '\n') TRANSITION(lex_start); + if (c == EOF) return tok_eof; + continue; + case lex_sharp: + switch (lc(c)) { + case '(': return tok_open_vector; + case 't': R_LEXEME = SC_TRUE; TRANSITION(lex_bool); + case 'f': R_LEXEME = SC_FALSE; TRANSITION(lex_bool); + case 'e': + case 'i': + case 'b': + case 'o': + case 'd': + case 'x': lexbuf_init(); lexbuf_append('#'); lexbuf_append(c); + TRANSITION(lex_number); + case '\\': TRANSITION(lex_char); + default: sc_error("bad # sequence"); + } + case lex_bool: + PUT_BACK; + if (!is_delimiter(c)) sc_error("bad # sequence"); + return tok_literal; + case lex_comma: + if (c == '@') { R_LEXEME = s_unquote_splicing; return tok_abbrev; } + PUT_BACK; R_LEXEME = s_unquote; return tok_abbrev; + case lex_dot: + if (is_delimiter(c)) { PUT_BACK; return tok_dot; } + if (c == '.') TRANSITION(lex_dot2); + lexbuf_init(); lexbuf_append('.'); lexbuf_append(c); + TRANSITION(lex_number); + case lex_dot2: + if (c != '.') { PUT_BACK; sc_error("bad . sequence"); } + TRANSITION(lex_dot3); + case lex_dot3: + PUT_BACK; + if (is_delimiter(c)) { + R_LEXEME = string("..."); return tok_identifier; + } + sc_error("bad . sequence"); + case lex_ident: + if (is_letter(c) || is_special_initial(c) || + is_digit(c) || is_special_subsequent(c)) { + lexbuf_append(lc(c)); continue; + } + PUT_BACK; + if (is_delimiter(c)) { lexbuf_done(); return tok_identifier; } + sc_error("bad identifier"); + case lex_string: + switch (c) { + case EOF: sc_error("unexpected end-of-file in string"); + case '"': lexbuf_done(); return tok_literal; + case '\\': TRANSITION(lex_string_escape); + default: lexbuf_append(c); continue; + } + case lex_string_escape: + switch (c) { + case EOF: sc_error("unexpected end-of-file in string"); + case '"': + case '\\': lexbuf_append(c); TRANSITION(lex_string); + default: sc_error("bad escape in string"); + } + case lex_char: + if (c == EOF) sc_error("unexpected end-of-file in character"); + saved_char = c; TRANSITION(lex_char2); + case lex_char2: + if (is_delimiter(c)) { + PUT_BACK; R_LEXEME = character(saved_char); return tok_literal; + } + lexbuf_init(); lexbuf_append(lc(saved_char)); lexbuf_append(lc(c)); + TRANSITION(lex_named_char); + case lex_named_char: + if (is_delimiter(c)) { + PUT_BACK; + lexbuf_done(); + return tok_named_char; + } + lexbuf_append(lc(c)); continue; + case lex_plus: + if (is_delimiter(c)) { + PUT_BACK; R_LEXEME = string("+"); return tok_identifier; + } + lexbuf_init(); lexbuf_append('+'); lexbuf_append(c); + TRANSITION(lex_number); + case lex_minus: + if (is_delimiter(c)) { + PUT_BACK; R_LEXEME = string("-"); return tok_identifier; + } + lexbuf_init(); lexbuf_append('-'); lexbuf_append(c); + TRANSITION(lex_number); + case lex_number: + if (is_delimiter(c)) { + PUT_BACK; + lexbuf_done(); + return tok_number; + } + lexbuf_append(c); continue; + } + } +} + + +/****************** + * Bootstrap reader + */ + +/* Read a value from R_PORT, using a predictive parser for Scheme's LL(1) + * grammar (report section 7.1.2). The sole purpose is to parse the compiler + * and library code at startup (though this started out as the only reader). + * Does not handle named characters or any numeric syntax beyond plain decimal + * fixnums. + * + * O(n) runtime except for symbols, as interning is currently proportional to + * the symbol table size for each one. Implemented as subroutines calling on + * the Scheme stack, much like the evaluator, so there is no overflow hazard or + * nesting depth limit other than available heap space. */ + +/* Minimal base-10 fixnum decoder */ +static value str_to_fixnum(value s) { + uchar *p = string_buf(s); + value len = string_len(s), neg = 0, acc = 0; + if (!len) goto err; + if (*p == '-') { + neg = 1; --len; ++p; + if (!len) goto err; + } + for (; len; --len, ++p) { + if (!is_digit(*p)) goto err; + if (acc > FIXNUM_MAX/10) goto err; + acc = 10*acc + (*p - '0'); + } + if (acc > FIXNUM_MAX) goto err; + return fixnum(neg ? -acc : acc); +err: + sc_error("bad number token"); +} + +/* Return addresses */ +#define RD_DONE 0 +#define RD_LIST_FIRST 1 +#define RD_LIST_LOOP 2 +#define RD_LIST_DOT 3 +#define RD_ABBREV 4 +#define RD_VEC_LOOP 5 + +static value sc_read(void) { + token_type t; + CALL(datum, RD_DONE); + +dispatch: + switch (pop()) { + case RD_DONE: + break; + +datum: + t = read_token(); + switch (t) { + case tok_eof: RETURN(SC_EOF); + case tok_literal: RETURN(R_LEXEME); + case tok_open_paren: goto list; + case tok_close_paren: RETURN(RD_CLOSEPAREN); + case tok_dot: RETURN(RD_DOT); + case tok_open_vector: goto vector; + case tok_identifier: R_CAR = R_LEXEME; RETURN(string_to_symbol()); + case tok_named_char: sc_error("named characters unsupported"); + case tok_abbrev: goto abbrev; + case tok_number: RETURN(str_to_fixnum(R_LEXEME)); + } + +list: + CALL(datum, RD_LIST_FIRST); + case RD_LIST_FIRST: + if (R_RESULT == RD_CLOSEPAREN) RETURN(SC_NULL); + if (R_RESULT == RD_DOT) sc_error("dotted list without first item"); + if (R_RESULT == SC_EOF) sc_error("unexpected end-of-file in list"); + R_CAR = R_RESULT; + R_CDR = SC_NULL; + R_CAR = cons(); + push(); /* list head */ + for (;;) { + push(); /* list tail */ + CALL(datum, RD_LIST_LOOP); + case RD_LIST_LOOP: + if (R_RESULT == RD_CLOSEPAREN) { + drop(); /* list tail */ + RETURN(pop()); /* list head */ + } + if (R_RESULT == RD_DOT) { + CALL(datum, RD_LIST_DOT); + case RD_LIST_DOT: + if (R_RESULT == RD_CLOSEPAREN) + sc_error("dotted list without last item"); + if (R_RESULT == RD_DOT) sc_error("extra dot in dotted list"); + if (R_RESULT == SC_EOF) + sc_error("unexpected end-of-file in list"); + PUSH(R_RESULT) + t = read_token(); + R_RESULT = pop(); + R_CAR = pop(); /* list tail */ + if (t == tok_close_paren) { + set_cdr(R_CAR, R_RESULT); + RETURN(pop()); /* list head */ + } + if (t == tok_eof) sc_error("unexpected end-of-file in list"); + sc_error("excess item in tail of dotted list"); + } + if (R_RESULT == SC_EOF) sc_error("unexpected end-of-file in list"); + R_CAR = R_RESULT; + R_CDR = SC_NULL; + R_CAR = cons(); + R_CDR = pop(); /* list tail */ + set_cdr(R_CDR, R_CAR); + } + +abbrev: /* 'x -> (quote x) etc. */ + PUSH(R_LEXEME) /* expanded abbrev symbol */ + CALL(datum, RD_ABBREV); + case RD_ABBREV: + if (R_RESULT == RD_CLOSEPAREN) + sc_error("unexpected close-paren in abbreviation"); + if (R_RESULT == RD_DOT) + sc_error("unexpected dot in abbreviation"); + if (R_RESULT == SC_EOF) + sc_error("unexpected end-of-file in abbreviation"); + R_CAR = R_RESULT; + R_CDR = SC_NULL; + R_CDR = cons(); + R_CAR = pop(); /* expanded abbrev symbol */ + RETURN(cons()); + +vector: + /* First build a list */ + R_CAR = SC_NULL; + for (;;) { + push(); /* list head */ + CALL(datum, RD_VEC_LOOP); + case RD_VEC_LOOP: + if (R_RESULT == RD_CLOSEPAREN) { + /* Then copy to a new vector while un-reversing */ + R_EXPR = pop(); /* list head */ + RETURN(rev_list_to_vec()); + } + if (R_RESULT == RD_DOT) sc_error("unexpected dot in vector"); + if (R_RESULT == SC_EOF) + sc_error("unexpected end-of-file in vector"); + R_CAR = R_RESULT; + R_CDR = pop(); /* list head */ + R_CAR = cons(); + } + + } + if (R_RESULT == RD_CLOSEPAREN) sc_error("unexpected close-paren"); + if (R_RESULT == RD_DOT) sc_error("unexpected dot"); + return R_RESULT; +} + + +/***************** + * Number printers + */ + +static char fmt_buf[128]; /* TODO justify size */ +static const char *fmt_fixnum_dec(long val) { + int i = sizeof(fmt_buf) - 1, neg = 0; + /* TODO null termination is convenient here but perhaps not ideal */ + fmt_buf[i] = 0; + if (val < 0) { neg = 1; val = -val; } + do { + --i; assert(i); + fmt_buf[i] = '0' + (val % 10); + val /= 10; + } while (val); + if (neg) fmt_buf[--i] = '-'; + return fmt_buf+i; +} +static const char *fmt_ulong_dec(ulong val) { + int i = sizeof(fmt_buf) - 1; + fmt_buf[i] = 0; + do { + --i; assert(i >= 0); + fmt_buf[i] = '0' + (val % 10); + val /= 10; + } while (val); + return fmt_buf+i; +} +static const char *fmt_fixnum_hex(long val) { + int i = sizeof(fmt_buf) - 1, neg = 0; + fmt_buf[i] = 0; + if (val < 0) { neg = 1; val = -val; } + do { + --i; assert(i); + fmt_buf[i] = "0123456789abcdef"[val & 0xf]; + val >>= 4; + } while (val); + if (neg) fmt_buf[--i] = '-'; + return fmt_buf+i; +} +static const char *fmt_fixnum_oct(long val) { + int i = sizeof(fmt_buf) - 1, neg = 0; + fmt_buf[i] = 0; + if (val < 0) { neg = 1; val = -val; } + do { + --i; assert(i); + fmt_buf[i] = '0' + (val & 7); + val >>= 3; + } while (val); + if (neg) fmt_buf[--i] = '-'; + return fmt_buf+i; +} +static const char *fmt_fixnum_bin(long val) { + int i = sizeof(fmt_buf) - 1, neg = 0; + fmt_buf[i] = 0; + if (val < 0) { neg = 1; val = -val; } + do { + --i; assert(i); + fmt_buf[i] = '0' + (val & 1); + val >>= 1; + } while (val); + if (neg) fmt_buf[--i] = '-'; + return fmt_buf+i; +} +static const char *fmt_ulong_bin(ulong val) { + int i = sizeof(fmt_buf) - 1; + fmt_buf[i] = 0; + do { + --i; assert(i); + fmt_buf[i] = '0' + (val & 1); + val >>= 1; + } while (val); + return fmt_buf+i; +} +static const char *fmt_flonum_dec(double val) { + /* TODO follow up on R5RS citations 3 and 5 */ + if ((size_t)snprintf(fmt_buf, sizeof fmt_buf, "%.15g", val) >= + sizeof fmt_buf) + sc_error("BUG: flonum formatting truncated"); + return fmt_buf; +} + +/**************************** + * Fallback (shallow) printer + */ + +/* Print the value in R_EXPR to R_PORT, using "write" style (quoting strings + * and characters) but not expanding named characters or looking inside + * compound objects. (This used to be the real printer, implemented as + * recursive subroutines on the Scheme stack like the reader and evaluator, but + * is now just for low-level debug and fallback error handlers.) */ + +static void shallow_print(void) { + int t = tag(R_EXPR); + if (t == T_SPECIAL) { + const char *s; + if (R_EXPR == SC_NULL) s = "()"; + else if (R_EXPR == SC_TRUE) s = "#t"; + else if (R_EXPR == SC_FALSE) s = "#f"; + else if (R_EXPR == SC_EOF) s = "#EOF"; + else if (R_EXPR == SC_NULL_ENV) s = "#ENVSPEC:NULL"; + else if (R_EXPR == SC_REPORT_ENV) s = "#ENVSPEC:SCHEME-REPORT"; + else if (R_EXPR == SC_GSCM_ENV) s = "#ENVSPEC:GALES-SCHEME"; + else if (R_EXPR == SC_INTERACT_ENV) s = "#ENVSPEC:INTERACTION"; + else if (R_EXPR == SC_TOPLEVEL_ENV) s = "#ENVSPEC:TOPLEVEL"; + else if (R_EXPR == UNDEFINED) s = "#UNDEFINED"; + else if (R_EXPR == RD_CLOSEPAREN) s = "#RDSENTINEL:CLOSEPAREN"; + else if (R_EXPR == RD_DOT) s = "#RDSENTINEL:DOT"; + else fatal("BUG: invalid special in shallow_print"); + write_cstr(s); + } + else if (t == T_IMMUT_PAIR) write_cstr("#IMMUTABLE-PAIR"); + else if (t == T_PAIR) write_cstr("#PAIR"); + else if (t == T_CHARACTER) { write_cstr("#\\"); write_char(R_EXPR); } + else if (t == T_FIXNUM) write_cstr(fmt_fixnum_dec(fixnum_val(R_EXPR))); + else if (t == T_EXTENDED) { + t = ext_tag(heap[untag(R_EXPR)]); + if ((t | 1) == T_STRING) write_str_quoted(R_EXPR); + else if ((t | 1) == T_VECTOR) { + if (t == T_VECTOR) write_cstr("#VECTOR:"); + else write_cstr("#IMMUTABLE-VECTOR:"); + write_cstr(fmt_fixnum_dec(vector_len(R_EXPR))); + } + else if (t == T_SYMBOL) write_str(R_EXPR); + else if (t == T_BUILTIN) { + write_cstr("#BUILTIN:"); + write_cstr(builtin_name(R_EXPR)); + } + else if (t == T_PROCEDURE) write_cstr("#PROCEDURE"); + else if (t == T_CONTINUATION) write_cstr("#CONTINUATION"); + else if (t == T_PROMISE) write_cstr("#PROMISE"); + else if (t == T_PORT) write_cstr("#PORT"); + else if (t == T_FLONUM) write_cstr("#FLONUM"); + else if (t == T_BIGNUM) write_cstr("#BIGNUM"); + else if (t == T_RATIONAL) write_cstr("#RATIONAL"); + else if (t == T_COMPLEX) write_cstr("#COMPLEX"); + else if (t == T_VARIABLE_REF) write_cstr("#VARIABLE-REF"); + else fatal("BUG: invalid extended tag in shallow_print"); + } + else fatal("BUG: invalid tag in shallow_print"); +} + + +/******************** + * Builtin procedures + */ + +/* Argument wrangling helpers for builtins */ + +static void require_args(value args) { + if (args == SC_NULL) sc_error("too few arguments"); +} + +static void no_args(value args) { + if (args != SC_NULL) sc_error("too many arguments"); +} + +static value extract_arg(value *args) { + require_args(*args); + value arg = car(*args); + *args = cdr(*args); + return arg; +} + +static value final_arg(value args) { + require_args(args); + no_args(cdr(args)); + return car(args); +} + +static value require_input_port(value arg) { + if (!is_input_port(arg)) sc_error("not an input port"); return arg; +} + +static value require_output_port(value arg) { + if (!is_output_port(arg)) sc_error("not an output port"); return arg; +} + +static value opt_final_in_port_arg(value args) { + return require_input_port(args == SC_NULL ? r_input_port : + final_arg(args)); +} + +static value opt_final_out_port_arg(value args) { + return require_output_port(args == SC_NULL ? r_output_port : + final_arg(args)); +} + +static value require_symbol(value arg) { + if (!is_symbol(arg)) sc_error1("not a symbol:", arg); + return arg; +} + +static value require_string(value arg) { + if (!is_string(arg)) sc_error1("not a string:", arg); + return arg; +} + +static value require_stringlike(value arg) { + if (!(is_string(arg) || is_symbol(arg))) + sc_error1("not a string or symbol:", arg); + return arg; +} + +static value require_vector(value arg) { + if (!is_vector(arg)) sc_error1("not a vector:", arg); + return arg; +} + +static value require_fixnum(value arg) { + if (!is_fixnum(arg)) sc_error1("not a fixnum:", arg); + return arg; +} + +static value require_procedure(value arg) { + if (!is_procedure(arg)) sc_error1("not a procedure:", arg); + return arg; +} + +#define BUILTIN(name) static value name(value args) + +/* Mnemonic for multi-valued returns, i.e. passing multiple values to the + * current continuation. f_values is strictly an optimization; we could just as + * well set R_PROC to current_continuation() and r_flag to f_apply. + * The arg list must be newly allocated! */ +#define RETURN_VALUES(args) { \ + R_ARGS = args; \ + r_flag = f_values; \ + return SC_NULL; \ +} + +/* 6.1 Equivalence predicates */ + +BUILTIN(builtin_is_eq) { + value a = extract_arg(&args); + return boolean(a == final_arg(args)); +} + +/* 6.2.5 Numerical operations */ + +BUILTIN(builtin_is_number) { return boolean(is_number(final_arg(args))); } +BUILTIN(builtin_is_integer) { return boolean(is_integer(final_arg(args))); } +BUILTIN(builtin_is_exact) { return boolean(is_exact(final_arg(args))); } +BUILTIN(builtin_is_inexact) { return boolean(is_flonum(final_arg(args))); } + +/* 6.3.1 Booleans */ + +BUILTIN(builtin_not) { return boolean(final_arg(args) == SC_FALSE); } +BUILTIN(builtin_is_boolean) { return boolean(is_boolean(final_arg(args))); } + +/* 6.3.2 Pairs and lists */ + +BUILTIN(builtin_is_pair) { return boolean(is_pair(final_arg(args))); } +BUILTIN(builtin_cons) { + R_CAR = extract_arg(&args); + R_CDR = final_arg(args); + return cons(); +} + +BUILTIN(builtin_car) { return safe_car(final_arg(args)); } +BUILTIN(builtin_cdr) { return safe_cdr(final_arg(args)); } + +BUILTIN(builtin_caar) { return safe_car(builtin_car(args)); } +BUILTIN(builtin_cadr) { return safe_car(builtin_cdr(args)); } +BUILTIN(builtin_cdar) { return safe_cdr(builtin_car(args)); } +BUILTIN(builtin_cddr) { return safe_cdr(builtin_cdr(args)); } + +BUILTIN(builtin_caaar) { return safe_car(builtin_caar(args)); } +BUILTIN(builtin_caadr) { return safe_car(builtin_cadr(args)); } +BUILTIN(builtin_cadar) { return safe_car(builtin_cdar(args)); } +BUILTIN(builtin_caddr) { return safe_car(builtin_cddr(args)); } +BUILTIN(builtin_cdaar) { return safe_cdr(builtin_caar(args)); } +BUILTIN(builtin_cdadr) { return safe_cdr(builtin_cadr(args)); } +BUILTIN(builtin_cddar) { return safe_cdr(builtin_cdar(args)); } +BUILTIN(builtin_cdddr) { return safe_cdr(builtin_cddr(args)); } + +BUILTIN(builtin_caaaar) { return safe_car(builtin_caaar(args)); } +BUILTIN(builtin_caaadr) { return safe_car(builtin_caadr(args)); } +BUILTIN(builtin_caadar) { return safe_car(builtin_cadar(args)); } +BUILTIN(builtin_caaddr) { return safe_car(builtin_caddr(args)); } +BUILTIN(builtin_cadaar) { return safe_car(builtin_cdaar(args)); } +BUILTIN(builtin_cadadr) { return safe_car(builtin_cdadr(args)); } +BUILTIN(builtin_caddar) { return safe_car(builtin_cddar(args)); } +BUILTIN(builtin_cadddr) { return safe_car(builtin_cdddr(args)); } +BUILTIN(builtin_cdaaar) { return safe_cdr(builtin_caaar(args)); } +BUILTIN(builtin_cdaadr) { return safe_cdr(builtin_caadr(args)); } +BUILTIN(builtin_cdadar) { return safe_cdr(builtin_cadar(args)); } +BUILTIN(builtin_cdaddr) { return safe_cdr(builtin_caddr(args)); } +BUILTIN(builtin_cddaar) { return safe_cdr(builtin_cdaar(args)); } +BUILTIN(builtin_cddadr) { return safe_cdr(builtin_cdadr(args)); } +BUILTIN(builtin_cdddar) { return safe_cdr(builtin_cddar(args)); } +BUILTIN(builtin_cddddr) { return safe_cdr(builtin_cdddr(args)); } + +BUILTIN(builtin_set_car) { + value p = extract_arg(&args); + value val = final_arg(args); + if (tag(p) != T_PAIR) { + if (tag(p) == T_IMMUT_PAIR) sc_error("immutable pair"); + sc_error("not a pair"); + } + set_car(p, val); + return SC_NULL; +} +BUILTIN(builtin_set_cdr) { + value p = extract_arg(&args); + value val = final_arg(args); + if (tag(p) != T_PAIR) { + if (tag(p) == T_IMMUT_PAIR) sc_error("immutable pair"); + sc_error("not a pair"); + } + set_cdr(p, val); + return SC_NULL; +} + +BUILTIN(builtin_is_null) { return boolean(final_arg(args) == SC_NULL); } +BUILTIN(builtin_is_list) { return boolean(is_list(final_arg(args))); } + +BUILTIN(builtin_length) { + long len = safe_list_length(final_arg(args)); + if (len < 0) sc_error("not a list"); + return fixnum(len); +} + +/* 6.3.3 Symbols */ + +BUILTIN(builtin_is_symbol) { return boolean(is_symbol(final_arg(args))); } + +BUILTIN(builtin_sym_to_str) { + /* TODO use immutability to avoid copying */ + R_EXPR = require_symbol(final_arg(args)); + return string_copy_immutable(); +} + +BUILTIN(builtin_str_to_sym) { + R_CAR = require_string(final_arg(args)); + return string_to_symbol(); +} + +/* 6.3.4 Characters */ + +BUILTIN(builtin_is_char) { return boolean(is_character(final_arg(args))); } + +#define CHAR1 uchar a = safe_char_val(final_arg(args)); +#define CHAR2 uchar a = safe_char_val(extract_arg(&args)); \ + uchar b = safe_char_val(final_arg(args)); + +BUILTIN(builtin_char_eq) { CHAR2 return boolean(a == b); } +BUILTIN(builtin_char_lt) { CHAR2 return boolean(a < b); } +BUILTIN(builtin_char_gt) { CHAR2 return boolean(a > b); } +BUILTIN(builtin_char_le) { CHAR2 return boolean(a <= b); } +BUILTIN(builtin_char_ge) { CHAR2 return boolean(a >= b); } +BUILTIN(builtin_char_ci_eq) { CHAR2 return boolean(lc(a) == lc(b)); } +BUILTIN(builtin_char_ci_lt) { CHAR2 return boolean(lc(a) < lc(b)); } +BUILTIN(builtin_char_ci_gt) { CHAR2 return boolean(lc(a) > lc(b)); } +BUILTIN(builtin_char_ci_le) { CHAR2 return boolean(lc(a) <= lc(b)); } +BUILTIN(builtin_char_ci_ge) { CHAR2 return boolean(lc(a) >= lc(b)); } + +BUILTIN(builtin_char_is_alpha) { + CHAR1 return boolean((a >= 'A' && a <= 'Z') || (a >= 'a' && a <= 'z')); +} +BUILTIN(builtin_char_is_num) { + CHAR1 return boolean(a >= '0' && a <= '9'); +} +BUILTIN(builtin_char_is_white) { CHAR1 return boolean(is_whitespace(a)); } +BUILTIN(builtin_char_is_upper) { CHAR1 return boolean(a >= 'A' && a <= 'Z'); } +BUILTIN(builtin_char_is_lower) { CHAR1 return boolean(a >= 'a' && a <= 'z'); } + +BUILTIN(builtin_char_to_int) { CHAR1 return fixnum(a); } + +BUILTIN(builtin_int_to_char) { + long n = safe_fixnum_val(final_arg(args)); + if (n < 0 || n > 255) sc_error1("out of bounds:", fixnum(n)); + return character(n); +} + +BUILTIN(builtin_char_upcase) { CHAR1 return character(uc(a)); } +BUILTIN(builtin_char_downcase) { CHAR1 return character(lc(a)); } + +/* 6.3.5 Strings */ + +BUILTIN(builtin_is_str) { return boolean(is_string(final_arg(args))); } + +BUILTIN(builtin_make_str) { + long len = safe_fixnum_val(extract_arg(&args)); + uchar fill = (args == SC_NULL) ? ' ' : safe_char_val(final_arg(args)); + return make_string(len, fill); +} + +BUILTIN(builtin_str_length) { + return fixnum(string_len(require_string(final_arg(args)))); +} + +BUILTIN(builtin_str_ref) { + value s = require_string(extract_arg(&args)); + value k = final_arg(args); + value k_unsigned = safe_fixnum_val(k); + /* see builtin_vec_ref comments */ + if (k_unsigned >= string_len(s)) sc_error1("out of bounds:", k); + return character(string_buf(s)[k_unsigned]); +} + +BUILTIN(builtin_str_set) { + value s = extract_arg(&args); + if (tag(s) != T_EXTENDED || ext_tag(heap[untag(s)]) != T_STRING) { + if (is_string(s)) sc_error1("immutable string:", s); + sc_error1("not a string:", s); + } + value k = extract_arg(&args); + uchar new_char = safe_char_val(final_arg(args)); + value k_unsigned = safe_fixnum_val(k); + /* see builtin_vec_ref comments */ + if (k_unsigned >= string_len(s)) sc_error1("out of bounds:", k); + string_buf(s)[k_unsigned] = new_char; + return SC_NULL; +} + +#define STR2 value a = require_string(extract_arg(&args)); \ + value b = require_string(final_arg(args)); \ + size_t a_len = string_len(a), b_len = string_len(b); \ + uchar *a_buf = string_buf(a), *b_buf = string_buf(b); + +BUILTIN(builtin_str_eq) { + STR2 + if (a_len != b_len) return SC_FALSE; + return boolean(memcmp(a_buf, b_buf, a_len) == 0); +} + +#define STRCMP \ + STR2 int cmp = memcmp(a_buf, b_buf, (a_len < b_len) ? a_len : b_len); + +BUILTIN(builtin_str_lt) { + STRCMP return boolean(cmp < 0 || (cmp == 0 && a_len < b_len)); +} +BUILTIN(builtin_str_gt) { + STRCMP return boolean(cmp > 0 || (cmp == 0 && a_len > b_len)); +} +BUILTIN(builtin_str_le) { + STRCMP return boolean(cmp < 0 || (cmp == 0 && a_len <= b_len)); +} +BUILTIN(builtin_str_ge) { + STRCMP return boolean(cmp > 0 || (cmp == 0 && a_len >= b_len)); +} + +static int memcmp_ci(const void *s1, const void *s2, size_t n) { + const uchar *b1 = s1, *b2 = s2; + uchar c1, c2; + size_t i; + for (i = 0; i < n; i++) { + c1 = lc(b1[i]); + c2 = lc(b2[i]); + if (c1 < c2) return -1; + if (c1 > c2) return 1; + } + return 0; +} + +BUILTIN(builtin_str_ci_eq) { + STR2 + if (a_len != b_len) return SC_FALSE; + return boolean(memcmp_ci(a_buf, b_buf, a_len) == 0); +} + +#define STRCMP_CI STR2 \ + int cmp = memcmp_ci(a_buf, b_buf, (a_len < b_len) ? a_len : b_len); + +BUILTIN(builtin_str_ci_lt) { + STRCMP_CI return boolean(cmp < 0 || (cmp == 0 && a_len < b_len)); +} +BUILTIN(builtin_str_ci_gt) { + STRCMP_CI return boolean(cmp > 0 || (cmp == 0 && a_len > b_len)); +} +BUILTIN(builtin_str_ci_le) { + STRCMP_CI return boolean(cmp < 0 || (cmp == 0 && a_len <= b_len)); +} +BUILTIN(builtin_str_ci_ge) { + STRCMP_CI return boolean(cmp > 0 || (cmp == 0 && a_len >= b_len)); +} + +BUILTIN(builtin_substr) { + value len = string_len(R_EXPR = require_string(extract_arg(&args))), + start = extract_arg(&args), end = final_arg(args), + start_unsigned = safe_fixnum_val(start), + end_unsigned = safe_fixnum_val(end); + if (start_unsigned > len) sc_error1("start out of bounds:", start); + if (end_unsigned > len) sc_error1("end out of bounds:", end); + if (end_unsigned < start_unsigned) sc_error("end less than start"); + len = end_unsigned - start_unsigned; + R_RESULT = make_string_uninit(len); + memcpy(string_buf(R_RESULT), string_buf(R_EXPR)+start_unsigned, len); + return R_RESULT; +} + +BUILTIN(builtin_str_append) { + value p, s, len = 0; + uchar *buf; + R_ARGS = args; + for (p = R_ARGS; p != SC_NULL; p = cdr(p)) { + len += string_len(require_string(car(p))); + if (len > EXT_LENGTH_MAX) sc_error("length too large for string"); + } + R_RESULT = make_string_uninit(len); + buf = string_buf(R_RESULT); + for (p = R_ARGS; p != SC_NULL; p = cdr(p)) { + s = car(p); + len = string_len(s); + memcpy(buf, string_buf(s), len); + buf += len; + } + return R_RESULT; +} + +BUILTIN(builtin_list_to_str) { + long len, i; + value s; + uchar *buf; + R_ARGS = final_arg(args); + len = safe_list_length(R_ARGS); + if (len < 0) sc_error("not a list"); + s = make_string_uninit(len); + buf = string_buf(s); + for (i = 0; i < len; i++) { + buf[i] = safe_char_val(car(R_ARGS)); + R_ARGS = cdr(R_ARGS); + } + return s; +} + +BUILTIN(builtin_str_copy) { + R_EXPR = require_string(final_arg(args)); + return string_copy(); +} + +BUILTIN(builtin_str_fill) { + value s = require_string(extract_arg(&args)); + memset(string_buf(s), safe_char_val(final_arg(args)), string_len(s)); + return SC_NULL; +} + +/* 6.3.6 Vectors */ + +BUILTIN(builtin_is_vector) { return boolean(is_vector(final_arg(args))); } + +BUILTIN(builtin_make_vector) { + long len = safe_fixnum_val(extract_arg(&args)); + R_EXPR = (args == SC_NULL) ? SC_NULL : final_arg(args); + return make_vector(len); +} + +BUILTIN(builtin_vec_length) { + value vec = require_vector(final_arg(args)); + return fixnum(vector_len(vec)); +} + +BUILTIN(builtin_vec_ref) { + value vec = require_vector(extract_arg(&args)); + value k = final_arg(args); + value k_unsigned = safe_fixnum_val(k); + if (k_unsigned >= vector_len(vec)) sc_error1("out of bounds:", k); + /* We don't need to also check for negative k: as value is an unsigned + * type, the assignment from long causes a negative to be seen as a + * positive greater than the longest allowed vector length. + * XXX: are there weird machines where this isn't true? */ + return vector_ref(vec, k_unsigned); +} + +BUILTIN(builtin_vec_set) { + value vec = extract_arg(&args); + value k = extract_arg(&args); + value obj = final_arg(args); + value k_unsigned = safe_fixnum_val(k); + if (tag(vec) != T_EXTENDED || ext_tag(heap[untag(vec)]) != T_VECTOR) { + if (is_vector(vec)) sc_error1("immutable vector:", vec); + sc_error1("not a vector:", vec); + } + if (k_unsigned >= vector_len(vec)) sc_error1("out of bounds:", k); + vector_set(vec, k_unsigned, obj); + return SC_NULL; +} + +BUILTIN(builtin_list_to_vec) { + long len; + value vec, *p; + R_ARGS = final_arg(args); + len = safe_list_length(R_ARGS); + if (len < 0) sc_error("not a list"); + vec = make_vector_uninit(len); + p = heap + untag(vec) + 1; + for (; len; --len, ++p, R_ARGS = cdr(R_ARGS)) *p = car(R_ARGS); + return vec; +} + +BUILTIN(builtin_vec_fill) { + value vec = require_vector(extract_arg(&args)); + value fill = final_arg(args); + value len = vector_len(vec), i; + for (i = 0; i < len; i++) vector_set(vec, i, fill); + return SC_NULL; +} + +/* 6.4 Control features */ + +BUILTIN(builtin_is_procedure) { return boolean(is_procedure(final_arg(args))); } + +BUILTIN(builtin_force) { + R_EXPR = final_arg(args); + r_flag = f_force; + return SC_NULL; +} + +BUILTIN(builtin_call_cc) { + R_PROC = require_procedure(final_arg(args)); + R_CAR = current_continuation(); + R_CDR = SC_NULL; + R_ARGS = cons(); + r_flag = f_apply; + return SC_NULL; +} + +BUILTIN(builtin_values) RETURN_VALUES(args) + +BUILTIN(builtin_call_with_values) { + R_PROC = extract_arg(&args); + R_ARGS = final_arg(args); + r_flag = f_call_with_values; + return SC_NULL; +} + +/* 6.5 Eval */ + +BUILTIN(builtin_eval) { + R_EXPR = extract_arg(&args); + value e = final_arg(args); + switch (e) { + case SC_NULL_ENV: + R_ENV = SC_NULL; break; + case SC_REPORT_ENV: + R_ENV = r5rs_env; break; + case SC_GSCM_ENV: + R_ENV = gscm_env; break; + case SC_INTERACT_ENV: + R_ENV = interaction_env; break; + case SC_TOPLEVEL_ENV: + R_ENV = toplevel_env; break; + default: + sc_error1("not an environment specifier:", e); + } + r_flag = f_compile; + return SC_NULL; +} + +BUILTIN(builtin_report_env) { + if (safe_fixnum_val(final_arg(args)) != 5) + sc_error("unsupported version"); + return SC_REPORT_ENV; +} +BUILTIN(builtin_null_env) { + if (safe_fixnum_val(final_arg(args)) != 5) + sc_error("unsupported version"); + return SC_NULL_ENV; +} +BUILTIN(builtin_interaction_env) { + no_args(args); + return SC_INTERACT_ENV; +} + +/* 6.6.1 Ports */ + +BUILTIN(builtin_is_port) { + return boolean(is_port(final_arg(args))); +} +BUILTIN(builtin_is_in_port) { + return boolean(is_input_port(final_arg(args))); +} +BUILTIN(builtin_is_out_port) { + return boolean(is_output_port(final_arg(args))); +} + +BUILTIN(builtin_current_in_port) { no_args(args); return r_input_port; } +BUILTIN(builtin_current_out_port) { no_args(args); return r_output_port; } + +BUILTIN(builtin_open_in_file) { + int fd; + R_EXPR = require_string(final_arg(args)); + fd = open_cloexec(c_string_buf(string_append_null()), O_RDONLY); + if (fd == -1) sc_perror1(R_EXPR); + return make_port(fd, 0, DEFAULT_R_BUF); +} + +BUILTIN(builtin_open_out_file) { + int fd, flags = O_WRONLY | O_CREAT; + value if_exists; + R_EXPR = require_string(extract_arg(&args)); + if (args == SC_NULL) if_exists = s_truncate; + else if_exists = final_arg(args); + + if (if_exists == s_truncate) flags |= O_TRUNC; + else if (if_exists == s_overwrite) ; + else if (if_exists == s_append) flags |= O_APPEND; + else sc_error("invalid if-exists option"); + + fd = open_cloexec(c_string_buf(string_append_null()), flags); + if (fd == -1) sc_perror1(R_EXPR); + return make_port(fd, 1, DEFAULT_W_BUF); +} + +BUILTIN(builtin_close_in_port) { + close_port(require_input_port(final_arg(args))); + return SC_NULL; +} + +BUILTIN(builtin_close_out_port) { + close_port(require_output_port(final_arg(args))); + return SC_NULL; +} + +/* 6.6.2 Input */ + +BUILTIN(builtin_read_char) { return read_char(opt_final_in_port_arg(args)); } + +BUILTIN(builtin_peek_char) { return peek_char(opt_final_in_port_arg(args)); } + +BUILTIN(builtin_is_eof) { return boolean(final_arg(args) == SC_EOF); } + +BUILTIN(builtin_is_char_ready) { + return input_port_ready(opt_final_in_port_arg(args)); +} + +/* 6.6.3 Output */ + +BUILTIN(builtin_write_char) { + uchar c = safe_char_val(extract_arg(&args)); + R_PORT = opt_final_out_port_arg(args); + write_char(c); + return SC_NULL; +} + +/* Gales Scheme extensions */ + +BUILTIN(builtin_gscm_env) { no_args(args); return SC_GSCM_ENV; } + +BUILTIN(builtin_is_immutable) { return boolean(!is_mutable(final_arg(args))); } + +BUILTIN(builtin_cons_immutable) { + R_CAR = extract_arg(&args); + R_CDR = final_arg(args); + return cons_immutable(); +} + +BUILTIN(builtin_str_copy_immutable) { + R_EXPR = require_string(final_arg(args)); + return string_copy_immutable(); +} + +BUILTIN(builtin_vec_copy_immutable) { + value len; + R_EXPR = require_vector(final_arg(args)); + len = vector_len(R_EXPR); + R_RESULT = make_immutable_vector(len); + memcpy(heap+untag(R_RESULT)+1, heap+untag(R_EXPR)+1, len*sizeof(value)); + return R_RESULT; +} + +BUILTIN(builtin_flush_out_port) { + value port = require_output_port(args == SC_NULL ? r_output_port : + extract_arg(&args)), *p = heap+untag(port); + int fd = fixnum_val(p[PORT_FD]); + if (fd == -1) sc_error("output port closed"); + flush_if_needed(port); + if (args != SC_NULL) { + value opt = final_arg(args); + if (opt == s_sync) { if (fsync(fd)) goto sync_err; } + else if (opt == s_data_sync) { if (fdatasync(fd)) goto sync_err; } + else sc_error1("invalid option:", opt); + } + return SC_NULL; +sync_err: + if (errno == EINVAL) sc_error("synchronization not possible"); + else { + /* As in flush_output_port: no good way to recover from output errors, + * but the kernel won't necessarily continue returning errors, so close + * the port. In practice, the mistake of retrying a failed fsync has + * caused data loss in PostgreSQL (broken durability guarantee). */ + int saved = errno; set_port_closed(p); errno = saved; + sc_perror(); + } +} + +BUILTIN(builtin_gc) { + no_args(args); + sc_gc(); + return fixnum(free_ptr); +} + +BUILTIN(builtin_is_fixnum) { return boolean(is_fixnum(final_arg(args))); } + +BUILTIN(builtin_fx_eq) { + value a = require_fixnum(extract_arg(&args)); + return boolean(a == require_fixnum(final_arg(args))); +} + +BUILTIN(builtin_fx_lt) { + long a = safe_fixnum_val(extract_arg(&args)); + return boolean(a < safe_fixnum_val(final_arg(args))); +} + +BUILTIN(builtin_fx_le) { + long a = safe_fixnum_val(extract_arg(&args)); + return boolean(a <= safe_fixnum_val(final_arg(args))); +} + +BUILTIN(builtin_fx_lt_unsigned) { + value a = require_fixnum(extract_arg(&args)); + return boolean(a < require_fixnum(final_arg(args))); +} + +BUILTIN(builtin_fx_le_unsigned) { + value a = require_fixnum(extract_arg(&args)); + return boolean(a <= require_fixnum(final_arg(args))); +} + +/* inputs left tagged: valid for wrapping and bitwise ops */ +#define FXFOLD(op, init) { \ + ulong acc = init; \ + for (; args != SC_NULL; args = cdr(args)) \ + acc = acc op require_fixnum(car(args)); \ + return fixnum(acc); \ +} + +BUILTIN(builtin_fx_add_wrap) FXFOLD(+, 0) + +BUILTIN(builtin_fx_add_carry) { + long acc = untag_signed(require_fixnum(extract_arg(&args))); + acc += untag_signed(require_fixnum(extract_arg(&args))); + if (args != SC_NULL) acc += untag_signed(require_fixnum(final_arg(args))); + R_CDR = SC_NULL; R_CAR = fixnum(acc >> VAL_BITS); /* high word (carry) */ + R_CDR = cons(); R_CAR = fixnum(acc); /* low word */ + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fx_add_carry_unsigned) { + ulong acc = untag(require_fixnum(extract_arg(&args))); + acc += untag(require_fixnum(extract_arg(&args))); + if (args != SC_NULL) acc += untag(require_fixnum(final_arg(args))); + R_CDR = SC_NULL; R_CAR = fixnum(acc >> VAL_BITS); /* high word (carry) */ + R_CDR = cons(); R_CAR = fixnum(acc); /* low word */ + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fx_sub_wrap) { + ulong acc = require_fixnum(extract_arg(&args)); + if (args == SC_NULL) return fixnum(-acc); + do { + acc -= require_fixnum(car(args)); + args = cdr(args); + } while (args != SC_NULL); + return fixnum(acc); +} + +BUILTIN(builtin_fx_sub_borrow_unsigned) { + ulong acc = untag(require_fixnum(extract_arg(&args))); + acc -= untag(require_fixnum(extract_arg(&args))); + if (args != SC_NULL) acc -= untag(require_fixnum(final_arg(args))); + R_CDR = SC_NULL; R_CAR = fixnum(-(((long)acc) >> VAL_BITS)); + R_CDR = cons(); R_CAR = fixnum(acc); + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fx_mul_wrap) FXFOLD(*, 1) + +BUILTIN(builtin_fx_mul_carry) { + ulong a = untag_signed(require_fixnum(extract_arg(&args))); + ulong b = untag_signed(require_fixnum(final_arg(args))); + sc_wide_mul_signed(&a, &b); + R_CDR = SC_NULL; R_CAR = fixnum(b << TAG_BITS | tag(a)); /* high word */ + R_CDR = cons(); R_CAR = fixnum(a); /* low word */ + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fx_mul_carry_unsigned) { + ulong a = untag(require_fixnum(extract_arg(&args))); + ulong b = untag(require_fixnum(final_arg(args))); + sc_wide_mul(&a, &b); + R_CDR = SC_NULL; R_CAR = fixnum(b << TAG_BITS | tag(a)); /* high word */ + R_CDR = cons(); R_CAR = fixnum(a); /* low word */ + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fxnot) { + return fixnum(~require_fixnum(final_arg(args))); +} + +BUILTIN(builtin_fxand) FXFOLD(&, -1) +BUILTIN(builtin_fxior) FXFOLD(|, 0) +BUILTIN(builtin_fxxor) FXFOLD(^, 0) + +BUILTIN(builtin_fxif) { + ulong mask = require_fixnum(extract_arg(&args)); + ulong a = require_fixnum(extract_arg(&args)); + ulong b = require_fixnum(final_arg(args)); + return fixnum(b ^ (mask & (a ^ b))); + /* equivalent to (mask & a) | (~mask & b) */ +} + +BUILTIN(builtin_fxmaj) { + ulong a = require_fixnum(extract_arg(&args)); + ulong b = require_fixnum(extract_arg(&args)); + ulong c = require_fixnum(final_arg(args)); + return fixnum((a & (b | c)) | (b & c)); + /* equivalent to (a & b) | (a & c) | (b & c) */ +} + +BUILTIN(builtin_fxshift) { + long a = untag_signed(require_fixnum(extract_arg(&args))); + long bits = untag_signed(require_fixnum(final_arg(args))); + if (bits < 0) { + if (bits <= -VAL_BITS) bits = -VAL_BITS+1; + a >>= -bits; + } + else { + if (bits >= VAL_BITS) a = 0; + else a <<= bits; + } + return fixnum(a); +} + +BUILTIN(builtin_fxshift_unsigned) { + ulong a = require_fixnum(extract_arg(&args)); + long bits = untag_signed(require_fixnum(final_arg(args))); + if (bits < 0) { + if (bits <= -VAL_BITS) a = 0; + else a = untag(a) >> -bits; + } + else { + if (bits >= VAL_BITS) a = 0; + else a <<= bits; + } + return fixnum(a); +} + +BUILTIN(builtin_fxlength_unsigned) { + /* TODO check existing interface alternatives */ + return fixnum(sc_bit_length(untag(require_fixnum(final_arg(args))))); +} + +/** (open-subprocess PROGRAM . ARGS) -> (values PID IN-PORT OUT-PORT) + * + * Executes PROGRAM in a Unix subprocess with the given arguments, returning + * its process ID along with input and output ports piped to its standard + * output and input streams respectively. Does not redirect standard error. By + * convention, the first ARG should be the executable filename. + * + * This is intended to be fast and hygienic: it does not invoke the system + * shell, perform a PATH search, pass through environment variables, or leak + * file descriptors associated with ports previously opened in Scheme. + * + * Signals an error if a system-defined limit is reached, per fork(2) (or any + * argument is not a string). + * + * The type of the returned PID is not specified, but must be composed of + * standard types with unambiguous external representation. + * + * See also: wait-subprocess */ +BUILTIN(builtin_open_subprocess) { + value n_args = 0, i; + char *path, **argv, *envp[] = {NULL}; + pid_t pid; + int out_pipe[2], in_pipe[2]; + + require_args(args); + r1 = args; + /* begin allocation: null-terminated strings and argv */ + for (r2 = r1; r2 != SC_NULL; r2 = cdr(r2)) { + R_EXPR = require_string(car(r2)); + R_EXPR = string_append_null(); + set_car(r2, R_EXPR); + n_args++; + } + n_args--; /* program path not counted as argument */ + /* Caution: allocating C blob on the Scheme heap. Must not be reachable + * from the roots, which in turn excludes further allocation while it's in + * use. */ + argv = (void*)&heap[sc_malloc(n_args+1)]; + /* end allocation */ + path = c_string_buf(car(r1)); + r1 = cdr(r1); /* program args */ + for (i = 0; i < n_args; i++) { + argv[i] = c_string_buf(car(r1)); + r1 = cdr(r1); + } + argv[i] = NULL; + + if (pipe_cloexec(out_pipe)) goto err1; + if (pipe_cloexec(in_pipe)) goto err2; + /* Use vfork so child creation can be fast, and possible on non-overcommit + * systems, even when parent is large. Any signal handlers must not corrupt + * the parent if invoked in the child. See http://ewontfix.com/7/. */ + if ((pid = vfork()) == -1) goto err3; + if (!pid) { /* child */ + while (dup2(out_pipe[0], 0) == -1) if (errno != EINTR) _exit(errno); + while (dup2(in_pipe[1], 1) == -1) if (errno != EINTR) _exit(errno); + execve(path, argv, envp); + _exit(errno); + } + blind_close(out_pipe[0]); + blind_close(in_pipe[1]); + + /* resume allocation */ + R_CDR = SC_NULL; R_CAR = make_port(out_pipe[1], 1, DEFAULT_W_BUF); + R_CDR = cons(); R_CAR = make_port(in_pipe[0], 0, DEFAULT_R_BUF); + R_CDR = cons(); R_CAR = string(fmt_ulong_dec(pid)); + /* ^ pid_t can't be guaranteed to fit in a fixnum, so stringify. I can't + * quite decipher POSIX here but it seems safe to assume it fits in a long + * and is positive on success. */ + RETURN_VALUES(cons()); + +err3: + blind_close(in_pipe[0]); + blind_close(in_pipe[1]); +err2: + blind_close(out_pipe[0]); + blind_close(out_pipe[1]); +err1: + sc_perror(); +} + +/** (wait-subprocess [PID]) -> STATUS + * + * Blocks until a subprocess has terminated, releases the associated resources, + * and returns either the nonnegative integer exit status for normal exit or + * the negative signal number for termination by signal. + * + * PID identifies the process to wait for; it must compare "equal?" to a PID + * previously returned by open-subprocess for which status has not yet been + * retrieved. If omitted, any subprocess is waited for. */ +BUILTIN(builtin_wait_subprocess) { + int status; + pid_t pid; + if (args == SC_NULL) pid = -1; + else { + /* dedicated parser for stringified PIDs (see above), yuck */ + value s = require_string(final_arg(args)); + value len = string_len(s), i; + const uchar *b = string_buf(s); + ulong acc = 0; + if (!len) goto invalid; + for (i = 0; i < len; i++) { + uchar digit = b[i] - '0'; + if (digit > 9) goto invalid; + if (acc > ULONG_MAX/10) goto invalid; + acc *= 10; + if (acc + digit < acc) goto invalid; + acc += digit; + } + pid = acc; + if ((ulong)pid != acc || pid < 0) goto invalid; + goto start; +invalid: + sc_error1("invalid PID:", s); + } +start: + if (waitpid(pid, &status, 0) == -1) { + if (errno == EINTR) goto start; + sc_perror(); + } + if (WIFEXITED(status)) return fixnum(WEXITSTATUS(status)); + if (WIFSIGNALED(status)) return fixnum(-WTERMSIG(status)); + sc_error("unknown status type"); /* shouldn't happen */ +} + +BUILTIN(builtin_read_token) { + R_PORT = opt_final_in_port_arg(args); + switch (read_token()) { + case tok_eof: return SC_EOF; + case tok_literal: R_CAR = s_literal; break; + case tok_open_paren: R_CAR = s_open_paren; break; + case tok_close_paren: R_CAR = s_close_paren; break; + case tok_dot: R_CAR = s_dot; break; + case tok_open_vector: R_CAR = s_open_vector; break; + case tok_identifier: R_CAR = s_identifier; break; + case tok_named_char: R_CAR = s_named_char; break; + case tok_abbrev: R_CAR = s_abbrev; break; + case tok_number: R_CAR = s_number; break; + } + R_CDR = R_LEXEME; + return cons(); +} + +BUILTIN(builtin_write_string) { + value s = extract_arg(&args); + R_PORT = opt_final_out_port_arg(args); + write_str(require_stringlike(s)); + return SC_NULL; +} + +BUILTIN(builtin_write_string_quoted) { + value s = extract_arg(&args); + R_PORT = opt_final_out_port_arg(args); + write_str_quoted(require_stringlike(s)); + return SC_NULL; +} + +/* Private builtins exposed to the toplevel and compiler only */ + +#define assert_args(n) (assert(list_length(args) == (n))) + +/* Debug access to the privileged environment */ +BUILTIN(builtin_toplevel_env) { no_args(args); return SC_TOPLEVEL_ENV; } + +/* (define-r5rs symbol obj) + * + * Binds a variable in the otherwise immutable (scheme-report-environment 5) + * as well as the interaction environment. */ +BUILTIN(builtin_define_r5rs) { + R_CAR = R_VARNAME = require_symbol(extract_arg(&args)); + R_CDR = R_EXPR = final_arg(args); + assert(global_frame_lookup(R_CAR, car(r5rs_env)) == SC_FALSE); + assert(global_frame_lookup(R_CAR, car(interaction_env)) == SC_FALSE); + R_ENV = r5rs_env; + extend_global_env(); + + R_CAR = R_VARNAME; + R_CDR = R_EXPR; + R_ENV = interaction_env; + extend_global_env(); + return SC_NULL; +} + +/* (define-gscm symbol obj) + * + * Binds a variable in the otherwise immutable (gales-scheme-environment) as + * well as the interaction environment. */ +BUILTIN(builtin_define_gscm) { + value binding; + R_VARNAME = require_symbol(extract_arg(&args)); + R_EXPR = final_arg(args); + assert(global_frame_lookup(R_VARNAME, car(r5rs_env)) == SC_FALSE); + + /* need to be able to upgrade ERROR on startup */ + binding = global_frame_lookup(R_VARNAME, car(gscm_env)); + if (binding == SC_FALSE) { + R_CAR = R_VARNAME; + R_CDR = R_EXPR; + R_ENV = gscm_env; + extend_global_env(); + } + else set_cdr(binding, R_EXPR); + + binding = global_frame_lookup(R_VARNAME, car(interaction_env)); + if (binding == SC_FALSE) { + R_CAR = R_VARNAME; + R_CDR = R_EXPR; + R_ENV = interaction_env; + extend_global_env(); + } + else set_cdr(binding, R_EXPR); + return SC_NULL; +} + +BUILTIN(builtin_set_in_port) { + r_input_port = require_input_port(final_arg(args)); + return SC_NULL; +} + +BUILTIN(builtin_set_out_port) { + r_output_port = require_output_port(final_arg(args)); + return SC_NULL; +} + +BUILTIN(builtin_push_winding) { + err_context = "dynamic-wind"; + R_CAR = args; + require_procedure(extract_arg(&args)); + set_cdr(R_CAR, require_procedure(final_arg(args))); + R_CDR = r_spool; + r_spool = cons(); + return SC_NULL; +} + +BUILTIN(builtin_variable_ref) { + R_CAR = car(args); + assert(cdr(args) == SC_NULL); + return make_variable_ref(); +} + +BUILTIN(builtin_apply_unchecked) { + assert_args(2); + R_PROC = car(args); + R_ARGS = cadr(args); + r_flag = f_apply; + return SC_NULL; +} + +BUILTIN(builtin_car_unchecked) { assert_args(1); return car(car(args)); } +BUILTIN(builtin_cdr_unchecked) { assert_args(1); return cdr(car(args)); } +BUILTIN(builtin_set_car_unchecked) { + assert_args(2); set_car(car(args), cadr(args)); return SC_NULL; +} +BUILTIN(builtin_set_cdr_unchecked) { + assert_args(2); set_cdr(car(args), cadr(args)); return SC_NULL; +} + +BUILTIN(builtin_str_ref_unchecked) { + assert_args(2); return character( + string_buf(car(args))[fixnum_val(cadr(args))]); +} +BUILTIN(builtin_vec_ref_unchecked) { + assert_args(2); return vector_ref(car(args), fixnum_val(cadr(args))); +} + +BUILTIN(builtin_fx_add_unchecked) { + assert_args(2); return fixnum(unsigned_fixnum_val(car(args)) + + unsigned_fixnum_val(cadr(args))); +} +BUILTIN(builtin_fx_sub_unchecked) { + assert_args(2); return fixnum(unsigned_fixnum_val(car(args)) - + unsigned_fixnum_val(cadr(args))); +} +BUILTIN(builtin_fx_eq_unchecked) { + assert_args(2); assert(is_fixnum(car(args)) && is_fixnum(cadr(args))); + return boolean(car(args) == cadr(args)); +} +BUILTIN(builtin_fx_lt_unchecked) { + assert_args(2); return boolean(fixnum_val(car(args)) < + fixnum_val(cadr(args))); +} +BUILTIN(builtin_fx_le_unchecked) { + assert_args(2); return boolean(fixnum_val(car(args)) <= + fixnum_val(cadr(args))); +} +BUILTIN(builtin_fx_neg_unchecked) { + assert_args(1); return fixnum(-fixnum_val(car(args))); +} +BUILTIN(builtin_is_fx_neg_unchecked) { + assert_args(1); return boolean(fixnum_val(car(args)) < 0); +} + +BUILTIN(builtin_fx_div_unsigned_unchecked) { + /* unsigned as / and % are implementation-defined on negatives */ + ulong a, b, q; + assert_args(2); + a = unsigned_fixnum_val(car(args)); + b = unsigned_fixnum_val(cadr(args)); + assert(b != 0); + /* the compiler had better recognize this as one division... */ + q = a/b; + a = a%b; + R_CDR = SC_NULL; R_CAR = fixnum(a); + R_CDR = cons(); R_CAR = fixnum(q); + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fx_div_ext_unsigned_unchecked) { + /* unsigned as / and % are implementation-defined on negatives */ + ulong a_lo, a_hi, b; + assert_args(3); + a_lo = unsigned_fixnum_val(car(args)); args = cdr(args); + a_hi = unsigned_fixnum_val(car(args)); args = cdr(args); + b = unsigned_fixnum_val(car(args)); + assert(b > a_hi); /* so quotient fits in fixnum */ + a_lo |= a_hi << VAL_BITS; + a_hi >>= TAG_BITS; + sc_div_extended(&a_lo, &a_hi, b); + R_CDR = SC_NULL; R_CAR = fixnum(a_lo); /* remainder */ + R_CDR = cons(); R_CAR = fixnum(a_hi); /* quotient */ + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_fixnum_to_dec_unchecked) { + assert_args(1); return string(fmt_fixnum_dec(fixnum_val(car(args)))); +} +BUILTIN(builtin_fixnum_to_hex_unchecked) { + assert_args(1); return string(fmt_fixnum_hex(fixnum_val(car(args)))); +} +BUILTIN(builtin_fixnum_to_oct_unchecked) { + assert_args(1); return string(fmt_fixnum_oct(fixnum_val(car(args)))); +} +BUILTIN(builtin_fixnum_to_bin_unchecked) { + assert_args(1); return string(fmt_fixnum_bin(fixnum_val(car(args)))); +} +BUILTIN(builtin_fixnum_to_bin_unsigned_unchecked) { + assert_args(1); + return string(fmt_ulong_bin(unsigned_fixnum_val(car(args)))); +} +BUILTIN(builtin_flonum_to_dec_unchecked) { + assert_args(1); return string(fmt_flonum_dec(flonum_val(car(args)))); +} + +/* Minimal error builtin to be replaced on startup, e.g. in case of compile + * errors in the toplevel */ +BUILTIN(builtin_error) { + value msg = require_string(extract_arg(&args)); + R_PORT = stdout_port; + write_cstr("ERROR [startup]: "); + write_str(msg); + if (args != SC_NULL) { + write_char(' '); + R_EXPR = car(args); + shallow_print(); + } + newline(); + sc_exit(1); +} + +BUILTIN(builtin_set_err_cont) { + value h = final_arg(args); + if (!is_continuation(h)) sc_error("not a continuation"); + r_error_cont = h; + return SC_NULL; +} + +BUILTIN(builtin_socket_ports) { + make_socket_ports(safe_fixnum_val(final_arg(args)), + DEFAULT_R_BUF, DEFAULT_W_BUF); + R_CDR = SC_NULL; R_CAR = r1; + R_CDR = cons(); R_CAR = r0; + RETURN_VALUES(cons()); +} + +static union { + struct sockaddr sa; + struct sockaddr_in sin; + struct sockaddr_un sun; +} sa; + +static socklen_t sa_len; + +/* Fill sa/sa_len from a Scheme IPv4 address structure */ +static void build_sockaddr_in(value addr) { + value ip = require_vector(safe_car(addr)), + port = safe_fixnum_val(safe_car(cdr(addr))), i, byte; + uchar *port_buf = (uchar *)&sa.sin.sin_port, + *addr_buf = (uchar *)&sa.sin.sin_addr; + if (port > 65535) sc_error1("port number out of range:", car(cdr(addr))); + memset(&sa.sin, 0, sizeof sa.sin); + sa.sin.sin_family = AF_INET; + port_buf[0] = port >> 8; + port_buf[1] = port & 0xFF; + if (vector_len(ip) != 4) sc_error("bad address length"); + for (i = 0; i < 4; ++i) { + byte = safe_fixnum_val(vector_ref(ip, i)); + if (byte > 255) + sc_error1("address byte out of range:", vector_ref(ip, i)); + addr_buf[i] = byte; + } + sa_len = sizeof sa.sin; +} + +/* Fill sa/sa_len from a Scheme Unix-domain address structure (string) */ +static void build_sockaddr_un(value addr) { + value path = require_string(addr), len = string_len(path), i; + uchar *buf = string_buf(path); + if (len > sizeof sa.sun.sun_path) sc_error("oversize pathname"); + /* initial NUL allowed for Linux abstract sockets */ + if (len && buf[0]) + for (i = 1; i < len; i++) + if (!buf[i]) sc_error("NUL byte in pathname"); + memset(&sa.sun, 0, sizeof sa.sun); + sa.sun.sun_family = AF_UNIX; + memcpy(&sa.sun.sun_path, string_buf(path), len); + sa_len = offsetof(struct sockaddr_un, sun_path) + len; +} + +/* Construct immutable Scheme address structure from a struct sockaddr_* in + * sa/sa_len. Side effects: R_CAR R_CDR */ +static value parse_sockaddr(void) { + if (sa.sa.sa_family == AF_INET) { + int i; + uchar *port_buf = (uchar *)&sa.sin.sin_port, + *addr_buf = (uchar *)&sa.sin.sin_addr; + R_CDR = SC_NULL; + R_CAR = fixnum((port_buf[0] << 8) + port_buf[1]); + R_CDR = cons_immutable(); + R_CAR = make_immutable_vector(4); + for (i = 0; i < 4; ++i) vector_set(R_CAR, i, fixnum(addr_buf[i])); + return cons_immutable(); + } + else if (sa.sa.sa_family == AF_UNIX) { + value path, path_len; + if (sa_len > sizeof sa.sun) sc_error("oversize pathname?!"); + /* XXX Linuxism; the data returned for unnamed sockets is unspecified + * in the standards */ + if (sa_len == sizeof(sa_family_t)) return SC_FALSE; + /* Possible somewhere? */ + if (sa_len <= offsetof(struct sockaddr_un, sun_path)) return SC_FALSE; + path_len = sa_len - offsetof(struct sockaddr_un, sun_path); + /* Some implementations are so rude as to append a trailing NUL and + * include it in the length. But a singular NUL is a valid abstract + * socket name on Linux. */ + if (path_len > 1 && sa.sun.sun_path[0] && !sa.sun.sun_path[path_len-1]) + --path_len; + path = make_immutable_string(path_len); + memcpy(string_buf(path), sa.sun.sun_path, path_len); + return path; + } + sc_error("unknown address family"); +} + +static value unbound_socket(int domain, int type) { + return fixnum(chkp(socket(domain, type, 0))); +} + +static value bound_socket(int domain, int type, int reuse) { + int fd = chkp(socket(domain, type, 0)); + if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof reuse) || + bind(fd, &sa.sa, sa_len)) { + blind_close(fd); + sc_perror(); + } + return fixnum(fd); +} + +BUILTIN(builtin_inet_stream_sock) { + if (args != SC_NULL) { + build_sockaddr_in(final_arg(args)); + return bound_socket(AF_INET, SOCK_STREAM, 0); + } + return unbound_socket(AF_INET, SOCK_STREAM); +} + +BUILTIN(builtin_inet_dgram_sock) { + if (args != SC_NULL) { + build_sockaddr_in(final_arg(args)); + return bound_socket(AF_INET, SOCK_DGRAM, 0); + } + return unbound_socket(AF_INET, SOCK_DGRAM); +} + +BUILTIN(builtin_unix_stream_sock) { + if (args != SC_NULL) { + build_sockaddr_un(final_arg(args)); + return bound_socket(AF_UNIX, SOCK_STREAM, 0); + } + return unbound_socket(AF_UNIX, SOCK_STREAM); +} + +BUILTIN(builtin_unix_dgram_sock) { + if (args != SC_NULL) { + build_sockaddr_un(final_arg(args)); + return bound_socket(AF_UNIX, SOCK_DGRAM, 0); + } + return unbound_socket(AF_UNIX, SOCK_DGRAM); +} + +BUILTIN(builtin_getsockname) { + uint fd = safe_fixnum_val(final_arg(args)); + sa_len = sizeof sa; + chkp(getsockname(fd, &sa.sa, &sa_len)); + return parse_sockaddr(); +} + +BUILTIN(builtin_getpeername) { + uint fd = safe_fixnum_val(final_arg(args)); + sa_len = sizeof sa; + chkp(getpeername(fd, &sa.sa, &sa_len)); + return parse_sockaddr(); +} + +BUILTIN(builtin_connect_inet) { + uint fd = safe_fixnum_val(extract_arg(&args)); + build_sockaddr_in(final_arg(args)); + chkp(connect(fd, &sa.sa, sa_len)); + return SC_NULL; +} + +BUILTIN(builtin_connect_unix) { + uint fd = safe_fixnum_val(extract_arg(&args)); + build_sockaddr_un(final_arg(args)); + chkp(connect(fd, &sa.sa, sa_len)); + return SC_NULL; +} + +BUILTIN(builtin_listen) { + uint fd = safe_fixnum_val(extract_arg(&args)); + long backlog = safe_fixnum_val(final_arg(args)); + if (backlog < 0) sc_error("negative backlog"); + if (backlog > INT_MAX) backlog = INT_MAX; + chkp(listen(fd, backlog)); + return SC_NULL; +} + +BUILTIN(builtin_accept) { + uint fd = safe_fixnum_val(final_arg(args)); + return fixnum(chkp(accept(fd, 0, 0))); +} + +BUILTIN(builtin_close) { + chkp(close(safe_fixnum_val(final_arg(args)))); + return SC_NULL; +} + +BUILTIN(builtin_is_flonum) { return boolean(is_flonum(final_arg(args))); } + +/* NB: "if the value being converted is in the range of values that can be + * represented but cannot be represented exactly, the result is either the + * nearest higher or nearest lower value, chosen in an implementation-defined + * manner." -C89 */ +BUILTIN(builtin_flonum_unchecked) { + assert_args(1); return flonum(fixnum_val(car(args))); +} +BUILTIN(builtin_flonum_unsigned_unchecked) { + assert_args(1); return flonum(unsigned_fixnum_val(car(args))); +} + +BUILTIN(builtin_flo_eq_unchecked) { + assert_args(2); + return boolean(flonum_val(car(args)) == flonum_val(cadr(args))); +} +BUILTIN(builtin_flo_lt_unchecked) { + assert_args(2); + return boolean(flonum_val(car(args)) < flonum_val(cadr(args))); +} +BUILTIN(builtin_flo_le_unchecked) { + assert_args(2); + return boolean(flonum_val(car(args)) <= flonum_val(cadr(args))); +} +BUILTIN(builtin_flo_neg_unchecked) { + assert_args(1); return flonum(-flonum_val(car(args))); +} +BUILTIN(builtin_is_flo_neg_unchecked) { + assert_args(1); return boolean(flonum_val(car(args)) < 0); +} + +#define FLONUM_OP2(op) { \ + assert_args(2); \ + return flonum(flonum_val(car(args)) op flonum_val(cadr(args))); \ +} + +BUILTIN(builtin_flo_add_unchecked) FLONUM_OP2(+) +BUILTIN(builtin_flo_sub_unchecked) FLONUM_OP2(-) +BUILTIN(builtin_flo_mul_unchecked) FLONUM_OP2(*) +BUILTIN(builtin_flo_div_unchecked) FLONUM_OP2(/) + +BUILTIN(builtin_flo_quotient_unchecked) { + assert_args(2); + return flonum(trunc(flonum_val(car(args)) / flonum_val(cadr(args)))); +} + +BUILTIN(builtin_flo_remainder_unchecked) { + double a, b; + assert_args(2); + a = flonum_val(car(args)); + b = flonum_val(cadr(args)); + return flonum(a < 0 ? -fmod(-a, fabs(b)) : fmod(a, fabs(b))); +} + +BUILTIN(builtin_frac_exp_unchecked) { + int e; + double frac; + assert_args(1); + frac = frexp(flonum_val(car(args)), &e); + R_CDR = SC_NULL; R_CAR = fixnum(e); + R_CDR = cons(); R_CAR = flonum(frac); + RETURN_VALUES(cons()); +} + +BUILTIN(builtin_load_exp_unchecked) { + assert_args(2); + return flonum(ldexp(flonum_val(car(args)), fixnum_val(cadr(args)))); +} + +BUILTIN(builtin_is_inf_unchecked) { + double d; + assert_args(1); + d = flonum_val(car(args)); + return boolean(d == HUGE_VAL || d == -HUGE_VAL); +} + +BUILTIN(builtin_flo_to_fix_unchecked) { + double d; + assert_args(1); + d = flonum_val(car(args)); + assert(fabs(d) <= (double)(1L << VAL_BITS)); + /* ^ Catches overflow of double to long conversion, which is UB, though + * not of long to fixnum (how tight the check can be made is not yet clear + * to me.) */ + return fixnum(d); +} + +#define MATH_FUNC(f) { \ + assert_args(1); return flonum(f(flonum_val(car(args)))); \ +} +BUILTIN(builtin_floor) MATH_FUNC(floor) +BUILTIN(builtin_ceiling) MATH_FUNC(ceil) +BUILTIN(builtin_truncate) MATH_FUNC(trunc) +BUILTIN(builtin_round) MATH_FUNC(nearbyint) +BUILTIN(builtin_exp) MATH_FUNC(exp) +BUILTIN(builtin_log) MATH_FUNC(log) +BUILTIN(builtin_sin) MATH_FUNC(sin) +BUILTIN(builtin_cos) MATH_FUNC(cos) +BUILTIN(builtin_tan) MATH_FUNC(tan) +BUILTIN(builtin_asin) MATH_FUNC(asin) +BUILTIN(builtin_acos) MATH_FUNC(acos) +BUILTIN(builtin_atan) MATH_FUNC(atan) +BUILTIN(builtin_atan2) { + assert_args(2); + return flonum(atan2(flonum_val(car(args)), flonum_val(cadr(args)))); +} +BUILTIN(builtin_sqrt) MATH_FUNC(sqrt) + +BUILTIN(builtin_rev_list_to_vec_unchecked) { + assert_args(1); + R_EXPR = car(args); + return rev_list_to_vec(); +} + +BUILTIN(builtin_is_builtin) { + return boolean(is_builtin(final_arg(args))); +} +BUILTIN(builtin_builtin_name) { + value b = final_arg(args); + if (!is_builtin(b)) sc_error("not a builtin"); + return string(builtin_name(b)); +} +BUILTIN(builtin_is_promise) { + return boolean(is_promise(final_arg(args))); +} +BUILTIN(builtin_is_continuation) { + return boolean(is_continuation(final_arg(args))); +} + +BUILTIN(builtin_make_bignum) { + assert_args(1); + /* Returning uninitialized is safe for the garbage collector: bignums are + * not scanned internally, though the words do keep their fixnum tags. Of + * course, used memory is still being exposed; the privileged bignum + * library is responsible for fully initializing or truncating. */ + return make_bignum_uninit(fixnum_val(car(args)), 0); +} +BUILTIN(builtin_is_bignum) { + assert_args(1); return boolean(is_bignum(car(args))); +} +BUILTIN(builtin_is_bignum_negative) { + assert_args(1); return boolean(is_bignum_negative(car(args))); +} +BUILTIN(builtin_bignum_set_negative) { + assert_args(1); return bignum_set_negative(car(args)); +} +BUILTIN(builtin_bignum_ref) { + assert_args(2); return bignum_ref(car(args), fixnum_val(cadr(args))); +} +BUILTIN(builtin_bignum_set) { + value bn; + assert_args(3); + bn = car(args); args = cdr(args); + bignum_set(bn, fixnum_val(car(args)), cadr(args)); + return SC_NULL; +} +BUILTIN(builtin_bignum_length) { + assert_args(1); return fixnum(bignum_len(car(args))); +} +BUILTIN(builtin_bignum_truncate) { + assert_args(2); return bignum_truncate(car(args), fixnum_val(cadr(args))); +} + +/* Construct bignum from signed fixnum, not demoting. */ +BUILTIN(builtin_bignum) { + value bn, word, word_sign_bit, word_sign_ext; + assert_args(1); + /* branch-free conversion from two's complement to sign-magnitude */ + word = fixnum_val(car(args)); + word_sign_bit = word >> ((8*sizeof word)-1); + word_sign_ext = ((long)word) >> ((8*sizeof word)-1); + word = (word ^ word_sign_ext) + word_sign_bit; + bn = make_bignum_uninit(1, word_sign_bit); + bignum_set(bn, 0, fixnum(word)); + return bn; +} + +/* Construct bignum from unsigned fixnum, not demoting. */ +BUILTIN(builtin_bignum_unsigned) { + value bn, word; + assert_args(1); + word = car(args); + bn = make_bignum_uninit(1, 0); + bignum_set(bn, 0, word); + return bn; +} + +/* Construct bignum from 2-word signed quantity, normalizing and demoting to + * fixnum when possible. */ +BUILTIN(builtin_bignum2) { + value bn; + long lo, hi; + int neg = 0; + assert_args(2); + lo = fixnum_val(car(args)); + hi = fixnum_val(cadr(args)); + /* in signed fixnum range if high word is sign extension of low */ + if (lo >> (VAL_BITS - 1) == hi) return fixnum(lo); + if (hi < 0) { + /* convert from two's complement to sign-magnitude */ + neg = 1; + /* capture carry bit in the tag by setting it to all ones prior to + * complement */ + lo = -(lo | (-1L << VAL_BITS)); + hi = ~(ulong)hi + (((ulong)lo) >> VAL_BITS); + } + if (hi == 0) { + /* need to drop high word to normalize */ + bn = make_bignum_uninit(1, neg); + bignum_set(bn, 0, fixnum(lo)); + } + else { + /* both words significant */ + bn = make_bignum_uninit(2, neg); + bignum_set(bn, 0, fixnum(lo)); + bignum_set(bn, 1, fixnum(hi)); + } + return bn; +} + + +/**************** + * Initialization + */ + +/* Construct a builtin and define it in the top frame of R_ENV. + * Side effects: R_CAR R_CDR */ +static void add_builtin(const char *name, builtin_func_t func) { + R_CAR = symbol(name); + R_CDR = builtin(name, func); + assert(global_frame_lookup(R_CAR, car(R_ENV)) == SC_FALSE); + extend_global_env(); +} + +/* Define a variable in the top frame of R_ENV. + * Side effects: R_EXPR R_CAR R_CDR */ +static void add_variable(const char *name, value val) { + R_EXPR = val; + R_CAR = symbol(name); + R_CDR = R_EXPR; + assert(global_frame_lookup(R_CAR, car(R_ENV)) == SC_FALSE); + extend_global_env(); +} + +/* Side effects: R_RESULT */ +static value open_lib_file(const char *filename) { + int fd = open_cloexec(filename, O_RDONLY); + if (fd == -1) fatal1(filename, strerror(errno)); + return make_port(fd, 0, DEFAULT_R_BUF); +} + +uint sc_hugepages; + +void sc_init(value heap_alloc) { + int mflags; + assert(sizeof(value) == __SIZEOF_POINTER__); + assert(sizeof(value) == sizeof(ulong)); + + mflags = MAP_PRIVATE | MAP_ANON; + if (sc_hugepages) { +#ifdef MAP_HUGETLB + mflags |= MAP_HUGETLB; +#else + fatal("huge pages not supported"); +#endif + } + heap = mmap(NULL, heap_alloc, PROT_READ | PROT_WRITE, mflags, -1, 0); + if (heap == MAP_FAILED) fatal1("failed to map heap", strerror(errno)); + heap_size = heap_alloc / sizeof(value) / 2; + new_heap = heap + heap_size; + + gc_root(&r0); + gc_root(&r1); + gc_root(&r2); + gc_root(&r3); + gc_root(&r4); + gc_root(&r5); + gc_root(&r6); + gc_root(&r_stack); + gc_root(&r_spool); + gc_root(&r_error_cont); + gc_root(&r_signal_handler); + gc_root(&r_compiler); + gc_root(&r_compiler_expr); + gc_root(&r_input_port); + gc_root(&r_output_port); + gc_root(&r_dump); + gc_root(&stdin_port); + gc_root(&stdout_port); + gc_root(&symbols); + gc_root(&s_lambda); + gc_root(&s_quote); + gc_root(&s_quasiquote); + gc_root(&s_unquote); + gc_root(&s_unquote_splicing); + gc_root(&s_if); + gc_root(&s_set); + gc_root(&s_begin); + gc_root(&s_letrec); + gc_root(&s_define); + gc_root(&s_delay); + gc_root(&s_literal); + gc_root(&s_open_paren); + gc_root(&s_close_paren); + gc_root(&s_dot); + gc_root(&s_open_vector); + gc_root(&s_identifier); + gc_root(&s_named_char); + gc_root(&s_abbrev); + gc_root(&s_number); + gc_root(&s_truncate); + gc_root(&s_overwrite); + gc_root(&s_append); + gc_root(&s_sync); + gc_root(&s_data_sync); + gc_root(&r5rs_env); + gc_root(&gscm_env); + gc_root(&interaction_env); + gc_root(&toplevel_env); + + r_input_port = stdin_port = make_port(0, 0, DEFAULT_R_BUF); + r_output_port = stdout_port = make_port(1, 1, DEFAULT_W_BUF); + stdout_ready = 1; + fixnum_zero = fixnum(0); + fixnum_one = fixnum(1); + + s_lambda = symbol("lambda"); + s_quote = symbol("quote"); + s_quasiquote = symbol("quasiquote"); + s_unquote = symbol("unquote"); + s_unquote_splicing = symbol("unquote-splicing"); + s_if = symbol("if"); + s_set = symbol("set!"); + s_begin = symbol("begin"); + s_letrec = symbol("letrec"); + s_define = symbol("define"); + s_delay = symbol("delay"); + s_literal = symbol("literal"); + s_open_paren = symbol("open-paren"); + s_close_paren = symbol("close-paren"); + s_dot = symbol("dot"); + s_open_vector = symbol("open-vector"); + s_identifier = symbol("identifier"); + s_named_char = symbol("named-char"); + s_abbrev = symbol("abbrev"); + s_number = symbol("number"); + s_truncate = symbol("truncate"); + s_overwrite = symbol("overwrite"); + s_append = symbol("append"); + s_sync = symbol("sync"); + s_data_sync = symbol("data-sync"); + + R_CAR = R_CDR = SC_NULL; + R_ENV = r5rs_env = cons(); + add_builtin("eq?", builtin_is_eq); + add_builtin("number?", builtin_is_number); + add_builtin("complex?", builtin_is_number); + add_builtin("real?", builtin_is_number); + add_builtin("rational?", builtin_is_number); + add_builtin("integer?", builtin_is_integer); + add_builtin("exact?", builtin_is_exact); + add_builtin("inexact?", builtin_is_inexact); + add_builtin("not", builtin_not); + add_builtin("boolean?", builtin_is_boolean); + add_builtin("pair?", builtin_is_pair); + add_builtin("cons", builtin_cons); + add_builtin("car", builtin_car); + add_builtin("cdr", builtin_cdr); + add_builtin("caar", builtin_caar); + add_builtin("cadr", builtin_cadr); + add_builtin("cdar", builtin_cdar); + add_builtin("cddr", builtin_cddr); + add_builtin("caaar", builtin_caaar); + add_builtin("caadr", builtin_caadr); + add_builtin("cadar", builtin_cadar); + add_builtin("caddr", builtin_caddr); + add_builtin("cdaar", builtin_cdaar); + add_builtin("cdadr", builtin_cdadr); + add_builtin("cddar", builtin_cddar); + add_builtin("cdddr", builtin_cdddr); + add_builtin("caaaar", builtin_caaaar); + add_builtin("caaadr", builtin_caaadr); + add_builtin("caadar", builtin_caadar); + add_builtin("caaddr", builtin_caaddr); + add_builtin("cadaar", builtin_cadaar); + add_builtin("cadadr", builtin_cadadr); + add_builtin("caddar", builtin_caddar); + add_builtin("cadddr", builtin_cadddr); + add_builtin("cdaaar", builtin_cdaaar); + add_builtin("cdaadr", builtin_cdaadr); + add_builtin("cdadar", builtin_cdadar); + add_builtin("cdaddr", builtin_cdaddr); + add_builtin("cddaar", builtin_cddaar); + add_builtin("cddadr", builtin_cddadr); + add_builtin("cdddar", builtin_cdddar); + add_builtin("cddddr", builtin_cddddr); + add_builtin("set-car!", builtin_set_car); + add_builtin("set-cdr!", builtin_set_cdr); + add_builtin("null?", builtin_is_null); + add_builtin("list?", builtin_is_list); + add_builtin("length", builtin_length); + add_builtin("symbol?", builtin_is_symbol); + add_builtin("symbol->string", builtin_sym_to_str); + add_builtin("string->symbol", builtin_str_to_sym); + add_builtin("char?", builtin_is_char); + add_builtin("char=?", builtin_char_eq); + add_builtin("char?", builtin_char_gt); + add_builtin("char<=?", builtin_char_le); + add_builtin("char>=?", builtin_char_ge); + add_builtin("char-ci=?", builtin_char_ci_eq); + add_builtin("char-ci?", builtin_char_ci_gt); + add_builtin("char-ci<=?", builtin_char_ci_le); + add_builtin("char-ci>=?", builtin_char_ci_ge); + add_builtin("char-alphabetic?", builtin_char_is_alpha); + add_builtin("char-numeric?", builtin_char_is_num); + add_builtin("char-whitespace?", builtin_char_is_white); + add_builtin("char-upper-case?", builtin_char_is_upper); + add_builtin("char-lower-case?", builtin_char_is_lower); + add_builtin("char->integer", builtin_char_to_int); + add_builtin("integer->char", builtin_int_to_char); + add_builtin("char-upcase", builtin_char_upcase); + add_builtin("char-downcase", builtin_char_downcase); + add_builtin("string?", builtin_is_str); + add_builtin("make-string", builtin_make_str); + add_builtin("string-length",builtin_str_length); + add_builtin("string-ref", builtin_str_ref); + add_builtin("string-set!", builtin_str_set); + add_builtin("string=?", builtin_str_eq); + add_builtin("string?", builtin_str_gt); + add_builtin("string<=?", builtin_str_le); + add_builtin("string>=?", builtin_str_ge); + add_builtin("string-ci=?", builtin_str_ci_eq); + add_builtin("string-ci?", builtin_str_ci_gt); + add_builtin("string-ci<=?", builtin_str_ci_le); + add_builtin("string-ci>=?", builtin_str_ci_ge); + add_builtin("substring", builtin_substr); + add_builtin("string-append",builtin_str_append); + add_builtin("list->string", builtin_list_to_str); + add_builtin("string-copy", builtin_str_copy); + add_builtin("string-fill!", builtin_str_fill); + add_builtin("vector?", builtin_is_vector); + add_builtin("make-vector", builtin_make_vector); + add_builtin("vector-length",builtin_vec_length); + add_builtin("vector-ref", builtin_vec_ref); + add_builtin("vector-set!", builtin_vec_set); + add_builtin("list->vector", builtin_list_to_vec); + add_builtin("vector-fill!", builtin_vec_fill); + add_builtin("procedure?", builtin_is_procedure); + add_builtin("force", builtin_force); + add_builtin("call-with-current-continuation", builtin_call_cc); + add_builtin("call/cc", builtin_call_cc); + add_builtin("values", builtin_values); + add_builtin("call-with-values", builtin_call_with_values); + add_builtin("eval", builtin_eval); + add_builtin("scheme-report-environment", builtin_report_env); + add_builtin("null-environment", builtin_null_env); + add_builtin("interaction-environment", builtin_interaction_env); + add_builtin("port?", builtin_is_port); + add_builtin("input-port?", builtin_is_in_port); + add_builtin("output-port?", builtin_is_out_port); + add_builtin("current-input-port", builtin_current_in_port); + add_builtin("current-output-port", builtin_current_out_port); + add_builtin("open-input-file", builtin_open_in_file); + add_builtin("open-output-file", builtin_open_out_file); + add_builtin("close-input-port", builtin_close_in_port); + add_builtin("close-output-port", builtin_close_out_port); + add_builtin("read-char", builtin_read_char); + add_builtin("peek-char", builtin_peek_char); + add_builtin("eof-object?", builtin_is_eof); + add_builtin("char-ready?", builtin_is_char_ready); + add_builtin("write-char", builtin_write_char); + + /* Immutable environment for extensions */ + R_CAR = SC_NULL; R_CDR = r5rs_env; + R_ENV = gscm_env = cons(); + add_builtin("gales-scheme-environment", builtin_gscm_env); + add_builtin("immutable?", builtin_is_immutable); + add_builtin("cons/immutable", builtin_cons_immutable); + add_builtin("string-copy/immutable", builtin_str_copy_immutable); + add_builtin("vector-copy/immutable", builtin_vec_copy_immutable); + add_builtin("flush-output-port", builtin_flush_out_port); + add_builtin("error", builtin_error); + add_builtin("gc", builtin_gc); + add_variable("*fixnum-width*", fixnum(VAL_BITS)); + add_variable("*greatest-fixnum*", fixnum(FIXNUM_MAX)); + add_variable("*least-fixnum*", fixnum(FIXNUM_MIN)); + add_builtin("fixnum?", builtin_is_fixnum); + add_builtin("fx=", builtin_fx_eq); + add_builtin("fx<", builtin_fx_lt); + add_builtin("fx<=", builtin_fx_le); + add_builtin("fx> 1)); + /* ^ sign-encoded arity must fit in procedure header; frame index must fit + * in variable ref header */ + add_builtin("define-r5rs", builtin_define_r5rs); + add_builtin("define-gscm", builtin_define_gscm); + add_builtin("set-input-port!", builtin_set_in_port); + add_builtin("set-output-port!", builtin_set_out_port); + add_builtin("push-winding!", builtin_push_winding); + add_builtin("variable-ref", builtin_variable_ref); + add_builtin("apply/unchecked", builtin_apply_unchecked); + add_builtin("car/unchecked", builtin_car_unchecked); + add_builtin("cdr/unchecked", builtin_cdr_unchecked); + add_builtin("set-car/unchecked!", builtin_set_car_unchecked); + add_builtin("set-cdr/unchecked!", builtin_set_cdr_unchecked); + add_builtin("string-ref/unchecked", builtin_str_ref_unchecked); + add_builtin("vector-ref/unchecked", builtin_vec_ref_unchecked); + add_builtin("fx+/unchecked", builtin_fx_add_unchecked); + add_builtin("fx-/unchecked", builtin_fx_sub_unchecked); + add_builtin("fx=/unchecked", builtin_fx_eq_unchecked); + add_builtin("fxdec/unchecked", builtin_fixnum_to_dec_unchecked); + add_builtin("fixnum->hex/unchecked", builtin_fixnum_to_hex_unchecked); + add_builtin("fixnum->oct/unchecked", builtin_fixnum_to_oct_unchecked); + add_builtin("fixnum->bin/unchecked", builtin_fixnum_to_bin_unchecked); + add_builtin("fixnum->bin/unsigned/unchecked", + builtin_fixnum_to_bin_unsigned_unchecked); + add_builtin("flonum->dec/unchecked", builtin_flonum_to_dec_unchecked); + add_builtin("set-error-continuation!", builtin_set_err_cont); + add_builtin("inet-stream-socket", builtin_inet_stream_sock); + add_builtin("inet-dgram-socket", builtin_inet_dgram_sock); + add_builtin("unix-stream-socket", builtin_unix_stream_sock); + add_builtin("unix-dgram-socket", builtin_unix_dgram_sock); + add_builtin("socket-ports", builtin_socket_ports); + add_builtin("getsockname", builtin_getsockname); + add_builtin("getpeername", builtin_getpeername); + add_builtin("connect-inet", builtin_connect_inet); + add_builtin("connect-unix", builtin_connect_unix); + add_builtin("listen", builtin_listen); + add_builtin("accept", builtin_accept); + add_builtin("close", builtin_close); + add_builtin("flonum?", builtin_is_flonum); + add_builtin("flonum/unchecked", builtin_flonum_unchecked); + add_builtin("flonum/unsigned/unchecked", builtin_flonum_unsigned_unchecked); + add_builtin("flo=/unchecked", builtin_flo_eq_unchecked); + add_builtin("flofixnum/unchecked", builtin_flo_to_fix_unchecked); + add_builtin("floor/unchecked", builtin_floor); + add_builtin("ceiling/unchecked", builtin_ceiling); + add_builtin("truncate/unchecked", builtin_truncate); + add_builtin("round/unchecked", builtin_round); + add_builtin("exp/unchecked", builtin_exp); + add_builtin("log/unchecked", builtin_log); + add_builtin("sin/unchecked", builtin_sin); + add_builtin("cos/unchecked", builtin_cos); + add_builtin("tan/unchecked", builtin_tan); + add_builtin("asin/unchecked", builtin_asin); + add_builtin("acos/unchecked", builtin_acos); + add_builtin("atan/unchecked", builtin_atan); + add_builtin("atan2/unchecked", builtin_atan2); + add_builtin("sqrt/unchecked", builtin_sqrt); + add_builtin("reverse-list->vector/unchecked", builtin_rev_list_to_vec_unchecked); + add_builtin("builtin?", builtin_is_builtin); + add_builtin("builtin-name", builtin_builtin_name); + add_builtin("promise?", builtin_is_promise); + add_builtin("continuation?", builtin_is_continuation); + add_builtin("make-bignum", builtin_make_bignum); + add_builtin("bignum?", builtin_is_bignum); + add_builtin("bignum-negative?", builtin_is_bignum_negative); + add_builtin("bignum-set-negative!", builtin_bignum_set_negative); + add_builtin("bignum-ref", builtin_bignum_ref); + add_builtin("bignum-set!", builtin_bignum_set); + add_builtin("bignum-length", builtin_bignum_length); + add_builtin("bignum", builtin_bignum); + add_builtin("bignum/unsigned", builtin_bignum_unsigned); + add_builtin("bignum2", builtin_bignum2); + add_builtin("bignum-truncate!", builtin_bignum_truncate); + + R_PORT = open_lib_file(GSCMLIB "/compiler.scm"); + err_context = "compiler"; + r_compiler_expr = sc_read(); + if (r_compiler_expr == SC_EOF) fatal("EOF reading compiler code"); + close_port(R_PORT); + R_EXPR = r_compiler_expr; + R_ENV = toplevel_env; + evaluator(); + r_compiler = R_RESULT; + /* Self-compile, for the speed benefit of variable refs */ + R_EXPR = r_compiler_expr; + R_ENV = toplevel_env; + r_compiler_expr = SC_NULL; + evaluator(); + r_compiler = R_RESULT; +} + +int sc_toplevel(int argc, char **argv) { + int i; + R_CDR = SC_NULL; + for (i=argc-1; i>=0; --i) { + R_CAR = string(argv[i]); + R_CDR = cons(); + } + R_ENV = interaction_env; + add_variable("*args*", R_CDR); + + R_PORT = open_lib_file(GSCMLIB "/toplevel.scm"); + err_context = "toplevel"; + R_EXPR = sc_read(); + if (R_EXPR == SC_EOF) fatal("EOF reading toplevel code"); + close_port(R_PORT); + R_ENV = toplevel_env; + evaluator(); + flush_all(); + return fixnum_val(R_RESULT); +} diff -uNr a/gscm/src/gscm.h b/gscm/src/gscm.h --- a/gscm/src/gscm.h false +++ b/gscm/src/gscm.h 1e1c352715558518fcb47d3a01f2a4e2c7f76c88118f8d01b429b7c8c2322d32dc3c357933405eb145e282a765c7e57cab68e4de8cb5b04ac1e77dd9e06a0b36 @@ -0,0 +1,48 @@ +/* Externally visible interface of gscm.c + * To verify: + * egrep -v '^($| |[A-Z]|}|/|#|static|typedef|__attribute__|.*:$)' gscm.c + */ + +#ifndef GSCM_H +#define GSCM_H + +#include +typedef size_t sc_value; + +#ifndef __SIZEOF_POINTER__ +#if defined(__amd64__) +#define __SIZEOF_POINTER__ 8 +#elif defined(__i386__) +#define __SIZEOF_POINTER__ 4 +#endif +#endif + +extern unsigned int sc_hugepages, sc_gc_verbose, sc_gc_thrash_factor; + +#ifdef NDEBUG +#define assert(expr) (void)0 +#else +#define assert(expr) \ + ((void)((expr) || (sc_assert_fail(__FILE__, __LINE__, __func__, #expr),0))) +#endif +void sc_assert_fail(const char *, unsigned long, const char *, const char *); + +void sc_exit(int status); +void sc_write_error(const char *msg); +void sc_error(const char *msg); +void sc_error1(const char *msg, sc_value detail); +void sc_perror(void); +void sc_perror1(sc_value detail); +void sc_gc(void); +void sc_dump(sc_value v); +void sc_init(sc_value heap_bytes); +int sc_toplevel(int argc, char **argv); + +/* Internal platform-dependent functions */ + +void sc_wide_mul(sc_value *a, sc_value *b); +void sc_wide_mul_signed(sc_value *a, sc_value *b); +void sc_div_extended(sc_value *a_lo, sc_value *a_hi, sc_value b); +int sc_bit_length(sc_value a); + +#endif diff -uNr a/gscm/src/main.c b/gscm/src/main.c --- a/gscm/src/main.c false +++ b/gscm/src/main.c fa3f7cdd3376c15f20d01c66d78bb405357932d7b82edce6e467a1dc22e4e02e10b326e33e90fca6f2b7e167e47c835cf70477a992843c0660a873026117521b @@ -0,0 +1,94 @@ +/* Gales Scheme executable entry point and option parser + * J. Welsh, 2017 + */ + +#include "gscm.h" + +#ifndef GSCM_DEF_HEAP +#define GSCM_DEF_HEAP 128 +#endif + +#define STR_(token) #token +#define STR(token_or_macro) STR_(token_or_macro) + +static const char usage[] = +"Usage: gscm [-Hcgh] [-m SIZE] [--] [FILE [ARG ...]]" +"\n" +"\nExecute Scheme source FILE (- for standard input, unless -- is used), with FILE and any ARGs stored in the global list *ARGS*. If no FILE is given, start a read-eval-print loop." +"\n" +"\n-H: Use huge pages if supported" +"\n-c: Load saved core instead of source (TODO)" +"\n-g: Print memory stats to standard error on garbage collection" +"\n-h: Print this help message" +"\n-m: Heap size in MiB. Twice this is fully allocated at startup from the OS kernel's point of view (half reserved for garbage collection). A larger heap reduces GC frequency but may increase virtual memory overhead (TLB thrashing)." +"\n" +"\nBuild configuration:" +"\n Word size (bytes): " STR(__SIZEOF_POINTER__) +"\n Default heap (-m): " STR(GSCM_DEF_HEAP) +"\n Library path: " GSCMLIB +"\n"; + +/* Strictly parse non-negative decimal integer in null-terminated string src. On success, return true and store the result in dst. On failure (empty string, non-digit or overflow), return false. */ +static int parse_ulong_dec(unsigned long *dst, const char *src) { + unsigned long acc = 0; + if (!*src) return 0; + do { + unsigned char digit = ((unsigned char)(*src)) - '0'; + if (digit > 9) return 0; + if (acc > (-1UL)/10) return 0; + acc *= 10; + if (acc + digit < acc) return 0; + acc += digit; + } while (*++src); + *dst = acc; + return 1; +} + +int main(int argc, char **argv) { + char *arg; + sc_value heap_mb = GSCM_DEF_HEAP, heap_alloc; +#define SHIFT { ++argv; --argc; } + if (argc) SHIFT; + while (argc) { + arg = *argv; + if (arg[0] != '-' || !arg[1] || (arg[1] == '-' && !arg[2])) break; + while (*++arg) switch (*arg) { + case 'H': + sc_hugepages = 1; break; + /* case 'c': break; */ + case 'g': + sc_gc_verbose = 1; break; + case 'h': + sc_write_error(usage); return 0; + case 'm': + SHIFT; + if (!argc) { + sc_write_error("gscm: missing -m argument\n"); + goto err; + } + if (!(parse_ulong_dec(&heap_mb, arg) && heap_mb)) { + sc_write_error("gscm: bad -m argument\n"); + goto err; + } + break; + default: + { + char tail[] = {*arg, '\n', 0}; + sc_write_error("gscm: bad option: "); + sc_write_error(tail); + goto err; + } + } + SHIFT; + } + heap_alloc = heap_mb<<21; + if (heap_alloc>>21 != heap_mb) { + sc_write_error("gscm: heap size too big (integer overflow)\n"); + return 1; + } + sc_init(heap_alloc); + return sc_toplevel(argc, argv); +err: + sc_write_error(usage); + return 1; +} diff -uNr a/gscm/src/wishlist b/gscm/src/wishlist --- a/gscm/src/wishlist false +++ b/gscm/src/wishlist 0a01abfa50e21c0a2e19b59ff083578c722a34782ad98456f45e0b672eb1ed97100ee8a2d4d6f4669f9e5251f9bbc89e6767c42dcb447fa46d82573eacacc411 @@ -0,0 +1,16 @@ +More extended types? (sockets) +SIGINT +Dispatch table for builtins; saving cores +Reorganize tag bits (special/char/builtin sharing etc.) +Flexible heap size + +Extensions: +Fixnum: popcount +I/O conditions +Time of day +Sockets: hosts / services file parsing, DNS + +Optimization: +Syntax dispatch +Use tries for symbol interning & variable lookup +Some list procedures could build by mutation to avoid non-tail recursion diff -uNr a/gscm/tests/common.scm b/gscm/tests/common.scm --- a/gscm/tests/common.scm false +++ b/gscm/tests/common.scm ce130771bc7c8151e82e2a46e873a804ca9edc3a71c613806e1986f8f3d44361202c97b0ef5be13072bcc696c2fca5eaaa85f1030de7fd43a6db425f16d7dd65 @@ -0,0 +1,14 @@ +(define interaction-env + (interaction-environment)) ; R5RS +; user-initial-environment) ; MIT + +(define (write-line x) + (write x) (newline)) + +(define (assert x . context) + (write-line (cons (if x 'pass 'fail) context))) + +(define (assert-equal a b . context) + (write-line (if (equal? a b) + (cons 'pass context) + (append (list 'fail a b) context)))) diff -uNr a/gscm/tests/compiler.scm b/gscm/tests/compiler.scm --- a/gscm/tests/compiler.scm false +++ b/gscm/tests/compiler.scm d85c54ab30280cf784fecb906115f0aa314a482bb5b9e76f74a8152f8c4af21ae9b757a1842577a8e6ceaa0b87cdbb223d1539ba8b9ff9132ddca36d70cd612c @@ -0,0 +1,24 @@ +(load "common.scm") + +(define good (list '+ 1 1)) +(assert-equal + (eval (list 'begin good + '(set-cdr! good 'crashme) ;; ill-formed expression (+ . crashme) + good) + interaction-env) + 2 'mutating-compiled-expression) + +(let ((formals (list 'x))) + (let ((proc (eval (list 'lambda formals 'x) interaction-env))) + (set-car! formals 'y) + (assert-equal (proc 1) 1 'mutating-compiled-formals))) + +; Example type safety violation (segfault) if builtins used by the compiler or +; library are mutable from user code. (Doesn't stress gscm properly now that +; MAP is no longer a builtin.) +(assert-equal + (eval '(begin + (set! map (lambda args 'crashme)) + (eval '(+) interaction-env)) + interaction-env) + 0 'mutating-compiler-dependency) diff -uNr a/gscm/tests/control.scm b/gscm/tests/control.scm --- a/gscm/tests/control.scm false +++ b/gscm/tests/control.scm 0206e59daa2fe210e24241dc52523e4236d12de46ac69f983e56422f6eeca3ed33db06aac60780cd1b3de8f1cceaed164c29f685d600e1c32eea1ed7e60755d4 @@ -0,0 +1,127 @@ +(load "common.scm") + +(assert-equal + (call-with-current-continuation (lambda (return) (return 1) error)) + 1 'escape) + +(assert-equal (values 1) 1 'values-normal-continuation) + +(assert-equal + (call-with-values (lambda () 1) +) + 1 'call-with-values-normal-return) + +(assert-equal + (call-with-values (lambda () (call/cc (lambda (return) (return 1)))) +) + 1 'call-with-values-continuation) + +(assert-equal + (call-with-values (lambda () (call/cc (lambda (return) (return 1 2)))) +) + 3 'multi-valued-continuation) + +(assert-equal (call-with-values (lambda () (values 1 2)) +) 3 'multi-values) + +(assert-equal (call-with-values (lambda () (values)) +) 0 'no-values) + +;; Schroedinger's variable: letrec variables are in scope yet undefined (error +;; to set or reference them) within the initializers, and fully defined within +;; the body. But what if the initializers are re-entered after they've already +;; completed? Seems like this should be an error, but difficult to detect. +(let ((enter #f) + (init #f)) + (letrec ((a 2) + (b (if (call/cc (lambda (c) (set! enter c) #f)) + a 1))) + (if init (assert-equal b 2 'letrec-reentry))) + (if (not init) + (begin (set! init #t) (enter #t)))) + +(letrec + ((done #f) + (p (delay (or done + (begin (set! done #t) + (force p) + #f))))) + (assert (force p) 'recursive-force)) + +;;; Testing dynamic-wind + +(let ((r '()) + (level 0) + (enter #f) + (done #f)) + (define (push x) (set! r (cons x r))) + (define (test vals context) + (assert-equal (reverse r) vals context) + (set! r '())) + (define (trace thunk) + (let ((l level)) + (dynamic-wind (lambda () (push `(enter ,l)) (set! level (+ l 1))) + (lambda () (let ((r (thunk))) (push r) r)) + (lambda () (push `(exit ,l)) (set! level l))))) + (define (capture) + (trace (lambda () (call/cc (lambda (c) (set! enter c) 1))))) + + (trace (lambda () 1)) + (test '((enter 0) 1 (exit 0)) 'simple-dynamic-wind) + + (trace (lambda () (call/cc (lambda (return) (return 1))))) + (test '((enter 0) 1 (exit 0)) 'escape-within-same-extent) + + (assert-equal + (call-with-values + (lambda () (call/cc (lambda (return) + (trace (lambda () + (trace (lambda () (return 1 2)))))))) +) + 3 'multi-valued-continuation-dynamic-wind) + (test '((enter 0) + (enter 1) + (exit 1) + (exit 0)) 'escape-dynamic-extents) + + (trace capture) + (if (not done) + (begin (test '((enter 0) + (enter 1) + 1 (exit 1) + 1 (exit 0)) 'captured-inner) + (set! done #t) + (enter 2))) + (test '((enter 0) + (enter 1) + 2 (exit 1) + 2 (exit 0)) 'reenter-dynamic-extents) + + (set! done #f) + (trace (lambda () + (trace capture) + (if done 3 + (begin (set! level 10) + (trace (lambda () (set! done #t) (enter 2))))))) + (test '((enter 0) + (enter 1) + (enter 2) + 1 (exit 2) + 1 (exit 1) + (enter 10) + (exit 10) + (enter 1) + (enter 2) + 2 (exit 2) + 2 (exit 1) + 3 (exit 0)) 'reenter-with-common-ancestor) + + (set! done #f) + (call/cc (lambda (continue) + (let try () + (set-error-handler! + (lambda args + (set! done #t) + (push 'handled) + (try))) + (if (not done) + (trace (lambda () barf)) + (begin (test '((enter 0) (exit 0) handled) 'error-escape) + (continue '())))))) + + ;; TODO escape from before/after shenanigans + ) diff -uNr a/gscm/tests/extensions.scm b/gscm/tests/extensions.scm --- a/gscm/tests/extensions.scm false +++ b/gscm/tests/extensions.scm f07e839012d43c8fb37336a86efc1035d293c78c3f1f30220a9134602b9e06e21bb5d4e6e1399bc345ffebde73a76b485704f4de0104fbed3075cbb40ead6160 @@ -0,0 +1,133 @@ +(load "common.scm") + +;;; Immutability + +(define (ident x) x) + +(assert + (not (or (immutable? (list 0)) + (immutable? (string #\a)) + (immutable? (vector 0)))) 'mutable-types) + +(assert + (and (immutable? (cons/immutable 0 1)) + (immutable? (string-copy/immutable (string #\a))) + (immutable? (vector-copy/immutable (vector 0))) + (immutable? (list)) + (immutable? #t) + (immutable? #f) + (immutable? 0) + (immutable? .1) + (immutable? (/ 1 2)) + (immutable? #\a) + (immutable? (integer->char 0)) + (immutable? 'a) + (immutable? (string->symbol "a")) + (immutable? (symbol->string 'a)) + (immutable? (current-output-port)) + (immutable? cons) + (immutable? ident) + (immutable? (delay 0)) + (immutable? (call-with-current-continuation ident))) + 'immutable-types) + +;; IMMUTABLE? is unspecified on empty strings and vectors although they're +;; inherently immutable. +;; +;; Rationale: allowing it to return false in these cases may be useful; for +;; example, code that asserts immutability might want to signal a bug if passed +;; a string that hasn't explicitly been made immutable. Also it may simplify +;; implementation. + +;;; Numbers + +;(receive (q r) (quotient/remainder 7 -5) +; (assert-equal q -1 'quotient/remainder) +; (assert-equal r 2)) + +;;; Fixnums + +(assert-equal (fxlength/unsigned 0) 0 'fxlength/unsigned) +(assert-equal (fxlength/unsigned 1) 1) +(assert-equal (fxlength/unsigned 2) 2) +(assert-equal (fxlength/unsigned 3) 2) +(assert-equal (fxlength/unsigned *greatest-fixnum*) (- *fixnum-width* 1)) +(assert-equal (fxlength/unsigned -1) *fixnum-width*) +(assert-equal (fxshift/unsigned -1 -1) *greatest-fixnum* 'fxshift/unsigned) +(assert-equal (fxshift -1 -1) -1 'fxshift) +(assert-equal (fxshift -1 1) -2) +(assert-equal (fxshift 1 *fixnum-width*) 0) +(assert-equal (fxshift 1 (- *fixnum-width* 1)) *least-fixnum*) +(assert-equal (fxshift/unsigned *least-fixnum* *fixnum-width*) 0) +(assert-equal (fxshift *least-fixnum* (- *fixnum-width*)) -1) +(assert-equal (fx+/wrap *greatest-fixnum* 1) *least-fixnum* 'fx+/wrap) +(assert-equal (fx-/wrap *least-fixnum*) *least-fixnum* 'fx-/wrap) +;; TODO more + +;;; Bignum conversions +(assert (positive? (- *least-fixnum*)) 'least-fx-promotion) +(assert-equal (- *least-fixnum*) (+ *greatest-fixnum* 1) 'least-fx-magnitude) +(assert-equal (- (- *least-fixnum*)) *least-fixnum* 'least-fx-demotion) + +;;; Sockets + +(define (write-all . args) (write-line args)) + +(let ((aref (lambda (alist field) (cadr (assq field alist)))) + (dispatch (lambda (alist) (lambda (field) (cadr (assq field alist))))) + (backlog 100)) + + (define (test-roundtrip i1 o1 i2 o2) + (let ((request "hello") + (response "world")) + (write request o1) + (close-output-port o1) + (assert-equal (read i2) request 'request) + (assert (eof-object? (read i2)) 'request-eof) + (close-input-port i2) + (write response o2) + (close-output-port o2) + (assert-equal (read i1) response 'response) + (assert (eof-object? (read i1)) 'response-eof) + (close-input-port i1))) + + (define (test-cli-srv listener client-opener) + (let* ((listener (dispatch listener)) + (cli (dispatch (client-opener (listener 'address)))) + (srv (dispatch ((listener 'accept))))) + ((listener 'close)) + ;; From ip(7) on Linux: "When connect(2) is called on an unbound socket, + ;; the socket is automatically bound to a random free port or to a usable + ;; shared port with the local address set to INADDR_ANY." Apparently this + ;; is not so; in this case the client's address becomes 127.0.0.1. + (assert-equal (cli 'address) (srv 'peer-address) 'srv-peer-matches-cli) + (assert-equal (cli 'peer-address) (srv 'address) 'cli-peer-matches-srv) + (test-roundtrip (cli 'input-port) (cli 'output-port) + (srv 'input-port) (srv 'output-port)))) + + (let* ((s (open-udp-socket)) + (a (aref s 'address))) + (assert-equal a '(#(0 0 0 0) 0) 'unbound-udp-address) + (assert (and (immutable? a) (immutable? (car a))) 'inet-address-immutable) + ((aref s 'close))) + + (let ((s (open-udp-socket '(#(127 0 0 1) 0)))) + (assert-equal (car (aref s 'address)) '#(127 0 0 1) 'bound-udp-address) + (assert (> (cadr (aref s 'address)) 0) 'auto-udp-port) + ((aref s 'close))) + + (let* ((l (open-unix-listener backlog "")) ;; Linux abstract socket + (a (aref l 'address))) + (assert (string? a) 'unix-listener-address) + (assert (immutable? a) 'unix-address-immutable) + (let ((s (open-unix-connection a))) + (assert-equal (aref s 'address) #f 'unbound-unix-stream-address) + ;((aref s 'close)) ;; unimplemented; would close the FD without shutdown + ((aref l 'close)))) + + (let ((s (open-unix-datagram-socket))) + (assert-equal (aref s 'address) #f 'unbound-unix-datagram-address) + ((aref s 'close))) + + (test-cli-srv (open-tcp-listener backlog) open-tcp-connection) + (test-cli-srv (open-unix-listener backlog "") open-unix-connection)) diff -uNr a/gscm/tests/library.scm b/gscm/tests/library.scm --- a/gscm/tests/library.scm false +++ b/gscm/tests/library.scm d718b0f503936373932183f5929db8c36cc9bfea4a0b6201225a1320f3bc4f45576d0ea286bd52c5850508aea9b354a8b16232d1b7bf54ab3f7a7c97f7a8f872 @@ -0,0 +1,41 @@ +(load "common.scm") + +(assert (equal? 1. 1.) 'equal-floats) +(assert (equal? (cons (cons 1 2) 3) '((1 . 2) . 3)) 'equal-pairs) +(assert (equal? (string #\a #\b #\c) "abc") 'equal-strings) +(assert (equal? (vector) '#()) 'equal-empty-vectors) +(assert (equal? (vector 1 2 (cons 3 4)) '#(1 2 (3 . 4))) 'equal-vectors) +(assert (not (equal? (cons 1 2) '())) 'unequal-types-1) +(assert (not (equal? "a" #\a)) 'unequal-types-2) +(assert (not (equal? "a" "A")) 'unequal-case) +(assert (not (equal? '#() '#(1))) 'unequal-vectors-1) +(assert (not (equal? '#(1) '#())) 'unequal-vectors-2) +(assert (not (equal? '#(1) '#(2))) 'unequal-vectors-3) + +(assert-equal (list? '(1 2 . 3)) #f 'improper-list) + +(let ((l (list 1 2 3))) + (set-cdr! (cddr l) l) + (assert-equal (list? l) #f 'cyclic-list)) + +(let* ((x (list 1)) + (y (list 2)) + (z (append x y))) + (assert-equal z '(1 2) 'append) + (set-car! x 0) + (assert-equal z '(1 2) 'append-copied-first) + (set-car! x 1) + (set-car! y 0) + (assert-equal z '(1 0) 'append-shared-last) + (assert-equal (append x 2) '(1 . 2) 'append-improper-last)) + +(let* ((x (list 1)) + (y (apply (lambda args args) 0 x))) + (assert-equal y '(0 1) 'apply) + (set-car! x 2) + (assert-equal y '(0 1) 'arg-list-copied)) + +(assert-equal (map - '(1 2 3)) '(-1 -2 -3) 'map) +(assert-equal (map + '(1 2 3) '(4 5 6)) '(5 7 9) 'multi-map) + +;; MOAR TESTS! diff -uNr a/gscm/tests/numbers.scm b/gscm/tests/numbers.scm --- a/gscm/tests/numbers.scm false +++ b/gscm/tests/numbers.scm d6a69bcc57e5e346acec37203bc640f03b0d75e5adb6a621ea75a92732ce03f2d626fb49abc9fb59490ef3d6d8175864e3ff631b94af215c027f66d4873dd816 @@ -0,0 +1,77 @@ +(load "common.scm") + +(assert (negative? -1) 'negative) +(assert (exact? #e10) 'exact) +(assert (inexact? 0.) 'inexact-zero) +(assert (inexact? .0) 'leading-point) +(assert (inexact? 1e2) 'inexact-exp) +(assert (< 0 1e-12 1) 'negative-exp) +(assert (inexact? #i0) 'inexact-prefix) +(assert (= #b11110000 #o360 #d240 240 #xf0) 'radix) +(assert (= #X#E-F #e#x-f -15) 'case-prefix-order) +(assert (inexact? (quotient 2. 1.))' inexact-quotient) +(assert (exact? 10000000000000000000000000000000000000000) 'exact-bignum) +(assert (inexact? #i10000000000000000000000000000000000000000) 'inexact-bignum) +(assert (< (/ (abs (- 10000000000000000000000000000000000000000. 1e40)) 1e40) + (expt 2 -53)) 'inexact-read-precision) +;(assert (exact? 1/2) 'rational) +;(assert (< -2 -4/3 -1) 'neg-rational) +;(assert (complex? +i) 'imaginary) +;(assert (complex? -i) 'neg-imaginary) +;(assert (inexact? 1+2.5i) 'inexact-complex) +;(assert (complex? 1@-1) 'polar-complex) + +(assert-equal (quotient 7 5) 1 'quotient) +(assert-equal (remainder 7 5) 2 'remainder) +(assert-equal (modulo 7 5) 2 'modulo) +(assert-equal (quotient 7 -5) -1 'quotient-rounds-to-zero) +(assert-equal (quotient -7 5) -1) +(assert-equal (remainder 7 -5) 2 'remainder-sign-dividend) +(assert-equal (remainder -7 5) -2) +(assert-equal (modulo 7 -5) -3 'modulo-sign-divisor) +(assert-equal (modulo -7 5) 3) +(assert (positive? (quotient *least-fixnum* -1)) 'quotient-overflow) +(assert-equal (quotient *least-fixnum* 1) *least-fixnum*) + +(let ((big (expt 10 500)) + (epsilon 1E-10)) + (define (~ x y) (< (abs (- x y)) (* (abs x) epsilon))) + (assert-equal (expt -.5 0) 1 'expt1) + (assert-equal (expt -.5 0.) 1. 'expt2) + (assert-equal (expt 0 0) 1 'expt3) + (assert-equal (expt 0. 0) 1 'expt4) + (assert-equal (expt 0 0.) 1. 'expt5) + (assert-equal (expt 0. 0.) 1. 'expt6) + (assert-equal (expt 0 1) 0 'expt7) + (assert-equal (expt 0 1.) 0. 'expt8) + (assert-equal (expt 0. 1) 0. 'expt9) + (assert-equal (expt 0. 1.) 0. 'expt10) + (assert-equal (expt 1 1.) 1 'expt12) + (assert-equal (expt 1 -1) 1 'expt13) + (assert-equal (expt 1 big) 1 'expt14) + (assert-equal (expt -1 big) 1 'expt15) + (assert-equal (expt -1 (+ big 1)) -1 'expt16) + (assert (~ (expt 2 (/ 2)) 1.4142135623731) 'expt17) + (assert (~ (expt (/ (expt 5. (/ -10))) 10.) 5) 'expt18) + (assert (~ (expt -2. 3.) -8.) 'expt19) + (assert (~ (expt 10 100) 1e100) 'expt20) + (assert (~ (atan 1e100) (atan (expt 10 100))) 'atan-bignum) + (assert (eqv? big (floor big)) 'floor-bignum) + (assert (not (= *greatest-fixnum* (exact->inexact *greatest-fixnum*))) + 'mixed-compare) + (assert (< 0 1e500) 'mixed-compare-inf) + (assert-equal 0 (log (exp 0)) 'exp-exact-roundtrip) + (assert-equal 0 (asin (sin 0)) 'sin-exact-roundtrip) + (assert-equal 0 (acos (cos 0)) 'cos-exact-roundtrip) + (assert-equal 0 (atan (tan 0)) 'tan-exact-roundtrip) + (assert (~ 1 (log (exp 1))) 'exp-roundtrip) + (assert (~ 1 (asin (sin 1))) 'sin-roundtrip) + (assert (~ 1 (acos (cos 1))) 'cos-roundtrip) + (assert (~ 1 (atan (tan 1))) 'tan-roundtrip) + (assert-equal 0 (atan 0 1.) 'atan2-exact) + (assert (~ (atan 1 0) (asin 1)) 'atan2) + (assert (~ (atan 0 -1) (acos -1)) 'atan2) + (assert (~ (atan -1 0) (- (asin 1))) 'atan2) + (assert (~ (atan 1 1) (atan 1)) 'atan2) + ;; Unspecified in r5rs; any number fits the equation + (assert-equal 0. (atan 0 0) 'atan2-singularity)) diff -uNr a/gscm/tests/semantic-bad.scm b/gscm/tests/semantic-bad.scm --- a/gscm/tests/semantic-bad.scm false +++ b/gscm/tests/semantic-bad.scm b0a591d90b7f673dedc74c28f776826e88782b91a1f764e990e9dac14b904b8d4271a4dc549f859754046a42890664f5f74598482b89d03ac3a664a9a83cf3a2 @@ -0,0 +1,77 @@ +;; Run by test-errors.scm + +empty-combination () +improper-combination (+ 1 . 1) +not-self-evaluating #(1) +missing-parameters (lambda) +missing-body (lambda a) +improper-combination (lambda a . b) +not-identifier (lambda (()) '()) +not-identifier (lambda (#t) '()) +not-identifier (lambda 0 '()) +duplicate-name (lambda (a a) a) +excess-arguments (quote a b) +missing-predicate (if) +missing-consequent (if #t) +excess-arguments (if #t 1 2 3) +missing-variable (set!) +missing-value (set! x) +not-identifier (set! () 1) +excess-arguments (set! x 1 2) +defs-mixed-with-body (let () (define x 0) (set! x 1) (define x 2) x) +missing-bindings (letrec) +missing-body (letrec ((x 1))) +bindings-not-list (letrec x x) +bindings-not-list (letrec ((x 1) . 0) x) +binding-not-list (letrec (x) x) +binding-missing-name (letrec (()) x) +binding-missing-value (letrec ((x)) x) +improper-binding (letrec ((x . 1)) x) +excess-binding-arguments (letrec ((x 1 2)) x) +duplicate-name (letrec ((x 1) (x 2)) x) +undefined-variable (letrec ((x 1) (y x)) y) +undefined-variable (letrec ((x 1) (y (set! x 2))) y) +undefined-variable (let ((x 1)) (letrec ((x 2) (y x)) y)) +missing-arguments (apply +) +arguments-not-list (apply + 1) +arguments-not-list (apply + 1 '(2 . 3)) +immutable-environment (eval '(define x 1) (scheme-report-environment 5)) +immutable-environment (eval '(define x 1) (null-environment 5)) +immutable-environment (eval '(set! + 1) (scheme-report-environment 5)) +missing-continuation-argument (+ (values)) +excess-continuation-argument (+ (values 1 2)) +uneven-map (map + '(1) '(2 3)) +improper-map (map + '(1 2 . 3) '(1 2)) +uneven-for-each (for-each + '(1) '(2 3)) + +;; Why immutability is important. +mutating-pair-constant +(let ((f (lambda () '(0)))) + (set-car! (f) 1) + (car (f))) + +mutating-string-constant +(let ((f (lambda () "a"))) + (string-set! (f) 0 #\b) + (string-ref (f) 0)) + +mutating-quoted-string-constant +(let ((f (lambda () '"a"))) + (string-set! (f) 0 #\b) + (string-ref (f) 0)) + +mutating-nested-vector-constant +(let ((f (lambda () '(#(0))))) + (vector-set! (car (f)) 0 1) + (vector-ref (car (f)) 0)) + +div-by-zero (quotient 0 0) +div-by-zero (remainder 0 0) +bignum-div-by-zero (remainder (expt 2 500) 0) +unary-add-type (+ 'bad-type) +unary-mul-type (* 'bad-type) +exact-expt-type (expt 'bad-type 0) +exact-expt-type (expt 0 'bad-type) +exact-expt-type (expt 1 'bad-type) +ident-expt-type (expt 'bad-type 1) +non-readable-value (eval (list 'quote +) (interaction-environment)) diff -uNr a/gscm/tests/storage.scm b/gscm/tests/storage.scm --- a/gscm/tests/storage.scm false +++ b/gscm/tests/storage.scm be4fce3c9c8376921229b6448f9a7e3d470b7fd01810970cf5930d74c1bfabe1dac390f45b7d077ddcea039a9014ba87dbe6d75b0913dd84f2b32a06225053d6 @@ -0,0 +1,38 @@ +;; Very basic storage model tests. + +(load "common.scm") + +(letrec ((a 0)) + (letrec ((b (cons a 1))) + (set-car! b 2) + (assert-equal a 0))) + +(letrec ((a 0)) + (define (b) a) + (assert-equal (b) 0) + (set! a 1) + (assert-equal (b) 1)) + +(letrec ((a 0)) + (define b ((lambda (a) (lambda () a)) + 1)) + (assert-equal (b) 1)) + +(assert-equal + (let* ((a 1) (b a)) + (set! a 2) + b) + 1) + +(let ((a 1)) + ((lambda () (set! a 2))) + (assert-equal a 2)) + +(let ((a (string #\a))) + (let ((b a)) + (assert (eq? a b)))) + +(let* ((a (list 1 2)) + (b (apply list a))) + (set-cdr! a '()) + (assert-equal b '(1 2))) diff -uNr a/gscm/tests/test-errors.scm b/gscm/tests/test-errors.scm --- a/gscm/tests/test-errors.scm false +++ b/gscm/tests/test-errors.scm 54bf47996eb3b76a34fa4c55081dde76b3961b1048689d5a906ad2b9037772f576d56e4d6114783ff904fab11c2a99a43bd68f66814ef06d242b317e6977603a @@ -0,0 +1,19 @@ +(define (run filename) + (define port (open-input-file filename)) + (define restart #f) + (define (loop) + (let* ((name (read port)) + (form (read port))) + (if (eof-object? form) (close-input-port port) + (begin (set-error-handler! + (lambda (msg args trace-log) + (write `(pass ,name ,msg . ,args)) + (newline) + (restart '()))) + (write `(fail ,name ,(eval form (interaction-environment)))) + (newline) + (loop))))) + (call-with-current-continuation (lambda (c) (set! restart c))) + (loop)) + +(run "semantic-bad.scm")