Projects : gscm : gscm_usrbin
1 | Report on Gales Scheme Extensions |
2 | ================================= |
3 | |
4 | J. Welsh |
5 | |
6 | 19 April 2018 |
7 | |
8 | This is a supplement to the "Revised(5) Report on the Algorithmic Language Scheme" defining the language extensions in the Gales Scheme system. The intent is to be stringent enough to be useful, yet lenient enough to facilitate adoption by a variety of Scheme implementations on a variety of platforms. |
9 | |
10 | Syntax library |
11 | -------------- |
12 | |
13 | RECEIVE, per SRFI 8. |
14 | |
15 | Runtime environment |
16 | ------------------- |
17 | |
18 | *args* list |
19 | |
20 | (set-error-handler! handler) procedure |
21 | |
22 | Registers an error handler procedure. When an error is signalled, either internally or by a call to ERROR, execution exits any active dynamic extents (see DYNAMIC-WIND) and HANDLER is called with a message string as first argument and possibly relevant values as additional arguments. To prevent recursive errors, the error handler is first reset to the system's default, so HANDLER may need to re-register after completion of any error-prone operations. If HANDLER returns, the Scheme process terminates. |
23 | |
24 | It may not be possible to guarantee recovery from an out-of-memory error, as the handler could trigger it again before it is able to do anything useful. |
25 | |
26 | Rationale: |
27 | This mechanism is a poor substitute for a typed exception system but enables basic recovery such as restarting a REPL. |
28 | |
29 | (error string detail ...) procedure |
30 | |
31 | Per SRFI-23. Exits any current dynamic extents and does not return. Arguments are passed to the current error handler. |
32 | |
33 | (toplevel) procedure |
34 | |
35 | (call-as-toplevel) procedure |
36 | |
37 | (save-core filename) procedure |
38 | |
39 | (exit) procedure |
40 | (exit status) procedure |
41 | |
42 | Exits any current dynamic extents and terminates the Scheme system, returning the given exact integer exit status to the operating system. What exactly is returned is implementation dependent, but a normal or successful status shall be indicated by zero, which is the default if STATUS is omitted. |
43 | |
44 | (gc) procedure |
45 | |
46 | Runs a garbage collection, returning the number of live cells. |
47 | |
48 | Eval |
49 | ---- |
50 | |
51 | (gales-scheme-environment) procedure |
52 | |
53 | Input and output |
54 | ---------------- |
55 | |
56 | (flush-output-port) procedure |
57 | (flush-output-port port) procedure |
58 | (flush-output-port port option) procedure |
59 | |
60 | Flushes an open output port, scheduling any buffered writes for delivery to their destination as soon as possible. Returns an unspecified value. The port argument, if omitted, defaults to the value returned by current-output-port. The option argument, if included, is a symbol specifying additional semantics: |
61 | |
62 | sync -- performs a synchronous flush, delivering any buffered data and metadata to the underlying storage device and not returning until complete. If this fails or is not possible, an error is signalled. The intent is that for a file on nonvolatile storage media, all data previously written to the port be safe with respect to power failure or system crash upon successful return. In practice, this requires cooperation from external components such as operating system and storage hardware. |
63 | data-sync -- like sync except that updates to metadata not required for correctly reading back the data, such as timestamps, are not required to be completed. |
64 | |
65 | Writes are still to be delivered in a timely manner even without explicit flushing; buffered implementations are suggested to use a timer on the order of 10 milliseconds. Buffers are always flushed on CLOSE-OUTPUT-PORT and on orderly termination of the Scheme system, which means termination for any reason within the control of the implementation. Asynchronous write errors must not be signalled until the next operation on the affected port, and thus may be lost if the port is not explicitly flushed or closed. |
66 | |
67 | Rationale: |
68 | The ideal port abstraction is to deliver individual characters without delay. Unfortunately this approach often has considerable overhead, such as context switching for system calls or per-packet headers and computation on a network. The magic number 10 is intended to provide adequate responsiveness by default for common applications, while this procedure provides the means to make stricter latency or persistence requirements explicit when needed. |
69 | |
70 | (open-subprocess path arg1 ...) procedure |
71 | |
72 | (wait-subprocess) procedure |
73 | (wait-subprocess pid) procedure |
74 | |
75 | (open-output-file path if-exists) procedure |
76 | (call-with-output-file path proc if-exists) procedure |
77 | (with-output-to-file path thunk if-exists) procedure |
78 | |
79 | These extended variants of the corresponding R5RS procedures specify what to do if the named file exists. Symbolic options include: |
80 | |
81 | truncate -- the file is truncated to length 0. |
82 | append -- writes are performed at the end of the file. |
83 | overwrite -- file contents are overwritten in place starting at offset 0. |
84 | |
85 | Rationale: |
86 | A default is not specified, as it wasn't in R5RS and existing implementations vary. An "error" option is not provided as there would generally be a race condition between the existence check and creation. POSIX for example provides the O_EXCL open flag but only requires it be atomic with respect to other calls with the same flag. |
87 | |
88 | (read-token) procedure |
89 | (read-token port) procedure |
90 | |
91 | READ-TOKEN reads a single Scheme token from PORT, or the current input port. It is a recognizer for the nonterminal <token> from R5RS section 7.1.1. With the exception of number tokens, an error is signalled if a lexically invalid character sequence is encountered in the input, or if an end of file is encountered after the beginning of a token and the token is incomplete. If an end of file is encountered before any characters are found that can begin a token, an end of file object is returned. Otherwise, a pair is returned whose CAR is a symbol indicating the token type and whose CDR contains either a string representation of its value, the value itself, or the empty list, depending on the type, as follows: |
92 | |
93 | (LITERAL . value) -- boolean, character, or string |
94 | (IDENTIFIER . string) -- in the implementation's preferred case |
95 | (NAMED-CHAR . string) -- in lowercase; omitting the #\; not validated |
96 | (NUMBER . string) -- including prefix if given; not validated |
97 | (STRING . string) |
98 | (OPEN-PAREN) |
99 | (CLOSE-PAREN) |
100 | (OPEN-VECTOR) |
101 | (ABBREV . symbol) -- one of QUOTE, QUASIQUOTE, UNQUOTE, or UNQUOTE-SPLICING |
102 | (DOT) |
103 | |
104 | Rationale: |
105 | Constructing numeric values at this stage is undesirable as it may require super-linear algorithms; likewise interning identifiers as symbols. Syntactic validation could still be done on numbers, but would be redundant with STRING->NUMBER. |
106 | |
107 | Sockets |
108 | ------- |
109 | |
110 | (define (socket-address socket) (socket 'address)) |
111 | (define (close-socket socket) (socket 'close)) |
112 | |
113 | Stream sockets: |
114 | * input-port |
115 | * output-port |
116 | * shutdown-read |
117 | * shutdown-write |
118 | |
119 | Listeners: |
120 | * accept |
121 | |
122 | Datagram sockets: |
123 | * send |
124 | * receive |
125 | |
126 | Data types used within this section are defined as follows: |
127 | |
128 | Octet -- exact integer from 0 to 255, inclusive |
129 | IP address -- vector of four octets |
130 | IP6 address -- vector of 16 octets |
131 | Socket address -- list |
132 | |
133 | (ip-address string) library procedure |
134 | (lookup-host string) library procedure |
135 | (lookup-service string) library procedure |
136 | (lookup-addresses node service) |
137 | |
138 | (open-tcp-connection address) procedure |
139 | (open-tcp-connection address bind-address) procedure |
140 | (open-unix-connection path) procedure |
141 | |
142 | ADDRESS and BIND-ADDRESS are 2-lists (HOST SERVICE), where HOST is either: |
143 | - an IPv4 address as 4-vector of octets -- exact integers in the interval [0,256) |
144 | - an IPv4 address as string in dotted decimal notation |
145 | - a host name string, to be resolved by a fresh lookup in the system hosts database or DNS |
146 | |
147 | and SERVICE is either: |
148 | - a port number -- exact integer in the interval [0,65536) |
149 | - a service name string, to be resolved by a fresh lookup in the system services database |
150 | |
151 | (listen-tcp bind-address backlog) procedure |
152 | (listen-unix path backlog) procedure |
153 | |
154 | (listening-socket? obj) procedure |
155 | |
156 | This type predicate returns #t if the object is a listening socket (whether or not it has been closed) and #f otherwise. Listening sockets form a subtype of PORT?, disjoint from INPUT-PORT? and OUTPUT-PORT?. |
157 | |
158 | (close-listening-socket socket) procedure |
159 | |
160 | Closes the listening socket SOCKET. Has no effect if it has already been closed. Returns an unspecified value. There may be an underlying socket object in the network stack that remains in a listening state if it remains open in external processes. |
161 | |
162 | (accept-connection socket) procedure |
163 | |
164 | Rationale: |
165 | Stream sockets should be distinct from input- or output-only ports (if those exist), |
166 | |
167 | One-way shutdown of stream sockets should be possible as it is essential for some application protocols. |
168 | On the other hand, it is a common pattern to combine sockets with OS-level threads or process forking, which is important in particular for utilizing multiple processors on a shared-memory system. If shutdown weren't distinguished from close, a socket being closed either explicitly or through garbage collection in one process would block its use |
169 | |
170 | Numbers |
171 | ------- |
172 | |
173 | [TODO] |
174 | (quotient/remainder n1 n2) procedure |
175 | |
176 | This procedure returns two values equivalent to (QUOTIENT N1 N2) and (REMAINDER N1 N2), but may be more efficient than calling the two separately. |
177 | |
178 | Commentary: |
179 | In a survey of existing variants of this interface, I found this one from Racket to best fit the spirit of R5RS. MIT Scheme has INTEGER-DIVIDE which returns "an object with two components" requiring dedicated selector procedures. Guile has TRUNCATE/, but this also handles non-integer reals and, as the name suggests, is part of a larger family implementing several division conventions. R6RS abandons both REMAINDER and MODULO in favor of a never-negative MOD. |
180 | |
181 | Fixnum operations |
182 | ----------------- |
183 | |
184 | Fixnums are a subtype of exact integers having a fixed precision defined by the execution environment. They behave as specified for integers under the generic arithmetic operators, for example being promoted to bignum or inexact on overflow, but can also be used with a dedicated set of operators for modular and bitwise arithmetic. These are intended as efficient low-level building blocks for applications such as hash functions and multiple precision arithmetic. All operators must behave as if fixnums are represented as two's complement binary integers. Unless otherwise noted, any timing and energy consumption invariance with respect to operand values provided by the underlying machine must be preserved. |
185 | |
186 | The argument naming conventions of R5RS section 1.3.3 are extended such that f, f1, ... imply a fixnum type restriction. |
187 | |
188 | *fixnum-width* fixnum |
189 | |
190 | This constant indicates the implementation's fixnum precision in bits, which must be at least 16. Note that it may be less than the underlying machine word size due to tag bits. |
191 | |
192 | *greatest-fixnum* fixnum |
193 | *least-fixnum* fixnum |
194 | |
195 | These constants are equal to (- (expt 2 (- *fixnum-bits* 1)) 1) and (- (expt 2 (- *fixnum-bits* 1))) respectively, and define the fixnum range as a closed interval. |
196 | |
197 | (fixnum? obj) procedure |
198 | |
199 | This type predicate returns #t if the object is a fixnum and #f otherwise. All exact integers within the fixnum range are fixnums. |
200 | |
201 | (fx= f1 f2) procedure |
202 | (fx< f1 f2) procedure |
203 | (fx<= f1 f2) procedure |
204 | |
205 | These are equivalent to the generic =, <, and <= procedures, except that they are only defined for two fixnum arguments, which may allow more efficient implementation. |
206 | |
207 | (fx</unsigned f1 f2) procedure |
208 | (fx<=/unsigned f1 f2) procedure |
209 | |
210 | (fx+/wrap f1 ...) procedure |
211 | |
212 | (fx+/carry f1 f2) procedure |
213 | (fx+/carry f1 f2 f3) procedure |
214 | |
215 | (fx+/carry-unsigned f1 f2) procedure |
216 | (fx+/carry-unsigned f1 f2 f3) procedure |
217 | |
218 | (fx-/wrap f1) procedure |
219 | (fx-/wrap f1 f2 ...) procedure |
220 | |
221 | (fx-/borrow-unsigned f1 f2) procedure |
222 | (fx-/borrow-unsigned f1 f2 f3) procedure |
223 | |
224 | => (values d0 d1) where (- f1 f2 f3) = d = (- d0 (* d1 (expt 2 *fixnum-width*))) |
225 | d1 = (- (quotient d (expt 2 *fixnum-width*))) ;; FIXME |
226 | d0 = (modulo d (expt 2 *fixnum-width*)) |
227 | |
228 | Args and results interpreted as unsigned. |
229 | |
230 | (fx*/wrap f1 ...) procedure |
231 | |
232 | (fx*/carry f1 f2) procedure |
233 | 3-arg form reserved for R6 style |
234 | |
235 | (fx*/carry-unsigned f1 f2) procedure |
236 | 3-arg form reserved for R6 style |
237 | |
238 | (fxnot f) procedure |
239 | |
240 | (fxand f1 ...) procedure |
241 | (fxior f1 ...) procedure |
242 | (fxxor f1 ...) procedure |
243 | |
244 | (fxif f1 f2 f3) procedure |
245 | |
246 | If/mux/choose/merge |
247 | (fxior (fxand f1 f2) (fxand (fxnot f1) f3)) |
248 | |
249 | (fxmaj f1 f2 f3) procedure |
250 | |
251 | Majority; carry-out of full adder; borrow-out of full subber by inverting minuend input |
252 | (fxior (fxand f1 f2) (fxand f1 f3) (fxand f2 f3)) |
253 | |
254 | (fxshift f bits) procedure |
255 | (fxshift/unsigned f bits) procedure |
256 | |
257 | (fxlength/unsigned f) procedure |
258 | |
259 | This procedure returns the bit length of a fixnum; that is, the one-based index of its most significant 1-bit, or zero if all bits are zero. This is mathematically equivalent to (CEILING (LG (+ F 1))) where LG is the base-2 logarithm, but computed exactly. |
260 | |
261 | (integer->fixnum n) (unsure) procedure |