gscm/doc/extensions.txt [gscm

gscm/doc/extensions.txt

1	Report on Gales Scheme Extensions
2	=================================
3
4	J. Welsh
5
6	19 April 2018
7
8	This is a supplement to the "Revised(5) Report on the Algorithmic Language Scheme" defining the language extensions in the Gales Scheme system. The intent is to be stringent enough to be useful, yet lenient enough to facilitate adoption by a variety of Scheme implementations on a variety of platforms.
9
10	Syntax library
11	--------------
12
13	RECEIVE, per SRFI 8.
14
15	Runtime environment
16	-------------------
17
18	args list
19
20	(set-error-handler! handler) procedure
21
22	Registers an error handler procedure. When an error is signalled, either internally or by a call to ERROR, execution exits any active dynamic extents (see DYNAMIC-WIND) and HANDLER is called with a message string as first argument and possibly relevant values as additional arguments. To prevent recursive errors, the error handler is first reset to the system's default, so HANDLER may need to re-register after completion of any error-prone operations. If HANDLER returns, the Scheme process terminates.
23
24	It may not be possible to guarantee recovery from an out-of-memory error, as the handler could trigger it again before it is able to do anything useful.
25
26	Rationale:
27	This mechanism is a poor substitute for a typed exception system but enables basic recovery such as restarting a REPL.
28
29	(error string detail ...) procedure
30
31	Per SRFI-23. Exits any current dynamic extents and does not return. Arguments are passed to the current error handler.
32
33	(toplevel) procedure
34
35	(call-as-toplevel) procedure
36
37	(save-core filename) procedure
38
39	(exit) procedure
40	(exit status) procedure
41
42	Exits any current dynamic extents and terminates the Scheme system, returning the given exact integer exit status to the operating system. What exactly is returned is implementation dependent, but a normal or successful status shall be indicated by zero, which is the default if STATUS is omitted.
43
44	(gc) procedure
45
46	Runs a garbage collection, returning the number of live cells.
47
48	Eval
49	----
50
51	(gales-scheme-environment) procedure
52
53	Input and output
54	----------------
55
56	(flush-output-port) procedure
57	(flush-output-port port) procedure
58	(flush-output-port port option) procedure
59
60	Flushes an open output port, scheduling any buffered writes for delivery to their destination as soon as possible. Returns an unspecified value. The port argument, if omitted, defaults to the value returned by current-output-port. The option argument, if included, is a symbol specifying additional semantics:
61
62	sync -- performs a synchronous flush, delivering any buffered data and metadata to the underlying storage device and not returning until complete. If this fails or is not possible, an error is signalled. The intent is that for a file on nonvolatile storage media, all data previously written to the port be safe with respect to power failure or system crash upon successful return. In practice, this requires cooperation from external components such as operating system and storage hardware.
63	data-sync -- like sync except that updates to metadata not required for correctly reading back the data, such as timestamps, are not required to be completed.
64
65	Writes are still to be delivered in a timely manner even without explicit flushing; buffered implementations are suggested to use a timer on the order of 10 milliseconds. Buffers are always flushed on CLOSE-OUTPUT-PORT and on orderly termination of the Scheme system, which means termination for any reason within the control of the implementation. Asynchronous write errors must not be signalled until the next operation on the affected port, and thus may be lost if the port is not explicitly flushed or closed.
66
67	Rationale:
68	The ideal port abstraction is to deliver individual characters without delay. Unfortunately this approach often has considerable overhead, such as context switching for system calls or per-packet headers and computation on a network. The magic number 10 is intended to provide adequate responsiveness by default for common applications, while this procedure provides the means to make stricter latency or persistence requirements explicit when needed.
69
70	(open-subprocess path arg1 ...) procedure
71
72	(wait-subprocess) procedure
73	(wait-subprocess pid) procedure
74
75	(open-output-file path if-exists) procedure
76	(call-with-output-file path proc if-exists) procedure
77	(with-output-to-file path thunk if-exists) procedure
78
79	These extended variants of the corresponding R5RS procedures specify what to do if the named file exists. Symbolic options include:
80
81	truncate -- the file is truncated to length 0.
82	append -- writes are performed at the end of the file.
83	overwrite -- file contents are overwritten in place starting at offset 0.
84
85	Rationale:
86	A default is not specified, as it wasn't in R5RS and existing implementations vary. An "error" option is not provided as there would generally be a race condition between the existence check and creation. POSIX for example provides the O_EXCL open flag but only requires it be atomic with respect to other calls with the same flag.
87
88	(read-token) procedure
89	(read-token port) procedure
90
91	READ-TOKEN reads a single Scheme token from PORT, or the current input port. It is a recognizer for the nonterminal <token> from R5RS section 7.1.1. With the exception of number tokens, an error is signalled if a lexically invalid character sequence is encountered in the input, or if an end of file is encountered after the beginning of a token and the token is incomplete. If an end of file is encountered before any characters are found that can begin a token, an end of file object is returned. Otherwise, a pair is returned whose CAR is a symbol indicating the token type and whose CDR contains either a string representation of its value, the value itself, or the empty list, depending on the type, as follows:
92
93	(LITERAL . value) -- boolean, character, or string
94	(IDENTIFIER . string) -- in the implementation's preferred case
95	(NAMED-CHAR . string) -- in lowercase; omitting the #\; not validated
96	(NUMBER . string) -- including prefix if given; not validated
97	(STRING . string)
98	(OPEN-PAREN)
99	(CLOSE-PAREN)
100	(OPEN-VECTOR)
101	(ABBREV . symbol) -- one of QUOTE, QUASIQUOTE, UNQUOTE, or UNQUOTE-SPLICING
102	(DOT)
103
104	Rationale:
105	Constructing numeric values at this stage is undesirable as it may require super-linear algorithms; likewise interning identifiers as symbols. Syntactic validation could still be done on numbers, but would be redundant with STRING->NUMBER.
106
107	Sockets
108	-------
109
110	(define (socket-address socket) (socket 'address))
111	(define (close-socket socket) (socket 'close))
112
113	Stream sockets:
114	* input-port
115	* output-port
116	* shutdown-read
117	* shutdown-write
118
119	Listeners:
120	* accept
121
122	Datagram sockets:
123	* send
124	* receive
125
126	Data types used within this section are defined as follows:
127
128	Octet -- exact integer from 0 to 255, inclusive
129	IP address -- vector of four octets
130	IP6 address -- vector of 16 octets
131	Socket address -- list
132
133	(ip-address string) library procedure
134	(lookup-host string) library procedure
135	(lookup-service string) library procedure
136	(lookup-addresses node service)
137
138	(open-tcp-connection address) procedure
139	(open-tcp-connection address bind-address) procedure
140	(open-unix-connection path) procedure
141
142	ADDRESS and BIND-ADDRESS are 2-lists (HOST SERVICE), where HOST is either:
143	- an IPv4 address as 4-vector of octets -- exact integers in the interval [0,256)
144	- an IPv4 address as string in dotted decimal notation
145	- a host name string, to be resolved by a fresh lookup in the system hosts database or DNS
146
147	and SERVICE is either:
148	- a port number -- exact integer in the interval [0,65536)
149	- a service name string, to be resolved by a fresh lookup in the system services database
150
151	(listen-tcp bind-address backlog) procedure
152	(listen-unix path backlog) procedure
153
154	(listening-socket? obj) procedure
155
156	This type predicate returns #t if the object is a listening socket (whether or not it has been closed) and #f otherwise. Listening sockets form a subtype of PORT?, disjoint from INPUT-PORT? and OUTPUT-PORT?.
157
158	(close-listening-socket socket) procedure
159
160	Closes the listening socket SOCKET. Has no effect if it has already been closed. Returns an unspecified value. There may be an underlying socket object in the network stack that remains in a listening state if it remains open in external processes.
161
162	(accept-connection socket) procedure
163
164	Rationale:
165	Stream sockets should be distinct from input- or output-only ports (if those exist),
166
167	One-way shutdown of stream sockets should be possible as it is essential for some application protocols.
168	On the other hand, it is a common pattern to combine sockets with OS-level threads or process forking, which is important in particular for utilizing multiple processors on a shared-memory system. If shutdown weren't distinguished from close, a socket being closed either explicitly or through garbage collection in one process would block its use
169
170	Numbers
171	-------
172
173	[TODO]
174	(quotient/remainder n1 n2) procedure
175
176	This procedure returns two values equivalent to (QUOTIENT N1 N2) and (REMAINDER N1 N2), but may be more efficient than calling the two separately.
177
178	Commentary:
179	In a survey of existing variants of this interface, I found this one from Racket to best fit the spirit of R5RS. MIT Scheme has INTEGER-DIVIDE which returns "an object with two components" requiring dedicated selector procedures. Guile has TRUNCATE/, but this also handles non-integer reals and, as the name suggests, is part of a larger family implementing several division conventions. R6RS abandons both REMAINDER and MODULO in favor of a never-negative MOD.
180
181	Fixnum operations
182	-----------------
183
184	Fixnums are a subtype of exact integers having a fixed precision defined by the execution environment. They behave as specified for integers under the generic arithmetic operators, for example being promoted to bignum or inexact on overflow, but can also be used with a dedicated set of operators for modular and bitwise arithmetic. These are intended as efficient low-level building blocks for applications such as hash functions and multiple precision arithmetic. All operators must behave as if fixnums are represented as two's complement binary integers. Unless otherwise noted, any timing and energy consumption invariance with respect to operand values provided by the underlying machine must be preserved.
185
186	The argument naming conventions of R5RS section 1.3.3 are extended such that f, f1, ... imply a fixnum type restriction.
187
188	fixnum-width fixnum
189
190	This constant indicates the implementation's fixnum precision in bits, which must be at least 16. Note that it may be less than the underlying machine word size due to tag bits.
191
192	greatest-fixnum fixnum
193	least-fixnum fixnum
194
195	These constants are equal to (- (expt 2 (- fixnum-bits 1)) 1) and (- (expt 2 (- fixnum-bits 1))) respectively, and define the fixnum range as a closed interval.
196
197	(fixnum? obj) procedure
198
199	This type predicate returns #t if the object is a fixnum and #f otherwise. All exact integers within the fixnum range are fixnums.
200
201	(fx= f1 f2) procedure
202	(fx< f1 f2) procedure
203	(fx<= f1 f2) procedure
204
205	These are equivalent to the generic =, <, and <= procedures, except that they are only defined for two fixnum arguments, which may allow more efficient implementation.
206
207	(fx</unsigned f1 f2) procedure
208	(fx<=/unsigned f1 f2) procedure
209
210	(fx+/wrap f1 ...) procedure
211
212	(fx+/carry f1 f2) procedure
213	(fx+/carry f1 f2 f3) procedure
214
215	(fx+/carry-unsigned f1 f2) procedure
216	(fx+/carry-unsigned f1 f2 f3) procedure
217
218	(fx-/wrap f1) procedure
219	(fx-/wrap f1 f2 ...) procedure
220
221	(fx-/borrow-unsigned f1 f2) procedure
222	(fx-/borrow-unsigned f1 f2 f3) procedure
223
224	=> (values d0 d1) where (- f1 f2 f3) = d = (- d0 (* d1 (expt 2 fixnum-width)))
225	d1 = (- (quotient d (expt 2 fixnum-width))) ;; FIXME
226	d0 = (modulo d (expt 2 fixnum-width))
227
228	Args and results interpreted as unsigned.
229
230	(fx*/wrap f1 ...) procedure
231
232	(fx*/carry f1 f2) procedure
233	3-arg form reserved for R6 style
234
235	(fx*/carry-unsigned f1 f2) procedure
236	3-arg form reserved for R6 style
237
238	(fxnot f) procedure
239
240	(fxand f1 ...) procedure
241	(fxior f1 ...) procedure
242	(fxxor f1 ...) procedure
243
244	(fxif f1 f2 f3) procedure
245
246	If/mux/choose/merge
247	(fxior (fxand f1 f2) (fxand (fxnot f1) f3))
248
249	(fxmaj f1 f2 f3) procedure
250
251	Majority; carry-out of full adder; borrow-out of full subber by inverting minuend input
252	(fxior (fxand f1 f2) (fxand f1 f3) (fxand f2 f3))
253
254	(fxshift f bits) procedure
255	(fxshift/unsigned f bits) procedure
256
257	(fxlength/unsigned f) procedure
258
259	This procedure returns the bit length of a fixnum; that is, the one-based index of its most significant 1-bit, or zero if all bits are zero. This is mathematically equivalent to (CEILING (LG (+ F 1))) where LG is the base-2 logarithm, but computed exactly.
260
261	(integer->fixnum n) (unsure) procedure