For fun this summer, I implemented part of the PostScript language, using PyCairo for rendering. I call it Stilted. Implementing a language is an interesting exercise. You always learn some things along the way.
Executable bit: All objects in PostScript have a literal/executable bit that can be changed with the cvx (convert to executable) and cvlit (convert to literal) operators. Literal arrays are delimited by square brackets, executable arrays (procedures) are in curly braces. Like in Python and JavaScript, multiple references share storage. But oddly, in PostScript, you can duplicate an object on the stack, and change its executable bit, and now you have two references to the same storage, but with different attributes.
Here’s an example using GhostScript (a third-party conforming implementation):
GS> [1 2 3] dup % make an an array and duplicate it
GS<2> cvx % make the top one executable
GS<2> pstack % print the stack
{1 2 3}
[1 2 3]
GS<2> dup 1 99 put % change the second element
GS<2> pstack % both objects share the storage
{1 99 3}
[1 99 3]
GS<2>
The executable attribute is part of the reference, not part of the object!? This doesn’t seem like a planned and desired outcome: it seems like a side-effect of a common C technique: using low bits of a pointer to store flags.
While writing Stilted, I didn’t realize this behavior until I already had made executability part of the object itself, so Stilted produces a different (wrong) result:
|-0> [1 2 3] dup % make an array and duplicate it
|-2> cvx % make the top one executable
|-2> pstack % oops: both are changed!
{1 2 3}
{1 2 3}
|-2> dup 1 99 put
|-2> pstack
{1 99 3}
{1 99 3}
|-2>
Since I don’t think anyone actually depends on having two objects that share storage, but with different executability, I didn’t bother changing it. An advantage of pure-fun side projects: you can do whatever you want!
BTW: the numbers in the prompts are the current depth of the operand stack.
Cutesy string syntax: PostScript strings are made with parentheses, and they nest, so this is one string:
GS> (Hello (there) 1 2 3) pstack
(Hello \(there\) 1 2 3)
GS<1>
Stilted doesn’t nest the parens in strings, because it uses regexes for lexing tokens, and nesting is hard with regexes. This is a syntax error in Stilted:
|-0> (Hello (there) 1 2 3) pstack
Error: syntaxerror in 3
Operand stack (4):
3
2
1
(Hello \(there)
|-4>
Also, who depends on nested parens in strings? Just escape the closing parens in your strings.
Flexible scope: PostScript is a stack-oriented language. There’s an operand stack that operators pop and push to, and also a dictionary stack where names are defined and looked up. The dictionary stack is explicitly manipulated with the begin and end operators. Instead of procedures starting new scopes implicitly, the programmer decides when to begin and end scopes. This means they don’t have to correspond to procedure invocations at all.
We’re so used to scoping being tied to function calls in our programming languages, it was strange to realize that the two concepts can be completely unrelated.
Surprising gaps: Re-acquainting myself with PostScript, I was surprised at what it didn’t have: no way to sort arrays, no string formatting, and so on. PostScript pre-dated languages like Python, JavaScript, and even Perl. Its model is much more like C than the higher-level languages that we’re used to now. Though C has string formatting, and you’d think that would be a useful thing in a printing programming language.
More: If you aren’t familiar with PostScript, I’ve got more description of its unusual control structure approach, and also other blog posts tagged #postscript.
Stilted has been a lot of fun. Extra fun: I used the Obfuscated PostScript winners as test cases!
Comments
Add a comment: