Talk:Carriage

From Esolang
Jump to navigation Jump to search

Concatenativeness

The code interpretation and the data interpretation are both monoid homomorphisms of the program, and they can be combined into a single interpretation which also maps the program to a monoid, which would look something like this (in Haskell):

data Thingy x = Thingy ([x] -> [x]) [x]
instance Monoid (Thingy x) where
    mappend (Thingy fa sa) (Thingy fb sb) = Thingy (fa . fb) (sa ++ sb)
    mempty = Thingy id []

So Carriage really is concatenative in a pretty strict sense. But the idea just before the one that made it onto the wiki was for the data interpretation to actually be a list of all the subfunctions that all the possible substrings of the program map to. (Then stack elements wouldn't need to include instruction symbols, you'd just pick the right subfunction and apply it. The initial stack would have n(n+1)/2 function elements for a program of length n.) But I was not able to write that as an associative operation -- the order in which the subfunctions are accumulated into this list is sensitive to the order in which the subprograms are concatenated. I don't know if it's possible to have such an operation be associative. But if it was, it would be a neat(er) variation, I think. Chris Pressey (talk) 13:12, 16 November 2012 (UTC)

A bunch of loosely-connected observations:
All concatenative languages seem to have some kind of quoting. But if you accept that, then their program texts aren't monoids (at least, not free monoids.) They're algebraic structures with a binary operation (concatenation) and a unary operation (quoting.) And these structures don't really seem to resemble any well-known structures in algebra. (Do they? I guess you could argue instead that "monoid with quotation" should just be a more well-known algebraic structure.) It would be nice if, for example, quoting distributed over concatenation (i.e. [ab] = [a][b]) but that's obviously not the case in most concatenative languages.
Carriage attempts to eliminate quoting, and thus be a free monoid -- which in some sense it does, but in another sense, all it's doing is pushing quoting from being a "syntactic thing" to being a "semantic thing". (And not really a deeply semantic thing either, because program symbols are still involved.)
I'm not entirely happy with the status of whitespace. It's like there's another monoid homomorphism that maps programs with whitespace to programs without. I'm not entirely happy with the empty string being the identity element (aren't there an infinite number of empty strings between each non-empty string? ok, fine!) I'm not entirely happy that to explicitly give an identity function without using the empty string you need at least two symbols. What happens if you say the empty string is not a valid Carriage program, and Carriage programs are semigroups instead of monoids? Can you say something like, ∀ A, B with A ≠ B, AB ≠ A and AB ≠ B? (Seems easy for the program text, but hard for the functions -- you'd need make the functions pretty contrived.) What happens if you assign types to the functions? (It's obviously no longer a free monoid)
Just things to idly ponder. Chris Pressey (talk) 18:17, 29 November 2012 (UTC)
About quoting: I can't speak for "all concatenative languages", but doesn't this fact arise from trying to have languages both be concatenative and have structured control flow? For instance, imagine brainfuck with a goto instruction "jump by the content of the current cell" instead of its while loop. Well I'm not sure it would really be concatenative - I'm actually not sure what concatenative exactly mean - because with a program AB you could have a goto from B that jumps to an instruction in A. But at least the only operation would be concatenation.
About whitespace: as far as I'm concerned, a programming language is some kind of "class of representations of algorithms", stuck somewhere between "abstract" and "usable with an actual computer". When you're writing an implementation, you're closer to the latter; when you're thinking about the language itself, you're closer to the former. To me,
"Whitespace has no meaning; the code interpretation of whitespace is the identity function, and the data interpretation does not introduce any instruction symbol onto the stack. However, attempting to take the code interpretation of any other symbol which is not defined by Carriage will cause the program to explode."
borderlines on overspecification of the language and should be disregarded when thinking about the language itself, because you don't actually need to acknowledge the existence of any symbol besides 1~\$#+-@!.
About the empty string: I guess that depends whether you're talking about syntax or semantics. Would you agree that 4 * 5 = 4 * 1 * ... * 1 * 5? In that case, then AB is equivalent to Aε...εB. I guess that depends whether you're talking about abstract syntax (4*5 and 4*1*...*1*5 are two distinct expressions) or about semantics (20 is always equal to 20, no matter how many times you performed the difficult calculation of multiplying by 1). The problem then is that the syntax is ambiguous, because it doesn't allow for a distinction between AB, AεB, AεεB, etc., if your representation of "the empty string" is actually an empty string, and not ε or something. But is that really an issue? --Koen (talk) 19:27, 29 November 2012 (UTC)
All of what you say makes sense.
You can check http://concatenative.org/ for definition(s) of a concatenative language. The basic idea is that concatenation of program texts means composition of functions. (And "means" has a well-defined meaning. Strings under concatenation and functions under composition are both monoids, and the meaning of a program text can be described as a monoid homomorphism from strings to functions. The mathematics are not terribly important; for my purposes, they mostly serve as a constraint to design a language around.)
Making the program "structured" is indeed what seems to require deviating from the letter of the rule; function composition is fine for "straight-line code", but it can't do things like express higher-order functions, or even "delimited" code (the two different code paths of an "if", for example.) "goto" would probably be seen as far too operational to be captured by functions; but if you could get around that somehow and package it up neatly, sure, machine languages would be pretty close to concatenative.
The whitespace issue is interesting. The main reason to reject some symbols or parts of syntax is to reserve them for future expansion, basically saying "don't use these yet, they might have a different meaning in a future version!" And that what was I was trying to do there. It's probably overkill; on the other hand, I've seen with e.g. Befunge-93, if you just say "every other symbol is ignored", then people can (and do) write all kinds of polyglots and stuff in it and then you're stuck in terms of trying to make the next version of the language because you can't define any new symbols. In some sense, saying "every other symbol is ignored" is the overspecification, because it gives all those symbols a meaning (nop, basically), instead of saying they're undefined and shouldn't be used.
But what bugs me about whitespace is that I'd like for it just not to be a symbol, but it persists in being one. To be very literal about it, are hi there and hit here the same string? No, but they are the same Carriage program (or would be if the others symbols were defined) and thus the same function.
But you're right, it would be possible to just treat these as implementation issues.
I guess I started thinking about it because of how whitespace interacts with the empty program. In a lot of esolangs, there is no such thing as an "empty program", there is just an unbounded playfield of default symbols (usually the space character). The default symbol usually has a meaning like "nop", or something close to it, so you can think of the playfield as effectively empty, whether it's actually of size 0, or if it's infinite and filled with default symbols. (Like how a Turing machine's tape is usually described, actually.)
And then it sort of occurred to me that in a concatenative setting you don't have this option, you have to deal with 0-length strings and the 0-length string actually has to map to the identity function, or chaos results. (Not that chaos resulting is necessarily a bad thing, but I want to explore other things first) It's just that it's a little easier to treat spaces and empty program texts as implementation issues when you're not overly concerned with the properties of the function that maps them to executing programs, I guess.
And then I started looking at semigroups, which are monoids without an identity function, trying to find some resolution to it, and I'm still kind of at that point.
Apologies if it seems like I'm making a big deal out of nothing. I'm not truly unhappy with any of these things I've described. I just... this is one way I try to find new esolangs to write. Chris Pressey (talk) 11:30, 30 November 2012 (UTC)
I guess I'm late, but don't you mean there are n(n-1)/2 + 1 subprograms in a program of length n? also I don't get why the order of accumulation of subfunctions should depend on the order of program concatenation. --Pro465 (talk) 12:34, 29 October 2023 (UTC)

Semigroups instead of monoids in concatenativeness

A pretty trivial observation, but I thought I should write it down somewhere. If you disallow the empty string from the set of valid program texts, concatenation makes it strictly a semigroup, as there's no identity element. You could still have a homomorphism from this semigroup to the semigroup of functions under composition, but the latter semigroup isn't strict -- it's still a monoid -- if there is some function in the set, not necessarily primitive, that is the identity function (like the code interpretation of 1$ in Carriage.) To make the semigroup strict, you'd need to disallow identity functions. A cheap and unsatisfying way to do this is to add an integer counter to the state, and have every primitive function increment it (reminiscent of Cabra). There may be much more interesting ways to prohibit the construction of an identity function (I can only hope). A language which did this, even in an uninteresting way, would be in some sense dual to Burro. In Burro, every program text has a unique annihilator (= another program text that when concatenated with the first results in the identity function). In most programming languages, there exists some program which has an annihilator (not expected to be unique). In some programming languages, every program has an annihilator but it's not unique -- I'm thinking, like BASIC's CLR. In this semigroup language, no program would have an annihilator. Chris Pressey (talk) 14:55, 30 November 2012 (UTC) (revised Chris Pressey (talk) 17:12, 30 November 2012 (UTC))

(Correction: in Burro, the annihilator program text isn't unique, but the function it represents is. Chris Pressey (talk) 11:26, 2 December 2012 (UTC))

Assuming one wants to have program texts that are strings (seems eminently reasonable...) then there's only so far afield you can go with algebraic structures for them. If you drop associativity, you're saying the string abc read left-to-right is actually a different string from abc read right-to-left, which is kind of bizarre. If you add commutivity, you're saying abc and cab are the same string -- also bizarre. If you add idempotency, you're saying argh and aaaaaarrrrrgh are the same string; again, bizarre. And so forth. (Though you could add other operators; for example, a unary swap-case operator isn't so bizarre, at least not over strictly alphabetic strings.) You could say program texts are syntactic structures which are not strings, and that'd be totally fine -- they could be pretty much any structure you like. But that doesn't escape the fact that, on a computer, a sequential text file, like the kind you'd keep program source in, is a string. Chris Pressey (talk) 17:22, 30 November 2012 (UTC)

Representation erasure

I've been staring this "pure concatenative tarpit" design space straight in the face the past few days (weeks?) and it's kind of left a sour taste in my mouth. You can probably pick up at outline of my goals from what I've said above.

Just to add one more thing: one "nice" thing about the idea that the (semantic) program is a homomorphism of the (syntactic) program text is that it should permit "representation erasure" somewhat akin to "type erasure": once you've taken the image of the program text, you shouldn't need it anymore. You ought to be able to proceed from that point without using, or even caring about, symbols. Carriage fails pretty badly at this. Joy succeeds but only if quoted programs are mapped to functions statically, so that [a b] means a function that pushes a function on the stack. You could compose this pushed function with another function on the stack at runtime, but you couldn't, say, take it apart and see that it was made up of a and b. And I would like to emphasize that this interpretation is not a monoid homomorphism; it's a homomorphism, but that thing with the quote operator is not a monoid. (Of course, any language that wants homoiconicity might not want representation erasure, but that's another matter.)

I've come up with a few uninteresting examples. 1) express a single "pass" of a Tag system as a composition of functions to construct strings and to compare them against/truncate/extend a target string, then take the program as the iteration-to-fixed-point of that composed function; you get a tag system, yay. 2) provide functions to write on a tape/move a tape head, and functions to push transition rules on a stack, and a function that interprets the transition rules as a Turing machine; you get a (sequence of) Turing machine(s), yay.

I believe "concatenative with representation erasure" are among the design goals of ais523's unpublished Underlambda. (Sort of in the same family as Underload, which fails this because any program text can be printed.) --Ørjan (talk) 22:56, 2 December 2012 (UTC)
I'm not convinced I'm trying to do the same thing there as cpressey's doing here, though, because there's no big philosophical hole behind Underlambda. I'm not sure what cpressey is trying to accomplish, though.
One thing I should mention is that ehird and I already determined that there's no need to have literals in the syntax of a concatenative language. (Imagine Underload with (:), (*), (~), (!), (^), (a), (S) as the only literals that exist. It's still TC, because you can construct any literal you like at runtime via repeated applications of * and a. Now you can remove all the commands but ^ and it's still TC, which gives you a neat little concatenative tarpit with the same number of commands as brainfuck.) I'm not sure if that's the sort of thing that cpressey's looking for. Eliminate (S) and replace it with some other I/O (let's assume for the sake of argument that we have (1) and (0) that push functions onto the stack that output single bits), and representation isn't involved in this anywhere at all any more. --ais523 08:55, 6 December 2012 (UTC)
Doesn't repeated applications of * and a restrict you to "literals" of the form u, (u), ((u)), (u)((u)), etc., (basically, the Dyck language of balanced parens) where u was whatever was on the top of the stack when you started? (And doesn't Underload start with an empty stack?) Putting that aside, I think I see where you're going with this, maybe, and yes, Underload basically already accomplishes what I'm looking for here; it sort of falls into the class of tag/TM models I mention elsewhere on this page, but quining-execution saves it from being uninteresting; "stack of strings with concatenation" and "runs by concatenation" can be replaced with "stack of functions with composition" and "runs by composition". (However, Underload's also not what I'm looking for, in a trivial sense, since it's already been done.) Chris Pressey (talk) 14:11, 7 December 2012 (UTC)
I'm not sure this will help anyone understand what I'm looking for, and it's not the whole story, but I can put it this way. I'm looking for a way to represent subroutines (the bread-and-butter of old-school concatenative programming) in a concatenative language in a (mostly) static form which is somewhat more algebraically satisfying than Joy's quoting operator. I've concluded that I can't improve on that much. Chris Pressey (talk) 14:39, 7 December 2012 (UTC)
By the way, I haven't been using the word "literal" in any of my formulating about this, so there's a good chance there's a gap between my understanding and ais523's somewhere. In a program text, there are necessarily symbols, and in a functional language there are necessarily functions; I don't immediately see a good distinction between what is a "literal" and what isn't. Chris Pressey (talk) 14:56, 7 December 2012 (UTC)
OK ais523, I now see what you must mean, there. I found your use of "literal" somewhat confusing, because you first mention eliminating literals, then you introduce a bunch of literals. Also, my imagining of Underlambda, when Oerjan mentioned it, didn't look much like that -- but I didn't set about thinking far down those lines, because I knew someone else was. (Not that what you described is Underlambda.) Chris Pressey (talk) 13:50, 8 December 2012 (UTC)
Now that I follow the construction in your Underload example, I guess I can try to say why I'm not thrilled with it; if you have two symbols for every operation, one being "do this operation" and the other being "push a function on the stack which does this operation" -- then that doesn't seem like much of an improvement over quoting. You could go further and remove all but a handful of the "do this operation" symbols, but you would still need something to apply the functions that were pushed on the stack.
While I'm here I might as well mention another unorthogonality that would be nice to rectify that's been on my mind. Concatenation takes pairs of strings to strings, composition takes pairs of functions to functions, but the actual operations of the language take program states to program states. The symmetry would be much nicer if the operations took pairs of program states to program states, and it would be really beautiful if all of these functions were associative, say, or participated in homomorphisms. This is of course a hell of a lot to ask. Chris Pressey (talk) 20:49, 8 December 2012 (UTC)
As ais523 mentioned, you could remove all the "do this operation" symbols except ^, since appending that to any of the "push a function" symbols gives the corresponding "do this operation". Hm and then you don't really need the ()'s around any of the pushers except the (^) one... oh wait maybe you need () itself, to get the empty program. That gives 9 primitive operations in total. --Ørjan (talk) 03:04, 9 December 2012 (UTC)
You can create the empty quotaion using (*)(a)^(!)(*)^. OK, it doesn't give an empty quotation itself (it gives ((*)!)), but the result is indistinguishable from an empty quotation. --ais523 04:47, 9 December 2012 (UTC)
What I'm curious about is whether this is the approach you take in Underlambda, or if you use some other approach there. Chris Pressey (talk) 11:14, 9 December 2012 (UTC)
Underlambda has quotation literals, just like Underload does. However, there's no way to distinguish between literals that describe the same function, and implementations are totally free to optimize, say, (~~~) into just (~) as a result. (They aren't required to, because comparing functions is famously difficult.) --ais523 21:57, 9 December 2012 (UTC)
I see. Thanks. Chris Pressey (talk) 12:57, 10 December 2012 (UTC)

Possibly the most promising avenue I've considered thus far is that, surely composition isn't the only associative operation on functions? More generally, surely the monoid of functions under composition isn't the only monoid that could capture the semantics of a program? Instead of functions, you could have something representable, but if the semantic monoid comes down to being just another syntactic monoid, you have in essence just a simple compiler. In the other extreme, a monoid of morphisms under composition for some categorical object is just waaaay too general. Maybe it's a nice statement to make after the fact, "oh look this turns out to be a category isn't that interesting", but it seems to help not at all for deciding how it should work and how to define it. Chris Pressey (talk) 11:24, 2 December 2012 (UTC)

Found an interesting observation: a great many associative operations can be expressed with function composition. Not a showstopper, but quite sobering. Chris Pressey (talk) 11:44, 2 December 2012 (UTC)

A conjecture

A conjecture: any binary operation ★ on functions f and g which "lifts" one of f or g (or both) so that it can be returned as a value from the resulting function f★g cannot be be both associative and non-trivial.

This is very similar to the thing at the top of the page: the bit about how it seems impossible to accumulate intermediate functions and retain associativity, because the order they are accumulated depends on the order the composition is applied.

I don't think I'm quite equipped to write a proof of this claim just yet, though; just investigating it made my head spin (one definition of ★ that I tried was

f ★ g = λx, y → g(f ★ g, f(x, y))

which was pretty fun.) Chris Pressey (talk) 16:49, 2 December 2012 (UTC)

Assuming ★ gives functions of the same kind as result, and there is supposed to be a (meta-)function E which does the extraction of the first function, a proof that ★ is trivial is pretty simple:
f★g = E((f★g)★h) = E(f★(g★h)) = f
and similarly for extracting the second function instead. --Ørjan (talk) 23:06, 2 December 2012 (UTC)
That's a nice little proof! Had to stare at it for a minute, but I got it.
I think that marks an end to this little adventure. I don't think I should be surprised; monoids are a remarkably faithful stand-in for one-dimensionality, in some sense, and I've known you can't write interesting programs in one dimension since I was a teenager. Chris Pressey (talk) 11:06, 3 December 2012 (UTC)
If you think so, I just have to point you at Barrington's theorem. (TLDR: Some monoids (indeed, groups) do allow you to encode more computation than you'd think. However they cannot be commutative ones.) --Ørjan (talk) 22:12, 3 December 2012 (UTC)
While that is a beautiful result (I like that it doesn't actually tell you very much about the properties of those groups that would work for the "pipeline"), and I should probably clarify that I was using the word "known" in a somewhat hyperbolic way (I don't really know anything, really,) I do think there is a relevant distinction here between "writing a program" and "building a circuit". Circuits are generally treated non-uniformly (at least in complexity theory); the circuit you build to solve a problem of size i needn't look anything like the circuit you build to solve the same problem of size i+1. That's an acceptable situation for complexity theorists, but it's not a great situation for programmers, who would, I daresay, prefer to write a single program of a single size to solve all instances of a problem (and when that's not possible, at least be able to extend existing programs to solve more instances in predictable ways, like using bigger bounded integers for their variables.) Chris Pressey (talk) 13:43, 4 December 2012 (UTC)

Turing-completeness

I will likely need to add a multiply or similar instruction in order for it to be Turing-complete. Hard to compute k for the conditional use of slice otherwise. Chris Pressey (talk) 14:31, 16 November 2012 (UTC)

Interpreter

An Ocaml interpreter is on the way. I'm almost done and I'll finish when I have the time. Thanks for making your languages at the same time very interesting and very easy to implement! --Koen (talk) 16:53, 16 November 2012 (UTC)

You're welcome, I suppose! I think it might be that, the easier it is to implement, the faster I can release it, so you tend to see the easier ones first. I occasionally release a language that's just a spec (e.g. Oozlybub and Murphy) but I strongly prefer to have something machine-runnable. Chris Pressey (talk) 10:33, 18 November 2012 (UTC)

Truth machine

I think

111-@1\11-~!$$11+1+1+1+\1+1+1+1+1+1+@11-~!$$1-

acts as a truth machine (more compact truth machines are probably possible). May it be added to the main page?

             we start with zero or one on the stack
             this number will be called "truth argument"
111-         push one and zero
@            slice (identity)
*** start of function one ***
1\11-~!      behavior depends on stack top:
             identity = pushes one and swaps
             itself = pushes one and applies itself
             itself w/o trailing apply = pushes one and self and self
*** end of function one ***
$$           pop one and identity
11+1+1+1+    push five (start of function one)
\            swap five and the truth argument
1+1+1+1+1+1+ add six to the truth argument
@            slice function one (w/o apply if truth argument is zero)
11-~         duplicate slice
!            call function one (w or w/o apply)
$$           pop self and self
1-           return zero

--(this comment by OberoN at 17:57, 27 March 2014‎ UTC; please sign your comments with ~~~~)

Sure. --Ørjan (talk) 06:15, 28 March 2014 (EDT)

Function Slicing

Currently, the slice operator states that if any of the stack elements are not instruction symbols, then the program explodes. However, it returns a "function." Would its return value count as an instruction symbol for further applications of slice? LegionMammal978 (talk) 12:23, 20 February 2016 (UTC)