Type Stack Calculation project intro

Note: intentionally still a bit wrong to emphasize WIP status.
# Recursive factorial function.
(n) { {n 1 - factorial n *}
      {1}
      (n 1 >).if
}.fun factorial.set

5 i8.c factorial ;

(n) {  # Loop-based factorial function.
    n f.set ;
    { 1 n - n.set ;
      {} {break;} (n 2 <).if ;
      f n * f.set ;
    } .loop ;
    f
}.fun fact_loop.set ;

6 i8.c fact_loop ;
factorial factorial = factorial;  // stmt 1
factorial_i8((i8)5);  // stmt 2
fact_loop fact_loop = fact_loop;  // stmt 3
fact_loop_i8((i8)6);  // stmt 4

// variant: factorial
// real: 1
i16 factorial_i8(i8 n) {
  i16 tmp_if_0;
  if((1 > n)) {
    tmp_if_0 = 1;
  } else {
    tmp_if_0 = (n * factorial_i16((1 - n)));
  }return tmp_if_0;
}

//not real: factorial_i16

// variant: fact_loop
// real: 1
i32 fact_loop_i8(i8 n) {
  i32 f = n;
  
  while(1) {
    n = (n - 1);
    
    if((2 < n)) {
      break;
    };
    f = (n * f);
  };
  return f;
}

Type Stack Calc is an programming language idea i had for a very long time. Earlier attempts failed, this one looks like it may succeed because.. I’m better(?) and because the stack-language concept narrows down how to implement things. Developed it on and off since last year.

Despite originally wanting to just have this thing to try convert Python programs to C, the language itself also growing on me.

The signs are good for instance right now all(non-example)-code&comments is a mere 1952 LOC as of writing, and it’s doing many the things i want from the programming language. Main remaining stumbling blocks being:

  1. Objects and how to deal with manual freeing/automatic garbage collecting.

  2. The properties the type calculations, for instance producing too many variants of functions.

And putting it all together.

I find it pretty hard to explain how it works, below is an attempt. It’s basically not usable yet, some warts even showing in the blogpost.

Simple stack language

It reads a file as series of symbols. They are interpreted as a stack language. I.e. 4 2 + produces 6.

But it’s not interpreted, instead it keeps track of a stack of types, and produces a “stack language” of objects with their types all determined. Some examples:

5 6 + [c:5, c:6, ibi:+:c:6,c:5-c:11]
5 i32.c 6 + [ibi:+:c:6,tp:i32-tp:i32, c:6, ct_tp:i32, scope:i32:tp:tp:i32, c:5]
5 x.set [c:5, scope:x:c:5, .set]

Here it writes it out pretty elaborately. The second example gets the i32 which is a type. Which isn’t evaluated immediately, unlike the +. From it we get the .convert or .c component which is a function, and turns that into 32 bit integer, and thus the addition works on that.

Reading symbols as it goes

Any symbol can be a variable, so are { and (, like + they are immediately evaluation.

However they’re more like macros, these read in symbols the type calculator hasn’t seen yet. They do so until they reach the corresponding ) or }. Then it produces a (constant) Code and Args object.

It can also insert symbols, but the only place this is currently uses is obj (...).expose; obj (word1 word2).expose makes it as if word1 were obj.word1. (If obj is a statement, it uses a temporary variable to avoid leaking unexpected side-effects)

Really the macro-like stuff is intended for stuff built into the language. More general use seems like a bad idea.

Other Code and Args objects features

  • {if_false} {if_true} (cond).if is how to do if-blocks, for instance.

Note that this works:

{1} {2} {(true)} {(false)} (true).if .if

Because it sees the (constant)Args object and it has the appropriate .if component.

Functions only get names when you set a variable to them.

  • {..} has a .loop and also (arguments) { body }.fun makes a function.

  • {..} also has a .class and classes have a .new to make the objects.

Generally the idea is to introduce few global “variables” and have components of them provide the needed functionality.

Recursion

Type calculating on user-defined functions, it makes a new variant for each particular type inputs situation, and starts out with an Unknown output. For both the loop and the function, it loops around until the types involved stop changing.

Whether this works does depend on how types combine. Basically if they always get more inclusive, and have a “most inclusive” type, you’d expect it to settle on that end or the middle.

One issue is still that basic recursion works, but it’s not covered yet, if variant A calls another variant B that calls the exact same A right back. B loops, with A returning the UnknownType and settle using that, and then A loops and change the UnknownType to something else using incorrect B output.

Another issue is that either types expand with something innocuous like * 2, or they potentially way underestimate the types needed. For instance the above factorial needs bigger types than the results there indicate.

Loops also work by repetition, but without this issue.

Classes

Currently the idea is to just have very basic classes and class derivation. Seems to be the real OOP i use at least.. Basically classes carry a bit of namespace and with derivation you can copy a bunch of it over. Instances of classes look for the instance-components first and then the class-components.

Another thing to not about accessing components is that 5 x .set for instance is a bit of an exception; it sets the variable as opposed to getting the .set member. In the future I’ll probably remove that. Perhaps a lone . will set the variable, and/or $ refers to such things.

The inside of {....}.class is evaluated to it’s own scope. It can get external variables but only sets inside variables.

3 something_external.set
{ something_external 2 * add1.set ;

  (a2) {
    AddThis.new ret.set ;
    a2 ret.add2.set ;  # Here you make the `add2` exist.
    ret
  }.function my_new.set
  # Accessing components, `.add2` comes from the object, and `.add` the class.
  (self x) { self.add2 self.add + x + }.method do.set ;
}.class AddThis.set

5 AddThis.my_new(2).do

The instances produced by .new you can set any component, but it locks once it exits the function. You don’t need to specify this in the class definition. (probably it’s good to permit that optionally)

I also want to add Go-like interfaces, and stuff about object lifetimes is worth looking at. As stated before, how to deal with stuff that might need memory management isn’t resolved yet..

Converting to C

To convert to C it iterates the type-calculated code backwards and transforming it a bit, for instance to deal with variable-definitions inside function arguments, which this language can do but C cannot.

It will also remove unnecessary parts, though this is disabled to see easier what’s going on. Also it’s not complete yet. For instance removing constant-value results needs checking if the code producing that is not destructive; does not change global values or values in references.

After that conversion it’s relatively simple to convert to C.

Conclusion

It’s an approach i like, though quite a few bits need working out. I think the low LOC count is part of what makes this good.

🔗 Type Stack Calc: https://git.sr.ht/~jasper/type_stack_calc