tjammer

The missing piece for schmu

2025-08-22

For the last couple of years I’ve been working on a programming language called schmu1. This is not an introductory post about schmu (that, I have yet to write) but instead a discussion of a language design problem I faced and the solution I found.

Mutable value semantics

The shortest summary of schmu I can give is this: It uses mutable value semantics (MVS) and is heavily inspired by OCaml. If you’re unfamiliar with MVS, the paper which introduced me to it summarizes it like this

Mutable value semantics is a programming discipline that upholds the independence of values to support local reasoning. In the discipline’s strictest form, references become second-class citizens: they are only created implicitly, at function boundaries, and cannot be stored in variables or object fields.

schmu uses this strictest form and makes references second-class. A value in schmu may consist of multiple parts, but it owns its parts (and they own theirs etc.). A part of a value cannot reference another value which is owned by someone else, there is no distinction between deep and shallow copies. Similarly, a function always returns a unique value which owns its parts. However, when passing variables downwards to functions (by value, naturally) they don’t have to be copied due to their independence. Every pass by value is essentially a read-only borrow.

The implementation of MVS in schmu does not follow the paper. Instead of pervasive reference counting, schmu uses a borrow checker to ensure independence of values2.

The restriction of second-class references may seem like a huge drawback at first which results in a lot of copies of values where other languages can just use references. As an example, a find function for a hash table can return a reference to some value in C++ or Rust without copying (and possibly heap-allocating) the value. A naive implementation in a MVS language cannot do this and must copy. To get around this, we can use higher order functions and pass values downwards without copying, with various degrees of syntactic sugar3. Our (simplified) hash table might look something like this4

-- simplified hash table with types [key] and [value]
type item = { key : key, value : value }
type hashtbl = array[option[item]]

fun find(hashtbl, key, fn : fun (value) -> 'a) -> option['a] {
  match hashtbl.[find_index(hashtbl, key)] {
    None -> None
    Some(item) -> Some(fn(item.value))
  }
}

We search for the value with key key in our hashtbl. If there is a value associated with key we apply it to the function fn and wrap it into an optional. Otherwise, we return None, the empty optional. From a user’s perspective, it looks like this

-- For clarity, we're not using the sugary syntax mentioned above
match find(hashtbl, key, fun value {
  -- returns the result of this function call
  calculate_something_with(value)
}) {
  None -> println("could not find value for key")
  Some(result) -> println("found some result")
}

This certainly works, but it’s not how we want to write this code (and it looks ugly). We want to pattern-match on the value directly and not on some result which was constructed with our value. So why did we write our find function in such a backward way? It’s because the optional that we want to pass into fn (Some(item.value)) is its own unique value (which owns its data). We cannot have an optional with a reference to item.value. Instead, we have to copy item.value to construct the optional Some(copy(item.value)). And we want to avoid these copies. This is where move semantics come in.

Move semantics

When a data structure (say, B) keeps a reference to another data structure A then B must not outlive A, otherwise it would create a dangling reference. Since A has a greater lifetime than B, there is another way to structure the data. B can own A (as a value) for the time it needs access and give back ownership when it is done. Move semantics ensure that we don’t need to copy (heap data) when transferring ownership. In schmu’s implementation of move semantics the compiler also stops you from trying to use a moved value and, since there are no custom destructors in schmu, frees only the existing, not-moved parts of a value if it goes out of scope5.

In our hash table example, we can move item.value into the optional and move it back after fn is done with it.

fun find2(hashtbl, key, fn : fun (option[value]) -> 'a) -> 'a {
  let index = find_index(hashtbl, key)
  match hashtbl.[index] {
    None -> fn(None)
    Some(item) -> {
      -- move item.value into optional
      let opt = Some(item.value)
      let result = fn(opt)
      -- move value back into item
      mut item.value = option/get(opt) -- syntax for module access, module/item
      -- return result
      result
    }
  }
}

find2(hashtbl, key, fun value {
  match value {
    None -> println("could not find value for key")
    Some(value) -> {
      -- do something with value
    }
  }
})

That’s an improvement! Our find2 function is slightly more complicated but usage has improved a lot. There’s only one problem with it. It doesn’t type-check.

Maybe you were wondering about this line earlier mut item.value = option/get(opt) when we moved back value into item, specifically about the strange mut keyword. In schmu, every mutation is explicit, we cannot mutate a value without signaling the mutation by using the mut keyword. You can think of it as passing something by pointer in C &value, only schmu tries to use keyword instead of sigils. The problem, and the reason this code doesn’t type-check, is that we passed our hash table immutably find2(hashtbl, ...). Otherwise, the code would say find2(mut hashtbl, ...).

There definitely can be a function which takes a mutable hash table find_mut(mut hashtbl, key, fun (mut option[value]) -> 'a) -> 'a (see Rust’s get_mut), but that’s also not the function we want write. This problem, that we want to be able to move immutable values but cannot reset them after moving, is the language design problem I mentioned in the beginning.

Unsafe to the rescue?

This really is a frustrating state of affairs. The function fn which uses the optional only receives it as an immutable value. It cannot change it. It should be possible to move the value back if we (or the compiler) can prove it couldn’t possibly have changed. Unfortunately, schmu doesn’t have the machinery for such proves and the prospect of not only having mutable and immutable variables but also mutable-but-not-really ones doesn’t look too good either.

A first workaround (I refuse to call it solution) for this problem was using an unsafe6 function. If we had a function which let’s us move item.value without the borrow (and move) checker knowing, we could get away with it. Let me introduce: unsafe/unchecked, which essentially disables borrow checking for the passed expression and returns it as a new value.

-- create a new value, but don't tell the borrow checker
let opt = unsafe/unchecked(Some(item.value))
let result = fn(opt)
-- item.value is now owned by item and opt
-- We cannot have opt be freed if it goes out of scope,
-- otherwise that's a double-free
unsafe/leak(opt)
result

This is, of course, horrible. It might be fine for the user of our hash table because they don’t have to pay for the extra copy. But this pattern would be repeated in many libraries, and it introduces a big loophole into the language. It’s still frustrating.

As an aside, hylo has a concept called remote parts which lets structs declare that certain fields have another owner (if my understanding is correct). This solves our problem, but creates a new one: Now we need two option types, an owning one and one for references. And option is probably not the only type where this duplication comes up.

Borrowed moves

The unsafe workaround has one very nice property: All unsafe operations are local to our hash table implementation. For the caller, everything is still fine. They borrow an immutable value which, to them, looks like any other value. Since the value is owned by the find function, and they cannot move it (it would change the value from not-moved to moved), they don’t need to worry about the potential double-free that our implementation avoids.

What if we didn’t have to move item.value into the optional but it could somehow borrow the value? For such a borrow to work,

  1. the borrow checker must be made aware that data structures might reference other data
  2. the optional must not leave its scope
  3. the optional must not free the borrowed item when it goes out of scope (at the end of find).

Luckily, the implementation of the borrow checker gives us 1. basically for free. 2. is a more significant change. Whenever we produce a new value (by calling a function, or constructing a variant or record) we know the value is independent. As a consequence, the recipient of this value really owns it, and can choose how they want to use that value, whether they want to mutate or even move it. For our use case we need the optional to borrow item.value immutably. Thus, the optional itself must be restricted in its use. It, too, must only be used immutably. There is one other item in the language which also behaves like this: string literals. A string literal by itself can only be borrowed. To own a string, we need to copy7 a string literal.

-- error: Borrowed string literal has been moved in line 1.
-- let owned_string = mov "some literal"
let owned_string = mov copy("some literal")

This leaves us with item 3. The careful reader might already have noticed that also for this item, there is precedent in schmu. In our implementation of move semantics it’s perfectly legal8 for a value to be destroyed while it’s in a partially-moved state. The compiler will simply omit the moved fields9. Coincidentally, in the implementation of moves there is one compiler pass responsible for tracking moves (the borrow checker) and another pass which realizes them. It turns out that we can implement “borrowed moves” by not-moving in the borrow checker and still realizing a move in the second pass. Syntax-wise, these borrowed moves look like this

fun find3(hashtbl, key, fn) {
  let index = find_index(hashtbl, key)
  match hashtbl.[index] {
    None -> fn(None)
    Some(item) -> {
      -- note the 'bor' keyword
      fn(Some(bor item.value))
    }
  }
}

The bor keyword indicates an explicit borrow where usually a move occurs. The cool thing about all of this is that code using this optional with borrowed content doesn’t need to know about the borrows. It’s still a normal value, as far as fn is concerned.

Conclusion

In the few weeks since implementing it I have found that it simplifies many patterns which were awkward in schmu before. As one last example, imagine we want to pattern match on two immutable values. Since there is no special syntax for this case in schmu, we’d want to construct a tuple and use a tuple pattern to match it.

match (v1, v2) {
  (Some(_), None) -> -- whatever
  ...
}

But, since a tuple creates a new value, the tuple wants to own both v1 and v2 and they would need to be moved. Now, we can express this as

match (bor v1, bor v2) {
  (Some(_), None) -> -- whatever
  ...
}

How cool is that?! I’m very happy with how this experiment turned out. It makes the language more expressive and the borrow checker can still ensure that our values are used independently, we still get the benefits of value semantics. The “downward borrowing” idea of passing value into functions now extends to structures as well, and we don’t need explicit lifetimes for it because the lifetimes still follow lexical scoping.

So far, I have only implemented these borrowed moves for the case of immutable borrows. I think that in principle they could be made to work for mutable borrows as well Some(mut item.value). But since string literals are always immutable, the implementation isn’t as straight-forward as for the immutable case, and mutable bindings can always be moved and re-set anyway, so supporting them is not as urgent.

Footnotes:

1

It’s a temporary name, but all the good birds and gems seem to be taken already.

2

The authors of the paper themselves have since moved away from reference counting in their language hylo, which also uses borrow checking to ensure independence of values.

3

hylo has subscripts for this, and schmu has a form of let-expressions similar to gleam’s use-expression.

4

I’m using schmu syntax here. Hopefully it’s easy enough to follow

5

For the custom destructor to run, all data must be available in the object.

6

schmu doesn’t have unsafe scopes like Rust. Instead, unsafe functionality is indicated by name and bundled in the unsafe module of schmu’s standard library.

7

copy is a builtin which auto-generates a copy-function for each type.

8

As long as only the remaining parts are used, and not the value as a whole or the moved-from parts.

9

In reality, a specialized destructor will be generated which only frees the remaining parts.

Last updated: 2025-08-23 21:55

mastodon codeberg github