The missing piece for schmu
2025-08-22
For the last couple of years I’ve been working on a programming language called
schmu1. This is not an introductory post about schmu (that, I have yet
to write) but instead a discussion of a language design problem I faced and the
solution I found.
Mutable value semantics
The shortest summary of schmu I can give is this: It uses mutable value
semantics (MVS) and is heavily inspired by OCaml. If you’re unfamiliar with
MVS, the paper which introduced me to it summarizes it like this
Mutable value semantics is a programming discipline that upholds the independence of values to support local reasoning. In the discipline’s strictest form, references become second-class citizens: they are only created implicitly, at function boundaries, and cannot be stored in variables or object fields.
schmu uses this strictest form and makes references second-class. A value in
schmu may consist of multiple parts, but it owns its parts (and they own
theirs etc.). A part of a value cannot reference another value which is owned by
someone else, there is no distinction between deep and shallow copies.
Similarly, a function always returns a unique value which owns its parts.
However, when passing variables downwards to functions (by value, naturally)
they don’t have to be copied due to their independence. Every pass by value is
essentially a read-only borrow.
The implementation of MVS in schmu does not follow the paper. Instead
of pervasive reference counting, schmu uses a borrow checker to ensure
independence of values2.
The restriction of second-class references may seem like a huge drawback at
first which results in a lot of copies of values where other languages can just
use references. As an example, a find function for a hash table can return a
reference to some value in C++ or Rust without copying (and possibly
heap-allocating) the value. A naive implementation in a MVS language cannot do
this and must copy. To get around this, we can use higher order functions and
pass values downwards without copying, with various degrees of syntactic
sugar3. Our (simplified) hash table might look something like this4
-- simplified hash table with types [key] and [value] type item = { key : key, value : value } type hashtbl = array[option[item]] fun find(hashtbl, key, fn : fun (value) -> 'a) -> option['a] { match hashtbl.[find_index(hashtbl, key)] { None -> None Some(item) -> Some(fn(item.value)) } }
We search for the value with key key in our hashtbl. If there is a value
associated with key we apply it to the function fn and wrap it into an
optional. Otherwise, we return None, the empty optional. From a user’s
perspective, it looks like this
-- For clarity, we're not using the sugary syntax mentioned above match find(hashtbl, key, fun value { -- returns the result of this function call calculate_something_with(value) }) { None -> println("could not find value for key") Some(result) -> println("found some result") }
This certainly works, but it’s not how we want to write this code (and it looks
ugly). We want to pattern-match on the value directly and not on some result
which was constructed with our value. So why did we write our find function in
such a backward way? It’s because the optional that we want to pass into fn
(Some(item.value)) is its own unique value (which owns its data). We cannot
have an optional with a reference to item.value. Instead, we have to copy
item.value to construct the optional Some(copy(item.value)). And we want to
avoid these copies. This is where move semantics come in.
Move semantics
When a data structure (say, B) keeps a reference to another data structure A
then B must not outlive A, otherwise it would create a dangling reference. Since
A has a greater lifetime than B, there is another way to structure the data. B
can own A (as a value) for the time it needs access and give back ownership when
it is done. Move semantics ensure that we don’t need to copy (heap data) when
transferring ownership. In schmu’s implementation of move semantics the
compiler also stops you from trying to use a moved value and, since there are no
custom destructors in schmu, frees only the existing, not-moved parts of a
value if it goes out of scope5.
In our hash table example, we can move item.value into the optional and move
it back after fn is done with it.
fun find2(hashtbl, key, fn : fun (option[value]) -> 'a) -> 'a { let index = find_index(hashtbl, key) match hashtbl.[index] { None -> fn(None) Some(item) -> { -- move item.value into optional let opt = Some(item.value) let result = fn(opt) -- move value back into item mut item.value = option/get(opt) -- syntax for module access, module/item -- return result result } } } find2(hashtbl, key, fun value { match value { None -> println("could not find value for key") Some(value) -> { -- do something with value } } })
That’s an improvement! Our find2 function is slightly more
complicated but usage has improved a lot. There’s only one problem with it. It
doesn’t type-check.
Maybe you were wondering about this line earlier mut item.value =
option/get(opt) when we moved back value into item, specifically about the
strange mut keyword. In schmu, every mutation is explicit, we cannot mutate
a value without signaling the mutation by using the mut keyword. You can think
of it as passing something by pointer in C &value, only schmu tries to use
keyword instead of sigils. The problem, and the reason this code doesn’t
type-check, is that we passed our hash table immutably find2(hashtbl, ...).
Otherwise, the code would say find2(mut hashtbl, ...).
There definitely can be a function which takes a mutable hash table
find_mut(mut hashtbl, key, fun (mut option[value]) -> 'a) -> 'a (see Rust’s
get_mut), but that’s also not the function we want write. This problem, that
we want to be able to move immutable values but cannot reset them after moving,
is the language design problem I mentioned in the beginning.
Unsafe to the rescue?
This really is a frustrating state of affairs. The function fn which uses the
optional only receives it as an immutable value. It cannot change it. It should
be possible to move the value back if we (or the compiler) can prove it couldn’t
possibly have changed. Unfortunately, schmu doesn’t have the machinery for
such proves and the prospect of not only having mutable and immutable variables
but also mutable-but-not-really ones doesn’t look too good either.
A first workaround (I refuse to call it solution) for this problem was using an
unsafe6 function. If we had a function which let’s us move item.value
without the borrow (and move) checker knowing, we could get away with it. Let me
introduce: unsafe/unchecked, which essentially disables borrow checking for
the passed expression and returns it as a new value.
-- create a new value, but don't tell the borrow checker let opt = unsafe/unchecked(Some(item.value)) let result = fn(opt) -- item.value is now owned by item and opt -- We cannot have opt be freed if it goes out of scope, -- otherwise that's a double-free unsafe/leak(opt) result
This is, of course, horrible. It might be fine for the user of our hash table because they don’t have to pay for the extra copy. But this pattern would be repeated in many libraries, and it introduces a big loophole into the language. It’s still frustrating.
As an aside, hylo has a concept called remote parts which lets structs
declare that certain fields have another owner (if my understanding is correct).
This solves our problem, but creates a new one: Now we need two option types,
an owning one and one for references. And option is probably not the only type
where this duplication comes up.
Borrowed moves
The unsafe workaround has one very nice property: All unsafe operations are
local to our hash table implementation. For the caller, everything is still
fine. They borrow an immutable value which, to them, looks like any other value.
Since the value is owned by the find function, and they cannot move it (it
would change the value from not-moved to moved), they don’t need to worry about
the potential double-free that our implementation avoids.
What if we didn’t have to move item.value into the optional but it could
somehow borrow the value? For such a borrow to work,
- the borrow checker must be made aware that data structures might reference other data
- the optional must not leave its scope
- the optional must not free the borrowed item when it goes out of scope (at
the end of
find).
Luckily, the implementation of the borrow checker gives us 1. basically for
free. 2. is a more significant change. Whenever we produce a new value (by
calling a function, or constructing a variant or record) we know the value is
independent. As a consequence, the recipient of this value really owns it, and
can choose how they want to use that value, whether they want to mutate or even
move it. For our use case we need the optional to borrow item.value immutably.
Thus, the optional itself must be restricted in its use. It, too, must only be
used immutably. There is one other item in the language which also behaves like
this: string literals. A string literal by itself can only be borrowed. To own a
string, we need to copy7 a string literal.
-- error: Borrowed string literal has been moved in line 1. -- let owned_string = mov "some literal" let owned_string = mov copy("some literal")
This leaves us with item 3. The careful reader might already have noticed that
also for this item, there is precedent in schmu. In our implementation of move
semantics it’s perfectly legal8 for a value to be destroyed while it’s in a
partially-moved state. The compiler will simply omit the moved fields9.
Coincidentally, in the implementation of moves there is one compiler pass
responsible for tracking moves (the borrow checker) and another pass which
realizes them. It turns out that we can implement “borrowed moves” by not-moving
in the borrow checker and still realizing a move in the second pass.
Syntax-wise, these borrowed moves look like this
fun find3(hashtbl, key, fn) { let index = find_index(hashtbl, key) match hashtbl.[index] { None -> fn(None) Some(item) -> { -- note the 'bor' keyword fn(Some(bor item.value)) } } }
The bor keyword indicates an explicit borrow where usually a move occurs. The
cool thing about all of this is that code using this optional with borrowed
content doesn’t need to know about the borrows. It’s still a normal value, as
far as fn is concerned.
Conclusion
In the few weeks since implementing it I have found that it simplifies many
patterns which were awkward in schmu before. As one last example, imagine we
want to pattern match on two immutable values. Since there is no special syntax
for this case in schmu, we’d want to construct a tuple and use a tuple pattern
to match it.
match (v1, v2) { (Some(_), None) -> -- whatever ... }
But, since a tuple creates a new value, the tuple wants to own both v1 and
v2 and they would need to be moved. Now, we can express this as
match (bor v1, bor v2) { (Some(_), None) -> -- whatever ... }
How cool is that?! I’m very happy with how this experiment turned out. It makes the language more expressive and the borrow checker can still ensure that our values are used independently, we still get the benefits of value semantics. The “downward borrowing” idea of passing value into functions now extends to structures as well, and we don’t need explicit lifetimes for it because the lifetimes still follow lexical scoping.
So far, I have only implemented these borrowed moves for the case of immutable
borrows. I think that in principle they could be made to work for mutable
borrows as well Some(mut item.value). But since string literals are always
immutable, the implementation isn’t as straight-forward as for the immutable
case, and mutable bindings can always be moved and re-set anyway, so supporting
them is not as urgent.
Footnotes:
It’s a temporary name, but all the good birds and gems seem to be taken already.
The authors of the paper themselves have since moved away from reference
counting in their language hylo, which also uses borrow checking to ensure
independence of values.
hylo has subscripts for this, and schmu has a form of let-expressions
similar to gleam’s use-expression.
I’m using schmu syntax here. Hopefully it’s easy enough to follow
For the custom destructor to run, all data must be available in the object.
schmu doesn’t have unsafe scopes like Rust. Instead, unsafe
functionality is indicated by name and bundled in the unsafe module of
schmu’s standard library.
copy is a builtin which auto-generates a copy-function for each type.
As long as only the remaining parts are used, and not the value as a whole or the moved-from parts.
In reality, a specialized destructor will be generated which only frees the remaining parts.
Last updated: 2025-08-23 21:55