Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Lake Programming Language

Lake is a process-oriented programming language built around machines, branches, and state transitions.

Programs in Lake are composed of machines — lightweight processes that define behavior through pattern-matched branches with typed parameters. Machines communicate by spawning new processes or transitioning their own state via self(). A cooperative scheduler manages concurrent execution of all spawned machines.

@rt(rt_write)

counter is {
  n i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1) }
    }
  }
}

main is {
  _ i64.0 -> {
    counter(5)
    counter(3)
    counter(7)
  }
}

This program spawns three independent counter processes that run concurrently. Each counter decrements until it reaches zero, then prints “done”.

Key Ideas

  • Machines are the core abstraction — each machine call spawns a new cooperatively-scheduled process
  • Branches define behavior through pattern matching on argument types
  • self(args) performs a state transition within the current process (no new process is spawned)
  • Calling another machine spawns it as a new concurrent process
  • Cooperative scheduling — each process runs a quantum of work before yielding to the scheduler
  • O(1) branch dispatch — branches are selected by hashing the argument types at compile time

Status

Lake is in active development. This book documents the features that are currently implemented and working in the native compiler.

Getting Started

Building the Compiler

The Lake native compiler is written in Rust (edition 2024) and uses Cranelift as its code generation backend:

git clone https://github.com/morphqdd/lake-native-compiler.git
cd lake-native-compiler
cargo build

Running an Example

cargo run -- examples/counter.lake -o counter
./counter

First Program

Here is a minimal Lake program:

@rt(rt_write)

main is {
  _ i64.0 -> {
    rt_write(1 "hello, lake!\n" 14)
  }
}

This program:

  1. Declares a runtime function rt_write via the @rt directive
  2. Defines a main machine with a single branch
  3. Calls rt_write with a file descriptor (1 = stdout), a string, and its length

Program Structure

Every Lake program consists of:

  1. Directives — compiler attributes that declare runtime functions (@rt)
  2. Machines — the program logic, defined with is
@rt(rt_write)                # 1. directive

main is {                    # 2. machine
  _ i64.0 -> {
    rt_write(1 "done\n" 5)
  }
}

Machines can be declared in any order — forward references are resolved automatically.

Machines

A machine is the core abstraction in Lake. Every machine call spawns a new cooperatively-scheduled process.

Defining a Machine

name is {
  # branches
}

The is keyword separates the machine name from its body. The body is enclosed in curly braces and contains one or more branches.

A Simple Machine

A counter that recursively decrements until it reaches zero:

@rt(rt_write)

counter is {
  n i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1) }
    }
  }
}

main is {
  _ i64.0 -> {
    counter(5)
  }
}

When main calls counter(5), a new process is spawned running the counter machine. The self(n-1) call does not spawn a new process — it transitions the current process to a new state.

Concurrent Execution

Calling a machine always spawns a new process. Multiple spawns create concurrent processes managed by the cooperative scheduler:

@rt(rt_write)

worker is {
  steps i64 acc1 i64 acc2 i64 -> {
    when 1 <= steps {
      true  -> { self(steps-1 acc2 acc1+acc2) }
      false -> { rt_write(1 ".\n" 2) }
    }
  }
}

main is {
  _ i64.0 -> {
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
  }
}

Here main spawns four worker processes. They execute concurrently — each process runs a quantum of work (256 blocks) before yielding to the scheduler.

Ping-Pong Example

Machines can spawn each other:

@rt(rt_write)

pong is {
  _ i64.0 -> {
    rt_write(1 "pong\n" 5)
  }
}

ping is {
  _ i64.0 -> {
    rt_write(1 "ping\n" 5)
    pong()
  }
}

main is {
  _ i64.0 -> {
    ping()
    ping()
    ping()
  }
}

main spawns three ping processes. Each ping prints “ping” and then spawns a pong process that prints “pong”.

Declaration Order

Machines can be declared in any order. Forward references work:

main is {
  _ i64.0 -> {
    worker(10)
  }
}

worker is {
  n i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1) }
    }
  }
}

Branches & Patterns

Branches define a machine’s behavior. Each branch describes which arguments are accepted and what happens when they are received.

Branch Syntax

pattern+ -> { body }
  • pattern+ — one or more patterns (parameters)
  • { body } — branch body containing expressions

Patterns

A pattern declares a parameter with a name, a type, and an optional default value:

ident Type            # typed parameter
ident Type.default    # parameter with default value
_                     # wildcard — ignored parameter

Typed Parameters

sum is {
  n i64 acc i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1 acc+n) }
    }
  }
}

This branch takes two i64 parameters: n and acc.

Default Values

A default value is specified after the type, separated by a dot (.):

main is {
  _ i64.0 -> {
    counter(5)
  }
}

_ i64.0 — a wildcard parameter of type i64 with default value 0. Parameters with defaults do not participate in branch signature matching.

Wildcard

The _ symbol means the argument is ignored. It does not bind a variable:

pong is {
  _ i64.0 -> {
    rt_write(1 "pong\n" 5)
  }
}

Branch Dispatch

When a machine is called, the compiler selects the matching branch based on the types of the provided arguments. Branch dispatch is O(1) — argument types are hashed at compile time.

Only non-default, non-wildcard pattern types participate in the hash. For example:

counter is {
  n i64 -> { ... }
}

A call counter(5) matches this branch because 5 is i64 and the branch expects i64.

Multiple Parameters

Parameters are separated by spaces:

worker is {
  steps i64 acc1 i64 acc2 i64 -> {
    when 1 <= steps {
      true  -> { self(steps-1 acc2 acc1+acc2) }
      false -> { rt_write(1 ".\n" 2) }
    }
  }
}

Three parameters: steps, acc1, and acc2, all of type i64.

Scope

Each branch has its own scope. Variables declared through patterns are only accessible within that branch’s body.

Types

Primitive Types

Lake currently supports three primitive types:

TypeDescription
i6464-bit signed integer
strString (fat pointer: start + end addresses)
pidProcess identifier (fat pointer to process context)

All values are represented as i64 at the machine code level. Strings are stored as read-only data with fat pointers. Process identifiers are heap pointers that remain stable for the lifetime of the process.

Type Annotations

Types appear in patterns:

n i64                 # 64-bit integer

And in default values:

_ i64.0               # i64 with default value 0

Type-Based Dispatch

Types are central to branch dispatch. When a machine is called, the compiler matches the argument types against branch signatures:

counter is {
  n i64 -> { ... }   # matches calls with one i64 argument
}

The hash of argument types determines which branch receives the call. This dispatch is O(1) at runtime.

Process Identifiers (pid)

The pid type represents a handle to a spawned process. When a machine is called, it returns a pid that can be used to send messages to that process:

receiver is {
  _ i64.0 -> {
    wait { n i64 -> { rt_write(1 "got message\n" 12) } }
  }
}

main is {
  _ i64.0 -> {
    let p pid = receiver()
    p(42)                    # send message to receiver
  }
}

A pid is a stable heap pointer that identifies the process for its entire lifetime. It can be stored in variables, passed as arguments, and sent through mailboxes.

Expressions

Expressions make up the body of branches.

Literals

42                   # number (i64)
0                    # number (i64)
100000               # number (i64)
"hello\n"            # string (str)
true                 # boolean
false                # boolean

Numbers are i64. Strings support escape sequences: \n, \t, \r, \\, \".

Variables

Names declared in branch patterns:

counter is {
  n i64 -> {
    self(n-1)        # "n" is available here
  }
}

Let Bindings

let x i64 = 42

Creates a local variable within the current branch.

Arithmetic

Binary operators with precedence:

PrecedenceOperatorsDescription
10 (highest)* /Multiplication, division
9+ -Addition, subtraction

All operators are left-associative and operate on i64:

n - 1                # subtraction
acc + n              # addition
acc1 + acc2          # addition
steps - 1            # subtraction

Arithmetic works in arguments:

self(n-1 acc+n)                # two computed arguments
self(steps-1 acc2 acc1+acc2)   # three, one with addition

Comparisons

PrecedenceOperatorsDescription
8<= >= == < >Comparison
0 == n               # equality check
1 <= steps           # less-or-equal check

Lower precedence than arithmetic: a + b <= c means (a + b) <= c.

When Expressions

Conditional branching based on an expression:

when 0 == n {
  true  -> { rt_write(1 "done\n" 5) }
  false -> { self(n-1) }
}
when 1 <= steps {
  true  -> { self(steps-1 acc2 acc1+acc2) }
  false -> { rt_write(1 ".\n" 2) }
}

Syntax:

when condition {
  pattern -> { body }
  pattern -> { body }
}

The condition is any expression. Arms match on literal values (numbers, booleans). If no arm matches, execution falls through silently.

Numeric pattern matching is also supported:

when some_value {
  0 -> { ... }
  1 -> { ... }
  2 -> { ... }
}

Wait Expression

The wait expression suspends the current process until a message arrives in its mailbox:

wait {
  n i64 -> { rt_write(1 "received\n" 9) }
}

When a message arrives, it is dequeued from the mailbox and the handler body executes with the message value bound to the pattern variables.

If the mailbox is empty, the process is suspended and moved to the scheduler’s wait array. When another process sends a message, the waiting process is awakened.

Multiple messages can be handled by using wait in a loop:

receiver is {
  remaining i64 -> {
    when 1 <= remaining {
      true -> {
        wait {
          n i64 -> {
            rt_write(1 "." 1)
            self(remaining-1)
          }
        }
      }
    }
  }
}

main is {
  _ i64.0 -> {
    let r pid = receiver(3)
    r(1)
    r(2)
    r(3)
  }
}

Message Sending

Calling a pid-typed variable sends a message to that process:

let p pid = worker()
p(42)                    # send 42 to the worker process

The syntax is identical to calling a machine, but when the callee is a pid variable, it becomes a message send instead of a spawn.

Messages are enqueued in the target process’s mailbox (a ring buffer of 256 slots). If the target is waiting, it is immediately awakened and moved back to the scheduler’s active queue.

Ping-Pong Example

ponger is {
  _ i64.0 -> {
    wait { partner pid -> { self(partner 3) } }
  }
  partner pid remaining i64 -> {
    when 1 <= remaining {
      true -> {
        wait { n i64 -> { partner(1) self(partner remaining-1) } }
      }
    }
  }
}

pinger is {
  partner pid remaining i64 -> {
    when 1 <= remaining {
      true -> {
        partner(1)
        wait { n i64 -> { self(partner remaining-1) } }
      }
    }
  }
}

main is {
  _ i64.0 -> {
    let po pid = ponger()
    let pi pid = pinger(po 3)
    po(pi)
  }
}

The ponger waits to receive the pinger’s pid, then they exchange messages in a loop.

State Transitions

State transitions are the primary control flow mechanism in Lake. There are two kinds:

self() — Internal State Transition

self(args) transitions the current process to a new state without spawning a new process. The current branch’s variables are replaced with the new arguments, and execution restarts from the matched branch.

This is the primary looping mechanism:

counter is {
  n i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1) }
    }
  }
}

self(n-1) does not recurse on the call stack — it transitions the process state and the scheduler re-enters the machine.

Arguments can include arithmetic:

sum is {
  n i64 acc i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1 acc+n) }
    }
  }
}

machine(args) — Spawn a New Process

Calling any machine other than self spawns a new concurrent process:

main is {
  _ i64.0 -> {
    counter(5)
    counter(3)
    counter(7)
  }
}

Each counter(N) spawns an independent process. The spawning process continues immediately — it does not wait for the spawned process to finish.

Cooperative Scheduling

All spawned processes are managed by a cooperative scheduler. Each process runs a quantum of work (256 blocks) before yielding. This means concurrent processes make interleaved progress:

@rt(rt_write)

worker is {
  steps i64 acc1 i64 acc2 i64 -> {
    when 1 <= steps {
      true  -> { self(steps-1 acc2 acc1+acc2) }
      false -> { rt_write(1 ".\n" 2) }
    }
  }
}

main is {
  _ i64.0 -> {
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
    worker(100000 0 1)
  }
}

Eight worker processes execute concurrently, each computing 100,000 Fibonacci iterations.

pid(args) — Send a Message

Calling a pid-typed variable sends a message to that process instead of spawning:

receiver is {
  _ i64.0 -> {
    wait { n i64 -> { rt_write(1 "got it\n" 7) } }
  }
}

main is {
  _ i64.0 -> {
    let p pid = receiver()
    p(42)                    # send message, not spawn
  }
}

When you call a machine, it returns a pid. You can store that pid and send messages to it later.

Messages are enqueued in a 256-slot ring buffer. If the target process is suspended (via wait), it is immediately awakened.

See Expressions for details on wait and message sending.

Runtime Functions

Calls to @rt-declared functions are inlined — they execute immediately without spawning a process:

rt_write(1 "hello\n" 6)     # direct call, no process spawned

See Directives for the available runtime functions.

Directives

Directives are compiler attributes that declare runtime functions. They are placed before machines.

@rt — Runtime Function

Binds a name to a built-in runtime function:

@rt(rt_write)

After this declaration, rt_write can be called directly from any branch. Unlike machine calls, runtime function calls do not spawn a new process — they execute inline.

Available Runtime Functions

FunctionArgumentsDescription
rt_writefd data sizeWrite size bytes from data to file descriptor fd
rt_exitcodeExit the program with the given exit code
rt_allocatesizeAllocate size bytes on the heap, returns fat pointer
rt_storectx value size offsetWrite value to memory at offset
rt_load_u64ctx offsetRead a 64-bit value from memory at offset
rt_mmapaddr size prot flags fd offRaw mmap syscall
rt_syscallvariesRaw syscall wrapper

Common Usage

Writing to stdout:

@rt(rt_write)

main is {
  _ i64.0 -> {
    rt_write(1 "hello, lake!\n" 14)
  }
}

The arguments to rt_write are: file descriptor (1 = stdout), string data, and byte length.

Placement

Directives are placed at the top of the file, before any machine definitions:

@rt(rt_write)

counter is {
  n i64 -> {
    when 0 == n {
      true  -> { rt_write(1 "done\n" 5) }
      false -> { self(n-1) }
    }
  }
}

main is {
  _ i64.0 -> {
    counter(5)
  }
}