Back to Blog

Build a Computer from NAND Gates

Part 4: Virtual Machine I - Stack Arithmetic

We've built the hardware, designed the CPU, and created an assembler. But writing assembly code directly is still tedious. Modern compilers don't generate assembly. They generate code for an intermediate virtual machine.

A virtual machine (VM) is an abstraction layer between high-level languages and machine code. It provides:

  • Portability: The same VM code runs on different hardware
  • Simplicity: Easier to target than raw assembly
  • Safety: Controlled execution environment

Let's build one.

The stack machine model

Our VM uses a stack as its primary data structure. Instead of registers, values are pushed onto and popped off a stack.

Stack Machine in Action
VM COMMANDS
push constant 7
push constant 8
add
Stack (SP=0)
(empty)
← top
CURRENT OPERATION
push constant 7

Push the constant 7 onto the stack

Why a stack?

Stack-based VMs are elegant because:

  1. No register allocation: Compilers don't need to decide which register holds which value
  2. Simple instructions: add means "pop two values, push their sum"
  3. Natural for expressions: (a + b) * c becomes push a, push b, add, push c, mul

The Java Virtual Machine (JVM), Python bytecode, and WebAssembly all use stack-based architectures.

The stack pointer

The stack lives in RAM. A special register called SP (Stack Pointer) tracks where the top of the stack is:

  • Push stores a value at RAM[SP] and increments SP
  • Pop decrements SP and reads from RAM[SP]

The stack grows upward in memory. Initially SP = 256, leaving room for segment pointers below.

VM commands

Our VM supports three types of commands:

1. Arithmetic commands

These operate on values at the top of the stack:

VM Arithmetic Operations
+=
15
push constant 7
push constant 8
add

Binary operations pop two values and push the result:

  • add: x + y
  • sub: x - y (note: second value minus first)
  • and: bitwise AND
  • or: bitwise OR
  • eq: -1 if x = y, else 0
  • gt: -1 if x > y, else 0
  • lt: -1 if x < y, else 0

Unary operations pop one value and push the result:

  • neg: -x (arithmetic negation)
  • not: ~x (bitwise NOT)

In our VM, true is represented as -1 (all bits set) and false as 0.

2. Push commands

Push commands add values to the stack. The syntax is:

push segment index

Where segment indicates where to get the value from, and index specifies which element within that segment.

3. Pop commands

Pop commands remove the top value and store it somewhere:

pop segment index

You can't pop to constant (constants are read-only).

Memory segments

The VM divides memory into segments, each serving a specific purpose:

Memory Segments
MEMORY LAYOUT
argument
100
200
0
0
local
10
20
30
40
this
1
2
3
4
that
5
6
7
8
Stack (SP=2)
42
99
← top
SegmentBasePurpose
localLCLLocal variables
argumentARGFunction arguments
thisTHISObject fields
thatTHATArray elements
constant-Virtual segment for constants
tempR5-R12Temporary storage
pointerR3-R4THIS/THAT pointers
staticR16+Static variables

The eight segments

SegmentPurposeImplementation
constantVirtual segment for constantspush only, value = index
localFunction's local variablesbase at LCL pointer
argumentFunction's argumentsbase at ARG pointer
thisCurrent object's fieldsbase at THIS pointer
thatArray elementsbase at THAT pointer
tempTemporary storagefixed at RAM[5-12]
pointerTHIS/THAT pointersRAM[3] = this, RAM[4] = that
staticStatic variablesRAM[16-255], named by file

Segment implementation

Pointer-based segments (local, argument, this, that):

  • The segment's base address is stored in a pointer (LCL, ARG, THIS, THAT)
  • push local 2 reads RAM[LCL + 2]
  • pop argument 0 writes to RAM[ARG + 0]

Fixed segments (temp, pointer):

  • temp is at RAM[5-12], so push temp 3 reads RAM[8]
  • pointer 0 is THIS (RAM[3]), pointer 1 is THAT (RAM[4])

Static segment:

  • Each file gets its own static variables
  • push static 5 in File.vm refers to a variable named File.5
  • These are allocated in RAM[16-255]

Constant segment:

  • Virtual (no actual memory location)
  • push constant 17 just pushes the value 17

Why segments matter

Segments separate concerns:

  • local and argument support function calls
  • this and that support object-oriented programming
  • static supports class-level variables
  • temp is scratch space for the compiler

A compiler targeting this VM can focus on logic, not memory layout.

Translating VM to assembly

The VM translator converts each VM command into Hack assembly:

VM to Assembly Translation
VM CODE
ASSEMBLY OUTPUT
23 lines
// push constant 7
@7
D=A
@SP
A=M
M=D
@SP
M=M+1
// push constant 8
@8
D=A
@SP
A=M
M=D
@SP
M=M+1
// add
@SP
AM=M-1
D=M
@SP
A=M-1
M=D+M
Try:

Translation patterns

push constant n:

@n
D=A
@SP
A=M
M=D
@SP
M=M+1

This loads n into D, stores it at RAM[SP], and increments SP.

push local i:

@LCL
D=M
@i
A=D+A
D=M
@SP
A=M
M=D
@SP
M=M+1

This computes LCL + i, reads from that address, and pushes the value.

pop local i:

@LCL
D=M
@i
D=D+A
@R13
M=D        // R13 = target address
@SP
AM=M-1
D=M        // D = popped value
@R13
A=M
M=D        // RAM[target] = D

Pop is trickier because we need to compute the target address before popping (the stack pointer changes during the pop).

add:

@SP
AM=M-1
D=M
A=A-1
M=D+M

This pops into D, points to the new top, and adds D to it.

Comparison commands

Comparisons like eq, gt, lt need conditional jumps:

// eq
@SP
AM=M-1
D=M
A=A-1
D=M-D
@TRUE_0
D;JEQ
@SP
A=M-1
M=0
@END_0
0;JMP
(TRUE_0)
@SP
A=M-1
M=-1
(END_0)

This compares the top two values and pushes -1 (true) or 0 (false).

Stack trace visualization

Let's trace through a complete program:

Stack Trace Visualization
EXECUTION TRACE
Click Step to begin execution
Stack (SP=0)
(empty)
← top

Watch how each command transforms the stack. The trace shows:

  1. The command being executed
  2. The stack state after execution

This step-by-step execution helps you understand exactly what the VM does.

Example: Evaluating an expression

Let's trace (3 + 4) * 2:

push constant 3   // stack: [3]
push constant 4   // stack: [3, 4]
add               // stack: [7]
push constant 2   // stack: [7, 2]
mul               // stack: [14]

The compiler translates the expression into postfix (reverse Polish notation), then each operation pops its arguments and pushes the result.

Stack-based expression evaluation

For any arithmetic expression:

  1. Convert to postfix notation
  2. For each token:
    • If it's a number, push it
    • If it's an operator, pop arguments, compute, push result

This is exactly what our VM does.

Memory layout

Here's how memory is organized:

RAM[0]      SP      Stack pointer
RAM[1]      LCL     Local segment base
RAM[2]      ARG     Argument segment base
RAM[3]      THIS    This segment base
RAM[4]      THAT    That segment base
RAM[5-12]   temp    Temp segment
RAM[13-15]  R13-R15 General purpose (used by VM translator)
RAM[16-255] static  Static variables
RAM[256+]   stack   Stack (grows upward)

The segment pointers (LCL, ARG, THIS, THAT) are set by function call/return commands (covered in Part 5).

Error handling

The VM translator should catch:

Syntax errors:

  • Unknown command: multiply (should be mul... wait, we don't have mul either!)
  • Missing arguments: push constant (missing index)
  • Invalid segment: push register 0

Semantic errors:

  • Pop to constant: pop constant 0 (constants are read-only)
  • Invalid temp index: push temp 10 (temp only has 8 slots)
  • Invalid pointer index: push pointer 2 (pointer only has 0 and 1)

A good translator provides clear error messages with line numbers.

What's next

In Part 5, we'll add:

  • Branching: label, goto, if-goto for control flow
  • Functions: function, call, return for subroutines
  • The call stack: Managing frames for recursive functions

These additions transform our simple stack calculator into a complete virtual machine capable of running real programs.

Summary

In this part, we built:

  1. Stack machine: Push/pop architecture for expression evaluation
  2. Memory segments: local, argument, this, that, temp, pointer, static, constant
  3. Arithmetic commands: add, sub, neg, eq, gt, lt, and, or, not
  4. VM translator: Converts VM commands to Hack assembly

The VM provides a clean abstraction layer. A compiler targeting this VM doesn't need to worry about:

  • Register allocation
  • Memory management
  • Hardware-specific details

It just generates push, pop, and arithmetic commands. The VM translator handles the rest.


Next: Part 5 covers branching and function calls, completing our virtual machine implementation.