Build a Computer from NAND Gates
Part 9: The Operating System
Our compiler generates calls like Math.multiply, Memory.alloc, and Output.printInt. But who implements these functions?
We do, by writing an operating system in Jack itself.
OS classes
The Jack OS provides eight classes:
Each class serves a specific purpose, from mathematics to graphics to memory management. Together, they form a minimal but complete runtime library.
Memory management
The most fundamental OS service is memory allocation:
The heap
Memory layout:
- RAM[0-4]: Segment pointers (SP, LCL, ARG, THIS, THAT)
- RAM[5-12]: Temp segment
- RAM[13-15]: General purpose
- RAM[16-255]: Static variables
- RAM[256-2047]: Stack
- RAM[2048-16383]: Heap (for objects and arrays)
- RAM[16384-24575]: Screen memory
- RAM[24576]: Keyboard memory
Free list algorithm
The allocator maintains a free list, a linked list of available memory blocks:
Memory.init():
freeList = 2048
RAM[2048] = 14335 // size of free block
RAM[2049] = null // next pointer (null = end of list)
Memory.alloc(size):
search freeList for block with size >= requested + 1
if found:
if block is larger than needed:
split block (allocate front, keep rest free)
else:
use entire block
return address + 1 (skip size header)
else:
Sys.error(6) // heap overflow
Memory.deAlloc(object):
block = object - 1 // point to size header
add block to freeList
merge with adjacent free blocks if possible
Block structure
Each allocated block has a size header:
┌─────────────────────┐
│ size (including hdr)│ ← RAM[block]
├─────────────────────┤
│ data[0] │ ← returned pointer
│ data[1] │
│ ... │
└─────────────────────┘
This lets deAlloc know how much to free.
Math operations
The VM has no multiply or divide instructions. The OS provides them:
Multiplication: Shift and add
Math.multiply(x, y):
sum = 0
shiftedX = x
for i = 0 to 15:
if (y's bit i is 1):
sum = sum + shiftedX
shiftedX = shiftedX + shiftedX // left shift
return sum
This works because binary multiplication is:
x × y = x × (y₀×2⁰ + y₁×2¹ + y₂×2² + ...)
= x×y₀ + 2x×y₁ + 4x×y₂ + ...
Division: Repeated subtraction
Math.divide(x, y):
if y > x: return 0
q = divide(x, 2*y)
if (x - 2*q*y) < y:
return 2*q
else:
return 2*q + 1
This recursive algorithm is O(log n).
Square root
Math.sqrt(x):
y = 0
for j = 7 downto 0: // 16-bit numbers
if (y + 2^j)² ≤ x:
y = y + 2^j
return y
Binary search finds the largest y where y² ≤ x.
Screen graphics
The screen is 512×256 pixels, memory-mapped to RAM[16384-24575]:
Pixel addressing
Each word (16 bits) represents 16 horizontal pixels:
- RAM[16384] = first 16 pixels of row 0
- RAM[16385] = next 16 pixels of row 0
- RAM[16384 + 32] = first 16 pixels of row 1
To draw pixel at (x, y):
address = 16384 + (y * 32) + (x / 16)
bit = x % 16
RAM[address] = RAM[address] | (1 << bit) // set bit
Line drawing: Bresenham's algorithm
Screen.drawLine(x1, y1, x2, y2):
dx = abs(x2 - x1)
dy = abs(y2 - y1)
sx = sign(x2 - x1)
sy = sign(y2 - y1)
err = dx - dy
while true:
drawPixel(x1, y1)
if x1 = x2 and y1 = y2: break
e2 = 2 * err
if e2 > -dy:
err = err - dy
x1 = x1 + sx
if e2 < dx:
err = err + dx
y1 = y1 + sy
Bresenham's algorithm draws lines using only integer arithmetic.
Rectangle and circle
Rectangles are filled with horizontal lines.
Circles use the midpoint algorithm:
Screen.drawCircle(cx, cy, r):
for dy = -r to r:
dx = sqrt(r² - dy²)
drawLine(cx - dx, cy + dy, cx + dx, cy + dy)
String operations
Strings are objects with character arrays:
String structure
┌─────────────┐
│ maxLength │ field 0
├─────────────┤
│ length │ field 1
├─────────────┤
│ chars[0] │ field 2
│ chars[1] │
│ ... │
└─────────────┘
String methods
String.new(maxLength):
allocate maxLength + 2 words
set maxLength field
set length = 0
return this
String.appendChar(c):
if length = maxLength: error
chars[length] = c
length = length + 1
return this
String.intValue():
result = 0
sign = 1
i = 0
if chars[0] = '-':
sign = -1
i = 1
while i < length:
digit = chars[i] - 48 // '0' is ASCII 48
result = result * 10 + digit
i = i + 1
return sign * result
Text output
Output uses a character bitmap to draw text:
Character map
Each character is an 11×8 bitmap. The OS stores these patterns:
'A' = [
0x00, 0x00, 0x1C, 0x22, 0x22, 0x3E,
0x22, 0x22, 0x22, 0x00, 0x00
]
Printing characters
Output.printChar(c):
bitmap = characterMap[c]
for row = 0 to 10:
for col = 0 to 7:
if bitmap[row] has bit col set:
Screen.drawPixel(cursorCol*8 + col, cursorRow*11 + row)
cursorCol = cursorCol + 1
if cursorCol >= 64: // 512/8 = 64 chars per row
println()
Text layout
The screen fits:
- 64 characters per row (512 pixels / 8 pixels per char)
- 23 rows (256 pixels / 11 pixels per char)
Keyboard input
The keyboard register at RAM[24576] contains the ASCII code of the currently pressed key (0 if none):
Reading characters
Keyboard.readChar():
// Wait for key press
while RAM[24576] = 0: { }
key = RAM[24576]
// Wait for key release
while RAM[24576] != 0: { }
// Echo to screen
Output.printChar(key)
return key
Reading lines
Keyboard.readLine(prompt):
Output.printString(prompt)
str = String.new(64)
while true:
c = readChar()
if c = 128: // newline
Output.println()
return str
if c = 129: // backspace
str.eraseLastChar()
Output.backspace()
else:
str.appendChar(c)
System initialization
The boot sequence
- Power on: CPU starts executing at ROM[0]
- Bootstrap: Sets SP = 256, calls Sys.init
- Sys.init: Initializes all OS modules
- Main.main: Your program runs
- Sys.halt: Infinite loop when done
Sys.init implementation
Sys.init():
Memory.init()
Math.init()
Screen.init()
Output.init()
Keyboard.init()
Main.main()
Sys.halt()
Sys.halt():
while true: { }
Sys.error(errorCode):
Output.printString("ERR")
Output.printInt(errorCode)
Sys.halt()
Error codes
| Code | Meaning |
|---|---|
| 1 | Sys.wait: duration must be positive |
| 2 | Array.new: size must be positive |
| 3 | Math.divide: division by zero |
| 4 | Math.sqrt: cannot compute square root of negative |
| 5 | Memory.alloc: size must be positive |
| 6 | Memory.alloc: heap overflow |
| 7 | Screen.draw*: illegal coordinates |
| 8 | String.new: max length must be non-negative |
| ... | ... |
The complete OS
The OS is about 2000 lines of Jack code:
| Class | Lines | Purpose |
|---|---|---|
| Sys | ~50 | Initialization and utilities |
| Memory | ~100 | Heap management |
| Math | ~100 | Arithmetic operations |
| String | ~150 | String manipulation |
| Array | ~20 | Array allocation |
| Output | ~400 | Text output |
| Screen | ~300 | Graphics primitives |
| Keyboard | ~100 | Input handling |
The OS is compiled to VM code and linked with user programs.
Self-hosting
Notice something remarkable: the OS is written in Jack, compiled by our compiler, run on our VM, executed on our CPU.
Every layer we built supports the layer above:
- NAND gates → logic circuits
- Logic circuits → CPU
- CPU → machine code
- Machine code → assembly
- Assembly → VM code
- VM code → Jack
- Jack → OS
- OS → applications
The entire stack is self-contained.
Testing the OS
Each OS class should be tested:
- Unit tests: Test each function with known inputs
- Integration tests: Compile and run test programs
- Stress tests: Allocate/free many blocks, draw complex graphics
Example test program:
class Test {
function void main() {
var int x;
let x = Math.multiply(7, 8);
do Output.printInt(x); // Should print 56
return;
}
}
What we've built
The complete system:
┌─────────────────────────────────────────────┐
│ User Applications │
├─────────────────────────────────────────────┤
│ Jack OS: Math, Memory, String, Screen, etc. │
├─────────────────────────────────────────────┤
│ Jack Compiler │
├─────────────────────────────────────────────┤
│ VM Translator │
├─────────────────────────────────────────────┤
│ Assembler │
├─────────────────────────────────────────────┤
│ Hack CPU + Memory │
├─────────────────────────────────────────────┤
│ Logic Gates (from NAND) │
└─────────────────────────────────────────────┘
From a single logic gate to a complete operating system.
Next: Part 10 explores extensions and what else we could build on this foundation.