Back to Blog

Build a Computer from NAND Gates

Part 6: The Jack Programming Language

We have hardware, assembly, and a virtual machine. But programming in VM code is still tedious. We need a high-level language.

Meet Jack, a simple, Java-like language designed for our platform. It has:

  • Classes and objects
  • Methods and functions
  • Variables and expressions
  • Control flow (if, while)
  • Arrays and strings

Jack is simpler than Java or Python, but powerful enough for real programs.

Language overview

Jack Language Overview
Class Structure
class Main {
  // Class variables
  static int count;
  field int value;

  // Subroutines
  constructor Main new() { ... }
  function void helper() { ... }
  method int getValue() { ... }
}
Every Jack program is a collection of classes. Each class has variables (static/field) and subroutines (constructor/function/method).

Core concepts

Classes: Every Jack program is a collection of classes. Each class lives in its own file (ClassName.jack).

Subroutines: Three types exist:

  • Constructors create and return new objects
  • Methods operate on objects (receive implicit "this")
  • Functions are static (no object context)

Variables:

  • static: Shared across all objects of a class
  • field: Per-object instance variables
  • var: Local variables within a subroutine
  • Parameters: Passed by value

Types:

  • int: 16-bit signed integer (-32768 to 32767)
  • char: Unicode character
  • boolean: true or false
  • ClassName: Reference to object of that type

Jack vs other languages

Jack vs Other Languages
FeatureJackJavaPython
Variable Declarationvar int x;int x;x: int
Assignmentlet x = 5;x = 5;x = 5
Equalityif (x = y)if (x == y)if x == y:
Method Calldo obj.print();obj.print();obj.print()
Constructorlet p = Point.new(1, 2);Point p = new Point(1, 2);p = Point(1, 2)
Array Accesslet arr[i] = 5;arr[i] = 5;arr[i] = 5
Void Returnreturn;return;return
Key differences: Jack uses let for assignment, do for void calls,= for comparison, and explicit ClassName.new() for object creation.

Key differences

Assignment uses let: Unlike most languages, Jack requires let x = 5; instead of just x = 5;.

Void calls use do: When calling a function for its side effect, use do Output.print();.

Equality is =: Jack uses single = for comparison (like == in other languages). There's no assignment expression.

Constructors are explicit: You write Point.new(x, y) instead of new Point(x, y).

No garbage collection: Objects allocated with new aren't automatically freed. The OS provides Memory.deAlloc().

These differences simplify the compiler while keeping the language expressive.

Object-based programming

Object-Based Programming
Define a Class
class Point {
  field int x;
  field int y;
}
A Point class with two field variables. Each Point object will have its own x and y.

How objects work

Each object is a block of memory on the heap:

  • Constructor calls Memory.alloc(n) to get n words
  • Field variables are stored at offsets from the base address
  • this pointer holds the base address

When you write let p = Point.new(3, 4):

  1. Memory is allocated (e.g., at address 3000)
  2. Fields are initialized (RAM[3000] = 3, RAM[3001] = 4)
  3. The constructor returns 3000
  4. Variable p stores 3000

Method calls pass this as the first argument. When p.getX() is called:

  1. Push 3000 (value of p) as argument 0
  2. Jump to Point.getX
  3. Method accesses x as argument 0 + offset 0

Object vs class

Jack is object-based, not fully object-oriented:

  • No inheritance (no "extends")
  • No polymorphism (no method overriding)
  • No interfaces

This keeps the compiler simple while supporting essential object concepts.

Keywords and symbols

Keywords and Symbols
KEYWORDS (21)
Program Structure
classconstructorfunctionmethod
Variable Declarations
fieldstaticvar
Types
intcharbooleanvoid
Constants
truefalsenullthis
Statements
letdoifelsewhilereturn
SYMBOLS (19)
Grouping
{}()[]
Operators
+-*/&|<>=~
Punctuation
.,;

The 21 keywords

Jack has exactly 21 reserved words. They fall into categories:

Program structure (4): class, constructor, function, method

Variable kinds (3): field, static, var

Types (4): int, char, boolean, void

Constants (4): true, false, null, this

Statements (6): let, do, if, else, while, return

The 19 symbols

Symbols serve as operators and punctuation:

Grouping (6): { } ( ) [ ]

Arithmetic (5): + - * / ~

Comparison (4): < > = & (where & is logical AND, | is OR)

Punctuation (3): . , ;

Everything else is either a keyword, number, string, or identifier.

Tokenization

Before parsing, the compiler breaks source code into tokens:

Jack Tokenizer
SOURCE CODE
TOKENS (30)
classMain{functionvoidmain(){varintx;letx=42;doOutput.printInt(x);return;}}
KEYWORD
IDENTIFIER
SYMBOL
INT_CONST
STRING_CONST

Token types

TypeExamplesDescription
KEYWORDclass, if, letReserved words
SYMBOL{ + ;Single characters
INT_CONST0, 42, 32767Integer literals
STRING_CONST"hello"String literals
IDENTIFIERx, Main, getValueNames

Tokenization rules

  1. Skip whitespace and comments (// and /* */)
  2. If character is a symbol, emit SYMBOL token
  3. If character is digit, read integer constant
  4. If character is ", read until closing "
  5. Otherwise, read word (alphanumeric + underscore)
  6. If word is a keyword, emit KEYWORD; else IDENTIFIER

Tokenization is the first compiler phase. It simplifies parsing.

Jack grammar

Jack Grammar (Context-Free)
GRAMMAR RULE
class: 'class' className '{' classVarDec* subroutineDec* '}'
EXAMPLE
class Point { field int x; method int getX() {...} }
The Jack grammar is context-free: each rule can be parsed without knowing surrounding context. This enables recursive descent parsing—one function per grammar rule.

Context-free grammar

Jack's syntax is defined by a context-free grammar (CFG). Each grammar rule describes how constructs are formed:

class       → 'class' className '{' classVarDec* subroutineDec* '}'
classVarDec → ('static'|'field') type varName (',' varName)* ';'
type        → 'int' | 'char' | 'boolean' | className

The grammar is recursive: expressions contain terms, terms can contain expressions.

Why context-free?

Context-free means each rule can be parsed independently. When we see if, we know what follows ((condition) { statements }).

This enables recursive descent parsing, one function per grammar rule:

  • parseClass() calls parseClassVarDec() and parseSubroutineDec()
  • parseExpression() calls parseTerm()
  • parseTerm() might call parseExpression() (recursively!)

We'll build this parser in Part 7.

Expressions

Jack expressions follow standard precedence:

expression → term (op term)*
term       → intConst | stringConst | keyword | varName
           | varName '[' expression ']'
           | subroutineCall
           | '(' expression ')'
           | unaryOp term
op         → '+' | '-' | '*' | '/' | '&' | '|' | '<' | '>' | '='
unaryOp    → '-' | '~'

No operator precedence

Jack doesn't define precedence between binary operators. The expression 1 + 2 * 3 is parsed left-to-right as (1 + 2) * 3 = 9, not 1 + (2 * 3) = 7.

To get correct precedence, use parentheses: 1 + (2 * 3).

This simplifies the parser but requires more careful coding.

Subroutine calls

Two forms exist:

Method/function on class or object:

target.subroutineName(arguments)
// Examples: Math.sqrt(x), point.getX()

Method on current object:

subroutineName(arguments)
// Implicitly calls this.subroutineName

Statements

Jack has five statement types:

letStatement    → 'let' varName ('[' expression ']')? '=' expression ';'
ifStatement     → 'if' '(' expression ')' '{' statements '}' ('else' '{' statements '}')?
whileStatement  → 'while' '(' expression ')' '{' statements '}'
doStatement     → 'do' subroutineCall ';'
returnStatement → 'return' expression? ';'

Statement semantics

let: Assigns a value to a variable or array element.

let x = 5;
let arr[i] = x + 1;

if/else: Conditional execution.

if (x > 0) {
  do Output.printString("positive");
} else {
  do Output.printString("non-positive");
}

while: Loop while condition is true.

while (i < 10) {
  let sum = sum + i;
  let i = i + 1;
}

do: Call a subroutine for its side effect (discard return value).

do Output.printInt(x);
do screen.drawPixel(100, 50);

return: Exit subroutine with optional value.

return x + 1;
return;  // void subroutines

Complete programs

Example Jack Programs
// Simple class
class Main {
  function void main() {
    var int x, y;
    let x = 1;
    let y = 2;
    do Output.printInt(x + y);
    return;
  }
}

Program entry point

Every Jack program needs:

  1. A class named Main
  2. A function Main.main() with no arguments

The OS calls Main.main() to start your program.

Multi-file programs

Large programs span multiple files:

  • Each class in its own ClassName.jack file
  • All files in the same directory
  • Compiler processes each file to ClassName.vm
  • VM translator combines all .vm files

Standard library

Jack comes with built-in classes:

ClassPurpose
MathMathematical operations (multiply, divide, sqrt)
StringString manipulation
ArrayArray allocation
OutputText output to screen
ScreenGraphics (drawPixel, drawLine, etc.)
KeyboardInput from keyboard
MemoryMemory allocation (alloc, deAlloc)
SysSystem utilities (halt, error, wait)

These are implemented in Part 9 (Operating System).

Design philosophy

Jack was designed with compiler simplicity in mind:

Explicit typing: Every variable has a declared type.

No overloading: Each subroutine name is unique within a class.

Simple memory model: No garbage collection, explicit alloc/deAlloc.

Limited operators: No operator precedence, no compound assignment.

Statement-oriented: Expressions can't be statements (need do).

These choices make the compiler straightforward while preserving expressiveness.

What's next

We now understand the Jack language. Next:

  • Part 7: Build the compiler frontend (tokenizer and parser that produce an AST)
  • Part 8: Build the compiler backend (code generator that translates AST to VM code)

By the end, we'll have a complete pipeline:

Jack source (.jack)
     ↓ Tokenizer
Tokens
     ↓ Parser
Abstract Syntax Tree
     ↓ Code Generator
VM code (.vm)
     ↓ VM Translator
Assembly (.asm)
     ↓ Assembler
Machine code (.hack)

Each layer hides complexity from the layer above.


Next: Part 7 builds the compiler frontend, covering tokenization and parsing.