root.system / 0x06 / memory

A wall of switches.
Numbered.

The CPU on the previous page only works because it has somewhere to put things. That somewhere is memory: a long array of bit-cells with a number stamped on each one. Programs are bytes in memory. Variables are bytes in memory. The page you're reading is bytes in memory. This page is about how that array is built, how it's organized, and how programs actually use it.

Beginner// level 01

What memory is

Your CPU is fast. Blindingly, impossibly fast. Four billion operations per second. But every one of those operations needs somewhere to put the result, somewhere to read the next instruction, somewhere to hold the number it just computed. Without memory the CPU is just heat.

Memory is the part of the computer that gives computation a place to exist. Not conceptually. Physically. Every variable you have ever declared, every function call you have ever made, every string you have ever typed: all of it lives as charged capacitors, flipping between two voltage states, organised into an addressed array, read and written billions of times per second.

You already know what those two states are. 0 and 1. Memory is just binary, with an address on each byte, and a wire connecting it to the CPU.

You already saw, on the logic-gates page, that two NOR gates wired in a loop form an SR latch: a circuit with two stable states. That's a 1-bit memory cell. Tile millions of those side by side, give each one a unique number, and you've built memory. Each number is an address. Each cell holds a bit. Bits are grouped into bytes of 8.

The CPU talks to memory through two buses (literally, bundles of wires):

  • The address bus: "give me the byte at address 0x4000."
  • The data bus: "here it is, 0x48" (which, by the ASCII page, is the letter H).

That's the whole interface. Read or write, one byte (or word) at a time, addressed by a number.

The SR latch that forms each memory cell is two NOR gates in a feedback loop. NOR gates are transistors, the same transistors from page 4. Memory is logic gates configured to remember instead of compute. ← see: logic gates

RAM vs ROM: volatile and non-volatile

There are two big families of memory, and they differ on one question: does the data survive when the power goes off?

familywhat's it made ofsurvives power off?where you find it
RAM (Random Access Memory)transistors + capacitors (DRAM) or flip-flops (SRAM)no, contents lostmain memory, CPU caches, registers
ROM (Read-Only Memory)fused / mask-programmed cellsyesboot firmware, embedded device microcode
Flash (a.k.a. EEPROM)floating-gate transistorsyesSSDs, USB sticks, phone storage

"RAM" really means two things in everyday speech: the kind of memory that loses its contents, and the main memory chip in your laptop (which happens to be that kind). Inside the CPU, the registers and caches are also RAM, just smaller, faster, and made of SRAM cells (flip-flops, like the ones we built on the logic-gates page). Main memory uses DRAM, which trades speed for density: each cell is just one transistor and one capacitor.

// the memory hierarchy
Smaller and faster as you climb: registers (a few hundred bytes, 1 cycle) → L1/L2/L3 cache (KB to MB, single-digit to tens of cycles) → main RAM (GB, ~200 cycles) → SSD/disk (TB, millions of cycles). The CPU page covers cache in detail.

Reading and writing bytes by address

From inside a program, "memory" is just a numeric address you can read from or write to. In Rust most accesses go through safe references; in C they're literally pointers (numbers). Both compile down to the same load and store instructions the CPU runs.

Rust• • •
// Memory is just a giant array of bytes, addressed by number.
// In Rust, you usually access it through references, but
// raw addresses are right there if you ask for them.
fn main() {
    let x: u32 = 0xDEADBEEF;
    let addr: *const u32 = &x;

    println!("value at the address: 0x{:08X}", x);
    println!("the address itself:   {:p}", addr);

    // Read the four individual bytes of x from memory
    // (little-endian on x86/ARM, see the binary page).
    unsafe {
        let bytes = std::slice::from_raw_parts(addr as *const u8, 4);
        for (i, b) in bytes.iter().enumerate() {
            println!("byte {i} @ {:p} = 0x{:02X}", bytes.as_ptr().add(i), b);
        }
    }
}
C• • •
#include <stdio.h>
#include <stdint.h>

int main(void) {
    uint32_t x = 0xDEADBEEF;
    uint32_t *addr = &x;

    printf("value at the address: 0x%08X\n", x);
    printf("the address itself:   %p\n", (void*)addr);

    // Read the four individual bytes of x from memory.
    uint8_t *bytes = (uint8_t*)addr;
    for (int i = 0; i < 4; i++)
        printf("byte %d @ %p = 0x%02X\n",
               i, (void*)(bytes + i), bytes[i]);
    return 0;
}

Explore memory like the CPU does

Type a value, choose its type, and watch the exact bytes land in an addressed grid starting at 0x1000. Click any cell to decode it as hex, decimal, binary, and ASCII, then flip the byte order to see endianness happen in front of you.

// explore memory like the CPU does
byte order
quick store
memory @ 0x1000
address0x1000
value0x00
decimal0
binary00000000
ascii(non-printable)

pick a type, type a value, and write it to address 0x1000. the bytes light up cyan in the grid; click any cell to decode it as hex, decimal, binary, and ASCII. flip the byte order to see why 0xDEADBEEF lands as EF BE AD DE on your laptop.

// remember from page 1?
The four bytes of 0xDEADBEEF show up in memory as EF BE AD DE on x86 / ARM. That's little-endian, the same byte order from the binary page. Memory layout and number representation are the same story.
Intermediate// level 02

Stack & heap: how programs use memory

When the OS launches your program, it hands it a private chunk of address space and divides it into named regions. Two of those regions handle almost all of the runtime data your program touches: the stack and the heap.

high addressesSTACKgrows down · function locals, return addresses(free)gap between stack and heapHEAPgrows up · malloc / Box / VecBSSuninitialised globalsDATAinitialised globalsTEXTprogram instructions (read-only)low addresses

The stack: automatic, LIFO, free

Every time a function is called, the CPU bumps a register called the stack pointer down by however many bytes that function's locals need. When the function returns, the pointer goes back up. That's it. There's no allocator running, no bookkeeping, just one register move. That's why stack allocation is essentially free.

The price: stack memory has a fixed lifetime tied to the function call. You can't return a pointer to a stack local and expect it to still be valid. The moment the function returns, that memory is up for grabs by the next call.

The stack pointer is a CPU register. Every function call decrements it, every return increments it. When a recursive function calls itself without a base case, this register decrements until the stack crosses into OS-protected memory: segmentation fault, process terminated. The recursion page explains exactly how. ← see: recursion

The heap: explicit, flexible, slow

When you don't know the size at compile time, or you need the data to outlive the function that creates it, you go to the heap. The heap is managed by an allocator: a chunk of code (in libc, in the Rust runtime, etc.) that hands out free regions on request and tracks which ones are in use. malloc, Box::new, Vec, String: all of them ultimately call into the allocator.

That bookkeeping has a real cost. A heap allocation is hundreds to thousands of cycles where a stack allocation is one. So a rule of thumb in performance-sensitive code: prefer the stack when you can.

Rust• • •
fn main() {
    // STACK: known size, lifetime tied to the function.
    let arr: [u8; 4] = [10, 20, 30, 40];
    println!("stack arr   @ {:p}", arr.as_ptr());

    // HEAP: Box puts a value on the heap, frees it on drop.
    let boxed: Box<u32> = Box::new(0xCAFEBABE);
    println!("heap u32    @ {:p}", &*boxed);

    // HEAP: Vec is a (ptr, len, cap) header on the stack
    // pointing at a buffer on the heap.
    let v: Vec<u8> = vec![1, 2, 3, 4];
    println!("vec header  @ {:p}  buffer @ {:p}",
             &v, v.as_ptr());

    // Both heap allocations are freed automatically when
    // `boxed` and `v` go out of scope. No malloc, no free.
}
C• • •
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    // STACK
    uint8_t arr[4] = {10, 20, 30, 40};
    printf("stack arr   @ %p\n", (void*)arr);

    // HEAP: malloc returns a pointer, you must free it.
    uint32_t *boxed = malloc(sizeof *boxed);
    *boxed = 0xCAFEBABE;
    printf("heap u32    @ %p\n", (void*)boxed);

    // HEAP: explicit array
    uint8_t *buf = malloc(4 * sizeof *buf);
    memcpy(buf, (uint8_t[]){1, 2, 3, 4}, 4);
    printf("heap buf    @ %p\n", (void*)buf);

    free(boxed);
    free(buf);   // forget either of these and you have a leak.
    return 0;
}

Pointers are just numbers

A pointer is an address. An address is a number. On a 64-bit system, that number is 8 bytes wide, which is why sizeof(void*) is 8 there. The CPU has no special "pointer" type; load and store instructions take addresses, full stop. The type attached to a pointer is a fiction the compiler enforces to make sure you don't read 8 bytes from a place where only 4 live.

A memory address is a hex number, like 0x7fff5fbff8a4. You know hex from the number systems page, and you know why addresses are written in hex: because hex is binary in human-readable form. Every address is just binary pointing somewhere in this wall of switches. ← see: number systems

stack
automatic
Allocation = move SP. Deallocation = move it back. Lifetime tied to the function.
heap
explicit
Allocation = ask the allocator. Deallocation = give it back (or be tracked by the runtime).
static
forever
Globals and string literals live in DATA / TEXT for the entire process lifetime.
// the classic stack mistake
Returning a pointer or reference to a local variable. The function returns, the stack frame is reclaimed, your pointer now points to whatever junk the next call writes there. C lets you do this; Rust catches it at compile time as a lifetime error.
Advanced// level 03

Virtual memory, safety & how languages model it

Virtual memory: every process gets its own universe

If two programs both write to address 0x4000, do they trample each other? They don't, because the address you see in your program isn't a real physical address at all. It's a virtual address. Between your program and the RAM chip sits a piece of CPU hardware called the MMU (memory management unit), which translates virtual addresses to physical ones using a per-process lookup table maintained by the OS.

This buys three properties at once:

  • Isolation. Process A literally can't address process B's pages.
  • Lazy allocation. A 1 GB malloc doesn't actually consume 1 GB of RAM; pages are mapped on first touch.
  • Swapping. Pages that haven't been touched recently can be written to disk and reloaded transparently.

Cache locality, restated

The CPU page covered the cache hierarchy in detail. The takeaway, restated as a memory-layout principle: data laid out contiguously is dramatically faster to read. A Vec<Foo> beats a Vec<Box<Foo>>. An array-of-structs beats a struct-of-pointers. Same algorithmic complexity, often a 5 to 50× wall-time difference, because the first one streams cleanly through L1 and the second one pointer-chases through main memory.

The four classic memory bugs

bugwhat happenswhat causes it
Use-after-freeRead or write through a pointer to memory that's already been releasedFreeing memory while another pointer to it still exists
Double freeAllocator's internal bookkeeping corrupts; later allocations crash or aliasCalling free() twice on the same pointer
Buffer overflowOverwrite adjacent variables, return addresses, control flow. Classic exploit vector.Writing past the end of an array
Memory leakProcess slowly grows until OOM; long-running services restart on a schedule to mitigateAllocating without ever freeing

Three strategies for memory safety

01 / manual
C, C++
You allocate, you free. Maximum control, maximum performance, every memory bug above is on the table. Decades of CVEs are buffer overflows in C.
02 / garbage collection
Java, Go, Python, JS
A runtime traces which heap objects are still reachable from live references and reclaims the rest. Safe, at the cost of a runtime, GC pauses, and less control over layout.
03 / ownership
Rust
Each value has a single owner. The compiler tracks lifetimes statically and refuses to compile code that could free memory that's still referenced. No runtime cost, no GC pauses.

Memory in Bitcoin

A Bitcoin full node keeps two critical data structures in RAM. Both are hash maps. Both are enormous.

The UTXO set is every unspent transaction output on the entire network: every coin that exists, every satoshi not yet spent. Currently about 85 million entries at roughly 100 bytes each, so 8 to 10 gigabytes in RAM. When your wallet sends Bitcoin, the node searches this hash map: is this coin unspent, does this address own it? One lookup, O(1), because it is a hash map, and the hashing page explained why hash maps are fast. All of it in RAM, because disk is about 1000 times slower.

Rust• • •
use std::collections::HashMap;

// Every unspent coin on the Bitcoin network lives in a
// structure like this. Bitcoin Core is C++, but the concept
// is identical: a hash map, held in RAM, hit on every tx.
struct UtxoSet {
    // key:   (txid, output index)
    // value: (amount in satoshis, locking script)
    entries: HashMap<(TxId, u32), TxOut>,
    // roughly 8GB in RAM on a full node,
    // touched on every single transaction validation
}

impl UtxoSet {
    // O(1) lookup. This is why a Bitcoin node wants RAM:
    // disk would be ~1000x slower per check.
    fn is_unspent(&self, txid: &TxId, vout: u32) -> bool {
        self.entries.contains_key(&(*txid, vout))
    }

    // Spending a coin removes it from the set.
    fn spend(&mut self, txid: &TxId, vout: u32) -> Option<TxOut> {
        self.entries.remove(&(*txid, vout))
    }
}
C• • •
#include <stdint.h>
#include <stddef.h>

/* A simplified UTXO entry. */
typedef struct {
    uint8_t  txid[32];   /* 32-byte transaction id */
    uint32_t vout;       /* output index */
    int64_t  satoshis;   /* amount */
    uint8_t  script[35]; /* locking script */
} Utxo;

/* A real node keeps these in a hash map (LevelDB on disk,
 * cached in RAM). This struct is the concept: */
typedef struct UtxoSet {
    Utxo   *entries;     /* hash map entries */
    size_t  count;       /* current size */
    size_t  capacity;    /* allocated capacity */
} UtxoSet;

/* O(1) average: the hash map contract from the hashing page. */
const Utxo *utxo_find(const UtxoSet *set,
                      const uint8_t txid[32],
                      uint32_t vout);

The mempool is where your transaction waits after you broadcast it, until a miner includes it in a block. It is also a hash map, also in RAM, on every node. Currently 5,000 to 100,000 transactions at 250 to 1,000 bytes each, so roughly 50 to 100 megabytes.

When the mempool fills up, low-fee transactions get evicted. Yours might not confirm for hours, or ever. This is not a bug in Bitcoin; it is the memory hierarchy in production. The OS gives each node a finite amount of RAM, the mempool respects that limit, and transactions above the limit do not queue. They are dropped.

From a transistor in a flip-flop to a global ledger of 85 million coins: all of it is memory. Addressed, read, written, at nanosecond speed, on machines made of the same gates you learned on page 4.

Rust• • •
// The compiler tracks who owns each piece of memory.
// Use-after-free becomes a compile-time error, not a runtime crash.

fn main() {
    let s = String::from("hello"); // s owns the heap buffer

    let r = &s;                    // r borrows s, immutably
    println!("{r}");

    drop(s);                       // s is dropped, buffer freed

    // println!("{r}");            // ← would not compile:
    //                             //   borrow of moved value
}

// In contrast, the equivalent C code (next block) compiles
// happily and prints garbage, or crashes, depending on the day.
C• • •
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    char *s = malloc(6);
    strcpy(s, "hello");
    char *r = s;          // both pointers alias the same buffer

    printf("%s\n", r);    // fine, prints "hello"

    free(s);              // buffer is now invalid memory
    printf("%s\n", r);    // USE-AFTER-FREE: undefined behaviour.
                          // No compiler warning. No runtime check.
                          // Might print "hello", might segfault,
                          // might leak whatever the allocator
                          // wrote into those bytes next.
    return 0;
}
// undefined behaviour is a real thing
Use-after-free in C isn't "the program will crash." It's undefined behaviour, which means the compiler is allowed to assume it never happens. Optimisers exploit that assumption, so the resulting program can do anything: print the right answer in debug, segfault in release, leak data over the network, run an attacker's code. UB is the source of most exploits on this list: cve.org.

Connecting the whole stack

// from electron to executable, with state
  1. Transistors form flip-flops and DRAM cells.
  2. Tiled and addressed, those become memory chips: registers, caches, RAM.
  3. The CPU reads instructions and data from memory using load/store ops.
  4. Bits in memory encode numbers (page 1) and characters (page 2).
  5. The OS gives each process its own virtual address space, divided into stack, heap, data, text.
  6. Your language picks a strategy for managing it: manual, GC, or ownership.
  7. Your program is, ultimately, a sequence of reads and writes to specific addresses.

The UTXO set and mempool are hash maps in RAM on every Bitcoin node. The hashing page explained how hash maps work; the networking page explained how nodes share this data; the blockchain page showed the full picture. Memory is where Bitcoin lives while it runs. ← see: hashing and blockchain

Where to dig in next

You now have the whole vertical, including state. Next stops:

  • What Every Programmer Should Know About Memory, Ulrich Drepper's canonical paper on caches, NUMA, and access patterns.
  • The Linux process memory map: read /proc/<pid>/maps on a running process and watch the regions above appear in the wild.
  • Rustonomicon, the dark-arts companion to the Rust book, on lifetimes, aliasing, and unsafe.
  • The Garbage Collection Handbook by Jones, Hosking & Moss, the reference text on GC algorithms.

And with that, the loop closes. You started at the bit. You've now seen everything between the bit and the program: the encodings on top of it, the gates beneath it, the CPU that orchestrates it, and the memory that holds all of it together.

Where memory appears in BitRoot

Memory is where every other topic comes to rest. The shortest path from each, back into this wall of switches:

0x02 / binary
Every cell is a bit
Every bit in every memory cell is a binary value: 0 or 1, charged or uncharged. Memory is just binary given an address and a wire to the CPU.
0x01 / number systems
Addresses are hex
Every memory address is a number written in hex, like 0x7fff5fbff8a4, because binary addresses are unreadable to humans. The number systems page explains why 16 is the perfect shorthand.
0x04 / logic gates
Cells are gates
A DRAM cell is one transistor and one capacitor; an SRAM cell is six transistors in a flip-flop. The logic gates page built both. Memory is gates configured to remember.
0x05 / cpu
Inseparable
The CPU reads instructions from memory on every clock cycle. The fetch in fetch-decode-execute is a memory read. Every register write, every stack push, is memory.
0x07 / operating system
Owns the address space
The OS creates the virtual address space (stack, heap, data, text), enforces isolation between processes, and handles page faults when virtual pages are not yet in RAM.
0x08 / variables
A name for an address
A variable is a name the compiler gives to a memory address. int x = 42 means: at address 0x7fff..., store 0x0000002A. The name vanishes at compile time; only the address remains.
0x09 / pointers
An address as a value
A pointer is a variable that holds a memory address as its value. Dereferencing follows the address: the CPU follows the number and reads what lives there. Pointers make memory navigable.
0x14 / recursion
The stack has a limit
Every recursive call pushes a stack frame. The stack lives in memory, ~8MB on Linux. Without a base case it grows until it hits the OS limit: segmentation fault. Memory kills the process.
0x0B / arrays
Contiguous blocks
An array is a contiguous block of identically-typed values. arr[2] is base_address + 2 × element_size: the CPU adds, follows the address, reads the value. Cache-friendly because contiguous.
0x0C / linked lists
Scattered, pointer-linked
A linked list is scattered memory connected by pointers. Each node lives at a different address; traversal follows addresses across the heap. Cache-unfriendly, flexible, dynamic.
0x0D / hashing
The UTXO set in RAM
Bitcoin's UTXO set is a hash map in RAM, roughly 8 to 10 gigabytes on a full node. Every transaction validation is a RAM lookup, O(1), because of hashing.
0x0F / networking
Packets buffered in RAM
Packets are buffered in RAM while the OS processes them; the network stack lives in kernel memory. When packets arrive faster than processing they queue. Network performance is often a memory-bandwidth problem.
0x10 / distributed systems
No shared memory
Every distributed system stores its state in memory. CAP is partly a memory problem: two machines, each with their own RAM, can hold different values. There is no shared memory across machines, only messages.
0x13 / blockchain
Where Bitcoin lives at runtime
A full node holds the UTXO set (~8GB), the mempool (~50-100MB), and the block index in RAM. Memory is where Bitcoin's state lives while it validates. The blockchain page shows every memory concept in production.
next up / 0x07
One CPU, many programs: how an operating system makes that work
operating system