A Parallel File Scanner

This section combines the ideas from the previous sections into one program structure.

A Parallel File Scanner

This section combines the ideas from the previous sections into one program structure.

The program scans files in parallel.

Several worker threads receive file paths from a queue. Each worker opens a file, reads it, and updates shared statistics.

The design has four parts:

Component Purpose
job queue stores file paths
worker threads process files
shared state stores totals
synchronization coordinates access

The shared state is small:

const State = struct {
    mutex: std.Thread.Mutex = .{},

    files_scanned: u64 = 0,
    bytes_read: u64 = 0,
};

The mutex protects the counters.

Workers update them through methods:

fn addFile(self: *State, bytes: u64) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.files_scanned += 1;
    self.bytes_read += bytes;
}

The queue stores paths:

const JobQueue = struct {
    mutex: std.Thread.Mutex = .{},
    condition: std.Thread.Condition = .{},

    jobs: [128][]const u8 = undefined,

    head: usize = 0,
    tail: usize = 0,
    count: usize = 0,

    shutdown: bool = false,
};

The queue uses a ring buffer.

Workers remove paths from the queue:

fn pop(self: *JobQueue) ?[]const u8

The producer inserts paths:

fn push(self: *JobQueue, path: []const u8) void

Workers repeatedly:

  1. take a path
  2. open the file
  3. read the file
  4. update statistics

The worker loop looks like this:

fn worker(
    queue: *JobQueue,
    state: *State,
) void {
    while (true) {
        const path = queue.pop() orelse break;

        processFile(path, state);
    }
}

The worker exits when pop returns null.

The file processing function does not hold the mutex while reading the file:

fn processFile(
    path: []const u8,
    state: *State,
) void {
    // open file
    // read file
    // compute size

    state.addFile(size);
}

Only the counter update is synchronized.

The expensive I/O happens outside the lock.

This is important.

If the mutex stayed locked during file reading, only one worker could read files at a time. Parallelism would disappear.

The main thread usually acts as the producer.

It walks the filesystem and pushes paths into the queue:

while (iterator.next()) |entry| {
    queue.push(entry.path);
}

After all jobs are added:

queue.mutex.lock();
queue.shutdown = true;
queue.condition.broadcast();
queue.mutex.unlock();

This wakes all workers and tells them no more jobs will arrive.

Each worker eventually exits:

const path = queue.pop() orelse break;

The main thread joins all workers:

thread.join();

Finally the totals are printed:

std.debug.print(
    "files = {d}, bytes = {d}\n",
    .{
        state.files_scanned,
        state.bytes_read,
    },
);

This design scales reasonably well because:

  • workers mostly operate independently
  • shared state is small
  • lock contention is low
  • expensive work happens outside locks

The queue is the synchronization point.

The statistics object is the aggregation point.

Everything else is local worker state.

This is a common concurrent structure:

producer -> queue -> workers -> shared results

The same model appears in:

  • search engine crawlers
  • compilers
  • build systems
  • backup tools
  • static analyzers
  • image processing pipelines

The details change, but the structure remains similar.

A good concurrent design usually has:

  • few shared objects
  • clear ownership
  • short lock durations
  • explicit shutdown rules
  • bounded queues
  • workers with independent local state

The threads themselves are not the important part.

The important part is how data moves through the system.

Exercise 18-35. Replace the fixed-size queue with a dynamically growing queue.

Exercise 18-36. Add a counter for failed file opens.

Exercise 18-37. Add recursive directory traversal.

Exercise 18-38. Add a second queue for completed results.

Exercise 18-39. Measure performance with one worker, two workers, and four workers.