Pi Compute¶
paglets examples pi demonstrates using mesh-info as lightweight placement
input for a distributed compute job. It computes decimal digits of Pi by
distributing Chudnovsky term batches and combining the integer partial sums on
the coordinator.
uv run paglets examples pi --digits 16 --batch-size 1
uv run paglets examples pi --digits 32 --max-load-per-cpu 0.75 --max-workers-per-host 2 --json
The coordinator stays on the dynamically discovered entry host, partitions the
requested digit range into the required Chudnovsky terms, asks local
mesh-info for eligible targets across the mesh, treats approximate free load
slots as additional launch capacity, creates short-lived
PiBatchWorkerAgent instances remotely, and receives batch_result messages.
Worker creation requests are issued in parallel so process-spawn overhead does
not keep free slots empty. The free-slot estimate is based on cpu_count *
--max-load-per-cpu - load_1m; existing in-flight workers are added back before
new launches are capped by --max-workers-per-host. --max-in-flight caps the
whole job. A dedicated PiPostProcessAgent runs on the entry host for each
active job; it incrementally merges finalized term fragments and performs
drain/format work so the coordinator can focus on scheduling and state
tracking.
In text mode the CLI starts the coordinator asynchronously, long-polls with
drain_stream, and appends each returned decimal fragment to the terminal.
drain_stream first refills worker slots, then returns compact progress counters
and new text only; it does not return the full raw term history, keeping messages
small while preserving 3.1415... output. Increase
--stream-chunk-size when larger terminal bursts are useful. Use --json for a
final summary object instead of live output. The default job timeout is disabled
so long calculations can run to completion; add --timeout SECONDS when a run
should be bounded, and increase --request-timeout if an exceptionally large
coordinator response needs longer than the default HTTP request window. Workers
re-check local load before computing; if a host has become busy, the worker
reports skipped, the coordinator requeues that batch, and the worker disposes
itself. If all hosts are above the load/CPU thresholds and no batch is running,
the coordinator sends one fallback worker anyway so a long job still makes
minimum progress.
The worker result payloads encode large Chudnovsky partial integers in
hexadecimal internally. The coordinator forwards only finalized ok term fragments
to the post-processor for incremental merge; the post-processor formats digits
on demand. This keeps scheduling messages compact and avoids expensive terminal-side
recombination while still producing normal 3.1415... output and avoiding Python
integer string conversion limits for very large jobs.
Programmatic use:
from paglets.examples.compute import (
PI_START_ASYNC,
PiComputeCoordinatorAgent,
PiComputeRequest,
PiStartRequest,
)
from paglets.patterns.operations import OperationClient
from paglets.serialization.codec import dataclass_to_wire
coordinator = self.context.create_paglet(PiComputeCoordinatorAgent)
client = OperationClient(coordinator)
reply = client.call(
PI_START_ASYNC,
PiStartRequest(
request=dataclass_to_wire(PiComputeRequest(start=0, digits=16, batch_size=1))
)
)
summary = reply.summary