Compute Slots¶
compute-slots is a built-in resident scheduler service for coarse compute
jobs. It is designed for paglets that know their own resource needs and can wait
until a host has enough capacity to run them.
The service does not run the job code. It admits, queues, redirects, and grants
compute slots. ComputeJobPaglet owns the scheduling protocol and lease release
for normal compute jobs. The job paglet remains responsible for downloading
input, performing work, storing intermediate results, and deciding where to
return results.
The Analysis Jobs example shows one complete consumer implementation.
Writing A Compute Job¶
New compute job types should not copy the scheduler protocol by hand. Use the
client-side base layer from paglets.system.compute_slots. The base class owns
placement, slot requests, sleep/wakeup, redirects, failure defaults, automatic
home-host capture, and automatic lease release.
Minimal skeleton:
from dataclasses import dataclass
from paglets.system.compute_slots import ComputeJobPaglet, ComputeJobState
@dataclass
class MyJobState(ComputeJobState):
cpu_cores: int = 1
memory_bytes: int = 512 * 1024 * 1024
temp_storage_bytes: int = 0
estimated_runtime_seconds: float = 120.0
dataset_name: str = ""
result_path: str = ""
class MyJobPaglet(ComputeJobPaglet[MyJobState]):
State = MyJobState
def run_compute_job(self) -> None:
# Do CPU/RAM/temp-storage work here. The base class releases the slot.
with self.locked_state() as state:
state.result_path = "..."
def continue_after_compute_success(self) -> None:
# Return, upload, or commit results here. This can run again after wakeup.
pass
def after_compute_failure(self, message: str) -> None:
self.notify_user("error", "Compute job failed", message)
Resource estimates are validated before the first scheduler call:
cpu_coresmust be at least1.memory_bytesmust be greater than0unlessallow_zero_memory_bytes = True.estimated_runtime_secondsmust be greater than0unlessallow_zero_runtime_seconds = True.temp_storage_bytesmay be0, but should be realistic when scratch space is required.
Use application fields, such as dataset_name, sample_id, job_id, or
result_key, for domain-specific result labels. The scheduler derives its own
internal diagnostic label from the runtime agent_id; normal job code does not
set or use that label.
| Field or hook | Owner | Purpose |
|---|---|---|
compute_status, compute_error |
Base class | Scheduler and slot lifecycle. |
slot_request_id, slot_lease_id, redirects, affinity metadata |
Base class | Internal scheduler protocol state. |
status, result_path, domain IDs |
Application | Result lifecycle and business state. |
run_compute_job() |
Application | The actual work after a slot grant. |
continue_after_compute_success() |
Application | Result return/commit logic, including later wakeups. |
after_compute_failure(message) |
Application | Optional notification or cleanup. |
after_compute_success() is a one-shot hook called immediately after
run_compute_job() returns and the lease is released. Its default implementation
calls continue_after_compute_success(). If a completed job deactivates and
wakes later, only continue_after_compute_success() is called again.
notify_user() is non-fatal. It suppresses notification transport failures and
returns False when delivery failed.
Job Groups And Collectors¶
For a submitter-plus-collector workflow, use the group helpers instead of hand-rolling the same registration and status bookkeeping in every application:
from paglets.system.compute_slots import (
CollectingComputeJobPaglet,
CollectingComputeJobState,
ResultCollectorPaglet,
submit_compute_job_group,
)
submit_compute_job_group(...) creates a collector, registers expected job
keys before workers start, creates the compute jobs, and records creation
failures back into the collector. ResultCollectorPaglet accepts job_result
and job_failure reports, exposes summary and drain, tracks duplicates,
and can optionally return to the submitter/home host when the group completes.
The group layer is intentionally small. It tracks job completion and result
reports; it does not add a durable global queue. Use
report_compute_artifact(...) from run_compute_job() when a result is a
file. That helper reports success itself, so do not also call
report_compute_success(...) for the same job. See
Artifact Transport for natural file mobility and low-level
artifact APIs.
Service Role¶
Each host runs one eager ComputeSlotsAgent:
[[resident_services]]
class = "paglets.system.compute_slots.agent:ComputeSlotsAgent"
enabled = true
agent_id = "service.compute-slots"
singleton = true
lifecycle = "eager"
scope = "mesh"
idle_timeout = 0.0
state = { service_scope = "mesh" }
Each scheduler owns only local admission decisions. A job paglet first asks for
candidate hosts while it is still at home. After it chooses a suitable host, the
paglet dispatches there. Only after arrival does it call that host's local
request_slot operation and receive run_now, sleep, redirect, or
rejected.
Schedulers also exchange peer status with other schedulers. That status exchange is between scheduler services; it is not the same thing as the job paglet's local slot request after movement.
flowchart LR
Job["compute job paglet"]
Slots["compute-slots\nresident scheduler"]
Server["server-info\nlocal resources"]
Info["user-info\noperator messages"]
Work["job code\napplication-specific"]
Results["result collection\napplication-specific"]
Job -->|"candidate_hosts"| Slots
Job -->|"request_slot / automatic release_slot"| Slots
Slots -->|"sample CPU/RAM/storage"| Server
Slots -->|"run_now / sleep / redirect"| Job
Slots -->|"compute_slot_granted\ncompute_slot_redirect"| Job
Job -->|"notify failures/status"| Info
Job -->|"run only after grant"| Work
Job -->|"save/return results"| Results
The scheduler does not run analysis code, store result frames, dispatch sleeping paglets directly, or own application retry policy. It admits work, queues work, grants leases, and recommends redirects. The paglet owns movement, computation, result storage, and final result collection.
Compute Paglet Lifecycle¶
This is the expected lifecycle for any paglet that uses the service:
stateDiagram-v2
[*] --> Preflight
Preflight --> NoHost: no host can ever satisfy request
Preflight --> MoveToCandidate: selected candidate
MoveToCandidate --> RequestLocalSlot: arrives on compute host
RequestLocalSlot --> Running: run_now
RequestLocalSlot --> Waiting: sleep
RequestLocalSlot --> MoveToCandidate: redirect
RequestLocalSlot --> Failed: rejected
Waiting --> Running: compute_slot_granted
Waiting --> MoveToCandidate: compute_slot_redirect
Running --> ReturningOrCommitting: automatic release_slot
NoHost --> [*]
Failed --> [*]
ReturningOrCommitting --> [*]
request_slot is always local to the host where the paglet currently lives. A
redirect is only a recommendation plus an activation message. The paglet must
dispatch itself to the target host and request a slot again after arrival.
Request Model¶
A paglet submits a ComputeSlotRequest with its own resource estimate:
agent_idandagent_host_url, so the scheduler can wake or redirect it.job_id, an internal scheduler diagnostic label derived by theComputeJobPagletbase class.cpu_cores, the requested number of logical CPU cores. This is the scheduler's explicit CPU reservation count.memory_bytes, expected peak RAM pressure.temp_storage_bytes, expected scratch storage.estimated_runtime_seconds, used for lease TTLs.requires_gpuandgpu_memory_mb, reserved for GPU scheduling.required_host_tags,excluded_host_tags, andpreferred_host_tags, for role-based placement such as Linux-only or GPU-preferred workers.excluded_host_namesandexcluded_host_urls, for avoiding the submitter, collector, laptop, or any other known host.redirect_count,last_redirect_at, andlast_redirect_from_host_url, used for redirect cooldown and ping-pong avoidance.
GPU fields are part of the schema, but v1 rejects GPU-required jobs because GPU admission is not implemented yet.
Scheduling Details¶
The scheduler uses common coarse-grained scheduling concepts:
- Work-conserving local admission: if a queued job can run locally now, the local scheduler grants it before considering redirects.
- Oldest-fit queue scanning: queued requests are inspected in submission order, and the first request that currently fits receives the next lease.
- Bounded work stealing: a scheduler with blocked waiting jobs may wake some of them and redirect them to a peer with free capacity.
- Local anchor: if more than one job is waiting locally, at least one waiting job stays behind so the host has work ready when a local lease finishes.
- Redirect cooldown: a recently redirected request is not redirected again immediately, which adds hysteresis and prevents ping-pong movement.
- Shadow reservations: when several jobs are redirected in one pass, the scheduler subtracts their estimated resources from projected peer capacity before choosing the next redirect.
The defaults are conservative:
| Setting | Default | Meaning |
|---|---|---|
grant_interval |
15.0 seconds |
Normal wait between local grants. |
max_grants_per_tick |
4 |
Hard cap for grants sent during one scheduler pass. |
usage_sample_interval |
60.0 seconds |
Minimum interval between active job usage samples. |
usage_history_limit |
100 |
Number of finished job usage summaries retained in scheduler state. |
burst_load_per_cpu |
0.5 |
Maximum load per CPU for burst grants after the first grant. |
burst_resource_headroom_factor |
2.0 |
Free CPU/RAM/temp-storage multiplier required for burst grants. |
max_redirects_per_tick |
4 |
Hard cap for peer spillover in one scheduler pass. |
max_redirect_fraction |
0.5 |
Redirect at most half of the local queue. |
redirect_cooldown_seconds |
60.0 |
Minimum time before a request can be redirected again. |
placement_sample_size |
3 |
Initial placement chooses deterministically among the best ranked candidates. |
CPU Cores, Affinity, And Memory¶
CPU admission is a discrete logical-core reservation model. A request with
cpu_cores = 2 reserves two logical CPU cores before the job starts. Free CPU
capacity is computed as:
The sampled load average is still useful, but only as a throttling signal for burst grants. It is not the primary CPU admission counter. This makes the maximum number of concurrently admitted jobs explicit and avoids depending on load average, which rises only after processes have already started.
When the host platform supports process affinity, a granted lease also carries
the current cpu_core_ids mask allocated to the job:
- Linux uses
os.sched_setaffinity(0, cpu_core_ids). - Windows uses
psutil.Process().cpu_affinity(cpu_core_ids). - macOS reports affinity as unsupported in v1.
Unsupported affinity does not reject the job. The scheduler still reserves
logical CPU cores, but the reply marks cpu_affinity_supported = false and the
job runs without OS-level pinning.
Affinity is host-managed. The compute job paglet does not need to set its own
process mask. compute-slots decides the mask and asks the local host process
to apply it to the target paglet child process. The low-level helper remains
available for tests and internal use, but normal compute paglets should treat
affinity as scheduler metadata.
cpu_cores is the guaranteed logical-core floor. When fewer jobs are active
than the host can run, the scheduler uses elastic affinity: it keeps each
lease's reserved CPU IDs and spreads idle eligible CPUs across active leases.
For example, on a host with ten eligible CPUs, two active jobs that each
requested two CPU cores keep their two-core reservations but are expanded to
five CPU IDs each. When more jobs start, masks shrink back toward the reserved
floors so the new job can get its guaranteed cores.
Memory admission uses available RAM, not merely free RAM:
This matters because operating systems often use otherwise idle memory for file
cache. Reclaimable cache should not by itself prevent a job from starting. The
memory_bytes field remains an expected peak estimate; v1 does not create
cgroups, Windows Job Objects, or other hard memory limits.
Active job usage is sampled separately from the scheduler loop. The scheduler keeps maxima for CPU percent, process RSS, process-tree RSS, Paglets work-dir bytes, and application-provided extra work bytes. Sampling defaults to once per minute so expensive filesystem walks are not run every scheduler tick.
Compute jobs can expose application-specific scratch files or directories by
overriding compute_usage_paths():
class MyJobPaglet(ComputeJobPaglet[MyJobState]):
def compute_usage_paths(self):
return [self.state.scratch_dir]
The default hook returns no extra paths. paglets jobs top
recursively sums returned directories and regular files in addition to the
Paglets-owned work directory.
Initial Placement¶
Initial placement is a reusable scheduler decision plus a paglet-owned movement:
sequenceDiagram
participant Job as "job paglet on home"
participant Entry as "entry compute-slots"
participant A as "compute-slots A"
participant B as "compute-slots B"
participant C as "compute-slots C"
Job->>Entry: candidate_hosts(resource wishes)
Entry->>A: scheduler_status
Entry->>B: scheduler_status
Entry->>C: scheduler_status
A-->>Entry: limits, queue, free resources
B-->>Entry: limits, queue, free resources
C-->>Entry: limits, queue, free resources
Entry-->>Job: ranked candidates + selected candidate
Job->>Job: persist selected target
Job->>A: dispatch self to selected host
candidate_hosts first filters durable capability mismatches, such as missing
GPU support or too little total RAM. It also skips hosts with current status
errors, but those are treated as transient health problems rather than proof
that the host could never run the job. The remaining hosts are ranked, and the
scheduler provides a deterministic selected candidate from the best few
candidates. The paglet still performs the dispatch itself.
Local Queue Pass¶
Each scheduler pass is local-first:
flowchart TD
Tick["scheduler tick"] --> Expire["expire stale leases"]
Expire --> Sync["peer status sync"]
Sync --> Grant["grant local oldest-fit jobs"]
Grant --> MoreLocal{"another local job fits?"}
MoreLocal -->|"yes, within burst limits"| Grant
MoreLocal -->|"no"| Spill["bounded peer spillover"]
Spill --> Done["wait for next tick"]
Local grants reserve CPU cores, RAM, and temp storage immediately. That reservation happens before the operating system load average has time to rise.
Bounded Spillover Across Three Hosts¶
When Host A has a blocked queue and Host B or Host C has free capacity, Host A may redirect more than one waiting paglet. It does not empty its whole queue.
sequenceDiagram
participant AQueue as "Host A waiting queue"
participant ASlots as "compute-slots A"
participant BSlots as "compute-slots B"
participant CSlots as "compute-slots C"
participant Job1 as "waiting job 1"
participant Job2 as "waiting job 2"
BSlots-->>ASlots: peer status: empty queue, 2 slots free
CSlots-->>ASlots: peer status: empty queue, 1 slot free
ASlots->>ASlots: local-first grant pass
ASlots->>ASlots: compute redirect budget and keep local anchor
ASlots->>ASlots: project B/C capacity with shadow reservations
ASlots-->>Job1: compute_slot_redirect(target B)
ASlots->>AQueue: remove job 1 after delivery
Job1->>BSlots: dispatch self, then request_slot
ASlots-->>Job2: compute_slot_redirect(target C)
ASlots->>AQueue: remove job 2 after delivery
Job2->>CSlots: dispatch self, then request_slot
Scenario behavior:
- If Host A has five waiting jobs and Host B has room for three, Host A can redirect up to three jobs, subject to the global redirect cap.
- If Host A has five waiting jobs and Host B plus Host C have room for six,
Host A still keeps at least one local waiting job and obeys
max_redirects_per_tick. - If Host A has exactly one waiting job and a peer has a fitting free slot, that one job may move; otherwise idle peer capacity would remain unused.
- If a waiting job can run on Host A now, Host A grants it locally instead of redirecting it.
- If a job was just redirected, the receiving host will not immediately redirect it again while the cooldown is active.
Placement Preflight¶
Before a raw protocol user moves to a compute host, it should call
candidate_hosts. ComputeJobPaglet performs this preflight automatically.
This initial round trip avoids moving to a host that can never satisfy the
request.
candidate_hosts actively asks peer schedulers for current status, then filters
hosts by “can ever satisfy” constraints:
- CPU jobs supported.
- GPU support if required.
- requested CPU cores no larger than the host's eligible logical CPU count.
- requested RAM no larger than host RAM.
- requested temp storage no larger than host work storage.
This check is deliberately different from “can run right now”. A host may be a valid candidate even when all slots are currently busy. A host with current status sampling errors is skipped for placement until it reports healthy status again, but the error is treated as transient health, not as durable capability.
Slot Request Decisions¶
After arriving on a candidate host, the paglet calls request_slot on the local
scheduler. The reply has one of four decisions:
| Decision | Meaning | Expected paglet behavior |
|---|---|---|
run_now |
A lease was created immediately. | Start compute. ComputeJobPaglet releases automatically; raw protocol users must call release_slot. |
sleep |
The request is queued locally. | Deactivate with activate_on_message=True. |
redirect |
Another scheduler is a better bounded-spillover target. | Dispatch to the target host and request again. |
rejected |
No valid local/peer path exists. | Fail or return home and notify the user. |
Queued paglets are woken by normal messages. The scheduler does not directly dispatch inactive paglets because the runtime requires paglets to be active for dispatch. Instead, it sends:
compute_slot_grantedwhen local resources become available.compute_slot_redirectwhen bounded work stealing selects a peer target.
Delivery is part of the queue transition. A queued request is removed from the local backlog only after the scheduler successfully submits the grant or redirect message to the paglet. If grant delivery fails, the lease is rolled back and the request remains queued for a later pass.
Queue And Lease Semantics¶
The scheduler queue is in-memory. It is intentionally not durable in v1. If a
host restarts, queued requests and leases are lost, and job-specific recovery is
handled by ComputeJobPaglet: jobs that were deactivated in WAITING_FOR_SLOT
activate on host startup and submit their slot request again. Raw request_slot
users must provide their own restart recovery if they choose to deactivate while
waiting.
By default, ComputeJobPaglet also treats a host shutdown during RUNNING as a
restartable compute attempt. The job stores its submitted compute state when it
is created. During graceful host shutdown, a running job is deactivated with
activate_on_startup=True and its persisted state is reset to that submitted
state, so after startup it requests a fresh slot before rerunning compute. Set
restart_running_on_host_startup=False on the job state to opt out and keep the
normal inactive lifecycle chosen by the deactivation request.
Queue processing is local-first and then oldest-fit:
- Inspect queued requests in submission order.
- Skip requests that cannot currently fit locally.
- Grant the first request that fits.
- Reserve its CPU cores, allocated CPU IDs when supported, RAM, and temp storage immediately.
- Send
compute_slot_granted. - Remove the request from the queue only after successful message delivery.
- Repeat only while burst headroom remains and
max_grants_per_tickis not reached. - Consider bounded peer spillover only after the local grant pass.
This keeps older requests favored while still allowing smaller jobs to use idle capacity when a large older job cannot currently fit.
Leases are local reservations. ComputeJobPaglet calls release_slot
automatically after run_compute_job() finishes or fails. Raw request_slot
users must call release_slot themselves. The scheduler also checks whether the
leased paglet still has an active local host record; if its child process has
exited or the paglet is otherwise definitely no longer active, the lease is
removed and the reserved CPU/RAM/storage capacity becomes available again.
Lease TTLs are a stale-record safety net, not permission to release resources
for an active job. If a lease expires while the local job record is still active,
the scheduler keeps the reservation and extends the lease instead of admitting
new jobs into the same resource budget.
Startup Throttling¶
Actual CPU and memory use often rises gradually after a job starts. To avoid launch storms, the scheduler throttles grants:
- It evaluates the queue every few seconds.
- After a normal grant, it waits before granting the next normal job.
- It can grant several jobs in one tick only when
max_grants_per_tick,burst_resource_headroom_factor, andburst_load_per_cpustill allow it.
Reserved resources count immediately, even before the operating system load average catches up.
With the explicit core reservation model, CPU cores are the hard admission counter and load is only a guard for burst behavior. For example, a host with eight eligible CPUs and four active one-core leases reports four free CPU cores, even if the operating system load average has not yet reached four.
Elastic affinity does not change the admission counter. A job may temporarily
receive more CPU IDs than it requested, but only its requested cpu_cores are
reserved against future admission. If another job arrives, the scheduler can
recompute masks and reclaim elastic CPUs without violating any lease's
guaranteed floor.
Compared With Cluster Job Submission¶
compute-slots is intentionally different from classic cluster schedulers such
as PBS, Slurm, LSF, or HTCondor. Those systems are mature production systems
that have been used for decades. They normally submit a job description to a
central or federated scheduler, stage inputs, launch a process on an allocated
worker node, and collect logs or output files after the process exits.
Paglets are not presented here as a replacement for those systems. In this project they are a research prototype and design experiment for investigating a different way to structure distributed work: make the work unit a mobile object with state, behavior, lifecycle, and messaging, then let resident services help that object find safe compute capacity.
Paglet scheduling keeps the unit of work as a mobile object:
- The paglet carries state, resource wishes, home-host information, result handling policy, and retry/notification logic.
- The resident scheduler grants or redirects compute slots, but the paglet dispatches itself, deactivates, wakes, runs, stores intermediate results, and decides where final results go.
- A laptop or intermittent home host can start work, go offline, and later receive returning result paglets without owning the compute process lifetime.
This makes paglets a better fit when the application benefits from explicit mobile state and application-specific result handling:
| Aspect | compute-slots paglets |
Classic cluster submission |
|---|---|---|
| Work unit | Mobile paglet with dataclass state and methods. | Submitted job script or job spec. |
| Scheduler role | Local resident admission, leases, queueing, peer spillover. | Usually central queue, allocation, launch, accounting. |
| Movement | Paglet dispatches itself between hosts. | Scheduler starts a process on an allocated node. |
| Result handling | Application-specific; paglet can return home, wait, notify, or store locally. | Usually stdout/stderr, files, artifacts, or workflow-managed outputs. |
| Home host offline | Natural fit if paglet stores results remotely and returns later. | Usually handled by shared storage, submit host spool, or scheduler accounting. |
| Adding servers live | New paglet hosts can join the mesh and advertise compute-slots; existing schedulers learn them through peer status sync. |
New worker nodes can be added live when the cluster manager, node daemon, queues, and node state are configured. |
| Operational scope | Lightweight mesh of paglet hosts. | Dedicated cluster services, worker daemons, queues, policies. |
| Fairness/accounting | Minimal in v1; cooperative estimates and local policy. | Mature priority, quotas, reservations, preemption, accounting. |
| Failure recovery | Mostly application-specific in v1. | Mature retry/requeue and node-failure handling, depending on system. |
The difference is not that PBS-style jobs cannot communicate or create more processes. A batch job can open network connections, launch child processes, use MPI, interact with shared storage, or participate in a workflow engine. The conceptual difference is what the system makes first-class.
Cluster schedulers put these topics at the foreground:
- durable queues and central or federated resource allocation;
- multi-user fairness, quotas, priorities, accounting, and administrative policy;
- node states, partitions, reservations, preemption, and operational control;
- reproducible batch execution where a submitted script or job spec becomes a process on an allocated node.
Paglets put different topics at the foreground:
- mobile state: the compute object carries its own serializable progress, target, home-host, and result metadata;
- autonomy: the job object can decide to move, wait, return, notify, retry, or split work into helper paglets;
- object messaging: jobs can communicate with resident services and other paglets using the same runtime model;
- intermittent topology: a starter laptop can go offline while job paglets wait remotely and return later;
- application-owned result handling: the paglet can choose whether to store intermediate payloads, return home, append to a database, or notify a user.
Pros of the paglet approach:
- Good for small meshes, intermittently connected starter hosts, and jobs whose result handling is part of the application logic.
- The job can carry rich Python state instead of relying only on command-line arguments, files, and environment variables.
- The same object can wait, move, resume, notify, and commit results.
- A job can spawn or coordinate additional paglets when the application wants a tree of cooperating mobile work units.
- Data locality can be handled by application logic: a paglet can move toward cached inputs, keep intermediate payloads near the compute host, or return only summaries.
- Resident services can be reused by many paglet types without turning result collection into a global scheduler feature.
- New compute hosts can be added while the mesh is running; once they advertise
compute-slots, pending paglets can discover them through candidate preflight or peer status sync.
Cons compared with cluster schedulers:
- It is not a replacement for a production HPC batch scheduler.
- v1 has in-memory queues and leases, not durable global scheduling state.
- Fair-share, user quotas, backfill scheduling, preemption, reservations, multi-tenant accounting, and node failure recovery are out of scope.
- Resource estimates are cooperative; a misbehaving job can still exceed its estimate unless the host environment enforces limits separately.
- Live host addition is opportunistic: existing queued paglets learn about new capacity through status sync and redirects, not through a central global reschedule of all queued work.
- The programming model is less established than script-based batch jobs and requires applications to be designed around serializable mobile state.
Classic cluster schedulers can also add worker nodes live, and they are usually stronger when this must be governed by partitions, reservations, node states, fair-share policy, and accounting. The paglet approach is lighter: a new host joins the mesh, starts the resident services, and becomes eligible for future placement or bounded spillover once other schedulers see its status.
The intended boundary is therefore: use compute-slots to explore mobile
paglets as an alternative structure for coarse distributed workflows, especially
when mobile state, messaging, intermittent hosts, and application-specific
result handling are central to the problem. Use PBS/Slurm-style systems when
you need mature production cluster operations, strict multi-user policy, and
durable centralized scheduling.
CLI¶
Inspect scheduler state:
uv run paglets jobs status
uv run paglets jobs queue
uv run paglets jobs ps --active
uv run paglets jobs why
uv run paglets jobs top
status prints aggregate host capacity including waiting job count and active
lease count. queue adds queued requests with declared CPU core, memory, and
temp-storage reservations. why explains which resource is preventing queued
requests from running. ps --active lists active compute job paglets. top
adds process-tree RSS, Paglets work-dir usage, application-provided extra work
usage, sampled maxima, and sample counts.
Check candidate hosts for a resource request:
uv run paglets jobs hosts --cores 2 --mem 4G --disk 1G
uv run paglets jobs hosts --tag linux --prefer-tag gpu --exclude-host laptop
Remove queued scheduler requests and inactive waiting jobs:
uv run paglets jobs rm --job JOB_ID --dry-run
uv run paglets jobs rm --agent agent-abc --force
uv run paglets jobs rm --all --force
rm removes matching queued slot requests from the scheduler and disposes
matching inactive waiting compute job paglets. Match by agent ID, job ID, or use
--all. Destructive removal requires --force; use --dry-run to preview
matches. Active leases and running job processes are intentionally not removed
by this command.
List or clear compute job paglets:
uv run paglets jobs ps --inactive
uv run paglets jobs ps --inactive --state WAITING_FOR_SLOT
uv run paglets jobs rm --state WAITING_FOR_SLOT --dry-run
uv run paglets jobs rm --state WAITING_FOR_SLOT --force
uv run paglets jobs history --limit 20
ps uses the host admin API to inspect active or inactive paglets whose
state includes compute_status. If neither --active nor --inactive is
provided, both active and inactive jobs are listed. rm disposes
inactive compute job paglets only; by default it targets WAITING_FOR_SLOT jobs
and requires --force unless --dry-run is used. history prints the
recent finished-job usage summaries retained by the local scheduler, including
runtime, finish reason, job class, max CPU, max process-tree RSS, and max disk
usage.
Inspect compute job groups:
For shared or relayed deployments, set PAGLETS_API_KEY for the host and
compute-slot commands. Use --api-key-env NAME only when the key lives in a
different environment variable.