Skip to Content
ProjectsProject Manifest

Project Manifest

computalot.project.json is the runtime contract for projects. Put it at the tarball root next to your Dockerfile and code.

The smallest valid manifest is:

{ "version": 1, "runtime": { "kind": "oci", "sandbox": "gvisor", "workdir": "/workspace" }, "entrypoint": { "command": ["python", "job.py"] } }

When present, Computalot builds a container image from your tarball and runs tasks in a sandboxed environment.

File Location

  • Tarball root: computalot.project.json

Required Fields

version

  • Positive integer, currently 1

runtime

  • kind: oci
  • workdir: absolute in-container working directory (e.g. /workspace)
  • sandbox: gvisor (required for public execution)

entrypoint

  • command: non-empty array of strings

Common Optional Fields

build

Configure how Computalot builds your container image.

{ "build": { "dockerfile": "Dockerfile", "context": ".", "target": "runtime", "args": { "PYTHON_VERSION": "3.11" } } }

If build.dockerfile is omitted, Computalot defaults to a root Dockerfile. If that file is missing, the push fails.

validation

Declarative preflight checks that run before tasks.

{ "validation": { "executables": ["python"], "files": ["job.py"], "commands": [ { "label": "numpy import", "command": ["python", "-c", "import numpy"] } ] } }

runtime.init

Extra commands that run during project init after the base setup.

{ "runtime": { "kind": "oci", "sandbox": "gvisor", "workdir": "/workspace", "init": { "commands": [ { "label": "sync deps", "command": ["uv", "sync"], "cwd": ".", "env": { "UV_LINK_MODE": "copy" } } ] } } }

runtime.services

Helper processes such as local model servers or sidecars.

{ "runtime": { "kind": "oci", "sandbox": "gvisor", "workdir": "/workspace", "services": [ { "name": "embedder", "command": ["python", "-m", "embedder"], "scope": "task", "restart": "on_failure", "healthcheck": { "command": ["python", "-c", "print(\"ok\")"] } } ] } }

Service fields: name, command, scope (project or task), restart (never, on_failure, always), cwd, env, ready_timeout_s, healthcheck.

Tasks receive service metadata through COMPUTALOT_RUNTIME_SERVICES_JSON and COMPUTALOT_SERVICE_<NAME>_* env vars.

requirements

Project-level placement defaults merged into job routing.

{ "requirements": { "profile": "gpu", "gpu_count": 1, "gpu_memory_mb": 24576, "cpu": 8, "memory_mb": 16384, "storage_gb": 40 } }

storage_gb should reflect real worker disk headroom, not just input data size. For sandboxed OCI workloads that usually means: runtime/image footprint, per-task sandbox copy overhead, writable caches, temp files, checkpoints, and any runtime downloads.

cache_mounts

Managed writable caches mounted into the runtime.

{ "cache_mounts": [ { "name": "hf-cache", "scope": "project_digest", "path": "/cache/huggingface", "class": "model" }, { "name": "pip-cache", "scope": "project_digest", "path": "/cache/pip", "class": "pip" } ] }

Fields: name, scope (currently project_digest), path (absolute), max_bytes, class (pip, cargo, model, data), seed_from_image (boolean).

Cache mounts persist per worker and per project version. Use COMPUTALOT_CACHE_<NAME>_DIR env vars when accessing caches outside runtime.workdir.

Use cache mounts for writable runtime state that your code populates at startup or during the task:

  • package caches such as pip
  • Hugging Face runtime caches such as HF_HOME or TRANSFORMERS_CACHE
  • model/data caches created by your own code at runtime

A cache mount replaces the image contents at that path. Use seed_from_image: true if you need the image’s baked files to survive the first mount.

data_sources

Declarative external inputs fetched before task launch.

{ "data_sources": [ { "name": "weights", "source": "huggingface", "uri": "hf://org/model-name", "delivery": "mount", "path": "/workspace/models/model-name", "cache": "hf-cache", "required": true } ] }

Use data sources for immutable inputs that Fleet should prepare before your code starts, such as model weights or reference datasets.

For Hugging Face, delivery: "mount" uses the worker-managed hf-mount path. That only applies to Hugging Face sources declared here in the manifest. If your runner script downloads from Hugging Face on its own, hf-mount is not being used automatically; declare the source here or add a writable Hugging Face cache mount for those runtime downloads.

Long ML Jobs

For long-running ML and evaluation workloads:

  • use data_sources for immutable large inputs such as model weights and reference datasets
  • use cache_mounts for writable runtime caches such as HF_HOME, TRANSFORMERS_CACHE, or package caches
  • declare realistic requirements.storage_gb; PyTorch/CUDA/Hugging Face stacks often need tens of GB of free worker disk before checkpoints or datasets
  • enable checkpointing.resume_from_latest on jobs and emit durable checkpoints throughout execution
  • write checkpoints and outputs to $COMPUTALOT_ARTIFACT_DIR, not repo-relative folders
  • do not assume runtime-downloaded models are reused unless you declared a matching cache mount or manifest data source
  • keep runtime and dev environments separate; avoid installing notebook/lint/test extras onto production workers unless the task actually needs them

artifacts

Named outputs and upload declarations.

{ "artifacts": { "upload": [ {"name": "report", "path": "report.json", "required": true} ], "outputs": [ {"name": "checkpoint", "path": "ckpt/latest.pt", "type": "checkpoint"} ] } }

Relative paths are resolved under $COMPUTALOT_ARTIFACT_DIR.

Filesystem Rules

  • The container filesystem is read-only during task execution
  • Use $COMPUTALOT_TASK_SCRATCH_DIR or $TMPDIR for temporary files
  • Use $COMPUTALOT_ARTIFACT_DIR for checkpoints and outputs
  • Managed cache mounts are writable at their declared paths
  • Do not assume repo-relative paths like checkpoints/ are writable

Path Rules

  • runtime.workdir and cache mount path values must be absolute
  • build.dockerfile, build.context, and command cwd values must be relative
  • Push-time validation rejects missing files, directories, cache names, or invalid paths

Common Push Errors

  • version must be a positive integer
  • runtime.kind must be tarball or oci
  • runtime.workdir must be an absolute path
  • entrypoint.command must be a non-empty string array
  • build.dockerfile does not exist in the tarball
  • build.context does not resolve to a directory
Last updated on