Compute cluster — GPU boxes for PDF → Markdown conversion
Which box to target for a
markerrun, and what each can handle. Probed over SSH on
2026-06-23. Conversion how-to + install:pipeline/docs_ws.md;
bring up a new box withpipeline/install_marker.sh.
All three are the user’s own NVIDIA machines, reached by SSH alias (ssh workstation / pc1 / pc2).
marker is memory-bound on the GPU, so VRAM is the binding constraint — prefer the box with the most
free VRAM for the biggest job, and convert long documents one chapter at a time to cap peak usage.
The boxes (VRAM-ranked: target top-down)
| Alias | Hostname | CPU | Threads | RAM | GPU | VRAM | Driver | system py | marker |
|---|---|---|---|---|---|---|---|---|---|
workstation | (WSL2) | i7-13620H | 16 | 15 GiB | RTX 5060 Laptop | 8 GB | 581.57 | 3.10.12 | ✅ installed, verified working |
pc1 | toni | Ryzen 7 1700 | 16 | 31 GiB | GTX 1660 SUPER | 6 GB | 580.159.03 | 3.12.3 | ⬜ run install_marker.sh |
pc2 | toni-toni-laptop | i7-9750H | 12 | 62 GiB | GTX 1650 Mobile (TU117M) | ~4 GB | ⚠️ not loaded | 3.10.12 | ⛔ fix driver first |
Notes & gotchas
workstationis the default forconvert_remote.sh(and the only box currently set up). 8 GB VRAM
handles a single textbook chapter (≤ ~70 pp) comfortably.pc1is ready to enlist — just needsinstall_marker.sh. Its 1660 SUPER (6 GB) is fine for
per-chapter conversion. Note system Python is 3.12, not 3.10 — marker-pdf supports it, but it’s a
different interpreter than the workstation’s, so keep the install user-site (the script handles this).pc2is GPU-blocked:nvidia-smican’t reach the driver (NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver). The hardware is there (lspcisees the TU117M /
GTX 1650 Mobile) and the box is otherwise the strongest on RAM (62 GiB) — but it can’t run GPU-marker
until the driver is reinstalled/loaded. Until then, use it only as a CPU fallback (slow) or leave it out.- VRAM, not RAM, gates marker. pc2’s 62 GiB RAM doesn’t help a 4 GB GPU; pc1’s 6 GB GPU outranks it for
this workload despite less system RAM.
Distributing a job
Slice a textbook into per-chapter PDFs locally, then point convert_remote.sh at the slice folder with
a different WORKSTATION= per box (full recipe in docs_ws.md). With pc2
down, that’s a 2-box pool today (workstation + pc1); a 3-box pool once pc2’s driver is fixed.