Compute cluster — GPU boxes for PDF → Markdown conversion

Which box to target for a marker run, and what each can handle. Probed over SSH on
2026-06-23. Conversion how-to + install: pipeline/docs_ws.md;
bring up a new box with pipeline/install_marker.sh.

All three are the user’s own NVIDIA machines, reached by SSH alias (ssh workstation / pc1 / pc2).
marker is memory-bound on the GPU, so VRAM is the binding constraint — prefer the box with the most
free VRAM for the biggest job, and convert long documents one chapter at a time to cap peak usage.

The boxes (VRAM-ranked: target top-down)

AliasHostnameCPUThreadsRAMGPUVRAMDriversystem pymarker
workstation(WSL2)i7-13620H1615 GiBRTX 5060 Laptop8 GB581.573.10.12✅ installed, verified working
pc1toniRyzen 7 17001631 GiBGTX 1660 SUPER6 GB580.159.033.12.3⬜ run install_marker.sh
pc2toni-toni-laptopi7-9750H1262 GiBGTX 1650 Mobile (TU117M)~4 GB⚠️ not loaded3.10.12⛔ fix driver first

Notes & gotchas

  • workstation is the default for convert_remote.sh (and the only box currently set up). 8 GB VRAM
    handles a single textbook chapter (≤ ~70 pp) comfortably.
  • pc1 is ready to enlist — just needs install_marker.sh. Its 1660 SUPER (6 GB) is fine for
    per-chapter conversion. Note system Python is 3.12, not 3.10 — marker-pdf supports it, but it’s a
    different interpreter than the workstation’s, so keep the install user-site (the script handles this).
  • pc2 is GPU-blocked: nvidia-smi can’t reach the driver (NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver). The hardware is there (lspci sees the TU117M /
    GTX 1650 Mobile) and the box is otherwise the strongest on RAM (62 GiB) — but it can’t run GPU-marker
    until the driver is reinstalled/loaded. Until then, use it only as a CPU fallback (slow) or leave it out.
  • VRAM, not RAM, gates marker. pc2’s 62 GiB RAM doesn’t help a 4 GB GPU; pc1’s 6 GB GPU outranks it for
    this workload despite less system RAM.

Distributing a job

Slice a textbook into per-chapter PDFs locally, then point convert_remote.sh at the slice folder with
a different WORKSTATION= per box (full recipe in docs_ws.md). With pc2
down, that’s a 2-box pool today (workstation + pc1); a 3-box pool once pc2’s driver is fixed.