Compute cluster — GPU boxes for PDF → Markdown conversion

Which box to target for a marker run, and what each can handle. Probed over SSH on
2026-06-23. Conversion how-to + install: pipeline/docs_ws.md;
bring up a new box with pipeline/install_marker.sh.

All three are the user’s own NVIDIA machines, reached by SSH alias (ssh workstation / pc1 / pc2).
marker is memory-bound on the GPU, so VRAM is the binding constraint — prefer the box with the most
free VRAM for the biggest job, and convert long documents one chapter at a time to cap peak usage.

The boxes (VRAM-ranked: target top-down)

Alias	Hostname	CPU	Threads	RAM	GPU	VRAM	Driver	system py	marker
`workstation`	(WSL2)	i7-13620H	16	15 GiB	RTX 5060 Laptop	8 GB	581.57	3.10.12	✅ installed, verified working
`pc1`	`toni`	Ryzen 7 1700	16	31 GiB	GTX 1660 SUPER	6 GB	580.159.03	3.12.3	⬜ run `install_marker.sh`
`pc2`	`toni-toni-laptop`	i7-9750H	12	62 GiB	GTX 1650 Mobile (TU117M)	~4 GB	⚠️ not loaded	3.10.12	⛔ fix driver first

Notes & gotchas

workstation is the default for convert_remote.sh (and the only box currently set up). 8 GB VRAM
handles a single textbook chapter (≤ ~70 pp) comfortably.
pc1 is ready to enlist — just needs install_marker.sh. Its 1660 SUPER (6 GB) is fine for
per-chapter conversion. Note system Python is 3.12, not 3.10 — marker-pdf supports it, but it’s a
different interpreter than the workstation’s, so keep the install user-site (the script handles this).
pc2 is GPU-blocked: nvidia-smi can’t reach the driver (NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver). The hardware is there (lspci sees the TU117M /
GTX 1650 Mobile) and the box is otherwise the strongest on RAM (62 GiB) — but it can’t run GPU-marker
until the driver is reinstalled/loaded. Until then, use it only as a CPU fallback (slow) or leave it out.
VRAM, not RAM, gates marker. pc2’s 62 GiB RAM doesn’t help a 4 GB GPU; pc1’s 6 GB GPU outranks it for
this workload despite less system RAM.

Distributing a job

Slice a textbook into per-chapter PDFs locally, then point convert_remote.sh at the slice folder with
a different WORKSTATION= per box (full recipe in docs_ws.md). With pc2
down, that’s a 2-box pool today (workstation + pc1); a 3-box pool once pc2’s driver is fixed.

Quartz 5

Explorer

compute_cluster

Compute cluster — GPU boxes for PDF → Markdown conversion

The boxes (VRAM-ranked: target top-down)

Notes & gotchas

Distributing a job

Graph View

Table of Contents