HomeProjWorkBlogContact
Back to Projects

SAM-3D Serverless Deployment

VDECOMA

2026

Large-scale data pipeline optimization utilizing modern indexing techniques to consolidate big data from multiple sources.

Screenshots

Runpod
SAM-3D
Serverless Deployment

About This Project

SAM-3D is a segment-anything model fine-tuned on architectural photography and 3D point-cloud data, used internally at VDECOMA to automatically identify and classify surfaces — walls, floors, ceilings, furniture — from raw room scans. The deployment challenge was running a multi-gigabyte PyTorch model on-demand without paying for an always-on GPU instance.

The solution is a fully serverless architecture on Runpod. Each inference request spins up a Docker container with the model weights pre-loaded from a network volume, runs inference, streams results back over a REST endpoint, and terminates. Cold-start time is under eight seconds for the first request in a session; subsequent calls within the same worker window are near-instant. The entire stack is defined as a Runpod Serverless template and deployed via a Python script, making environment parity between staging and production trivial.

Compared to the previous always-on GPU VM, this architecture reduced monthly compute costs by approximately 70% while improving throughput for burst workloads — the system can handle up to 50 concurrent inference jobs by auto-scaling workers horizontally, something that was impossible with a single fixed instance. Monitoring is handled through Runpod's built-in metrics dashboard supplemented by a lightweight Python logging layer that ships structured logs to a centralized store.

Stack

#Python#Runpod#Serverless#Docker