All roles

[Remote] Computer Vision & AI Lead

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. Strattmont is a construction-tech company headquartered in Riyadh, focused on building a connected-worker platform for large-scale construction sites. They are seeking a Computer Vision & AI Lead to own the SiteGuard stack, develop detection models, and lead a team of senior engineers in enhancing safety compliance through advanced AI technologies.

Responsibilities

  • Own hiring end-to-end for SiteGuard: define the bar across ML, backend, and frontend disciplines, run the process, and make decisions. You have hired senior engineers before and know what good looks like
  • Manage performance directly - set clear expectations, give continuous feedback, run meaningful reviews, and act decisively when someone is not meeting the bar
  • Build a high-ownership engineering culture where engineers take initiative, write their own issues, and feel accountable for product outcomes, not just task completion
  • Mentor engineers at every level - from onboarding new contributors to developing senior engineers. Your track record includes engineers who grew significantly under your leadership
  • Own the squad's delivery: run agile ceremonies, plan sprints, and partner with Product to translate the SiteGuard roadmap into shippable increments
  • Own the full lifecycle of SiteGuard's detection capabilities - PPE compliance, hazard detection, unsafe-behavior recognition - from problem framing through training, evaluation, and production deployment
  • Lead the strategic use of modern architectures and foundation models: transformer-based detectors (RT-DETR, YOLOv10/11, Co-DETR) alongside classical YOLO-family models; zero/few-shot approaches via CLIP, DINOv2, GroundingDINO, and SAM 2; and VLMs (Gemini Vision, GPT-4V, LLaVA, Qwen-VL) for scene understanding and incident reasoning where they outperform purpose-built detectors
  • Lead the use of generative AI to augment training data - diffusion-based synthetic image generation (ControlNet, Stable Diffusion) for rare PPE violations, lighting conditions, and site environments that real-world collection cannot cover economically
  • Define and drive model-quality targets (precision/recall, false-alarm rates) and the retraining loops that sustain them as site conditions change
  • Oversee dataset strategy: collection, AI-assisted annotation (using SAM, CLIP, or VLMs as labeling tools), curation, and governance of site imagery and video
  • Lead development of the on-site edge agent that runs inference close to the camera - optimizing transformer-based and classical models for constrained hardware (quantization, INT8/FP16, TensorRT, ONNX, batching, CUDA/NPU accelerators)
  • Lead development of cloud camera agents for cloud-based deployments, ensuring the same consistent detection output as on-site agents regardless of deployment type
  • Engineer for the realities of the field: on-prem gateway deployment, intermittent connectivity, store-and-forward, and graceful degradation
  • Ensure low-latency, reliable detection-to-alert pipelines from camera to platform, including on-device pre-filtering before cloud VLM calls where cost and latency demand it
  • Own the golden labeled dataset as the single source of truth for evaluation, fine-tuning, and production monitoring. Run production reviewer signals (labels, corrections) back into the dataset continuously
  • Design evaluation metrics for both classical detector output and VLM-generated detections (precision/recall, false-positive control, human-agreement). Run every model, prompt, or schema change as a regression test against the golden set before release, no change ships without a measured quality bar
  • Architect data-extraction, AI-assisted annotation, and training pipelines that ensure reproducibility and versioning of datasets and models (experiment tracking, model registries, dataset versioning)
  • Implement CI for models and code: automated retraining/evaluation, LLM-as-judge patterns for open-ended detection outputs, and production monitoring for model drift, cost, and accuracy degradation
  • Build deterministic post-processing guardrails over model output - domain/OSHA rule filters, confidence calibration, audit trails - so the product behaves predictably even when model output varies
  • Raise engineering maturity across SiteGuard repositories: test coverage, CI gating, and coverage reporting
  • Own the SiteGuard product surfaces (web dashboards and frontends) and the APIs that deliver detected violations and safety events into the core Company's platform
  • Ensure alerting, reporting, and analytics turn raw detections into clear, prioritized actions - including LLM-generated incident summaries and natural-language search over safety event history
  • Enforce code review, testing, and quality standards across model and application code
  • Champion privacy-by-design for video and personal data - anonymization, access controls, retention limits, and responsible use of footage in compliance with Company and customer requirements. Implement responsible-AI safeguards on all AI-generated outputs: confidence thresholds, human-in-the-loop review for high-severity alerts, and audit trails
  • Collaborate with Hardware/camera, DevOps, Field Engineering, and customer teams to validate SiteGuard in real site conditions and incorporate feedback
  • Lead R&D into new detection capabilities - multi-modal models combining video and sensor data, behavioral analysis, crowd analytics - and evaluate emerging CV/AI approaches with honest, grounded judgment
  • Stay current with the fast-moving foundation-model and VLM ecosystem and translate new research into concrete roadmap decisions for the squad

Skills

  • 6+ years of software/ML engineering experience with a clear progression from senior IC to engineering leadership
  • 3+ years directly managing engineers - leading teams, owning hiring, running performance cycles. Mentoring is not the same as managing; this role requires the latter
  • A track record of hiring: you have built or significantly grown an engineering team and made independent hiring decisions at the senior engineer level and above
  • Proven experience shipping production systems built on vision/multimodal foundation models (VLMs/LLMs via cloud APIs) - owning quality, latency, and cost from prototype to scale
  • Hands-on experience operating high-throughput, asynchronous video/media processing pipelines in production
  • Strong Python (async-first: asyncio, FastAPI or equivalent); production service design with a focus on reliability and observability
  • Hands-on with multimodal/VLM APIs - Gemini/Vertex AI, OpenAI, or Anthropic equivalents: prompt engineering, structured/JSON-schema-constrained output, context/caching, and per-model parameter tuning
  • Computer-vision foundations: object detection and segmentation across both classical architectures (YOLO-family) and modern transformer-based detectors (RT-DETR, Co-DETR, GroundingDINO); video frame extraction and handling (OpenCV/FFmpeg); spatial reasoning over model output
  • Foundation models and zero/few-shot approaches: CLIP, DINOv2, SAM 2 for annotation assistance and detection; VLMs for scene understanding and incident reasoning
  • Edge inference optimization: ONNX, TensorRT, quantization (INT8/FP16), deployment to constrained hardware (Jetson, Hailo, or equivalent)
  • Distributed pipeline design: message brokers, relational databases with async ORM and migration tooling, object storage - comfortable across cloud providers (GCP, AWS, or Azure)
  • MLOps stack: experiment tracking, model registries, dataset versioning, and CI pipelines for model evaluation
  • Primary daily experience with Claude Code and Codex - used in real engineering work, with formed opinions about when to trust their output and when not to
  • Current awareness of the AI model landscape: practical differences between frontier models (Gemini, GPT-4o, Claude, DeepSeek) for vision tasks, code generation, and structured output
  • Tracks AI trends actively - model releases, VLM capabilities, agentic framework developments - and translates this into concrete, grounded team guidance
  • Exceptional written and spoken English. This is a hard requirement. You write clearly and precisely - design documents, evaluation reports, and stakeholder updates are well-structured and unambiguous. You can explain a model's failure mode to an HSE manager and an architecture decision to an engineer with equal clarity
  • Experience with safety, surveillance, or video analytics in industrial or construction environments. OSHA/EHS domain knowledge is a strong plus
  • Experience with synthetic data generation pipelines (ControlNet, Stable Diffusion) for computer-vision training data augmentation
  • LLMOps / observability for model-backed services: tracing model calls, output monitoring, A/B testing of prompts and schemas
  • Agentic frameworks (LangChain, LlamaIndex, AutoGen, or equivalent) applied to safety workflows or multi-step incident management
  • Comfort working config-over-code for multi-tenant rollouts (per-project model and prompt configuration)
  • Familiarity with on-prem/edge deployment, gateways, and operating under intermittent connectivity
  • Experienced people manager: you have had difficult performance conversations, managed out engineers who were not meeting the bar, and done so with fairness and directness
  • Research-oriented curiosity balanced with production pragmatism - you read papers, run experiments, and ship to production. You distinguish AI approaches that deliver real site-safety value from impressive benchmarks that do not survive the field
  • Excellent written and spoken English - able to translate model behavior, failure modes, and confidence levels to non-technical stakeholders, including explaining why an AI made a specific safety call
  • Strategic, outcome-driven thinking: makes technology decisions based on product value and field reliability, not novelty
  • Comfort operating in a fast-paced, evolving environment where both site priorities and the AI landscape shift quickly

Benefits

  • Competitive salary, performance bonus, and equity participation.
  • High-autonomy role with direct product and company impact - you are building a safety AI product from the ground up, not maintaining an inherited codebase.
  • A small, senior engineering team where your decisions matter and your name is on the architecture.
  • Relocation support for candidates joining from outside KSA.
  • Health insurance, annual flights, and standard Company benefits package.

Company Overview

  • Strattmont offers IT services, consulting, and talent acquisition, focusing on hiring and managed IT solutions. It was founded in 2020, and is headquartered in Brvenica, Brvenica, MKD, with a workforce of 11-50 employees. Its website is https://strattmont.com.
  • Apply To This Job

    Related roles