Skip to content

Deployment fails with Out of Memory (OOM) errors during dependency installation #271

@kubbot

Description

@kubbot

Problem Description

The nexus deployment is consistently failing with Out of Memory (OOM) errors during the container startup process. The failures occur when downloading and installing Python dependencies.

Error Details

  • Error Type: Out of Memory (OOM)
  • Frequency: Multiple consecutive failures (20+ attempts in ~5 minutes)
  • Stage: During dependency installation in container startup
  • Total Dependencies: 237 packages being resolved

Large Dependencies Identified

From the build logs, several large packages are being downloaded:

  • speechrecognition (31.3MB)
  • pymupdf (22.9MB)
  • pythonmonkey-fork (20.6MB)
  • numpy (17.1MB)
  • onnxruntime (15.6MB)
  • magika (14.4MB)
  • pandas (12.1MB)
  • botocore (13.0MB)
  • ruff (11.0MB)

Proposed Solution

Phase 1: Verify CI Feasibility

  • Test current dependency installation in CI environment
  • Measure actual memory consumption during build process
  • Identify which dependencies are causing the highest memory usage

Phase 2: Implement Containerized Approach

  • Migrate to containerized deployment strategy
  • Use multi-stage Docker builds to optimize memory usage
  • Pre-build dependencies in separate build stage
  • Implement dependency caching strategies

Expected Benefits

  1. Predictable Resource Usage: Containerized builds provide better memory management
  2. Faster Deployments: Pre-built containers reduce startup time
  3. Better Debugging: Container logs provide clearer error tracking
  4. Scalability: Easier to adjust resource limits as needed

Current Impact

  • Deployment success rate: 0% (all recent attempts failing)
  • Service availability: Severely impacted
  • Development workflow: Blocked

Environment

  • Platform: Current hosting environment
  • Python Version: 3.12.7
  • Deployment Region: asia-southeast1-eqsg3a
  • Container Status: Crashed due to OOM

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions