Edge Deployment¶

Your infrastructure, your rules. Deploy Pauhu anywhere - on-premises data centers, your cloud account, edge locations, or even on a Raspberry Pi. Same API, same quality, complete control.

Deployment Options¶

graph TB
    subgraph "Pauhu Cloud"
        A[Global Edge Network]
    end

    subgraph "Your Cloud"
        B[AWS / Azure / GCP]
    end

    subgraph "On-Premises"
        C[Data Center]
        D[Branch Office]
    end

    subgraph "Edge"
        E[Factory Floor]
        F[Field Devices]
    end

    G[Same Pauhu API] --> A
    G --> B
    G --> C
    G --> D
    G --> E
    G --> F

Cloud Deployment¶

AWS¶

# Deploy to AWS using CloudFormation
aws cloudformation deploy \
  --template-file pauhu-aws.yaml \
  --stack-name pauhu \
  --parameter-overrides \
    InstanceType=g4dn.xlarge \
    ModelsS3Bucket=your-models-bucket

Azure¶

# Deploy to Azure using ARM template
az deployment group create \
  --resource-group pauhu-rg \
  --template-file pauhu-azure.json \
  --parameters vmSize=Standard_NC4as_T4_v3

Google Cloud¶

# Deploy to GCP using Terraform
terraform init
terraform apply \
  -var="project=your-project" \
  -var="region=europe-north1" \
  -var="machine_type=n1-standard-4-t4"

On-Premises Deployment¶

Docker Compose¶

# docker-compose.yml
version: '3.8'

services:
  pauhu:
    image: pauhu/enterprise:latest
    ports:
      - "8080:8080"
    volumes:
      - ./models:/opt/pauhu/models
      - ./config:/opt/pauhu/config
    environment:
      - PAUHU_MODE=offline
      - PAUHU_LICENSE_KEY=${LICENSE_KEY}
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Kubernetes¶

# pauhu-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pauhu
  namespace: ai-services
spec:
  replicas: 3
  selector:
    matchLabels:
      app: pauhu
  template:
    metadata:
      labels:
        app: pauhu
    spec:
      containers:
        - name: pauhu
          image: pauhu/enterprise:latest
          ports:
            - containerPort: 8080
          resources:
            requests:
              memory: "16Gi"
              cpu: "4"
              nvidia.com/gpu: 1
            limits:
              memory: "32Gi"
              cpu: "8"
              nvidia.com/gpu: 1
          volumeMounts:
            - name: models
              mountPath: /opt/pauhu/models
      volumes:
        - name: models
          persistentVolumeClaim:
            claimName: pauhu-models

Helm Chart¶

# Add Pauhu Helm repository
helm repo add pauhu https://charts.pauhu.ai

# Install
helm install pauhu pauhu/pauhu \
  --namespace ai-services \
  --set license.key=$LICENSE_KEY \
  --set models.size=full \
  --set gpu.enabled=true

Edge Devices¶

NVIDIA Jetson¶

# Jetson Orin deployment
docker run -d \
  --runtime nvidia \
  -p 8080:8080 \
  -v /mnt/models:/opt/pauhu/models \
  pauhu/jetson:latest

Raspberry Pi¶

# Raspberry Pi 5 (CPU-only, limited models)
docker run -d \
  -p 8080:8080 \
  -v /mnt/models:/opt/pauhu/models \
  pauhu/arm64:latest

Intel NUC¶

# Intel NUC with OpenVINO acceleration
docker run -d \
  --device /dev/dri \
  -p 8080:8080 \
  -v /mnt/models:/opt/pauhu/models \
  pauhu/openvino:latest

Hardware Compatibility¶

Platform	CPU	GPU	RAM	Notes
Server	x86_64	NVIDIA A100/H100	128GB+	Full performance
Workstation	x86_64	NVIDIA RTX 4090	64GB+	High performance
Cloud VM	x86_64	T4/A10G	32GB+	Cost-effective
Jetson Orin	ARM64	Ampere	32GB	Edge AI
Mac	Apple Silicon	M3 Max	128GB	Metal acceleration
Raspberry Pi	ARM64	None	8GB	Basic translation

Network Configuration¶

Firewall Rules¶

# Required ports
8080  - HTTP API
8443  - HTTPS API
9090  - Metrics (Prometheus)
9091  - Health checks

Load Balancer¶

# HAProxy configuration
frontend pauhu_front
    bind *:443 ssl crt /etc/ssl/pauhu.pem
    default_backend pauhu_back

backend pauhu_back
    balance roundrobin
    option httpchk GET /health
    server pauhu1 10.0.0.1:8080 check
    server pauhu2 10.0.0.2:8080 check
    server pauhu3 10.0.0.3:8080 check

High Availability¶

graph TB
    subgraph "Region A"
        A1[Pauhu Node 1]
        A2[Pauhu Node 2]
        A3[Shared Models]
    end

    subgraph "Region B"
        B1[Pauhu Node 3]
        B2[Pauhu Node 4]
        B3[Shared Models]
    end

    LB[Load Balancer] --> A1
    LB --> A2
    LB --> B1
    LB --> B2

    A3 --> A1
    A3 --> A2
    B3 --> B1
    B3 --> B2

Monitoring¶

Prometheus Metrics¶

# Scrape configuration
scrape_configs:
  - job_name: 'pauhu'
    static_configs:
      - targets: ['pauhu:9090']

Grafana Dashboard¶

# Available metrics
pauhu_requests_total
pauhu_request_duration_seconds
pauhu_translation_characters_total
pauhu_model_load_time_seconds
pauhu_gpu_memory_used_bytes
pauhu_queue_depth

License Management¶

Offline License¶

from pauhu import Pauhu

# Activate with offline license
client = Pauhu(
    license_file="/opt/pauhu/license.key",
    mode="offline"
)

# Check license status
print(client.license.valid_until)
print(client.license.features)

License Server¶

# For multi-node deployments
pauhu:
  license:
    server: https://license.internal.corp
    key: ${LICENSE_KEY}
    offline_grace_period: 7d

Getting Started¶

# 1. Pull the image
docker pull pauhu/enterprise:latest

# 2. Download models
docker run pauhu/enterprise:latest \
  pauhu models download --all --path /mnt/models

# 3. Start the server
docker run -d \
  -p 8080:8080 \
  -v /mnt/models:/opt/pauhu/models \
  -e PAUHU_LICENSE_KEY=$LICENSE_KEY \
  pauhu/enterprise:latest

# 4. Test
curl http://localhost:8080/v1/translate \
  -d '{"text": "Hello", "target": "fi"}'