Edge Deployment¶
Your infrastructure, your rules. Deploy Pauhu anywhere - on-premises data centers, your cloud account, edge locations, or even on a Raspberry Pi. Same API, same quality, complete control.
Deployment Options¶
graph TB
subgraph "Pauhu Cloud"
A[Global Edge Network]
end
subgraph "Your Cloud"
B[AWS / Azure / GCP]
end
subgraph "On-Premises"
C[Data Center]
D[Branch Office]
end
subgraph "Edge"
E[Factory Floor]
F[Field Devices]
end
G[Same Pauhu API] --> A
G --> B
G --> C
G --> D
G --> E
G --> F Cloud Deployment¶
AWS¶
# Deploy to AWS using CloudFormation
aws cloudformation deploy \
--template-file pauhu-aws.yaml \
--stack-name pauhu \
--parameter-overrides \
InstanceType=g4dn.xlarge \
ModelsS3Bucket=your-models-bucket
Azure¶
# Deploy to Azure using ARM template
az deployment group create \
--resource-group pauhu-rg \
--template-file pauhu-azure.json \
--parameters vmSize=Standard_NC4as_T4_v3
Google Cloud¶
# Deploy to GCP using Terraform
terraform init
terraform apply \
-var="project=your-project" \
-var="region=europe-north1" \
-var="machine_type=n1-standard-4-t4"
On-Premises Deployment¶
Docker Compose¶
# docker-compose.yml
version: '3.8'
services:
pauhu:
image: pauhu/enterprise:latest
ports:
- "8080:8080"
volumes:
- ./models:/opt/pauhu/models
- ./config:/opt/pauhu/config
environment:
- PAUHU_MODE=offline
- PAUHU_LICENSE_KEY=${LICENSE_KEY}
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Kubernetes¶
# pauhu-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pauhu
namespace: ai-services
spec:
replicas: 3
selector:
matchLabels:
app: pauhu
template:
metadata:
labels:
app: pauhu
spec:
containers:
- name: pauhu
image: pauhu/enterprise:latest
ports:
- containerPort: 8080
resources:
requests:
memory: "16Gi"
cpu: "4"
nvidia.com/gpu: 1
limits:
memory: "32Gi"
cpu: "8"
nvidia.com/gpu: 1
volumeMounts:
- name: models
mountPath: /opt/pauhu/models
volumes:
- name: models
persistentVolumeClaim:
claimName: pauhu-models
Helm Chart¶
# Add Pauhu Helm repository
helm repo add pauhu https://charts.pauhu.ai
# Install
helm install pauhu pauhu/pauhu \
--namespace ai-services \
--set license.key=$LICENSE_KEY \
--set models.size=full \
--set gpu.enabled=true
Edge Devices¶
NVIDIA Jetson¶
# Jetson Orin deployment
docker run -d \
--runtime nvidia \
-p 8080:8080 \
-v /mnt/models:/opt/pauhu/models \
pauhu/jetson:latest
Raspberry Pi¶
# Raspberry Pi 5 (CPU-only, limited models)
docker run -d \
-p 8080:8080 \
-v /mnt/models:/opt/pauhu/models \
pauhu/arm64:latest
Intel NUC¶
# Intel NUC with OpenVINO acceleration
docker run -d \
--device /dev/dri \
-p 8080:8080 \
-v /mnt/models:/opt/pauhu/models \
pauhu/openvino:latest
Hardware Compatibility¶
| Platform | CPU | GPU | RAM | Notes |
|---|---|---|---|---|
| Server | x86_64 | NVIDIA A100/H100 | 128GB+ | Full performance |
| Workstation | x86_64 | NVIDIA RTX 4090 | 64GB+ | High performance |
| Cloud VM | x86_64 | T4/A10G | 32GB+ | Cost-effective |
| Jetson Orin | ARM64 | Ampere | 32GB | Edge AI |
| Mac | Apple Silicon | M3 Max | 128GB | Metal acceleration |
| Raspberry Pi | ARM64 | None | 8GB | Basic translation |
Network Configuration¶
Firewall Rules¶
Load Balancer¶
# HAProxy configuration
frontend pauhu_front
bind *:443 ssl crt /etc/ssl/pauhu.pem
default_backend pauhu_back
backend pauhu_back
balance roundrobin
option httpchk GET /health
server pauhu1 10.0.0.1:8080 check
server pauhu2 10.0.0.2:8080 check
server pauhu3 10.0.0.3:8080 check
High Availability¶
graph TB
subgraph "Region A"
A1[Pauhu Node 1]
A2[Pauhu Node 2]
A3[Shared Models]
end
subgraph "Region B"
B1[Pauhu Node 3]
B2[Pauhu Node 4]
B3[Shared Models]
end
LB[Load Balancer] --> A1
LB --> A2
LB --> B1
LB --> B2
A3 --> A1
A3 --> A2
B3 --> B1
B3 --> B2 Monitoring¶
Prometheus Metrics¶
# Scrape configuration
scrape_configs:
- job_name: 'pauhu'
static_configs:
- targets: ['pauhu:9090']
Grafana Dashboard¶
# Available metrics
pauhu_requests_total
pauhu_request_duration_seconds
pauhu_translation_characters_total
pauhu_model_load_time_seconds
pauhu_gpu_memory_used_bytes
pauhu_queue_depth
License Management¶
Offline License¶
from pauhu import Pauhu
# Activate with offline license
client = Pauhu(
license_file="/opt/pauhu/license.key",
mode="offline"
)
# Check license status
print(client.license.valid_until)
print(client.license.features)
License Server¶
# For multi-node deployments
pauhu:
license:
server: https://license.internal.corp
key: ${LICENSE_KEY}
offline_grace_period: 7d
Getting Started¶
# 1. Pull the image
docker pull pauhu/enterprise:latest
# 2. Download models
docker run pauhu/enterprise:latest \
pauhu models download --all --path /mnt/models
# 3. Start the server
docker run -d \
-p 8080:8080 \
-v /mnt/models:/opt/pauhu/models \
-e PAUHU_LICENSE_KEY=$LICENSE_KEY \
pauhu/enterprise:latest
# 4. Test
curl http://localhost:8080/v1/translate \
-d '{"text": "Hello", "target": "fi"}'