Hướng Dẫn Azure Kubernetes Service (AKS): Container Orchestration Toàn Diện Cho Doanh Nghiệp
TL;DR: Azure Kubernetes Service (AKS) là managed Kubernetes — Azure quản lý control plane (miễn phí), bạn chỉ quản lý worker nodes. Deploy containerized apps với auto-scaling (HPA + Cluster Autoscaler + KEDA), self-healing, rolling updates. Security defense-in-depth: Microsoft Entra ID RBAC + Network Policies (Calico) + Microsoft Defender for Containers + Azure Policy. Chi phí production cluster: ~$500–1,000/tháng. Tích hợp CI/CD qua Azure Pipelines hoặc GitHub Actions + Helm charts.
Azure Kubernetes Service — managed Kubernetes với auto-scaling, security defense-in-depth và CI/CD integration.
Doanh nghiệp của bạn cần triển khai microservices trên Kubernetes nhưng không muốn tự quản lý control plane? Liên hệ PUPAM để thiết kế và triển khai AKS production cluster với auto-scaling, network policies, monitoring và CI/CD pipeline.
AKS vs Other Container Options
| Option | Managed | Complexity | Scaling | Best For |
|---|---|---|---|---|
| AKS | Control plane managed | Medium | Auto-scale (HPA + cluster) | Production containerized apps |
| Azure Container Instances | Fully managed | Low | Manual | Simple containers, batch jobs |
| Azure Container Apps | Fully managed | Low-Medium | Auto (KEDA) | Microservices, event-driven |
| App Service (containers) | Fully managed | Low | Auto-scale | Single container web apps |
| Azure Red Hat OpenShift | Co-managed | Medium-High | Auto-scale | Enterprise with OpenShift |
| Self-managed K8s | You manage all | High | Manual/custom | Full control needed |
Khi nào chọn AKS?
- 5+ microservices cần orchestration
- Cần custom pod networking (Azure CNI, network policies)
- Cần Ingress controller (NGINX, Traefik) hoặc service mesh (Istio, Linkerd)
- Team có Kubernetes expertise
- Cần GPU workloads cho ML/AI
Khi nào KHÔNG cần AKS?
- Single container web app → Azure App Service hoặc Container Instances
- ≤10 microservices đơn giản → Azure Container Apps (nhanh hơn, ít config)
- Batch/scheduled containers → Container Instances hoặc AKS Jobs
Cluster Creation
Tạo AKS Cluster qua Azure CLI
# Prerequisites
az login
az provider register --namespace Microsoft.ContainerService
# Create resource group
az group create \
--name rg-aks-prod \
--location southeastasia
# Create AKS cluster
az aks create \
--resource-group rg-aks-prod \
--name aks-pupam-prod \
--node-count 3 \
--node-vm-size Standard_D4s_v5 \
--kubernetes-version 1.29 \
--network-plugin azure \
--network-policy calico \
--enable-managed-identity \
--enable-aad \
--aad-admin-group-object-ids <entra-group-id> \
--attach-acr <acr-name> \
--enable-defender \
--enable-addons monitoring \
--zones 1 2 3 \
--generate-ssh-keys
# Get credentials
az aks get-credentials \
--resource-group rg-aks-prod \
--name aks-pupam-prod
# Verify
kubectl get nodes
Tạo AKS Cluster qua Azure Portal
- Azure portal → Kubernetes services → + Create
- Basics: Resource group, Cluster name, Region (Southeast Asia), Availability zones (1, 2, 3), K8s version 1.29, Pricing tier Standard (production SLA)
- Node pools: System pool (Standard_D2s_v5, 3 nodes), User pool (Standard_D4s_v5, 3–10 nodes auto-scale)
- Networking: Azure CNI, Network policy Calico
- Integrations: Attach ACR, enable Container Insights, enable Microsoft Defender
- Authentication: Microsoft Entra ID enable, Admin group: K8s-Admins
- Review + Create
Node Pools
| Pool Type | Purpose | VM Size | Count | Scaling |
|---|---|---|---|---|
| System | K8s system pods (CoreDNS, kube-proxy) | D2s_v5 (2 vCPU, 8 GB) | 2–3 (fixed) | Manual |
| User (general) | Application workloads | D4s_v5 (4 vCPU, 16 GB) | 3–10 | Auto-scale |
| User (memory) | Databases, caching | E4s_v5 (4 vCPU, 32 GB) | 2–5 | Auto-scale |
| User (GPU) | ML/AI workloads | NC6s_v3 (6 vCPU, GPU) | 1–4 | Manual |
| Spot | Batch jobs, dev/test | D4s_v5 (Spot price) | 0–10 | Auto-scale |
Thêm Node Pool
# Add user pool (general workloads)
az aks nodepool add \
--resource-group rg-aks-prod \
--cluster-name aks-pupam-prod \
--name userpool \
--node-count 3 \
--min-count 2 \
--max-count 10 \
--enable-cluster-autoscaler \
--node-vm-size Standard_D4s_v5 \
--zones 1 2 3
# Add spot pool (save 60-80% cost)
az aks nodepool add \
--name spotpool \
--priority Spot \
--spot-max-price -1 \
--eviction-policy Delete \
--node-count 3 \
--min-count 0 \
--max-count 10 \
--enable-cluster-autoscaler
Node Selectors & Taints
- nodeSelector:
agentpool: userpool— schedule pods vào pool cụ thể - System pool taint:
CriticalAddonsOnly— chỉ system pods - GPU pool taint:
nvidia.com/gpu— chỉ GPU workloads - Spot pool taint:
kubernetes.azure.com/scalesetpriority=spot— batch jobs chấp nhận eviction
Cần tư vấn thiết kế node pool strategy tối ưu chi phí cho workload của bạn? PUPAM hỗ trợ capacity planning và cost optimization cho AKS clusters.
Networking
| Network Model | Pod IP | Scalability | Best For |
|---|---|---|---|
| Azure CNI | VNet IP per pod | Large (plan IP range) | Production |
| Azure CNI Overlay | Overlay IPs | Very large | Large clusters |
| kubenet | NAT-based | Limited | Dev/test |
VNet Layout Recommended
| Subnet | CIDR | Purpose |
|---|---|---|
| aks-nodes | 10.240.0.0/16 | Worker nodes |
| aks-pods | 10.241.0.0/16 | Pod IPs (Azure CNI) |
| aks-services | 10.0.0.0/16 | K8s services |
| appgw | 10.242.0.0/24 | Application Gateway |
Ingress Controller Setup
# NGINX Ingress (most popular)
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx \
--set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz
Ingress Options
- NGINX Ingress Controller — community, phổ biến nhất
- Azure Application Gateway Ingress Controller (AGIC) — native Azure, WAF tích hợp
- Traefik — lightweight, config-based
- Istio Gateway — service mesh, advanced traffic management
DNS & TLS
- Azure DNS zone:
*.apps.company.com→ Ingress Load Balancer IP - ExternalDNS controller: auto-create DNS records từ Ingress
- cert-manager: auto-provision Let's Encrypt TLS certificates
Network Policies (Calico)
- Default deny all ingress/egress
- Allow specific pod-to-pod communication
- Allow specific pod-to-internet traffic
- Ví dụ: chỉ API pods được talk to DB pods
Scaling
| Type | What Scales | Trigger | Config |
|---|---|---|---|
| HPA | Pod replicas | CPU/Memory/Custom metric | kubectl autoscale |
| VPA | Pod resources | Actual usage | VerticalPodAutoscaler CRD |
| Cluster Autoscaler | Nodes | Pending pods | --enable-cluster-autoscaler |
| KEDA | Pod replicas | Event-driven (queue, HTTP) | ScaledObject CRD |
Horizontal Pod Autoscaler (HPA)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Cluster Autoscaler (Node Scaling)
- Pending pods (not schedulable) → tự động add node
- Underutilized nodes (< 50% cho 10 phút) → drain & remove
- Scale up: ~3–5 phút (VM provisioning)
- Scale down: 10 phút cool-down period
KEDA (Event-Driven Scaling)
- Scale theo Azure Service Bus queue depth
- Scale theo Azure Storage Queue messages
- Scale theo HTTP requests per second
- Scale to 0 replicas — tiết kiệm chi phí ban đêm
Helm Charts
Cài đặt và sử dụng Helm
# Install Helm
brew install helm # macOS
# Add chart repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
# Install application
helm install my-redis bitnami/redis \
--namespace redis \
--create-namespace \
--set auth.password=mypassword \
--set replica.replicaCount=3
Custom Helm Chart cho ứng dụng
# Create chart scaffold
helm create myapp
# Deploy with environment-specific values
helm install myapp ./myapp \
--namespace production \
-f values-production.yaml
# Upgrade
helm upgrade myapp ./myapp -f values-production.yaml
# Rollback
helm rollback myapp 1
helm history myapp # view all revisions
Chart Structure
| File | Purpose |
|---|---|
Chart.yaml | Metadata (name, version, description) |
values.yaml | Default configuration values |
templates/deployment.yaml | Deployment manifest |
templates/service.yaml | Service manifest |
templates/ingress.yaml | Ingress rules |
templates/hpa.yaml | Auto-scaling config |
Helm trong CI/CD Pipeline
- Build Docker image → push to ACR
- Update
values.yaml(new image tag) helm upgrade --installtrong Azure Pipeline- Automated deployment on every push
Monitoring & Security
| Tool | Purpose | Setup |
|---|---|---|
| Container Insights | Logs, metrics, node/pod health | AKS addon (built-in) |
| Prometheus + Grafana | Custom metrics dashboards | Azure Managed Prometheus |
| Defender for Containers | Vulnerability scanning, runtime protection | Microsoft Defender enable |
| Azure Policy | Enforce K8s policies (no privileged pods) | AKS addon |
| Entra Workload Identity | Pod-level identity for Azure resources | Workload identity addon |
Container Insights
- Nodes tab: CPU, memory, disk per node
- Controllers tab: deployment health, restart count
- Containers tab: logs, resource usage per container
- Alerts: node CPU > 80%, pod restart > 3, OOM kills
Defender for Containers
- Image vulnerability scanning (ACR + runtime)
- Runtime threat detection (suspicious exec, crypto mining)
- Network anomaly detection (unexpected egress)
- Compliance: CIS Kubernetes Benchmark
Azure Policy for AKS
- No privileged containers
- Require resource limits (CPU/memory)
- Only allow images from approved ACR
- Require pod labels (app, version, team)
- No hostNetwork or hostPID
Workload Identity (Không cần secrets!)
- Pods dùng Entra ID identity trực tiếp → Azure Key Vault, SQL, Storage
- Setup: federated credentials linking K8s ServiceAccount → managed identity
- Best practice: không store secrets trong K8s — dùng Key Vault CSI driver
Checklist AKS Production
- Tạo AKS cluster với availability zones (2+ zones) và Standard tier SLA
- Tách system pool và user node pools (general, memory, GPU, spot)
- Enable cluster autoscaler với min/max nodes + HPA cho critical deployments
- Configure Azure CNI networking + Calico network policies
- Install Ingress controller (NGINX/AGIC) + cert-manager cho TLS
- Enable Microsoft Entra ID authentication + RBAC (ClusterRole/RoleBinding)
- Attach ACR + enable Defender for Containers + Azure Policies
- Setup Helm charts + CI/CD pipeline (Azure Pipelines/GitHub Actions)
FAQ
AKS có free không? Chi phí thực tế bao nhiêu?
AKS control plane miễn phí — chỉ trả tiền cho worker nodes (VMs), storage, networking. AKS Free tier: control plane free, no SLA. AKS Standard tier: $73/tháng, 99.95% SLA (production). Worker nodes: Standard_D4s_v5 (~$140/tháng/node) × 3 nodes = ~$420/tháng. Typical production cluster: 3–5 nodes ($420–700) + load balancer ($18) + disks ($50–100) + ACR ($5–170) = ~$500–1,000/tháng. Cost optimization: Spot instances (save 60–80%), Reserved VMs (1-year save 30–40%), cluster autoscaler (scale down off-hours), KEDA scale-to-zero.
AKS vs Azure Container Apps — nên chọn nào?
Container Apps cho simple microservices (nhanh, ít config), AKS cho complex orchestration (full control). Container Apps: fully managed, KEDA built-in, Dapr integration, scale to zero — deploy trong 5 phút, không cần K8s expertise. AKS: full Kubernetes API, custom networking (CNI, network policies), service mesh (Istio), Helm charts, pod security policies. Chọn Container Apps khi ≤10 microservices, event-driven workloads. Chọn AKS khi complex networking, custom Ingress/service mesh, GPU workloads.
Làm sao secure AKS cluster cho production?
Defense-in-depth 5 layers. Layer 1 — Identity: Microsoft Entra ID + Kubernetes RBAC, disable local accounts. Layer 2 — Network: Azure CNI, Calico policies, private cluster (API server không public). Layer 3 — Image: only pull từ approved ACR, Defender scanning, image signing (Notary). Layer 4 — Runtime: Azure Policy (no privileged, require resource limits). Layer 5 — Secrets: Azure Key Vault via CSI driver + Workload Identity. Review security qua Microsoft Defender for Cloud → Secure Score.
Upgrade Kubernetes version có downtime không?
Không downtime nếu configure đúng — rolling upgrade node by node. Azure upgrades control plane trước (transparent). Upgrade nodes: surge settings = 1 extra node → drain old → provision new → repeat. Pod Disruption Budget (PDB): set maxUnavailable=1. Tips: test trên dev cluster trước, set surge = 33%, ≥2 replicas per deployment, readiness probes. Upgrade cycle: N-2 support (1.29 current → 1.27 still supported). Schedule maintenance window: AKS → Maintenance (off-peak hours).
AKS integrate được với Azure DevOps CI/CD không?
Có — Azure Pipelines deploy to AKS native, hoặc GitHub Actions + Helm. Pipeline: Build Docker image → Push ACR → Update Helm values (image tag) → helm upgrade --install. Azure Pipeline tasks: KubernetesManifest@1 hoặc HelmDeploy@0. Service connection: auth via Microsoft Entra ID managed identity. GitOps alternative: Flux v2 (AKS addon) — declare desired state trong Git repo → Flux auto-syncs to cluster.
AKS có chạy được Windows containers không?
Có — AKS hỗ trợ Windows node pools bên cạnh Linux. Thêm Windows node pool: az aks nodepool add --os-type Windows --name winnp. Use cases: .NET Framework legacy apps, Windows-specific workloads. Lưu ý: system pool bắt buộc Linux, Windows nodes chỉ cho user pools. Chi phí Windows nodes cao hơn ~20% do Windows Server licensing. Recommendation: containerize .NET Framework → .NET 8 (Linux) khi có thể để giảm cost.
Nguồn Tham Khảo
- Azure Kubernetes Service Documentation — Microsoft Learn
- AKS Best Practices — Microsoft Learn
- Kubernetes Official Documentation
- CNCF Cloud Native Landscape
- OWASP Kubernetes Security Cheat Sheet
- CIS Kubernetes Benchmark
- Azure Container Security — Microsoft Learn
Hành Động Tiếp Theo
- Tạo AKS cluster development — 1 node pool Standard_D2s_v5, deploy sample workload qua
kubectl, làm quen với cluster management - Enable security baseline — Microsoft Defender for Containers + Azure Policy add-on + Microsoft Entra ID RBAC
- Thiết kế production architecture — xác định node pool strategy (system/user/spot), networking model (Azure CNI), Ingress controller, và CI/CD pipeline với Helm charts
Bài Liên Quan Nên Đọc
- Azure App Service Web Apps Hosting
- Azure DevOps CI/CD Pipelines
- Azure Blob Storage Cloud Storage
- Azure AD B2B Guest Access Collaboration
- Microsoft 365 Security Best Practices
Kết Luận
| Khía Cạnh | Chi Tiết |
|---|---|
| AKS Model | Control plane miễn phí, trả tiền worker nodes (~$500–1,000/tháng production) |
| Scaling | HPA (pods) + Cluster Autoscaler (nodes) + KEDA (event-driven, scale to zero) |
| Security | 5-layer defense: Entra ID RBAC, Calico, Defender, Azure Policy, Key Vault |
AKS là managed Kubernetes production-grade trên Azure — control plane miễn phí, Azure handles upgrades, patching, high availability. Node pools cho workload isolation (system, general, GPU, spot). Auto-scaling 3 tầng: HPA (pods), Cluster Autoscaler (nodes), KEDA (event-driven, scale to zero). Security defense-in-depth: Microsoft Entra ID RBAC, Calico network policies, Microsoft Defender for Containers, Azure Policy enforcement. CI/CD: Azure Pipelines hoặc GitHub Actions + Helm charts. Doanh nghiệp VN muốn chạy microservices architecture mà không tự quản lý Kubernetes control plane — AKS là lựa chọn hàng đầu.
Cần triển khai AKS production cluster cho doanh nghiệp? Liên hệ PUPAM — thiết kế cluster architecture, node pool strategy, networking, security baseline, monitoring và CI/CD pipeline.