Install on EKS¶
TIP: Make sure you have enough GPU quota in your AWS account.
The default quotas for GPU instances are often 0. You will need to request a quota increase for the GPU instances you want to use.
The following quotas may require an increase if you wish to use GPUs in your EKS cluster: - All G and VT Spot Instance Requests - All P5 Spot Instance Requests - All P4, P3 and P2 Spot Instance Requests - Running Dedicated p4d Hosts
1. Create EKS cluster with Karpenter¶
Set the environment variables used throughout this guide:
export CLUSTER_NAME="cluster-with-karpenter"
export AWS_DEFAULT_REGION="us-west-2"
export K8S_VERSION="1.30"
export GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"
Create the EKS cluster using eksctl:
eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: "${CLUSTER_NAME}"
region: "${AWS_DEFAULT_REGION}"
version: "${K8S_VERSION}"
tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # here, it is set to the cluster name
iam:
withOIDC: true # required
karpenter:
version: '1.0.6' # Exact version must be specified
managedNodeGroups:
- instanceType: m5.large
amiFamily: AmazonLinux2
name: "${CLUSTER_NAME}-m5-ng"
desiredCapacity: 2
minSize: 1
maxSize: 10
EOF
2. Configure a Karpenter GPU NodePool¶
Create the NodePool and EC2NodeClass objects:
kubectl apply -f - <<EOF
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: gpu
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["g", "p"]
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: gpu
expireAfter: 720h # 30 * 24h = 720h
taints:
- key: nvidia.com/gpu
value: "true"
effect: NoSchedule
limits:
cpu: 1000
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: gpu
spec:
amiFamily: AL2 # Amazon Linux 2
role: "eksctl-KarpenterNodeRole-${CLUSTER_NAME}"
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
amiSelectorTerms:
- id: "${GPU_AMI_ID}" # <- GPU Optimized AMD AMI
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 300Gi
volumeType: gp3
encrypted: true
EOF
Install the NVIDIA device plugin (needed for Karpenter nodes):
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.1/deployments/static/nvidia-device-plugin.yml
3. Install KubeAI¶
Add KubeAI Helm repository.
helm repo add kubeai https://www.kubeai.org
helm repo update
Make sure you have a HuggingFace Hub token set in your environment (HUGGING_FACE_HUB_TOKEN
).
export HF_TOKEN="replace-with-your-huggingface-token"
Install KubeAI with Helm.
curl -L -O https://raw.githubusercontent.com/substratusai/kubeai/refs/heads/main/charts/kubeai/values-eks.yaml
# Please review the values-eks.yaml file and edit the nodeSelectors if needed.
cat values-eks.yaml
helm upgrade --install kubeai kubeai/kubeai \
-f values-eks.yaml \
--set secrets.huggingface.token=$HF_TOKEN \
--wait
3. Optionally configure models¶
Optionally install preconfigured models.
cat <<EOF > kubeai-models.yaml
catalog:
llama-3.1-8b-instruct-fp8-l4:
enabled: true
EOF
helm install kubeai-models kubeai/models \
-f ./kubeai-models.yaml