Skip to content

Private Deep Chat

In this tutorial you are going to deploy a custom, multitenant, private chat application. The Chat UI is powered by Deep Chat - an open source web component that is easy to embed into any frontend web app framework or simple HTML page. KubeAI, which serves LLMs on Kubernetes, will be used to ensure that chat interactions are not sent out of the cluster.

Screenshot

In this example, we will deploy a custom Go server that will authenticate users using Basic Authentication. When a webpage is requested, a simple HTML page with the <deep-chat> web component will be served. We will configure Deep Chat and KubeAI to communicate using the OpenAI API format:

<deep-chat
    connect='{"url":"/openai/v1/chat/completions", ... }'
    directConnection='{"openAI": ... }'>
</deep-chat>

When the HTML page loads we will use Javascript to make an initial request to fetch available models. The Go server will proxy this request to KubeAI:

proxyHandler := httputil.NewSingleHostReverseProxy(kubeAIURL)

http.Handle("/openai/", authUserToKubeAI(proxyHandler))

The server will translate the username and password provided in the basic auth header into a label selector that will tell KubeAI to filter the models it returns. The same approach will also be used to enforce access at inference-time.

r.Header.Set("X-Label-Selector", fmt.Sprintf("tenancy in (%s)",
    strings.Join(tenancy, ","),
))

While this is a simple example application, this overall architecture can be used when incorporating chat into a production application.

Architecture

Guide

Create a local cluster with kind and install KubeAI.

kind create cluster

helm repo add kubeai https://www.kubeai.org && helm repo update
helm install kubeai kubeai/kubeai --set openwebui.enabled=false --wait --timeout 5m

Clone the KubeAI repo and navigate to the example directory.

git clone https://github.com/substratusai/kubeai
cd ./kubeai/examples/private-deep-chat

Build the private chat application and load the image into the local kind cluster.

docker build -t private-deep-chat:latest .
kind load docker-image private-deep-chat:latest

Deploy the private chat application along with some KubeAI Models.

kubectl apply -f ./manifests

Start a port-forward.

kubectl port-forward svc/private-deep-chat 8000:80

In your browser, navigate to localhost:8000.

NOTE: Your first interaction with a given model may take a minute as the model needs to scale from zero. Use kubectl get pods to check the scale up progress.

Login as any of the following users:

User Password
nick nickspass
sam samspass
joe joespass

These users each have access to different KubeAI Models. You can see this assignment by looking at the user mapping in ./main.go and the associated tenancy label on the Models in ./manifests/models.yaml.