Skip to main content

Kubernetes Production Installation

Deploy Solace Agent Mesh Enterprise on Kubernetes for production with full configuration, high availability, and security.

info

For quick evaluation, see the Kubernetes Quick Start. For air-gapped environments, see the Air-Gapped Kubernetes Installation.

Production vs Quick Start

Quick Start is designed for evaluation only and uses embedded components unsuitable for production:

ComponentQuick StartProduction
Solace BrokerEmbedded in clusterExternal Solace Cloud or on-premises broker
StorageEmbedded PostgreSQL/SeaweedFSExternal S3-compatible storage, managed databases
AuthenticationDisabledRBAC + SSO (OIDC)
High AvailabilitySingle instanceMulti-replica, auto-recovery
TLS/CertificatesNoneFull TLS with trusted certificates
MonitoringBasic logsFull observability (Prometheus, Grafana, and so on)
Suitable ForTesting, evaluation, demosProduction workloads

Production Prerequisites

Kubernetes Platform Requirements

Supported Kubernetes Versions:

  • The three most recent minor versions of upstream Kubernetes
  • For managed services (EKS, AKS, GKE): validated against provider's default release channels
  • For on-premises (OpenShift, Rancher, and so on): compatibility based on underlying K8s API version

Platform Support Matrix:

CategoryDistributionsSupport Level
ValidatedAWS EKS
Azure AKS
Google GKE
Tier 1 Support - Explicitly validated by Solace QA
CompatibleRed Hat OpenShift
VMware Tanzu (TKG)
SUSE Rancher (RKE2)
Oracle OKE
Canonical Charmed K8s
Upstream K8s (kubeadm)
Tier 2 Support - Compatible with standard K8s APIs. Proprietary security features (SCCs, PSPs) are customer responsibility

Constraints & Limitations:

  • Node Architecture: Standard worker nodes (VMs or bare metal) required
    • Not Supported: Serverless nodes (AWS Fargate, GKE Autopilot, Azure Virtual Nodes)
  • Security Context: Containers run as non-root (UID 999), no privileged capabilities required
    • For OpenShift: May need to add service account to nonroot SCC if cluster enforces restricted-v2
  • Monitoring: Solace Agent Mesh does NOT deploy DaemonSets - observability is customer responsibility

Compute Resources

Processor Architecture:

  • ARM64 (recommended) - Better price/performance (AWS Graviton, Azure Cobalt, Google Axion)
  • x86_64 (Intel/AMD) - Fully supported

Recommended Production Node Sizing:

Use latest-generation general-purpose nodes with 1:4 CPU:Memory ratio:

SpecificationvCPURAMARM Examplesx86 Examples
Recommended416 GiBAWS m8g.xlarge
Azure Standard_D4ps_v6
GCP c4a-standard-4
AWS m8i.xlarge
Azure Standard_D4s_v6
GCP n2-standard-4
Minimum28 GiBAWS m8g.large
Azure Standard_D2ps_v6
GCP c4a-standard-2
AWS m8i.large
Azure Standard_D2s_v6
GCP n2-standard-2
Instance Availability

If the listed instance types are unavailable in your region, choose the next closest equivalent (for example, m7g.large instead of m8g.large).

Component Resource Specifications:

Default resource requests and limits for core components (excluding sidecar overhead):

ComponentDescriptionCPU RequestCPU LimitRAM RequestRAM LimitQoS Class
Agent MeshCore services, Orchestrator, WebUI1000m2000m1024 MiB2048 MiBBurstable
DeployerManages agent/gateway deployments100m200m256 MiB512 MiBBurstable
Agent (per instance)Runtime for each deployed agent1000m2000m1024 MiB2048 MiBBurstable

Capacity Planning:

Budget the following per concurrent agent you plan to deploy:

  • Memory Request: 1024 MiB (1 GiB)
  • Memory Limit: 2048 MiB (2 GiB)
  • CPU Request: 1000m (1 vCPU)
  • CPU Limit: 2000m (2 vCPU)

External Services (Required)

Production deployments must use managed external services. Embedded components are not supported for production.

Database:

  • PostgreSQL 17+ (AWS RDS, Azure Database for PostgreSQL, Cloud SQL, and so on)
  • Admin credentials with SUPERUSER privileges (recommended) or minimum CREATEROLE and CREATEDB
  • The Agent Mesh init container uses admin credentials to automatically create users and databases
  • See Session Storage for configuration

Object Storage:

  • S3-compatible API (AWS S3, Azure Blob Storage, Google Cloud Storage)
  • See Artifact Storage for configuration

Solace Event Broker:

  • Solace Cloud-managed broker (recommended) or self-hosted PubSub+
  • SMF over TLS (port 55443) or WebSocket Secure connectivity

LLM Service:

  • OpenAI, Azure OpenAI, AWS Bedrock, or compatible endpoint
  • Can be configured post-install via Model Config UI or pre-configured in values.yaml

Identity Provider (IdP):

  • OIDC-compliant provider (Microsoft Entra ID, Okta, AWS Cognito, and so on)
  • Required for SSO and RBAC
  • See Single Sign-On for configuration

S3 Bucket for OpenAPI Connector Specs (Optional):

If using OpenAPI Connector features, a separate S3 bucket is required:

  • Public read access - Agents download specs without authentication
  • Authenticated write - Agent Mesh platform uploads/manages specs
  • Security: Only API schemas stored, never credentials
  • Configure via dataStores.s3.connectorSpecBucketName in values.yaml
Why Separate Bucket?

OpenAPI connector specs must be publicly readable for agents to download at startup, while artifact storage requires authentication. Separation maintains security boundaries.

For detailed setup instructions, see S3 Buckets for OpenAPI Connector Specs in the following section.

Network Connectivity

Inbound Traffic:

  • Ingress controller required (NGINX, ALB, and so on)
  • TLS certificate for production (via cert-manager or manual)

Outbound Platform Access:

The following outbound connectivity is required:

DestinationPurposeNotes
gcr.io/gcp-maas-prodContainer registryRequires Pull Secret from Solace Cloud Console
OR mirror to private registry
*.messaging.solace.cloudSolace Cloud brokerOR self-hosted broker at SMF/SMF+TLS ports
LLM provider endpointsModel inferencefor example, api.openai.com, Azure OpenAI endpoints
Identity Provider (IdP)Authentication/authorizationYour OIDC provider endpoints

Corporate Proxy Support:

  • HTTP/HTTPS proxy configuration supported for egress filtering
  • See Proxy Configuration for setup

Application & Mesh Components:

  • Custom agents may require access to external systems (Salesforce, Jira, databases, and so on)
  • Ensure worker nodes have network reachability to target services

Command-Line Tools

  • Helm v3.0 or later (installation guide)
  • kubectl configured with appropriate RBAC permissions
  • Optional: helm diff plugin for upgrade previews

Additional Requirements

TLS Certificates:

  • Required for production Ingress or LoadBalancer/NodePort with TLS
  • Managed via Ingress annotations (cert-manager, ACM) or manual Secret creation
  • See values.yaml inline documentation for configuration options

RBAC Permissions:

  • Namespace creation and management
  • Deployment, Service, ConfigMap, Secret creation
  • PVC creation (if using bundled persistence for dev/staging)

Queue Template Configuration (Recommended):

For production Kubernetes deployments, configure Solace broker queue templates to prevent message buildup and startup issues. See Queue Template Configuration for Kubernetes in Step 2 for detailed setup instructions.

Step 1: Infrastructure Preparation

Prepare your Kubernetes cluster infrastructure before deploying Agent Mesh.

Cluster Sizing

Production Cluster Requirements:

For production deployments using external components (no embedded broker/persistence), plan for the following baseline resources:

  • Minimum per node: 2 vCPU / 8 GiB RAM
  • Recommended per node: 4 vCPU / 16 GiB RAM

Per-Agent Capacity Planning:

Budget the following per concurrent agent:

  • CPU Request: 175m
  • CPU Limit: 200m
  • Memory Request: 625 MiB
  • Memory Limit: 768 MiB

Node Instance Examples:

SpecificationARM64 (Recommended)x86_64
Recommended (4 vCPU / 16 GiB)AWS m8g.xlarge
Azure Standard_D4ps_v6
GCP c4a-standard-4
AWS m8i.xlarge
Azure Standard_D4s_v6
GCP n2-standard-4
Minimum (2 vCPU / 8 GiB)AWS m8g.large
Azure Standard_D2ps_v6
GCP c4a-standard-2
AWS m8i.large
Azure Standard_D2s_v6
GCP n2-standard-2
ARM64 Recommended

ARM64 instances (AWS Graviton, Azure Cobalt, Google Axion) offer better price/performance. If listed instances are unavailable in your region, choose the next closest equivalent (for example, m7g.large instead of m8g.large).

Node Pool Topology (Multi-AZ Clusters)

Stateless Workloads

When using external persistence (recommended for production), all Agent Mesh workloads are stateless and do not have multi-AZ topology constraints. This section is only relevant if you're using embedded persistence for dev/staging environments on production-grade clusters.

In multi-AZ clusters (EKS, AKS, GKE), when using embedded persistence (global.persistence.enabled: true), one node pool must be provisioned per availability zone due to volume affinity constraints.

Simplest Approach:

Provision Agent Mesh in one availability zone only to avoid multi-AZ complexity.

Official Cloud Provider Guidance:

Why This Matters:

StatefulSets with persistent volumes (PostgreSQL, SeaweedFS) are bound to specific zones. When nodes span multiple AZs without proper node pool configuration, pod scheduling can fail if the PVC and node are in different zones.

Step 2: External Dependencies

Configure external services required for production Agent Mesh deployments.

Solace Broker Configuration

Set up your external Solace event broker before installing Agent Mesh.

Solace Cloud (Recommended):

  1. Create a service in Solace Cloud
  2. Navigate to Cluster Manager → Your Service → Connect
  3. Switch dropdown to View by Language
  4. Select Solace Python with SMF protocol
  5. Note the following credentials:
    • Secured SMF URI (for broker.url)
    • Message VPN (for broker.vpn)
    • Username (for broker.clientUsername)
    • Password (for broker.password)

Self-Hosted PubSub+:

  • Ensure SMF over TLS (port 55443) or WebSocket Secure connectivity
  • Provide connection details in values.yaml (see the configuration in Step 3)

Queue Template Configuration for Kubernetes

For production Kubernetes deployments, configure your Solace broker to use durable queues with message TTL to prevent queue buildup and startup issues.

Why Durable Queues for Kubernetes?

When USE_TEMPORARY_QUEUES=true (default), Agent Mesh uses temporary endpoints for agent-to-agent communication. Temporary queues are automatically created and deleted by the broker, but they do not support multiple client connections to the same queue.

In container-managed environments like Kubernetes, this causes problems:

  • A new pod may start while the previous instance is still terminating
  • The new pod cannot connect because the old pod still holds the temporary queue
  • Pod startup fails until the old instance fully terminates

Solution: Use Durable Queues

Durable queues persist beyond client disconnections and allow multiple instances to connect to the same queue.

Step 1: Configure Agent Mesh to Use Durable Queues

Set the following in your Helm values:

# In your production-overrides.yaml
environmentVariables:
USE_TEMPORARY_QUEUES: "false"

Step 2: Create Queue Template in Solace Cloud Console

To prevent messages from piling up when agents are not running, configure message TTL (time-to-live) via a Queue Template:

  1. Navigate to Message VPNs and select your VPN
  2. Go to the Queues page
  3. Open the Templates tab
  4. Click + Queue Template

Template Settings:

SettingValueLocation
Queue Name Filter{NAMESPACE}/>Replace {NAMESPACE} with your Agent Mesh namespace (for example, sam/>)
Respect TTLtrueAdvanced Settings → Message Expiry
Maximum TTL (sec)18000Advanced Settings → Message Expiry
Template Application

Queue templates only apply to new queues created by messaging clients. If you already have durable queues from previous deployments, either:

  • Manually enable TTL and Respect TTL on each existing queue in Solace console, OR
  • Delete existing queues and restart Agent Mesh to recreate them with the template settings

Step 3: Verify Configuration

After deploying Agent Mesh, verify queues are created with correct settings:

  1. In Solace Cloud Console, navigate to Queues
  2. Find queues matching your namespace pattern (for example, sam/...)
  3. Check that Respect TTL is enabled
  4. Verify Maximum TTL is set to 18000 seconds

For more details on queue configuration, see Queue Template Configuration.

S3-Compatible Storage

Configure external object storage for artifacts and session data.

S3 Buckets for OpenAPI Connector Specs

If you plan to use the OpenAPI Connector feature for REST API integrations, you must configure a dedicated S3 bucket for OpenAPI specification files. This is separate from artifact storage and required for agents to download OpenAPI specs at startup.

When is a Connector Specs Bucket Required?

  • When using the OpenAPI Connector feature for REST API integrations
  • When agents must access OpenAPI spec files at startup
  • For all Kubernetes deployments using OpenAPI connectors
  • Not required if you're not using OpenAPI Connector features

Why a Separate Bucket?

  • Public read access - Agents must download OpenAPI specs without authentication
  • Security isolation - Keeps infrastructure files separate from user artifacts
  • No secrets - Only API schemas, endpoints, and models are stored (never credentials)

Setup Instructions

Step 1: Create the Connector Specs S3 Bucket

aws s3 mb s3://my-connector-specs-bucket --region us-west-2

Replace my-connector-specs-bucket with your bucket name and us-west-2 with your region.

Step 2: Apply Public Read Policy

Create a policy file connector-specs-policy.json:

{
"Version": "2012-10-17",
"Statement": [{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-connector-specs-bucket/*"
}]
}

Apply the policy:

aws s3api put-bucket-policy \
--bucket my-connector-specs-bucket \
--policy file://connector-specs-policy.json

Step 3: Configure IAM Permissions for Agent Mesh Platform

Grant the Agent Mesh platform's IAM user/role the following permissions for write access:

{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-connector-specs-bucket",
"arn:aws:s3:::my-connector-specs-bucket/*"
]
}]
}

Step 4: Configure in Helm Values

Add the bucket name to your production-overrides.yaml:

dataStores:
s3:
bucketName: "sam-artifacts" # Main artifact storage
connectorSpecBucketName: "my-connector-specs-bucket" # OpenAPI connector specs
region: "us-west-2"
# ... other S3 config

Other Cloud Providers

For Azure Blob Storage, Google Cloud Storage, or other S3-compatible storage, configure the connectorSpecBucketName (or equivalent container name) in your Helm values under the appropriate dataStores section. Ensure the bucket/container has public read access and that the Agent Mesh platform has write permissions.

Security Best Practices

Security Guidelines
  • Never store API keys, passwords, or secrets in OpenAPI spec files
  • Public read is safe - only API schemas are stored
  • Write access should be restricted to the Agent Mesh platform

Verification

After setup, verify the bucket is accessible:

# Test public read access (should work without credentials)
curl https://my-connector-specs-bucket.s3.amazonaws.com/test-spec.yaml

If you get a 403 Forbidden error, check your bucket policy. If you get 404 Not Found, the bucket is correctly configured (just no files uploaded yet).

Step 3: Helm Chart Configuration

Create a production overrides file (production-overrides.yaml) based on the inline documentation in the chart's values.yaml.

Embedded vs External Components

Production deployments must use external components. Embedded PostgreSQL, SeaweedFS, and Solace broker lack high availability, backup/restore, and proper resource limits.

Disable Embedded Components

global:
broker:
embedded: false # Use external Solace broker
persistence:
enabled: false # Use external datastores

External Broker Configuration

broker:
url: "tcps://your-broker.messaging.solace.cloud:55443"
clientUsername: "your-username"
password: "your-password"
vpn: "your-vpn"

External Datastores

Configure PostgreSQL database and object storage. The applicationPassword is required. This single password is used for all database users created by Agent Mesh (webui, orchestrator, platform, and all agents).

Password Rotation Limitation

After database users are created for a given namespaceId, the applicationPassword cannot be changed. To change passwords, you must either use a new namespaceId (creates new databases/users) or manually update passwords directly in the database.

AWS RDS + S3

dataStores:
database:
protocol: "postgresql+psycopg2"
host: "mydb.abc123.us-east-1.rds.amazonaws.com"
port: "5432"
adminUsername: "postgres"
adminPassword: "your-rds-password"
applicationPassword: "your-secure-app-password" # REQUIRED

objectStorage:
type: "s3"

s3:
endpointUrl: "https://s3.us-east-1.amazonaws.com"
bucketName: "my-sam-artifacts"
connectorSpecBucketName: "my-sam-connector-specs"
accessKey: "AKIAIOSFODNN7EXAMPLE"
secretKey: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"

Supabase

If using Supabase with the connection pooler (required for IPv4 networks):

dataStores:
database:
protocol: "postgresql+psycopg2"
host: "aws-1-us-east-1.pooler.supabase.com"
port: "5432"
adminUsername: "postgres"
adminPassword: "your-supabase-postgres-password"
applicationPassword: "your-secure-app-password"
supabaseTenantId: "your-project-id" # Extract from connection string

s3:
endpointUrl: "https://your-project-id.storage.supabase.co/storage/v1/s3"
bucketName: "your-bucket-name"
connectorSpecBucketName: "your-connector-specs-bucket-name"
accessKey: "your-supabase-s3-access-key"
secretKey: "your-supabase-s3-secret-key"
Supabase Direct Connection

If using Supabase's Direct Connection with IPv4 addon, omit the supabaseTenantId field.

NeonDB

dataStores:
database:
protocol: "postgresql+psycopg2"
host: "ep-cool-name-123456.us-east-2.aws.neon.tech"
port: "5432"
adminUsername: "neondb_owner"
adminPassword: "your-neon-password"
applicationPassword: "your-secure-app-password"

s3:
endpointUrl: "https://s3.amazonaws.com"
bucketName: "my-sam-artifacts"
connectorSpecBucketName: "my-sam-connector-specs"
accessKey: "your-access-key"
secretKey: "your-secret-key"

Azure Blob Storage

Option 1 — account name and key:

dataStores:
objectStorage:
type: "azure"

database:
protocol: "postgresql+psycopg2"
host: "your-postgres-host"
port: "5432"
adminUsername: "postgres"
adminPassword: "your-db-password"
applicationPassword: "your-secure-app-password"

azure:
accountName: "mystorageaccount"
accountKey: "your-azure-storage-account-key"
containerName: "my-sam-artifacts"
connectorSpecContainerName: "my-sam-connector-specs"

Option 2 — connection string:

azure:
connectionString: "DefaultEndpointsProtocol=https;AccountName=mystorageaccount;AccountKey=...;EndpointSuffix=core.windows.net"
containerName: "my-sam-artifacts"
connectorSpecContainerName: "my-sam-connector-specs"

Google Cloud Storage

dataStores:
objectStorage:
type: "gcs"

database:
protocol: "postgresql+psycopg2"
host: "your-postgres-host"
port: "5432"
adminUsername: "postgres"
adminPassword: "your-db-password"
applicationPassword: "your-secure-app-password"

gcs:
project: "my-gcp-project"
credentialsJson: '{"type":"service_account","project_id":"my-gcp-project",...}'
bucketName: "my-sam-artifacts"
connectorSpecBucketName: "my-sam-connector-specs"

Workload Identity

Workload identity allows Agent Mesh pods to authenticate with cloud storage using the pod's Kubernetes service account, eliminating static credentials (access keys, account keys, JSON key files).

dataStores:
objectStorage:
type: "s3" # or "azure" or "gcs"
workloadIdentity:
enabled: true

samDeployment:
serviceAccount:
annotations:
# AWS IRSA:
eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/my-sam-role"
# OR Azure Workload Identity:
azure.workload.identity/client-id: "00000000-0000-0000-0000-000000000000"
# OR GCP Workload Identity:
iam.gke.io/gcp-service-account: "my-sa@my-project.iam.gserviceaccount.com"

Per-provider setup (high-level):

  • AWS IRSA: Create IAM role with S3 permissions, associate with K8s service account, omit accessKey/secretKey
  • Azure Workload Identity: Create managed identity with Storage Blob Data Contributor role, establish federated credential, omit accountKey/connectionString
  • GCP Workload Identity: Create GCP service account with Storage Object Admin, bind to K8s service account, omit credentialsJson

Authorization and OIDC

sam:
authorization:
enabled: true

oauthProvider:
oidc:
issuer: "https://login.microsoftonline.com/YOUR-TENANT-ID/v2.0"
clientId: "your-client-id"
clientSecret: "your-client-secret"

Ingress

ingress:
enabled: true
className: "nginx"
host: "sam.example.com"
autoConfigurePaths: true
tls:
- secretName: sam-tls-cert
hosts:
- sam.example.com

Secret Management

  • SESSION_SECRET_KEY is auto-generated if not provided
  • After generation, it is preserved across upgrades to prevent session invalidation
  • For production, explicitly set it for consistency: sam.sessionSecretKey: "your-secret-key"

LLM Configuration

  • LLM can be configured post-install via the Model Config UI
  • Alternatively, pre-configure in values.yaml under llmService.*

Custom CA Certificates

If your internal infrastructure (Solace broker, OIDC provider, LLM service) uses self-signed or private CA certificates, configure Agent Mesh to trust them. This applies to:

  • Solace broker with custom CA
  • OIDC provider (Keycloak, and so on) with custom CA
  • LLM service with custom CA
Certificate Requirements
  • Use the CA certificate (issuer), not the server certificate
  • Certificate must include SAN (Subject Alternative Name) extension
  • File must have .crt extension (REQUIRED)
  • PEM format with -----BEGIN CERTIFICATE----- headers

Before creating the ConfigMap, verify your CA certificate includes SAN:

openssl x509 -in ca-cert.pem -noout -text | grep -A1 "Subject Alternative Name"

If your certificate is not in PEM format, convert first:

# DER → PEM
openssl x509 -inform der -in ca.der -out ca.crt

# PKCS#7 → PEM
openssl pkcs7 -print_certs -in ca.p7b -out ca.crt

# PKCS#12 → PEM (CA certs only)
openssl pkcs12 -in ca.pfx -out ca.crt -nokeys -cacerts

Step 1: Prepare the CA bundle. Ensure the file has a .crt extension:

cp ca-cert.pem ca-cert.crt

Step 2: Create Kubernetes ConfigMap:

# Single CA
kubectl create configmap truststore \
--from-file=ca.crt=/path/to/your-ca.crt \
-n <namespace>

# Multiple CAs — each key must end in .crt
kubectl create configmap truststore \
--from-file=ca1.crt=/path/to/ca1.crt \
--from-file=ca2.crt=/path/to/ca2.crt \
-n <namespace>

Step 3: Enable in production-overrides.yaml:

samDeployment:
customCA:
enabled: true
configMapName: "truststore" # Optional: default is "truststore"

Step 4: Install or upgrade. The chart injects a ca-merge init container that merges your CA bundle with the system trust store:

helm upgrade sam /path/to/charts/solace-agent-mesh-<version>.tgz \
-n <namespace> \
-f production-overrides.yaml

To rotate or update certificates, delete the ConfigMap, create a new one, then restart the deployment with kubectl rollout restart deployment/sam-solace-agent-mesh-core -n <namespace>.

  • If the ConfigMap does not exist at pod start, Agent Mesh falls back to the system CA bundle silently
  • Pod restart is always required for CA changes (no hot reload)
  • ConfigMap name can be customized via customCA.configMapName if truststore conflicts

Step 4: Pre-Installation Validation

Validate your production configuration before deploying:

Dry-run installation:

helm install sam /path/to/charts/solace-agent-mesh-<version>.tgz \
--namespace sam \
--dry-run \
-f production-overrides.yaml

Validate Kubernetes manifests (optional):

helm template sam /path/to/charts/solace-agent-mesh-<version>.tgz \
-f production-overrides.yaml | \
kubeconform -strict -summary -kubernetes-version 1.28.0

Verify external service connectivity:

  • Confirm PostgreSQL is accessible from the cluster
  • Confirm S3 endpoint is accessible
  • Confirm Solace broker is reachable
  • Confirm LLM endpoint is accessible (if pre-configured)

Step 5: Installation

Install Agent Mesh with your production overrides:

helm install sam /path/to/charts/solace-agent-mesh-<version>.tgz \
--namespace sam \
--create-namespace \
-f production-overrides.yaml

The chart's default values.yaml contains inline documentation for all configuration options. Consult it when creating your production overrides.

Step 6: Post-Installation Configuration

Verify Production Readiness:

  • Persistence: External PostgreSQL and S3 (not embedded)
  • Authorization: Enabled
  • OIDC: Issuer configured
  • TLS: Certificates configured
  • Ingress/LoadBalancer: External access enabled

First Login

With OIDC configured:

On first login, you are redirected to your identity provider. Before logging in, ensure your OIDC callback URI is registered with your provider:

https://<your-sam-domain>/callback

Without OIDC:

On first login, you are prompted to configure your LLM API key via the Model Configuration UI.

Configure Authentication

See Single Sign-On for detailed OAuth/OIDC provider setup.

Configure Authorization

See RBAC Setup Guide for detailed access control configuration.

Step 7: Production Validation

Perform comprehensive validation before going live.

Health Checks

Agent Mesh provides HTTP health check endpoints that integrate with Kubernetes probes for automated lifecycle management. Configure startup, readiness, and liveness probes in your deployment manifests to enable graceful deployments and automatic recovery from failures.

Verify health endpoints (replace with your actual domain):

curl -s https://sam.example.com/health
curl -s https://sam.example.com/api/v1/platform/health

For detailed probe configuration options and examples, see Health Checks.

Upgrading from Quick Start

If you started with the Quick Start installation (chart defaults), upgrade to production using helm upgrade.

helm upgrade sam /path/to/charts/solace-agent-mesh-<version>.tgz \
--namespace sam \
-f production-overrides.yaml

Progressive Upgrade Strategy:

You can enable external components one at a time to reduce risk:

  1. First upgrade: External datastores only

    global:
    persistence:
    enabled: false
    dataStores:
    database: { ... }
    s3: { ... }
  2. Second upgrade: Add external broker

    global:
    broker:
    embedded: false
    broker: { ... }
  3. Third upgrade: Enable auth and TLS

    sam:
    authorization:
    enabled: true
    ingress:
    enabled: true
    tls: [ ... ]
Data Migration Required

When migrating from embedded PostgreSQL to external, you must export and import your data. The embedded database is not automatically migrated.

Reference

Complete reference for all configuration options in the Agent Mesh Helm chart.

Global Configuration

KeyTypeDefaultDescription
global.broker.embeddedbooltrueDeploy embedded single-node Solace event broker alongside Agent Mesh. For production, set to false and configure external broker.
global.persistence.enabledbooltrueDeploy bundled persistence with in-cluster PostgreSQL and SeaweedFS. For production, set to false and configure external datastores.
global.persistence.namespaceIdstring"solace-agent-mesh"Unique identifier for Agent Mesh database/user scoping. Must be unique per Agent Mesh installation to avoid topic collisions.
global.imageRegistrystring"gcr.io/gcp-maas-prod"Container registry for all images. For air-gapped environments, set to your internal registry.
global.imagePullSecretslist[]Image pull secrets applied to ALL pods (core, agent-deployer, postgresql, seaweedfs, broker). Required when using a private registry.
global.imagePullKeystring""Docker config JSON for private registry authentication. Mutually exclusive with imagePullSecrets. Use with --set-file global.imagePullKey=credentials.json

Validations

KeyTypeDefaultDescription
validations.clusterResourceChecksbooltrueTemplate-time cluster resource existence checks. Looks up referenced Secrets, ConfigMaps, StorageClass, IngressClass before install. Set to false if service account lacks get RBAC on cluster resources.

Agent Mesh Core Configuration

KeyTypeDefaultDescription
sam.communityModeboolfalseDisable Agent Mesh enterprise features (internal use only)
sam.frontendServerUrlstring"http://localhost:8000"Frontend URL for accessing Agent Mesh. For port-forward, use http://localhost:8000. For production with Ingress, set to "" (enables auto-detection). For LoadBalancer, set to external URL.
sam.platformServiceUrlstring"http://localhost:8080"Platform service URL. For port-forward, use http://localhost:8080. For production with Ingress, set to "" (enables auto-detection).
sam.cors.allowedOriginRegexstring"https?://(localhost|127\.0\.0\.1)(:\d+)?"CORS regex pattern for allowed origins. Default allows any localhost:port. For production, set to "".
sam.authorization.enabledboolfalseEnforce RBAC authorization via OIDC. Default: false (all users have admin access). For production, set to true and configure oauthProvider/authenticationRbac.
sam.dnsNamestring""External DNS name for Agent Mesh. Not required for port-forward or Ingress. For LoadBalancer/NodePort, set to your external DNS name.
sam.sessionSecretKeystring""Secure session key. Auto-generates on first install if empty; stable across upgrades. For production, explicitly set for reproducibility.
sam.oauthProvider.oidc.issuerstring""OIDC issuer URL for authentication
sam.oauthProvider.oidc.clientIdstring""OIDC client ID
sam.oauthProvider.oidc.clientSecretstring""OIDC client secret
sam.authenticationRbac.customRolesobject{}Custom role definitions with fine-grained scopes
sam.authenticationRbac.userslistSee default users in values.yamlStatic user role assignments
sam.authenticationRbac.idpClaims.enabledboolfalseEnable dynamic role assignment from IDP claims
sam.authenticationRbac.idpClaims.oidcProviderstring"oidc"OIDC provider name for IDP claims
sam.authenticationRbac.idpClaims.claimKeystring"groups"Claim key containing group/role information
sam.authenticationRbac.idpClaims.mappingsobject{}Map IDP claim values to Agent Mesh roles
sam.authenticationRbac.defaultRoleslist["sam_user"]Default roles assigned when no explicit role match found
sam.taskLogging.enabledbooltrueEnable Agent Mesh logging during task execution
sam.taskLogging.logStatusUpdatesbooltrueLog status updates during tasks
sam.taskLogging.logArtifactEventsboolfalseLog artifact events
sam.taskLogging.logFilePartsbooltrueLog file parts
sam.taskLogging.maxFilePartSizeBytesint10240Maximum file part size for logging
sam.taskLogging.hybridBuffer.enabledbooltrueEnable hybrid buffer for logging
sam.taskLogging.hybridBuffer.flushThresholdint10Flush threshold for hybrid buffer
sam.featureEnablement.awsBedrockEnabledbooltrueEnable AWS Bedrock integration
sam.featureEnablement.backgroundTasksbooltrueEnable background tasks feature
sam.featureEnablement.binaryArtifactPreviewbooltrueEnable binary artifact preview

Broker Configuration (External)

Configure an external Solace Event Broker. For production, set global.broker.embedded: false and configure these values.

KeyTypeDefaultDescription
broker.urlstring""Solace broker connection URL (for example, tcps://broker.messaging.solace.cloud:55443 or wss://...:443)
broker.clientUsernamestring""Broker username for authentication
broker.passwordstring""Broker password
broker.vpnstring""Broker VPN name

LLM Service Configuration

Configure the LLM service here or via the Agent Mesh UI after installation. All fields are optional.

KeyTypeDefaultDescription
llmService.llmServiceEndpointstringN/A (optional)LLM API endpoint (for example, https://api.openai.com/v1)
llmService.llmServiceApiKeystringN/A (optional)API key for LLM service
llmService.planningModelstringN/A (optional)Model name for planning tasks (for example, gpt-4o)
llmService.generalModelstringN/A (optional)Model name for general tasks (for example, gpt-4o)
llmService.reportModelstringN/A (optional)Model name for reports (optional)
llmService.imageModelstringN/A (optional)Model name for image generation (for example, dall-e-3, optional)
llmService.transcriptionModelstringN/A (optional)Model name for audio transcription (for example, whisper-1, optional)

Environment Variables

KeyTypeDefaultDescription
extraSecretEnvironmentVarslist[]Load credentials from existing Kubernetes Secrets. List of objects with envName, secretName, secretKey fields.
environmentVariablesobjectFeature flags enabledInject custom environment variables into Agent Mesh core containers. Use for feature flags and custom configuration.

Network Configuration - Service

KeyTypeDefaultDescription
service.typestring"ClusterIP"Kubernetes service type. For production, use ClusterIP with Ingress, or LoadBalancer/NodePort for direct access.
service.annotationsobject{}Service annotations for cloud-specific load balancer configuration
service.nodePorts.httpstring""NodePort for HTTP WebUI (30000-32767). Only used if service.type: NodePort
service.nodePorts.httpsstring""NodePort for HTTPS WebUI (30000-32767)
service.nodePorts.authstring""NodePort for Auth Service (30000-32767)
service.nodePorts.platformHttpstring""NodePort for Platform API HTTP (30000-32767)
service.nodePorts.platformHttpsstring""NodePort for Platform API HTTPS (30000-32767)
service.tls.enabledboolfalseEnable TLS/SSL for LoadBalancer/NodePort (pod-level TLS termination). Not needed for Ingress.
service.tls.existingSecretstring""Reference existing kubernetes.io/tls secret for TLS
service.tls.certstring""TLS certificate (inline). Use --set-file service.tls.cert=/path/to/tls.crt
service.tls.keystring""TLS key (inline). Use --set-file service.tls.key=/path/to/tls.key
service.tls.passphrasestring""TLS key passphrase

Network Configuration - Ingress

KeyTypeDefaultDescription
ingress.enabledboolfalseEnable Ingress for HTTP/HTTPS routing. For production, set to true.
ingress.classNamestring""Ingress controller class name (for example, nginx, alb, traefik, gce)
ingress.annotationsobject{}Ingress annotations (vary by controller). Examples in values.yaml for NGINX, ALB.
ingress.autoConfigurePathsbooltrueAutomatically configure all required ingress paths (platform API, auth, webui). Recommended.
ingress.hoststring""Hostname for ingress. Leave empty for ALB (accepts all hostnames). Required for NGINX/other name-based controllers.
ingress.hostslist[]Manual hosts/paths configuration. Only used when autoConfigurePaths: false.
ingress.tlslist[]TLS configuration for ingress. Entries trigger HTTPS URL generation (required for OIDC redirects).

Persistence - External Datastores

Configure external PostgreSQL database and object storage. For production, set global.persistence.enabled: false and configure these values.

KeyTypeDefaultDescription
dataStores.database.protocolstring"postgresql+psycopg2"Database protocol
dataStores.database.hoststring""PostgreSQL hostname (for example, mydb.us-east-1.rds.amazonaws.com)
dataStores.database.portstring"5432"PostgreSQL port
dataStores.database.adminUsernamestring""PostgreSQL admin user (used to create Agent Mesh application users)
dataStores.database.adminPasswordstring""PostgreSQL admin password
dataStores.database.applicationPasswordstring""Shared password for all Agent Mesh database users (webui, orchestrator, platform, agents). Required for external persistence.
dataStores.database.supabaseTenantIdstring""Supabase project ID. Required when using Supabase connection pooler.
dataStores.objectStorage.typestring"s3"Object storage type: s3, azure, or gcs
dataStores.objectStorage.workloadIdentity.enabledboolfalseEnable cloud-native auth (AWS IRSA, Azure WI, GCP WI) instead of access keys
dataStores.s3.endpointUrlstring""S3 endpoint URL. Leave empty for AWS S3. Set for MinIO or other S3-compatible stores.
dataStores.s3.bucketNamestring""S3 bucket for artifact storage
dataStores.s3.connectorSpecBucketNamestring""S3 bucket for connector specs (can be same as bucketName)
dataStores.s3.accessKeystring""S3 access key ID. Omit when using workload identity.
dataStores.s3.secretKeystring""S3 secret access key. Omit when using workload identity.
dataStores.s3.regionstring"us-east-1"AWS S3 region
dataStores.azure.accountNamestring""Azure storage account name
dataStores.azure.accountKeystring""Azure storage account key. Omit when using workload identity.
dataStores.azure.connectionStringstring""Azure storage connection string. Alternative to accountName/accountKey.
dataStores.azure.containerNamestring""Azure Blob container for artifacts
dataStores.azure.connectorSpecContainerNamestring""Azure Blob container for connector specs
dataStores.gcs.projectstring""GCP project ID
dataStores.gcs.credentialsJsonstring""GCS service account JSON credentials. Omit when using workload identity.
dataStores.gcs.bucketNamestring""GCS bucket for artifacts
dataStores.gcs.connectorSpecBucketNamestring""GCS bucket for connector specs

Agent Mesh Deployment

KeyTypeDefaultDescription
samDeployment.serviceAccount.namestring""Service account name. Auto-generates {release}-solace-agent-mesh-core-sa when empty. Set explicitly for workload identity.
samDeployment.serviceAccount.annotationsobject{}Service account annotations for workload identity (AWS IRSA, Azure WI, GCP WI)
samDeployment.imagePullSecretstring""Image pull secret attached to Agent Mesh service accounts. Using global.imagePullSecrets is preferred.
samDeployment.image.registrystring""Overrides global.imageRegistry for the Agent Mesh image only
samDeployment.image.repositorystring"solace-agent-mesh-enterprise"Agent Mesh application image repository
samDeployment.image.tagstring"1.143.0"Agent Mesh application image tag
samDeployment.image.digeststring""Agent Mesh image digest. Takes precedence over tag when set.
samDeployment.image.pullPolicystring"IfNotPresent"Image pull policy
samDeployment.agentDeployer.image.registrystring""Overrides global.imageRegistry for agent deployer image only
samDeployment.agentDeployer.image.repositorystring"sam-agent-deployer"Agent deployer image repository
samDeployment.agentDeployer.image.tagstring"1.8.2"Agent deployer image tag
samDeployment.agentDeployer.image.digeststring""Agent deployer image digest
samDeployment.agentDeployer.image.pullPolicystring"IfNotPresent"Agent deployer pull policy
samDeployment.agentDeployer.versionstring"k8s-1.500.0"Agent deployer version identifier
samDeployment.agentDeployer.chartVersionstring"1.500.0"Agent chart version
samDeployment.agentDeployer.kubeApiHoststring"kubernetes.default.svc"Kubernetes API host the agent-deployer's helm client dials. Applied only when HTTP_PROXY/HTTPS_PROXY is set, so the chart's .svc NO_PROXY rule bypasses the proxy. Set to "" to use the kubelet-injected cluster IP (required if your API server cert lacks a kubernetes.default.svc SAN).
samDeployment.dbInit.image.registrystring""Database init container image registry override
samDeployment.dbInit.image.repositorystring"postgres"Database init container image
samDeployment.dbInit.image.tagstring"18.0-trixie"Database init container tag
samDeployment.dbInit.image.digeststring""Database init image digest
samDeployment.dbInit.image.pullPolicystring"IfNotPresent"Database init pull policy
samDeployment.customCA.enabledboolfalseEnable custom CA certificate injection via ConfigMap
samDeployment.customCA.configMapNamestring"truststore"ConfigMap name containing custom CA certificates (.crt files)
samDeployment.rollout.strategystring"RollingUpdate"Deployment rollout strategy
samDeployment.podSecurityContext.runAsUserint10001Pod security context user ID
samDeployment.podSecurityContext.fsGroupint10002Pod security context filesystem group ID
samDeployment.securityContext.allowPrivilegeEscalationboolfalseAllow privilege escalation
samDeployment.securityContext.runAsUserint999Container runs as user ID
samDeployment.securityContext.runAsGroupint999Container runs as group ID
samDeployment.securityContext.runAsNonRootbooltrueEnforce non-root container
samDeployment.nodeSelectorobject{}Node selector for pod placement
samDeployment.tolerationslist[]Tolerations for pod scheduling
samDeployment.annotationsobject{}Deployment annotations
samDeployment.podAnnotationsobject{}Pod annotations
samDeployment.podLabelsobject{}Pod labels
samDeployment.resources.sam.requests.cpustring"1000m"CPU request for Agent Mesh core container
samDeployment.resources.sam.requests.memorystring"1024Mi"Memory request for Agent Mesh core container
samDeployment.resources.sam.limits.cpustring"2000m"CPU limit for Agent Mesh core container
samDeployment.resources.sam.limits.memorystring"2048Mi"Memory limit for Agent Mesh core container
samDeployment.resources.agentDeployer.requests.cpustring"100m"CPU request for agent deployer container
samDeployment.resources.agentDeployer.requests.memorystring"256Mi"Memory request for agent deployer container
samDeployment.resources.agentDeployer.limits.cpustring"200m"CPU limit for agent deployer container
samDeployment.resources.agentDeployer.limits.memorystring"512Mi"Memory limit for agent deployer container

Agent Mesh Pre-flight Validation (sam-doctor)

Agent Mesh runs a set of pre-flight validation checks (collectively called sam-doctor) before any workload pods are created. These checks catch environment misconfigurations up front. By default sam-doctor is enabled (samDoctor.enabled: true) and a failing check blocks helm install and helm upgrade with an error like:

Error: INSTALLATION FAILED: failed pre-install: 1 error occurred:
* job <release>-<chart-name>-sam-doctor failed: ...

To see the diagnostic report, view the job's logs (the job name appears in the helm error above):

kubectl logs job/<release>-<chart-name>-sam-doctor -n <namespace>

For example, for the solace-agent-mesh chart with helm install sam --namespace sam:

kubectl logs job/sam-solace-agent-mesh-sam-doctor -n sam

The report lists each check with a PASS, WARN, FAIL, or SKIP status and a reason for any failure.

Bypassing the check

To demote failures to warnings and always proceed, set the following in your Helm values:

samDoctor:
failOnError: false

To skip the hook entirely, set the following in your Helm values:

samDoctor:
enabled: false
KeyTypeDefaultDescription
samDoctor.enabledbooltrueEnable sam-doctor pre-flight validation
samDoctor.failOnErrorbooltrueBlock install/upgrade on validation failure
samDoctor.timeoutSecondsint180Hook job timeout in seconds
samDoctor.tlsDnsNamestring""DNS name for TLS certificate validation. Defaults to sam.dnsName if not set.

Bundled Components - Persistence Layer

Configure embedded PostgreSQL and SeaweedFS. Only used when global.persistence.enabled: true. Not recommended for production.

KeyTypeDefaultDescription
persistence-layer.postgresql.serviceAccountNamestring""Service account for PostgreSQL pod
persistence-layer.postgresql.commonLabelsobject{"app.kubernetes.io/service": "database"}Common labels applied to PostgreSQL resources
persistence-layer.postgresql.imagePullSecretslist[]Image pull secrets for PostgreSQL image (merged with global.imagePullSecrets)
persistence-layer.postgresql.image.registrystring""Overrides global.imageRegistry for PostgreSQL image
persistence-layer.postgresql.image.repositorystring"postgres"PostgreSQL image repository
persistence-layer.postgresql.image.tagstring"18.0-trixie"PostgreSQL image tag
persistence-layer.postgresql.image.digeststring""PostgreSQL image digest
persistence-layer.seaweedfs.serviceAccountNamestring""Service account for SeaweedFS pod
persistence-layer.seaweedfs.commonLabelsobject{"app.kubernetes.io/service": "s3"}Common labels applied to SeaweedFS resources
persistence-layer.seaweedfs.imagePullSecretslist[]Image pull secrets for SeaweedFS image (merged with global.imagePullSecrets)
persistence-layer.seaweedfs.image.registrystring""Overrides global.imageRegistry for SeaweedFS image
persistence-layer.seaweedfs.image.repositorystring"chrislusf/seaweedfs"SeaweedFS image repository
persistence-layer.seaweedfs.image.tagstring"3.97-compliant"SeaweedFS image tag
persistence-layer.seaweedfs.image.digeststring""SeaweedFS image digest

Bundled Components - Embedded Broker

Configure embedded Solace PubSub+ broker. Only used when global.broker.embedded: true. Not recommended for production.

KeyTypeDefaultDescription
embeddedBroker.imagePullSecretslist[]Image pull secrets for broker image (merged with global.imagePullSecrets)
embeddedBroker.image.registrystring""Overrides global.imageRegistry for broker image
embeddedBroker.image.repositorystring"solace-pubsub-enterprise"Solace broker image repository
embeddedBroker.image.tagstring"10.25.0.193-multi-arch"Solace broker image tag
embeddedBroker.image.digeststring""Solace broker image digest

For inline documentation and examples, see the chart's values.yaml file.