Model Sources¶
AIMClusterModelSource automatically discovers and syncs AI model images from container registries, creating AIMClusterModel resources for matched images.
Overview¶
Model sources eliminate the need to manually create model resources for every image version. They continuously sync with container registries, automatically creating models when new images are published.
Key features:
- Automatic discovery: Continuously monitors registries for images matching your filters
- Flexible filtering: Use wildcards, version constraints, and exclusions
- Multi-registry support: Works with Docker Hub, GitHub Container Registry (ghcr.io), and more
- Periodic sync: Configurable sync intervals to keep models up to date
- Private registries: Supports authentication via imagePullSecrets
Basic Example¶
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-models
spec:
filters:
- image: amdenterpriseai/aim-*
syncInterval: 1h
This source discovers all images matching amdenterpriseai/aim-* from Docker Hub and creates an AIMClusterModel for each.
Configuration¶
Registry¶
The registry field specifies which container registry to query. Defaults to docker.io if not specified.
Filters¶
Filters define which images to discover. Each filter specifies a pattern with optional version constraints and exclusions. Multiple filters are combined with OR logic.
Repository Patterns¶
Match repositories using wildcards:
Repository with Specific Tag¶
Match a specific tag:
Full URI¶
Override the registry for specific filters:
Full URI with Wildcard¶
Override registry and use wildcards:
Version Constraints¶
Use semantic version constraints to filter tags. Supports both global and per-filter version constraints.
Global Version Constraints¶
Apply to all filters:
spec:
registry: ghcr.io
filters:
- image: amdenterpriseai/aim-qwen-*
- image: amdenterpriseai/aim-deepseek-*
versions:
- ">=0.8.0"
- "<1.0.0"
Per-Filter Version Constraints¶
Override global constraints for specific filters:
spec:
registry: ghcr.io
versions:
- ">=0.8.0" # global default
filters:
- image: amdenterpriseai/aim-qwen-*
versions:
- ">=0.8.5" # overrides global for this filter
- image: amdenterpriseai/aim-deepseek-*
# uses global constraint
Version Syntax¶
Constraints use standard semver syntax:
>=1.0.0- Version 1.0.0 or higher<2.0.0- Below version 2.0.0~1.2.0- Patch updates only (1.2.x)^1.0.0- Minor updates allowed (1.x.x)
Prerelease versions (e.g., 0.8.1-rc1) are supported:
Non-semver tags (e.g., latest, dev) are silently skipped when version constraints are specified.
Exclusions¶
Exclude specific repositories from matching:
spec:
filters:
- image: amdenterpriseai/aim-*
exclude:
- amdenterpriseai/aim-base
- amdenterpriseai/aim-experimental
Exclusions match repository names exactly (not including the registry).
Sync Interval¶
Control how often the source syncs with the registry:
Default is 1h. Minimum recommended interval is 15m to avoid rate limiting.
Private Registries¶
Authenticate to private registries using imagePullSecrets:
apiVersion: v1
kind: Secret
metadata:
name: ghcr-secret
namespace: aim-system # operator namespace
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: BASE64_CONFIG
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: private-models
spec:
registry: ghcr.io
imagePullSecrets:
- name: ghcr-secret
filters:
- image: myorg/private-model-*
Secrets must exist in the operator namespace (typically aim-system).
GitHub Container Registry (GHCR) Authentication¶
For GitHub Container Registry, use a GitHub Personal Access Token (PAT) with the minimal required scope:
Required Scope:
- read:packages - Read access to container packages
Recommended: Use Fine-Grained Personal Access Tokens
- Create a fine-grained PAT at: https://github.com/settings/tokens
- Set repository access or organization permissions
- Grant only
read:packagespermission - Set expiration date
- Create the secret:
kubectl create secret docker-registry ghcr-secret \
--docker-server=ghcr.io \
--docker-username=YOUR_GITHUB_USERNAME \
--docker-password=YOUR_GITHUB_PAT \
--namespace=aim-system
Security Best Practices:
- Use fine-grained PATs instead of classic PATs when possible
- Grant minimal permissions (read:packages only)
- Set expiration dates on tokens
- Rotate tokens regularly
- Use separate tokens for different environments (dev/staging/prod)
- Enable encryption at rest for Kubernetes Secrets in production
- Limit Secret access via RBAC to only the operator namespace
Token Scopes to Avoid:
- ❌ repo - Grants read/write access to repositories (too broad)
- ❌ write:packages - Write access not needed for discovery
- ❌ admin:org - Organization admin access (unnecessary)
- ❌ delete:packages - Delete permission (unnecessary risk)
Max Models Limit¶
Control the maximum number of models created to prevent runaway resource creation:
When using the Helm chart's optional clusterModelSource, the chart default is maxModels: 500 unless overridden.
When the limit is reached:
- No new models are created, even if more matching images exist
- Existing models are never deleted
- Status shows
modelsLimitReached: true availableModelsshows total images found vsdiscoveredModelscreated
Use Cases:
- Prevent accidental model explosion from overly broad filters
- Enforce resource quotas in multi-tenant environments
- Limit cluster resource consumption during initial sync
Example Status:
status:
status: Ready
discoveredModels: 100
availableModels: 250
modelsLimitReached: true
conditions:
- type: MaxModelsLimitReached
status: "True"
message: "Model creation limit reached (100 models created). 150 available images not created as models."
Status¶
The status field tracks sync progress and discovered models:
Status Values¶
- Pending: Waiting for initial sync
- Progressing: Sync in progress
- Ready: All filters succeeded
- Degraded: Some filters failed, but others succeeded
- Failed: All filters failed
Detailed Status¶
Key status fields:
status: Overall state (Ready, Degraded, Failed, etc.)discoveredModels: Count of AIMClusterModel resources createdavailableModels: Total count of images matching filters in registrymodelsLimitReached: Boolean indicating if maxModels limit was reachedlastSyncTime: Timestamp of last successful syncconditions: Detailed conditions including Ready, Degraded, and MaxModelsLimitReached
Examples¶
Docker Hub with Wildcards¶
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: dockerhub-models
spec:
registry: docker.io
filters:
- image: amdenterpriseai/aim-*
exclude:
- amdenterpriseai/aim-base
syncInterval: 2h
GitHub Container Registry with Version Constraints¶
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: ghcr-stable-models
spec:
registry: ghcr.io
filters:
- image: amdenterpriseai/aim-qwen-*
- image: amdenterpriseai/aim-deepseek-*
versions:
- ">=0.8.0"
- "<1.0.0"
syncInterval: 1h
Multiple Registries¶
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: multi-registry-models
spec:
registry: docker.io # default
filters:
- image: amdenterpriseai/aim-* # uses docker.io
- image: ghcr.io/amdenterpriseai/aim-* # overrides to ghcr.io
syncInterval: 1h
Private Registry with Authentication¶
apiVersion: v1
kind: Secret
metadata:
name: private-registry-creds
namespace: aim-system
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: BASE64_ENCODED_CONFIG
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: private-models
spec:
registry: private.registry.io
imagePullSecrets:
- name: private-registry-creds
filters:
- image: myorg/model-*
versions:
- ">=1.0.0"
syncInterval: 1h
Specific Versions Only¶
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: specific-versions
spec:
registry: ghcr.io
filters:
- image: amdenterpriseai/aim-qwen-qwen3-32b:0.8.5
- image: amdenterpriseai/aim-qwen-qwen3-32b:0.8.4
- image: amdenterpriseai/aim-deepseek-deepseek-r1:0.8.5
syncInterval: 6h
Lifecycle¶
Created Models¶
Model sources create AIMClusterModel resources with auto-generated names based on the image URI. These models are owned by the source via an owner reference.
Created models have discovery enabled by default and will automatically create service templates if the image includes recommended deployment metadata.
Append-Only¶
Model sources follow an append-only lifecycle during normal operation. Once created, models are never deleted by the source, even if:
- The image is removed from the registry
- The filter is changed or removed
This ensures running services aren't disrupted when registry contents change.
Ownership and Deletion¶
Created models have an owner reference to the source. When you delete the source, Kubernetes will automatically delete all models that were created by it.
This cascading deletion happens via Kubernetes garbage collection. To prevent accidentally disrupting running services, consider the impact before deleting a model source.
If you need to stop tracking specific models:
- Update the source filters to exclude those models
- Delete the unwanted models manually:
Note: You cannot selectively clean up models while keeping the source unchanged - any models matching the active filters will be recreated on the next sync.
Troubleshooting¶
No Models Discovered¶
Check the source status:
Common causes:
- No images match the filters
- Registry is unreachable
- Authentication failed (check imagePullSecrets)
- Version constraints too restrictive
Degraded Status¶
Some filters failed while others succeeded. Check conditions:
Look for error messages indicating which filters failed and why.
Failed Status¶
All filters failed. Common causes:
- Invalid registry hostname
- Missing or invalid imagePullSecrets
- Network connectivity issues
- Registry catalog API not supported (for wildcard filters)
Wildcard Filters Not Working¶
Wildcard filters require registry catalog API support. GitHub Container Registry (ghcr.io) wildcard discovery is supported via GHCR's REST API.
Related Documentation¶
- Models - Understanding AIMClusterModel and AIMModel resources
- Templates - Auto-generated service templates
- Runtime Config - Authentication and discovery configuration