Julian Haseleu 4805faf9db
All checks were successful
Gitea Docker Build Demo / Test (push) Successful in 1m1s
Gitea Docker Build Demo / Build_Image (push) Successful in 1m18s
chore(): added docs
2025-10-07 09:38:38 +00:00
2025-10-07 11:13:15 +02:00
2024-03-22 20:58:28 +00:00
2025-10-07 11:13:15 +02:00
2025-10-07 11:13:15 +02:00
2024-03-22 20:58:28 +00:00
2025-10-07 09:38:38 +00:00

Canada Kaktus Documentation

Build Status

Overview

Canada Kaktus is a Kubernetes controller that automatically manages Cilium LoadBalancer IP pools by synchronizing them with Hetzner Cloud server instances. It continuously monitors Hetzner Cloud servers matching a specific label selector and updates a Kubernetes Custom Resource Definition (CRD) to maintain an up-to-date IP pool for load balancing services.

Purpose

The application serves as a bridge between Hetzner Cloud infrastructure and Kubernetes/Cilium networking, ensuring that load balancer IP pools always reflect the current set of available server instances. This automation eliminates the need for manual IP pool management when servers are added or removed from the cluster.

Architecture

Components

  1. Configuration Management (config.go)

    • Environment-based configuration with default values
    • Supports JSON configuration files with auto-reload capability
    • Manages Hetzner Cloud API tokens and label selectors
  2. Health Monitoring (health.go)

    • HTTP health endpoint on port 8080
    • Thread-safe health state management
    • RESTful health checks for Kubernetes probes
  3. Hetzner Cloud Integration (hetzner.go)

    • Interacts with Hetzner Cloud API
    • Discovers servers based on label selectors
    • Extracts public IPv4 addresses from server instances
  4. Kubernetes Integration (k8s.go)

    • Manages Cilium LoadBalancer IP Pool CRDs
    • In-cluster Kubernetes client configuration
    • Template-based CRD generation and updates
  5. Logging (utils/logging.go)

    • Structured JSON logging with configurable levels
    • Contextual logging with caller information

Processing Flow

graph TD
    A[Application Start] --> B[Load Configuration]
    B --> C[Configure Logger]
    C --> D[Start Health Server]
    D --> E[Enter Main Loop]
    
    E --> F[Query Hetzner Cloud API]
    F --> G{Servers Found?}
    G -->|No| H[Log Error & Set Unhealthy]
    G -->|Yes| I[Extract IP Addresses]
    
    I --> J{IPs Valid?}
    J -->|No| K[Log Error & Set Unhealthy]
    J -->|Yes| L[Get Current CRD Resource Version]
    
    L --> M[Generate IP Pool Template]
    M --> N[Update Kubernetes CRD]
    N --> O{Update Successful?}
    
    O -->|No| P[Log Error & Set Unhealthy]
    O -->|Yes| Q[Log Success & Set Healthy]
    
    H --> R[Wait 15 minutes]
    K --> R
    P --> R
    Q --> R
    R --> E
    
    subgraph "Health Endpoint"
        S[HTTP GET /health] --> T[Return Health Status]
    end
    
    subgraph "Hetzner Cloud"
        U[Server Instances] --> V[Label Selector Filter]
        V --> W[Public IPv4 Addresses]
    end
    
    subgraph "Kubernetes"
        X[CiliumLoadBalancerIPPool CRD] --> Y[IP Pool Configuration]
        Y --> Z[Load Balancer Services]
    end

Configuration

Environment Variables

Variable Default Description
CANADA_KAKTUS_LOGLEVEL Info Logging level (Debug, Info, Warn, Error)
CANADA_KAKTUS_LABELSELECTOR kops.k8s.io/instance-role=Node Label selector for Hetzner Cloud servers
CANADA_KAKTUS_HCLOUD_TOKEN (required) Hetzner Cloud API token

Configuration File

Optionally, a config.json file can be used with auto-reload capability:

{
  "LogLevel": "Info",
  "LabelSelector": "kops.k8s.io/instance-role=Node",
  "HcloudToken": "your-hetzner-token-here"
}

Deployment

Prerequisites

  • Kubernetes cluster with Cilium CNI
  • Hetzner Cloud API token with read access to servers
  • Proper RBAC permissions for CRD management

Required Kubernetes Permissions

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: canada-kaktus
rules:
- apiGroups: ["cilium.io"]
  resources: ["ciliumloadbalancerippools"]
  verbs: ["get", "create", "update", "patch"]

Docker Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: canada-kaktus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: canada-kaktus
  template:
    metadata:
      labels:
        app: canada-kaktus
    spec:
      containers:
      - name: canada-kaktus
        image: your-registry/canada-kaktus:latest
        env:
        - name: CANADA_KAKTUS_HCLOUD_TOKEN
          valueFrom:
            secretKeyRef:
              name: hetzner-credentials
              key: token
        ports:
        - containerPort: 8080
          name: health
        livenessProbe:
          httpGet:
            path: /health
            port: health
          initialDelaySeconds: 30
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /health
            port: health
          initialDelaySeconds: 5
          periodSeconds: 10

API Endpoints

Health Check

  • URL: GET /health
  • Port: 8080
  • Response Codes:
    • 200 OK: All operations successful
    • 503 Service Unavailable: Error in processing loop

Generated Resources

Cilium LoadBalancer IP Pool CRD

The application generates and maintains a CiliumLoadBalancerIPPool resource:

apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
  name: covidnetes-pool
  annotations:
    argocd.argoproj.io/tracking-id: "cilium-lb:cilium.io/CiliumLoadBalancerIPPool:kube-system/covidnetes-pool"
    managed-by: "canada-kaktus"
spec:
  blocks:
  - cidr: "192.168.1.100/32"
  - cidr: "192.168.1.101/32"
  disabled: false

Operation Details

Main Loop Behavior

  1. Interval: Runs every 15 minutes
  2. Error Handling: Non-fatal errors are logged and health status is updated
  3. Resilience: Continues operation despite temporary failures
  4. State Management: Maintains health status for monitoring systems

Error Scenarios

  • Hetzner API Failures: Network issues, authentication problems, rate limiting
  • Kubernetes API Failures: RBAC issues, CRD not found, API server unavailable
  • Configuration Issues: Invalid tokens, missing permissions, malformed templates

Logging

All operations are logged with structured JSON format including:

  • Timestamp
  • Log level
  • Caller information
  • Contextual details
  • Error messages

Example log entry:

{
  "Caller": "Main",
  "level": "info",
  "msg": "Successfully recreated IP Pool CRD",
  "time": "2025-10-07T10:30:00Z"
}

Dependencies

Go Modules

  • Hetzner Cloud SDK: github.com/hetznercloud/hcloud-go - Hetzner Cloud API client
  • Kubernetes Client: k8s.io/client-go - Kubernetes API interactions
  • Configuration: github.com/jinzhu/configor - Environment and file-based config
  • Logging: github.com/sirupsen/logrus - Structured logging
  • HTTP Router: github.com/gorilla/mux - Health endpoint routing

External Services

  • Hetzner Cloud API: Server discovery and metadata retrieval
  • Kubernetes API: CRD management and cluster integration
  • Cilium: LoadBalancer IP pool consumption

Monitoring and Observability

Health Monitoring

  • HTTP health endpoint for liveness/readiness probes
  • Health status reflects the success of the last operation cycle
  • Automatic health status updates on errors

Logging

  • Configurable log levels (Debug, Info, Warn, Error)
  • Structured JSON output for log aggregation
  • Contextual information for debugging

Metrics

Currently, the application provides health status via HTTP endpoint. For production deployments, consider adding:

  • Prometheus metrics for operation success/failure rates
  • Timing metrics for API calls
  • Counter metrics for IP pool updates

Troubleshooting

Common Issues

  1. Authentication Failures

    • Verify Hetzner Cloud token is valid and has necessary permissions
    • Check token is correctly set in environment variable
  2. No Servers Found

    • Verify label selector matches your server configuration
    • Check servers exist in the configured Hetzner project
  3. Kubernetes Permission Errors

    • Ensure proper RBAC permissions for CRD access
    • Verify service account has necessary cluster roles
  4. Health Endpoint Unavailable

    • Check port 8080 is accessible
    • Verify no port conflicts in the cluster

Debug Mode

Enable debug logging by setting:

export CANADA_KAKTUS_LOGLEVEL=Debug

This provides detailed information about:

  • Server discovery process
  • IP address extraction
  • CRD template generation
  • Kubernetes API interactions

Development

Building

go mod download
go build -o canada-kaktus ./cmd/main.go

Testing

go test ./internal/...

Local Development

For local testing, ensure you have:

  • Valid Hetzner Cloud token
  • Kubernetes cluster access (can use kind/minikube)
  • Cilium installed in the cluster

Set environment variables and run:

export CANADA_KAKTUS_HCLOUD_TOKEN="your-token"
go run ./cmd/main.go
Description
No description provided
Readme MIT 158 KiB
v1.1.0 Latest
2025-10-07 09:19:11 +00:00
Languages
Go 95.9%
Dockerfile 4.1%