Skip to content

FastMCP Server

Helm chart for deploying FastMCP Server on Kubernetes — a MCP (Model Context Protocol) server that dynamically loads tools, resources, prompts, and knowledge bases from multiple sources.

Key Features

  • Multi-source loading — tools, resources, prompts, and knowledge from inline ConfigMaps, S3-compatible storage (AWS S3, MinIO, R2), Git repositories, or OCI artifacts
  • Merge precedence — Inline (highest) > S3 > Git > OCI (lowest) — override remote tools locally without touching the upstream source
  • Bearer, JWT, and multi-auth — configure one auth mode or combine multiple providers through FastMCP
  • Knowledge base support — serve Markdown files as MCP resources for RAG and context injection
  • Extra pip packages — install additional Python packages at startup before loading tools
  • Tool metadata — optional __tags__, __timeout__, __annotations_mcp__ module variables for tool categorization and behavior hints
  • Resource templates — parameterized URIs like users://{user_id}/profile for dynamic resources
  • Multiple resources per fileRESOURCES dict maps multiple URIs to handler functions
  • Error masking — hide internal error details from clients via MCP_MASK_ERROR_DETAILS
  • Duplicate handling — control behavior when tools share names via MCP_ON_DUPLICATE_TOOLS
  • Built-in Web UI — dashboard at /ui with tools/resources/prompts explorer (Alpine.js + Tailwind CDN)
  • Prometheus metrics — tool call counts, durations, errors, source sync status at /metrics
  • Structured JSON loggingLOG_FORMAT=json for Loki, ELK, CloudWatch, Datadog
  • Dedicated health endpoints/healthz (liveness), /readyz (readiness), /startupz (startup)
  • Diagnostic endpointGET /debug/info with full server introspection
  • Init container pattern — pre-sync sources before server starts via initSync.enabled
  • Strict loadingMCP_STRICT_LOADING=true fails on boot if any tool/resource has errors
  • Hot reload — automatic tool/resource reload on filesystem changes via MCP_HOT_RELOAD=true
  • Periodic sync — poll S3/Git sources for changes at configurable intervals
  • Webhook reloadPOST /reload endpoint for CI/CD-triggered reloads
  • OCI artifact source — pull tool bundles from OCI registries via ORAS with optional registry credentials
  • Selective sync — include/exclude glob patterns for source filtering
  • Gateway mode — compose multiple MCP servers via MCP_MODE=gateway and MCP_MOUNT_SERVERS
  • Tag visibility — enable/disable tools by tags with MCP_ENABLE_TAGS and MCP_DISABLE_TAGS
  • Multi-auth — combine bearer + JWT providers via MCP_AUTH_PROVIDERS
  • Tool-level scopes__required_scopes__ module variable for authorization
  • Context integration — tools can use ctx: Context for progress, logging, sampling, elicitation, and session state
  • Rate limiting__rate_limit__ module variable or MCP_RATE_LIMIT_DEFAULT env var (sliding window)
  • Caching__cache_ttl__ module variable for idempotent tool result caching
  • Tool sandboxing__max_memory_mb__ and __max_output_size_kb__ resource limits per tool
  • PodDisruptionBudgetpdb.enabled for zero-downtime rolling updates
  • HorizontalPodAutoscalerautoscaling.enabled for auto-scaling based on CPU/memory
  • Security — Trivy vulnerability scan, CycloneDX SBOM, Cosign keyless signing, SLSA provenance

Installation

HTTPS Repository

helm repo add helmforge https://repo.helmforge.dev
helm repo update
helm install fastmcp-server helmforge/fastmcp-server

OCI Registry

helm install fastmcp-server oci://ghcr.io/helmforgedev/helm/fastmcp-server

Basic Example — Inline Tools

# values.yaml
sources:
  inline:
    tools:
      greet.py: |
        def greet(name: str) -> str:
            """Greet someone by name."""
            return f"Hello, {name}!"
      math_ops.py: |
        def add(a: float, b: float) -> float:
            """Add two numbers."""
            return a + b
        def multiply(a: float, b: float) -> float:
            """Multiply two numbers."""
            return a * b
    knowledge:
      overview.md: |
        # Product Overview
        This document provides context for the AI assistant.

S3 Source (MinIO, AWS S3, Cloudflare R2)

sources:
  s3:
    enabled: true
    endpoint: 'https://minio.example.com'
    bucket: mcp-tools
    region: us-east-1
    prefix: production
    accessKey: '<access-key>'
    secretKey: '<secret-key>'
    include:
      - 'tools/**/*.py'
      - 'knowledge/**/*.md'
    exclude:
      - '**/*.tmp'
    syncInterval: 300

Git Source

sources:
  git:
    enabled: true
    repository: 'https://github.com/your-org/mcp-tools.git'
    branch: main
    path: '' # optional subdirectory
    token: '<github-token>' # for private repos
    allowedRepositories:
      - 'https://github.com/your-org/mcp-tools.git'
    allowedBranches:
      - main
    include:
      - 'tools/**/*.py'
      - 'resources/**/*.py'
    exclude:
      - '**/private/**'
    syncInterval: 300

OCI Source

sources:
  oci:
    enabled: true
    registry: ghcr.io/your-org/mcp-bundle
    tag: '1.0.0'
    username: '<registry-user>'
    password: '<registry-token>'
    include:
      - 'tools/**/*.py'
      - 'knowledge/**/*.md'
    exclude:
      - '**/*.tmp'

Authentication

Bearer Token

auth:
  type: bearer
  bearer:
    token: my-secret-token
    # or use an existing Kubernetes secret:
    # existingSecret: my-auth-secret
    # existingSecretKey: token

JWT

auth:
  type: jwt
  jwt:
    issuer: 'https://auth.example.com'
    audience: 'mcp-server'
    jwksUri: 'https://auth.example.com/.well-known/jwks.json'

Multi-Auth and Scopes

auth:
  type: multi
  providers:
    - bearer
    - jwt
  bearer:
    existingSecret: fastmcp-auth
    existingSecretKey: token
  jwt:
    issuer: 'https://auth.example.com'
    audience: 'mcp-server'
    jwksUri: 'https://auth.example.com/.well-known/jwks.json'
    algorithm: RS256
  scopes:
    - tools:read
    - tools:execute
  requiredScopes:
    - tools:execute
  clientId: fastmcp-server
  requireHumanApprovalForDestructive: true

Gateway Mode

gateway:
  enabled: true
  mountServers:
    github:
      transport: streamable-http
      url: 'https://github-mcp.example.com/mcp'
      auth:
        type: bearer
        tokenEnv: GITHUB_MCP_TOKEN
    internal:
      transport: streamable-http
      url: 'http://internal-mcp.default.svc.cluster.local:8000/mcp'

extraEnv:
  - name: GITHUB_MCP_TOKEN
    valueFrom:
      secretKeyRef:
        name: github-mcp-auth
        key: token

Visibility and Reload

hotReload:
  enabled: true

visibility:
  mode: allowlist
  enableTags:
    - public
    - approved
  disableTags:
    - destructive

Observability

Prometheus Metrics

metrics:
  enabled: true
  serviceMonitor:
    enabled: true # requires Prometheus Operator
    interval: 30s

Metrics exposed at /metrics: tool call counts, durations, errors, source sync status, auth attempts.

Structured Logging

server:
  logFormat: json # JSON output for log aggregation

Health Endpoints

EndpointTypeWhen 200
/healthzLivenessAlways (process running)
/readyzReadinessSources synced + components loaded
/startupzStartupFull initialization complete

Diagnostics

GET /debug/info returns server version, FastMCP version, uptime, registered components, source status, auth type, and configuration.

Rate Limiting

rateLimiting:
  default: '100/min' # global default
  perTool:
    DEPLOY: '5/min' # per-tool override
    DELETE_DATA: '2/min'

Or via module-level variable:

__rate_limit__ = "5/min"

def deploy(service: str, version: str) -> str:
    """Deploy a service (rate limited)."""
    return f"Deployed {service}@{version}"

Caching

__cache_ttl__ = 300  # cache results for 5 minutes

def get_exchange_rate(currency: str) -> float:
    """Get current exchange rate (cached 5min)."""
    ...
caching:
  enabled: true # default
  maxSize: 1000 # max entries per tool

Tool Sandboxing

__max_memory_mb__ = 256
__max_output_size_kb__ = 100

def process_data(data: str) -> str:
    """Process data with resource limits."""
    ...

Autoscaling

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

pdb:
  enabled: true
  minAvailable: 1

Key Values

KeyDefaultDescription
image.repositorydocker.io/helmforge/fastmcp-serverContainer image
image.tag0.11.0Image tag
server.namefastmcp-serverServer name in MCP responses
server.version""Server version env; empty uses chart appVersion
server.environmentdevRuntime environment (MCP_ENV)
server.host0.0.0.0Bind host
server.port8000HTTP port
server.path/mcpMCP endpoint path
server.workspace/app/workspaceRuntime source workspace
server.logFormattextLog format: text or json
server.strictLoadingfalseFail on boot if component errors
server.maskErrorDetails""Override runtime error masking behavior
server.onDuplicateToolserrorDuplicate tool policy: error, warn, replace
server.corsAllowedOrigins[]Allowed CORS origins
server.maxSourceFileSizeBytes1048576Max source file size
server.maxKnowledgeBytes10485760Max knowledge file size
server.allowedKnowledgeExtensions[".md", ".txt"]Knowledge file extensions
server.strategy.typeRecreateDeployment strategy
server.revisionHistoryLimit10Deployment revision history
ui.enabledtrueEnable Web UI at /ui
metrics.enabledfalseEnable Prometheus metrics at /metrics
metrics.serviceMonitor.enabledfalseCreate ServiceMonitor when metrics are enabled
auth.typenoneAuthentication: none, bearer, jwt, multi
auth.allowNoAuthfalseExplicitly allow no-auth production deployments
auth.bearer.token""Bearer token secret value
auth.bearer.existingSecret""Existing bearer token secret
auth.jwt.issuer""JWT issuer
auth.jwt.audience""JWT audience
auth.jwt.jwksUri""JWT JWKS URI
auth.jwt.algorithmRS256JWT verification algorithm
auth.jwt.publicKeyExistingSecret""Existing JWT public key secret
auth.scopes[]Advertised auth scopes
auth.requiredScopes[]Required request scopes
auth.clientId""OAuth client identifier
auth.providers[]Providers used by multi auth
auth.reloadRequiredScopes[]Scopes required for reload operations
auth.requireHumanApprovalForDestructivetrueRequire approval metadata for destructive tools
rateLimiting.default""Default rate limit (e.g., 100/min)
caching.enabledtrueEnable result caching
sandboxing.maxMemoryMb0Default max memory per tool (MB)
sandboxing.maxOutputSizeKb0Default max output per tool (KB)
sources.blockedFileAllowlist[]Explicit allowlist for blocked file names
sources.inline.dir/workspace/inlineInline ConfigMap mount directory
sources.inline.tools{}Inline Python tool files
sources.inline.resources{}Inline resource files
sources.inline.prompts{}Inline prompt files
sources.inline.knowledge{}Inline knowledge base files
sources.s3.enabledfalseEnable S3 source
sources.s3.bucket""S3 bucket name
sources.s3.include[]S3 include glob patterns
sources.s3.exclude[]S3 exclude glob patterns
sources.s3.syncInterval0S3 polling interval in seconds
sources.git.enabledfalseEnable Git source
sources.git.repository""Git repository HTTPS URL
sources.git.username""Optional Git username
sources.git.allowedRepositories[]Allowed Git repository URLs
sources.git.allowedBranches[]Allowed Git branches
sources.git.include[]Git include glob patterns
sources.git.exclude[]Git exclude glob patterns
sources.git.syncInterval0Git polling interval in seconds
sources.oci.enabledfalseEnable OCI artifact source
sources.oci.registry""OCI artifact reference
sources.oci.tag""OCI artifact tag; empty lets runtime decide
sources.oci.existingSecret""Existing OCI registry credential secret
sources.oci.include[]OCI include glob patterns
sources.oci.exclude[]OCI exclude glob patterns
hotReload.enabledfalseEnable filesystem hot reload
gateway.enabledfalseRun server in gateway mode
gateway.mountServers{}Gateway mount server map
gateway.rawMountServersJson""Raw MCP_MOUNT_SERVERS JSON override
visibility.modeblocklistTool visibility mode
visibility.enableTags[]Allowlisted tags
visibility.disableTags[]Hidden tags
extraPipPackages[]Extra pip packages to install at startup
initSync.enabledfalseRun source sync as init container
persistence.enabledfalseEnable persistent workspace volume
serviceAccount.createtrueCreate a dedicated Kubernetes ServiceAccount
serviceAccount.automountServiceAccountTokenfalseAutomount Kubernetes API token
networkPolicy.enabledfalseCreate NetworkPolicy
networkPolicy.ingress[]Custom ingress rules; empty uses service port
networkPolicy.egress[]Custom egress rules; empty allows outbound sync
autoscaling.enabledfalseEnable HPA
pdb.enabledfalseEnable PodDisruptionBudget
ingress.enabledfalseEnable ingress

Operational Notes

  • Merge precedence is Inline > S3 > Git > OCI — if a tool with the same filename exists in multiple sources, the highest-precedence version wins
  • Production-like environments require an auth mode unless auth.allowNoAuth=true is set explicitly
  • ServiceMonitor requires metrics.enabled=true
  • Gateway mode requires either gateway.mountServers or gateway.rawMountServersJson
  • Knowledge base files are served as MCP resources at knowledge://{filename} URIs
  • Tools are Python files with top-level functions; resources need a RESOURCE_URI or RESOURCES module-level variable
  • Tools support optional metadata: __tags__ (set), __timeout__ (float), __annotations_mcp__ (dict)
  • Resource URIs can use {param} placeholders for dynamic templates
  • The extraPipPackages list installs before tools load — use it when tools import external libraries
  • The Web UI auto-refreshes every 15 seconds and requires no external dependencies
  • Init container pattern (initSync.enabled) separates source syncing from server startup for better Kubernetes readiness semantics
  • Default readiness requires at least one loaded tool, resource, prompt, or knowledge file

More Information