Module 13: Production Patterns & Security¶

This module covers essential patterns for running Solana DApps in production, including security best practices, CI/CD pipelines, deployment strategies, and operational excellence.

Learning Objectives¶

By the end of this module, you will be able to:

Implement security best practices for smart contracts
Set up comprehensive CI/CD pipelines with GitHub Actions
Deploy using blue-green and canary strategies
Monitor and respond to production incidents
Optimize compute units and reduce costs

Prerequisites¶

Completed all previous modules
Understanding of Kubernetes basics
Familiarity with GitHub Actions
Basic security awareness

Part A: Smart Contract Security¶

Security Checklist¶

Before deploying any Solana program to mainnet, verify these security requirements:

Category	Check	Priority
Authorization	All instructions verify required signers	Critical
Ownership	Account ownership verified before mutation	Critical
Math	All arithmetic uses checked_* operations	Critical
Seeds	PDA seeds include unique identifiers	High
Reentrancy	State updated before external CPIs	High
Validation	All inputs validated and bounded	High
Accounts	Account discriminators checked	Medium
Events	Critical actions emit events	Medium

Signer Verification¶

Every instruction that modifies state must verify the appropriate signer:

// INCORRECT - No signer verification
pub fn unsafe_transfer(ctx: Context<Transfer>, amount: u64) -> Result<()> {
    // Anyone can call this!
    ctx.accounts.vault.amount -= amount;
    Ok(())
}

// CORRECT - Proper signer verification
pub fn safe_transfer(ctx: Context<SafeTransfer>, amount: u64) -> Result<()> {
    // Authority must sign
    require!(
        ctx.accounts.authority.key() == ctx.accounts.vault.authority,
        ErrorCode::Unauthorized
    );
    ctx.accounts.vault.amount = ctx.accounts.vault.amount
        .checked_sub(amount)
        .ok_or(ErrorCode::InsufficientFunds)?;
    Ok(())
}

#[derive(Accounts)]
pub struct SafeTransfer<'info> {
    #[account(mut)]
    pub vault: Account<'info, Vault>,
    #[account(signer)]  // Constraint ensures signature
    pub authority: Signer<'info>,
}

Account Ownership Verification¶

Always verify that accounts are owned by expected programs:

#[derive(Accounts)]
pub struct ProcessPayment<'info> {
    // Verify token account is owned by Token Program
    #[account(
        mut,
        constraint = token_account.owner == token::ID @ ErrorCode::InvalidOwner
    )]
    pub token_account: Account<'info, TokenAccount>,

    // Anchor automatically verifies Account<'info, T> ownership
    // But be explicit for interface accounts
    #[account(
        owner = crate::ID @ ErrorCode::InvalidProgramOwner
    )]
    pub escrow: Account<'info, Escrow>,
}

Checked Arithmetic¶

Always use checked arithmetic to prevent overflow/underflow:

// INCORRECT - Can overflow
pub fn unsafe_math(amount1: u64, amount2: u64) -> u64 {
    amount1 + amount2  // Panics or wraps on overflow
}

// CORRECT - Checked operations
pub fn safe_math(amount1: u64, amount2: u64) -> Result<u64> {
    amount1
        .checked_add(amount2)
        .ok_or(ErrorCode::MathOverflow.into())
}

// For complex calculations, use a dedicated module
pub mod math {
    use super::*;

    pub fn calculate_swap_output(
        input_amount: u64,
        input_reserve: u64,
        output_reserve: u64,
        fee_rate: u16,
    ) -> Result<u64> {
        // Calculate fee
        let fee = input_amount
            .checked_mul(fee_rate as u64)
            .ok_or(ErrorCode::MathOverflow)?
            .checked_div(10000)
            .ok_or(ErrorCode::MathOverflow)?;

        // Input after fee
        let input_after_fee = input_amount
            .checked_sub(fee)
            .ok_or(ErrorCode::MathOverflow)?;

        // Constant product formula
        let numerator = output_reserve
            .checked_mul(input_after_fee)
            .ok_or(ErrorCode::MathOverflow)?;

        let denominator = input_reserve
            .checked_add(input_after_fee)
            .ok_or(ErrorCode::MathOverflow)?;

        numerator
            .checked_div(denominator)
            .ok_or(ErrorCode::MathOverflow.into())
    }
}

PDA Seed Security¶

Use unique, collision-resistant seeds for PDAs:

// INCORRECT - Seeds not unique enough
#[account(
    seeds = [b"escrow"],  // Only one escrow possible!
    bump
)]
pub escrow: Account<'info, Escrow>,

// CORRECT - Unique seeds with user and identifier
#[account(
    seeds = [
        b"escrow",
        maker.key().as_ref(),
        &escrow_id.to_le_bytes()
    ],
    bump = escrow.bump
)]
pub escrow: Account<'info, Escrow>,

Reentrancy Protection¶

Update state before making cross-program invocations:

pub fn withdraw(ctx: Context<Withdraw>, amount: u64) -> Result<()> {
    let escrow = &mut ctx.accounts.escrow;

    // FIRST: Check conditions
    require!(escrow.balance >= amount, ErrorCode::InsufficientFunds);
    require!(escrow.authority == ctx.accounts.authority.key(), ErrorCode::Unauthorized);

    // SECOND: Update state BEFORE external call
    escrow.balance = escrow.balance.checked_sub(amount).unwrap();

    // THIRD: Make external call (CPI to Token Program)
    let cpi_accounts = Transfer {
        from: ctx.accounts.vault.to_account_info(),
        to: ctx.accounts.recipient.to_account_info(),
        authority: ctx.accounts.escrow.to_account_info(),
    };
    let cpi_program = ctx.accounts.token_program.to_account_info();
    let seeds = &[b"escrow", &[ctx.accounts.escrow.bump]];
    let signer = &[&seeds[..]];
    let cpi_ctx = CpiContext::new_with_signer(cpi_program, cpi_accounts, signer);
    token::transfer(cpi_ctx, amount)?;

    Ok(())
}

Part B: CI/CD with GitHub Actions¶

Comprehensive Test Workflow¶

# .github/workflows/test.yml
name: Test

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  SOLANA_VERSION: 1.18.0
  ANCHOR_VERSION: 0.30.0
  NODE_VERSION: 20
  RUST_VERSION: 1.75.0

jobs:
  # ==========================================
  # Anchor Program Tests
  # ==========================================
  anchor-test:
    name: Anchor Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Rust
        uses: dtolnay/rust-action@stable
        with:
          toolchain: ${{ env.RUST_VERSION }}
          components: clippy, rustfmt

      - name: Cache Cargo
        uses: actions/cache@v4
        with:
          path: |
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}

      - name: Install Solana
        run: |
          sh -c "$(curl -sSfL https://release.solana.com/v${{ env.SOLANA_VERSION }}/install)"
          echo "$HOME/.local/share/solana/install/active_release/bin" >> $GITHUB_PATH

      - name: Install Anchor
        run: |
          cargo install --git https://github.com/coral-xyz/anchor --tag v${{ env.ANCHOR_VERSION }} anchor-cli --locked

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'pnpm'

      - name: Install pnpm
        run: npm install -g pnpm

      - name: Install Dependencies
        run: pnpm install

      - name: Build Programs
        run: anchor build

      - name: Run Tests
        run: anchor test

      - name: Upload Coverage
        uses: codecov/codecov-action@v4
        with:
          files: ./coverage/lcov.info
          fail_ci_if_error: false

  # ==========================================
  # Rust Linting and Security
  # ==========================================
  rust-quality:
    name: Rust Quality
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Rust
        uses: dtolnay/rust-action@stable
        with:
          toolchain: ${{ env.RUST_VERSION }}
          components: clippy, rustfmt

      - name: Check Formatting
        run: cargo fmt --all -- --check

      - name: Clippy Lints
        run: cargo clippy --all-targets --all-features -- -D warnings

      - name: Security Audit
        run: |
          cargo install cargo-audit
          cargo audit

  # ==========================================
  # Frontend Tests
  # ==========================================
  frontend-test:
    name: Frontend Tests
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: ./app
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'pnpm'
          cache-dependency-path: app/pnpm-lock.yaml

      - name: Install pnpm
        run: npm install -g pnpm

      - name: Install Dependencies
        run: pnpm install

      - name: Type Check
        run: pnpm type-check

      - name: Lint
        run: pnpm lint

      - name: Test
        run: pnpm test:ci

      - name: Build
        run: pnpm build

  # ==========================================
  # API Tests
  # ==========================================
  api-test:
    name: API Tests
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: ./api
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
          POSTGRES_DB: test_db
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    steps:
      - uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install Poetry
        run: pip install poetry

      - name: Install Dependencies
        run: poetry install

      - name: Run Tests
        env:
          DATABASE_URL: postgresql://postgres:test@localhost:5432/test_db
        run: poetry run pytest --cov=app --cov-report=xml

      - name: Upload Coverage
        uses: codecov/codecov-action@v4
        with:
          files: ./coverage.xml

  # ==========================================
  # Rust Services Tests
  # ==========================================
  services-test:
    name: Services Tests
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: ./services
    steps:
      - uses: actions/checkout@v4

      - name: Install Rust
        uses: dtolnay/rust-action@stable
        with:
          toolchain: ${{ env.RUST_VERSION }}

      - name: Build Services
        run: cargo build --workspace

      - name: Test Services
        run: cargo test --workspace

Deploy to Devnet Workflow¶

# .github/workflows/deploy-devnet.yml
name: Deploy to Devnet

on:
  push:
    branches: [develop]
  workflow_dispatch:

env:
  SOLANA_VERSION: 1.18.0
  ANCHOR_VERSION: 0.30.0

jobs:
  deploy-programs:
    name: Deploy Programs
    runs-on: ubuntu-latest
    environment: devnet
    steps:
      - uses: actions/checkout@v4

      - name: Install Solana
        run: |
          sh -c "$(curl -sSfL https://release.solana.com/v${{ env.SOLANA_VERSION }}/install)"
          echo "$HOME/.local/share/solana/install/active_release/bin" >> $GITHUB_PATH

      - name: Install Anchor
        run: cargo install --git https://github.com/coral-xyz/anchor --tag v${{ env.ANCHOR_VERSION }} anchor-cli --locked

      - name: Setup Keypair
        run: |
          echo "${{ secrets.DEVNET_DEPLOYER_KEYPAIR }}" > ~/.config/solana/id.json
          solana config set --url devnet

      - name: Build Programs
        run: anchor build

      - name: Deploy Programs
        run: |
          anchor deploy --provider.cluster devnet

      - name: Verify Deployment
        run: |
          solana program show $(solana address -k target/deploy/token_escrow-keypair.json)
          solana program show $(solana address -k target/deploy/nft_marketplace-keypair.json)
          solana program show $(solana address -k target/deploy/defi_amm-keypair.json)
          solana program show $(solana address -k target/deploy/dao_governance-keypair.json)

  deploy-services:
    name: Deploy Services
    runs-on: ubuntu-latest
    needs: deploy-programs
    environment: devnet
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1

      - name: Login to ECR
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and Push Images
        run: |
          # Build API
          docker build -t ${{ secrets.ECR_REGISTRY }}/solana-dapps-api:${{ github.sha }} ./api
          docker push ${{ secrets.ECR_REGISTRY }}/solana-dapps-api:${{ github.sha }}

          # Build App
          docker build -t ${{ secrets.ECR_REGISTRY }}/solana-dapps-app:${{ github.sha }} ./app
          docker push ${{ secrets.ECR_REGISTRY }}/solana-dapps-app:${{ github.sha }}

          # Build Indexer
          docker build -t ${{ secrets.ECR_REGISTRY }}/solana-dapps-indexer:${{ github.sha }} ./services/indexer
          docker push ${{ secrets.ECR_REGISTRY }}/solana-dapps-indexer:${{ github.sha }}

          # Build Relay
          docker build -t ${{ secrets.ECR_REGISTRY }}/solana-dapps-relay:${{ github.sha }} ./services/relay
          docker push ${{ secrets.ECR_REGISTRY }}/solana-dapps-relay:${{ github.sha }}

      - name: Deploy to Kubernetes
        run: |
          aws eks update-kubeconfig --name solana-dapps-devnet
          helm upgrade --install solana-dapps ./k8s/helm/solana-dapps \
            --namespace solana-dapps-devnet \
            --values ./k8s/helm/solana-dapps/values-devnet.yaml \
            --set api.image.tag=${{ github.sha }} \
            --set app.image.tag=${{ github.sha }} \
            --set indexer.image.tag=${{ github.sha }} \
            --set relay.image.tag=${{ github.sha }} \
            --wait --timeout 10m

Mainnet Release Workflow¶

# .github/workflows/release.yml
name: Release to Mainnet

on:
  release:
    types: [published]

jobs:
  verify:
    name: Verify Release
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Verify Tests Passed
        run: |
          # Check that all tests passed on this commit
          gh run list --commit ${{ github.sha }} --status success --json name | jq -e '.[] | select(.name == "Test")'
        env:
          GH_TOKEN: ${{ github.token }}

      - name: Verify Audit
        run: |
          # Ensure security audit was performed
          cargo audit

      - name: Verify Build
        run: anchor build --verifiable

  deploy-mainnet:
    name: Deploy to Mainnet
    runs-on: ubuntu-latest
    needs: verify
    environment: mainnet
    steps:
      - uses: actions/checkout@v4

      - name: Install Solana
        run: |
          sh -c "$(curl -sSfL https://release.solana.com/v${{ env.SOLANA_VERSION }}/install)"
          echo "$HOME/.local/share/solana/install/active_release/bin" >> $GITHUB_PATH

      - name: Setup Keypair
        run: |
          echo "${{ secrets.MAINNET_DEPLOYER_KEYPAIR }}" > ~/.config/solana/id.json
          solana config set --url mainnet-beta

      - name: Build Verifiable
        run: anchor build --verifiable

      - name: Deploy with Confirmation
        run: |
          # Deploy each program with confirmation
          anchor deploy --provider.cluster mainnet-beta

      - name: Verify On-Chain
        run: |
          # Verify deployed bytecode matches
          anchor verify $(solana address -k target/deploy/token_escrow-keypair.json)

  notify:
    name: Notify
    runs-on: ubuntu-latest
    needs: deploy-mainnet
    steps:
      - name: Notify Slack
        uses: slackapi/slack-github-action@v1
        with:
          payload: |
            {
              "text": "Mainnet deployment complete for ${{ github.event.release.tag_name }}",
              "blocks": [
                {
                  "type": "section",
                  "text": {
                    "type": "mrkdwn",
                    "text": "*Solana DApps Mainnet Release*\n*Version:* ${{ github.event.release.tag_name }}\n*Commit:* ${{ github.sha }}"
                  }
                }
              ]
            }
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

Part C: Deployment Strategies¶

Blue-Green Deployment¶

Blue-green deployments maintain two identical production environments:

# k8s/helm/solana-dapps/templates/deployment-blue-green.yaml
{{- if .Values.blueGreen.enabled }}
# Blue Deployment (Current Production)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "api.fullname" . }}-blue
  labels:
    {{- include "api.labels" . | nindent 4 }}
    version: blue
spec:
  replicas: {{ .Values.blueGreen.blue.replicas }}
  selector:
    matchLabels:
      {{- include "api.selectorLabels" . | nindent 6 }}
      version: blue
  template:
    metadata:
      labels:
        {{- include "api.selectorLabels" . | nindent 8 }}
        version: blue
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.blueGreen.blue.tag }}"
          # ... rest of container spec
---
# Green Deployment (New Version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "api.fullname" . }}-green
  labels:
    {{- include "api.labels" . | nindent 4 }}
    version: green
spec:
  replicas: {{ .Values.blueGreen.green.replicas }}
  selector:
    matchLabels:
      {{- include "api.selectorLabels" . | nindent 6 }}
      version: green
  template:
    metadata:
      labels:
        {{- include "api.selectorLabels" . | nindent 8 }}
        version: green
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.blueGreen.green.tag }}"
---
# Service points to active color
apiVersion: v1
kind: Service
metadata:
  name: {{ include "api.fullname" . }}
spec:
  selector:
    {{- include "api.selectorLabels" . | nindent 4 }}
    version: {{ .Values.blueGreen.active }}  # "blue" or "green"
  ports:
    - port: 8000
      targetPort: http
{{- end }}

Blue-green switch script:

#!/bin/bash
# scripts/switch-blue-green.sh

NAMESPACE="solana-dapps"
CURRENT=$(kubectl get svc solana-dapps-api -n $NAMESPACE -o jsonpath='{.spec.selector.version}')

if [ "$CURRENT" == "blue" ]; then
    NEW="green"
else
    NEW="blue"
fi

echo "Switching from $CURRENT to $NEW..."

# Verify new deployment is healthy
kubectl rollout status deployment/solana-dapps-api-$NEW -n $NAMESPACE

# Switch traffic
kubectl patch svc solana-dapps-api -n $NAMESPACE -p "{\"spec\":{\"selector\":{\"version\":\"$NEW\"}}}"

echo "Traffic now pointing to $NEW"

Canary Deployment¶

Gradually roll out to a subset of users:

# k8s/helm/solana-dapps/templates/canary.yaml
{{- if .Values.canary.enabled }}
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: {{ include "api.fullname" . }}
  namespace: {{ .Release.Namespace }}
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ include "api.fullname" . }}
  progressDeadlineSeconds: 60
  service:
    port: 8000
    targetPort: http
    gateways:
      - public-gateway
    hosts:
      - {{ .Values.global.ingress.host }}
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m
      - name: request-duration
        thresholdRange:
          max: 500
        interval: 1m
    webhooks:
      - name: load-test
        url: http://flagger-loadtester.test/
        timeout: 5s
        metadata:
          cmd: "hey -z 1m -q 10 -c 2 http://{{ include "api.fullname" . }}-canary:8000/"
{{- end }}

Part D: Monitoring and Alerting¶

Prometheus Alerting Rules¶

# k8s/monitoring/alerting-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: solana-dapps-alerts
  namespace: monitoring
spec:
  groups:
    - name: solana-dapps.rules
      rules:
        # High Error Rate
        - alert: HighErrorRate
          expr: |
            sum(rate(http_requests_total{status=~"5..", service="api"}[5m])) /
            sum(rate(http_requests_total{service="api"}[5m])) > 0.05
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "High error rate detected"
            description: "Error rate is above 5% for the last 5 minutes"

        # High Latency
        - alert: HighLatency
          expr: |
            histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{service="api"}[5m])) > 1
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "High latency detected"
            description: "P95 latency is above 1 second"

        # Pod Restarts
        - alert: PodCrashLooping
          expr: |
            rate(kube_pod_container_status_restarts_total{namespace="solana-dapps"}[15m]) > 0
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Pod is crash looping"
            description: "Pod {{ $labels.pod }} is restarting frequently"

        # Transaction Failures
        - alert: HighTransactionFailureRate
          expr: |
            sum(rate(solana_tx_failed_total[5m])) /
            sum(rate(solana_tx_total[5m])) > 0.1
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "High Solana transaction failure rate"
            description: "More than 10% of transactions are failing"

        # RPC Endpoint Issues
        - alert: RPCEndpointDown
          expr: solana_rpc_up == 0
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: "Solana RPC endpoint is unreachable"
            description: "Cannot connect to Solana RPC endpoint"

        # Database Connection Issues
        - alert: DatabaseConnectionPoolExhausted
          expr: db_pool_available_connections < 2
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Database connection pool nearly exhausted"
            description: "Only {{ $value }} connections available"

PagerDuty Integration¶

# k8s/monitoring/alertmanager-config.yaml
apiVersion: v1
kind: Secret
metadata:
  name: alertmanager-config
  namespace: monitoring
type: Opaque
stringData:
  alertmanager.yaml: |
    global:
      resolve_timeout: 5m
      pagerduty_url: 'https://events.pagerduty.com/v2/enqueue'

    route:
      group_by: ['alertname', 'severity']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 4h
      receiver: 'default'
      routes:
        - match:
            severity: critical
          receiver: 'pagerduty-critical'
        - match:
            severity: warning
          receiver: 'slack-warnings'

    receivers:
      - name: 'default'
        slack_configs:
          - channel: '#solana-dapps-alerts'
            send_resolved: true

      - name: 'pagerduty-critical'
        pagerduty_configs:
          - service_key_file: '/etc/alertmanager/secrets/pagerduty-key'
            severity: critical

      - name: 'slack-warnings'
        slack_configs:
          - channel: '#solana-dapps-warnings'
            send_resolved: true

Part E: Cost Optimization¶

Compute Unit Optimization¶

Solana programs have a compute budget. Optimize to reduce costs:

// Optimize account data size
#[account]
pub struct OptimizedEscrow {
    pub maker: Pubkey,           // 32 bytes
    pub taker: Pubkey,           // 32 bytes
    pub mint_a: Pubkey,          // 32 bytes
    pub mint_b: Pubkey,          // 32 bytes
    pub amount_a: u64,           // 8 bytes
    pub amount_b: u64,           // 8 bytes
    pub vault_bump: u8,          // 1 byte
    pub state: u8,               // 1 byte (instead of enum with padding)
    // Total: 146 bytes (vs potentially larger with unnecessary fields)
}

// Use smaller types where possible
pub fn calculate_fee(amount: u64, fee_bps: u16) -> u64 {
    // u16 is sufficient for basis points (0-10000)
    (amount as u128 * fee_bps as u128 / 10000) as u64
}

// Avoid redundant deserialization
pub fn process_batch(ctx: Context<ProcessBatch>) -> Result<()> {
    // Process accounts in batch instead of individual transactions
    for account_info in ctx.remaining_accounts.iter() {
        // Process each account
    }
    Ok(())
}

Request compute budget only when needed:

// Frontend: Set compute budget for complex transactions
import { ComputeBudgetProgram } from '@solana/web3.js';

const modifyComputeUnits = ComputeBudgetProgram.setComputeUnitLimit({
  units: 300000  // Increase for complex operations
});

const addPriorityFee = ComputeBudgetProgram.setComputeUnitPrice({
  microLamports: 1000  // Priority fee for faster inclusion
});

const transaction = new Transaction()
  .add(modifyComputeUnits)
  .add(addPriorityFee)
  .add(yourInstruction);

RPC Cost Management¶

# api/app/services/rpc_manager.py
from typing import List
import httpx
import asyncio
from dataclasses import dataclass
from datetime import datetime, timedelta

@dataclass
class RPCEndpoint:
    url: str
    priority: int
    rate_limit: int  # requests per second
    current_count: int = 0
    last_reset: datetime = datetime.now()

class RPCManager:
    def __init__(self, endpoints: List[RPCEndpoint]):
        self.endpoints = sorted(endpoints, key=lambda e: e.priority)
        self.current_index = 0

    async def get_endpoint(self) -> str:
        """Get next available endpoint with rate limiting."""
        for endpoint in self.endpoints:
            # Reset counter if needed
            if datetime.now() - endpoint.last_reset > timedelta(seconds=1):
                endpoint.current_count = 0
                endpoint.last_reset = datetime.now()

            # Check rate limit
            if endpoint.current_count < endpoint.rate_limit:
                endpoint.current_count += 1
                return endpoint.url

        # All endpoints rate limited, wait and retry
        await asyncio.sleep(0.1)
        return await self.get_endpoint()

    async def make_request(self, method: str, params: list) -> dict:
        """Make RPC request with automatic failover."""
        endpoint = await self.get_endpoint()

        async with httpx.AsyncClient() as client:
            try:
                response = await client.post(
                    endpoint,
                    json={"jsonrpc": "2.0", "id": 1, "method": method, "params": params},
                    timeout=30.0
                )
                return response.json()
            except Exception as e:
                # Mark endpoint as failed and try next
                self.endpoints = [e for e in self.endpoints if e.url != endpoint]
                if self.endpoints:
                    return await self.make_request(method, params)
                raise e

Part F: Incident Response¶

Runbook Template¶

# Incident Runbook: High Error Rate

## Detection
- Alert: HighErrorRate
- Threshold: Error rate > 5% for 5 minutes
- Dashboard: [Grafana Link]

## Initial Assessment (5 minutes)
1. Check error rate trend in Grafana
2. Identify affected endpoints via logs
3. Check recent deployments
4. Verify RPC endpoint health

## Common Causes & Remediation

### RPC Endpoint Issues
**Symptoms:** All transactions failing, RPC timeout errors
**Check:** `solana cluster-version --url <rpc-url>`
**Fix:**
1. Switch to backup RPC: `kubectl set env deployment/api SOLANA_RPC_URL=<backup>`
2. Notify RPC provider

### Database Connection Issues
**Symptoms:** API timeouts, connection pool exhausted
**Check:** `kubectl logs -l app=api | grep "connection"`
**Fix:**
1. Restart API pods: `kubectl rollout restart deployment/api`
2. Scale up if persistent: `kubectl scale deployment/api --replicas=5`

### Program Bug
**Symptoms:** Specific instruction failing, error logs show program error
**Check:** Review recent program deployments
**Fix:**
1. Rollback program if critical: `solana program deploy <previous-binary>`
2. Disable affected feature via feature flag

## Escalation
- L1 (On-Call): Initial triage, known issues
- L2 (Senior Engineer): Complex debugging, code changes
- L3 (Architect): System-wide issues, critical decisions

## Post-Incident
1. Document timeline in incident report
2. Identify root cause
3. Create follow-up tickets for prevention
4. Update runbook if needed

Feature Flags¶

# api/app/services/feature_flags.py
from enum import Enum
from typing import Optional
import redis.asyncio as redis

class FeatureFlag(Enum):
    ENABLE_ESCROW = "enable_escrow"
    ENABLE_MARKETPLACE = "enable_marketplace"
    ENABLE_AMM = "enable_amm"
    ENABLE_GOVERNANCE = "enable_governance"
    MAINTENANCE_MODE = "maintenance_mode"

class FeatureFlagService:
    def __init__(self, redis_url: str):
        self.redis = redis.from_url(redis_url)
        self.cache: dict[str, bool] = {}
        self.cache_ttl = 60  # seconds

    async def is_enabled(self, flag: FeatureFlag) -> bool:
        """Check if feature flag is enabled."""
        # Check cache first
        if flag.value in self.cache:
            return self.cache[flag.value]

        # Fetch from Redis
        value = await self.redis.get(f"feature:{flag.value}")
        enabled = value == b"true" if value else True  # Default enabled

        self.cache[flag.value] = enabled
        return enabled

    async def set_flag(self, flag: FeatureFlag, enabled: bool) -> None:
        """Set feature flag value."""
        await self.redis.set(f"feature:{flag.value}", str(enabled).lower())
        self.cache[flag.value] = enabled

    async def emergency_disable(self, flag: FeatureFlag) -> None:
        """Emergency disable a feature."""
        await self.set_flag(flag, False)
        # Clear all caches
        self.cache.clear()

# Usage in FastAPI
from fastapi import HTTPException, Depends

async def check_escrow_enabled(
    flags: FeatureFlagService = Depends(get_feature_flags)
):
    if not await flags.is_enabled(FeatureFlag.ENABLE_ESCROW):
        raise HTTPException(503, "Escrow feature is currently disabled")
    return True

@router.post("/escrows", dependencies=[Depends(check_escrow_enabled)])
async def create_escrow(...):
    ...

Part G: Program Upgrades¶

Upgrade Authority Management¶

// Always use multisig for mainnet upgrade authority
use squads_multisig::state::Ms;

pub fn upgrade_program_with_multisig(
    ctx: Context<UpgradeWithMultisig>,
    buffer: Pubkey,
) -> Result<()> {
    // Verify multisig has signed
    require!(
        ctx.accounts.multisig.transaction_index > ctx.accounts.multisig.ms_change_index,
        ErrorCode::MultisigNotApproved
    );

    // Proceed with upgrade via BPF Upgradeable Loader
    let upgrade_ix = bpf_loader_upgradeable::upgrade(
        &ctx.accounts.program.key(),
        &buffer,
        &ctx.accounts.multisig.key(),
        &ctx.accounts.spill.key(),
    );

    invoke_signed(
        &upgrade_ix,
        &[
            ctx.accounts.program.to_account_info(),
            ctx.accounts.program_data.to_account_info(),
            ctx.accounts.buffer.to_account_info(),
            ctx.accounts.multisig.to_account_info(),
            ctx.accounts.spill.to_account_info(),
            ctx.accounts.rent.to_account_info(),
            ctx.accounts.clock.to_account_info(),
            ctx.accounts.bpf_loader.to_account_info(),
        ],
        &[/* multisig seeds */],
    )?;

    Ok(())
}

Version Migration¶

// Support multiple account versions for migration
#[account]
pub struct EscrowV1 {
    pub version: u8,  // 1
    pub maker: Pubkey,
    pub amount: u64,
}

#[account]
pub struct EscrowV2 {
    pub version: u8,  // 2
    pub maker: Pubkey,
    pub taker: Option<Pubkey>,  // New field
    pub amount: u64,
    pub created_at: i64,  // New field
}

pub fn migrate_escrow_v1_to_v2(ctx: Context<MigrateEscrow>) -> Result<()> {
    let old_escrow = &ctx.accounts.old_escrow;
    let new_escrow = &mut ctx.accounts.new_escrow;

    // Copy existing fields
    new_escrow.version = 2;
    new_escrow.maker = old_escrow.maker;
    new_escrow.amount = old_escrow.amount;

    // Initialize new fields
    new_escrow.taker = None;
    new_escrow.created_at = Clock::get()?.unix_timestamp;

    // Close old account, return rent to payer
    ctx.accounts.old_escrow.close(ctx.accounts.payer.to_account_info())?;

    emit!(EscrowMigrated {
        old_pubkey: ctx.accounts.old_escrow.key(),
        new_pubkey: ctx.accounts.new_escrow.key(),
        version: 2,
    });

    Ok(())
}

Summary¶

In this module, you learned:

Security: Implementing signer verification, checked math, and reentrancy protection
CI/CD: Setting up comprehensive GitHub Actions workflows for testing and deployment
Deployment Strategies: Blue-green and canary deployments for safe releases
Monitoring: Prometheus alerting rules and PagerDuty integration
Cost Optimization: Compute unit optimization and RPC cost management
Incident Response: Runbooks, feature flags, and emergency procedures
Upgrades: Safe program upgrade patterns with multisig and version migration

Key Takeaways¶

Never deploy without comprehensive testing and security review
Implement feature flags for quick incident response
Monitor everything and alert on meaningful thresholds
Have runbooks ready for common incident types
Use multisig for all mainnet upgrade authorities
Plan for version migration before deploying

Course Completion¶

Congratulations! You have completed the Solana DApps course. You now have the knowledge to:

Build production-ready Solana programs with Anchor
Create modern frontends with React and wallet integration
Deploy scalable backends with FastAPI and Poem
Operate applications in Kubernetes with observability
Implement security best practices and incident response

Next Steps¶

Build: Create your own DApp using these patterns
Test: Deploy to devnet and iterate
Audit: Get a security review before mainnet
Launch: Deploy with monitoring and alerting
Iterate: Continuously improve based on user feedback

Back to Module 12 Back to Home