Abuse of Public Endpoints — When Your Free Tier Becomes Someone Else's Compute

Introduction

Public endpoints that provide real value attract abuse. An AI generation endpoint, a file conversion service, an email sending API, an SMS OTP — anything that does useful computational work and is accessible without payment will be discovered and used for free. The abuse ranges from innocuous overuse to active malice: spam relay, cryptomining via compute, credential stuffing via auth endpoints, and DDoS amplification via reflection endpoints. The controls are usage limits, intent validation, and cost attribution that makes abuse expensive.

Common Endpoint Abuse Patterns
Fix 1: Per-Account Quotas With Hard Limits
Fix 2: SMS Pumping Fraud Prevention
Fix 3: Compute Abuse Prevention for AI/ML Endpoints
Fix 4: Detect and Kill Runaway Usage in Real Time
Endpoint Protection Checklist
Conclusion

Common Endpoint Abuse Patterns

Endpoint types and how they get abused:

1. AI/ML inference endpoints
   → Your "free tier" used to generate competitor's product photos
   → One account: 50,000 images at $0.002 each = $100/day for free
   → Fix: strict quota, credit system, content moderation

2. Email sending endpoints
   → Your transactional email endpoint used as spam relay
   → Attacker sends 100,000 marketing emails through your domain
   → Fix: rate limit, recipient allowlist validation, DKIM compliance

3. SMS OTP endpoints
   → OTP endpoint used for SMS pumping fraud
   → Attacker bills your Twilio account by requesting OTPs to premium numbers
   → Fix: phone number validation, SMS rate limits, suspicious number detection

4. File processing (PDF, image)
   → Free conversion API used at scale
   → CPU and memory costs accumulate
   → Fix: file size limits, processing time limits, usage quotas

5. Search endpoints
   → Full-text search used to extract your entire content catalog
   → Rate limit + pagination cursor enforcement

Fix 1: Per-Account Quotas With Hard Limits

// quota-manager.ts — enforce usage limits per account
interface Quota {
  resource: string
  limit: number
  window: 'hourly' | 'daily' | 'monthly'
  resetAt?: Date
}

const DEFAULT_FREE_TIER_QUOTAS: Quota[] = [
  { resource: 'ai_generations', limit: 50, window: 'daily' },
  { resource: 'email_sends', limit: 100, window: 'daily' },
  { resource: 'sms_sends', limit: 20, window: 'daily' },
  { resource: 'pdf_conversions', limit: 10, window: 'daily' },
  { resource: 'api_calls', limit: 1000, window: 'hourly' },
]

async function checkQuota(
  accountId: string,
  resource: string,
  cost = 1
): Promise<{ allowed: boolean; remaining: number; resetAt: Date }> {
  const quota = await getAccountQuota(accountId, resource)

  const windowKey = getWindowKey(resource, quota.window)
  const usageKey = `quota:${accountId}:${resource}:${windowKey}`

  const current = await redis.incrby(usageKey, cost)

  // Set expiry on first use in window
  if (current === cost) {
    const ttl = getWindowTTL(quota.window)
    await redis.expire(usageKey, ttl)
  }

  const resetAt = getWindowResetTime(quota.window)

  if (current > quota.limit) {
    // Exceeded — decrement (don't record usage over limit)
    await redis.decrby(usageKey, cost)

    return {
      allowed: false,
      remaining: 0,
      resetAt,
    }
  }

  return {
    allowed: true,
    remaining: quota.limit - current,
    resetAt,
  }
}

// Middleware
function quotaMiddleware(resource: string, cost = 1) {
  return async (req: Request, res: Response, next: NextFunction) => {
    const accountId = req.account!.id

    const result = await checkQuota(accountId, resource, cost)

    res.set({
      'X-RateLimit-Resource': resource,
      'X-RateLimit-Remaining': result.remaining,
      'X-RateLimit-Reset': result.resetAt.toISOString(),
    })

    if (!result.allowed) {
      return res.status(429).json({
        error: `Daily ${resource} quota exceeded`,
        limit: await getAccountQuota(accountId, resource).then(q => q.limit),
        resetAt: result.resetAt.toISOString(),
        upgradeUrl: 'https://myapp.com/pricing',
      })
    }

    next()
  }
}

Fix 2: SMS Pumping Fraud Prevention

// SMS pumping: attacker triggers OTPs to premium-rate numbers
// Your Twilio bill spikes; attacker gets revenue share from carrier

async function validatePhoneNumberForSMS(phoneNumber: string): Promise<void> {
  // 1. Parse and validate format
  const parsed = parsePhoneNumber(phoneNumber)
  if (!parsed.isValid()) {
    throw new ValidationError('Invalid phone number format')
  }

  // 2. Block premium/short code numbers
  // Premium numbers start with specific prefixes by country
  const premiumPrefixes = {
    US: ['+1900', '+1976'],
    GB: ['+4490', '+4491'],
  }
  const country = parsed.country
  const premiums = premiumPrefixes[country] ?? []
  if (premiums.some(prefix => phoneNumber.startsWith(prefix))) {
    throw new ValidationError('Phone number not eligible for SMS')
  }

  // 3. Limit OTP requests per phone number (even across accounts)
  const smsKey = `sms_attempts:${phoneNumber}`
  const attempts = await redis.incr(smsKey)
  await redis.expire(smsKey, 3600)

  if (attempts > 5) {  // Max 5 OTPs per phone number per hour
    throw new RateLimitError('Too many SMS requests to this number')
  }

  // 4. For high-risk geos, require additional verification
  const highRiskCountries = ['KE', 'NG', 'PK', 'BD', 'IN']  // Common SMS pumping origins
  if (highRiskCountries.includes(country)) {
    const accountSmsSent = await redis.incr(`sms_highRisk:${req.account?.id}`)
    if (accountSmsSent > 10) {
      // Flag account for review before sending more
      await flagAccountForReview(req.account?.id, 'high_risk_sms_volume')
      throw new Error('Account requires verification for international SMS')
    }
  }
}

Fix 3: Compute Abuse Prevention for AI/ML Endpoints

// Prevent one account from consuming all your AI compute budget
router.post('/api/generate-image',
  requireAuth,
  quotaMiddleware('ai_generations', 1),
  async (req, res) => {
    const { prompt, style } = req.body

    // 1. Validate input — reject prompts that look like bulk generation
    if (prompt.length > 500) {
      return res.status(400).json({ error: 'Prompt too long' })
    }

    // 2. Check account's generation history — flag suspicious patterns
    const recentGens = await db.query(`
      SELECT COUNT(*) as count
      FROM ai_generations
      WHERE account_id = $1
      AND created_at > NOW() - INTERVAL '1 hour'
    `, [req.account.id])

    if (recentGens.rows[0].count > 100) {
      // 100 generations per hour is almost certainly automated abuse
      await flagAccountForReview(req.account.id, 'ai_generation_abuse')
      return res.status(429).json({
        error: 'Unusual activity detected. Account temporarily limited.',
      })
    }

    // 3. Assign queue priority based on account tier
    const priority = req.account.plan === 'pro' ? 10 : 5
    const jobId = await imageGenerationQueue.add(
      { prompt, style, accountId: req.account.id },
      { priority, timeout: 60_000 }
    )

    res.json({ jobId, estimatedWaitSeconds: await getQueueWaitTime() })
  }
)

Fix 4: Detect and Kill Runaway Usage in Real Time

// abuse-detector.ts — catch abuse before it becomes a bill
cron.schedule('*/5 * * * *', async () => {  // Every 5 minutes
  await detectAbusePatterns()
})

async function detectAbusePatterns(): Promise<void> {
  // Find accounts using > 10x their average in the last hour
  const topConsumers = await db.query(`
    SELECT
      account_id,
      SUM(cost_units) as usage_last_hour,
      AVG(daily_avg) as typical_daily_avg
    FROM (
      SELECT
        account_id,
        cost_units,
        AVG(cost_units) OVER (
          PARTITION BY account_id
          ORDER BY created_at
          ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING
        ) * 24 as daily_avg
      FROM api_usage_events
      WHERE created_at > NOW() - INTERVAL '1 hour'
    ) t
    GROUP BY account_id
    HAVING SUM(cost_units) > AVG(daily_avg) * 10
    ORDER BY usage_last_hour DESC
    LIMIT 10
  `)

  for (const account of topConsumers.rows) {
    logger.warn({
      accountId: account.account_id,
      usageLastHour: account.usage_last_hour,
      typicalDailyAvg: account.typical_daily_avg,
    }, 'Potential abuse detected')

    await alerting.warn(
      `Abuse alert: Account ${account.account_id} consumed ${account.usage_last_hour} units in last hour (typical: ${account.typical_daily_avg} per day)`
    )
  }
}

Endpoint Protection Checklist

✅ Every "free tier" resource has a hard daily quota enforced at the middleware level
✅ SMS endpoints validate phone numbers and limit OTPs per number per hour
✅ AI/ML compute endpoints track per-account hourly usage and flag anomalies
✅ Automated abuse detection runs every 5 minutes — doesn't wait for the bill
✅ Accounts flagged for abuse are suspended until manual review (not just rate limited)
✅ Usage costs attributed to each account — easy to identify who's costing money
✅ "Suspicious pattern" alerting separate from rate limit alerting

Conclusion

Public endpoint abuse is a product design problem as much as a security problem. If a feature has real value, someone will use it for free at scale if there's no cost barrier. The defenses are: hard quotas with clear limits, per-resource rate limiting across IP and account, SMS-specific fraud detection (pumping is expensive and detectable), and real-time anomaly detection that catches abuse in minutes rather than at the end of the billing cycle. The goal isn't to block all automation — legitimate users automate too — it's to make abuse economically unattractive and technically detectable.