Admin AI Usage & Cost Dashboard — 3Bids Screen Review

Overview

AI Usage Overview

9:41

AI Usage Live

Today

7 Days

30 Days

Custom

2,847

API Calls Today

+12.4%

$342.18

Cost MTD

-3.2%

$0.28

Cost Per User

+1.1%

99.2%

Success Rate

+0.3%

Model Breakdown

Models

Gemini Flash — 60%

Gemini Pro Image — 22%

GPT Image 1.5 — 18%

ModelsView All

Gemini 3 Flash

1,708 calls • $48.20

Healthy

Gemini Pro Image

626 calls • $187.80

Healthy

GPT Image 1.5

513 calls • $106.18

Elevated

Requests Today24h

12 AM 6 AM 12 PM 6 PM Now

AIUsageOverview screen — useAIUsageStats(period) hook with real-time subscription. Donut via ModelBreakdownChart.

Model Detail

Gemini 3 Flash

9:41

Gemini 3 Flash Healthy

gemini-3-flash-preview

Text & Chat • Google AI

1,708

Requests/Day

+8.2%

$48.20

Cost Today

-2.1%

142ms

Avg Latency

-5ms

0.3%

Error Rate

-0.1%

Token UsageToday

1.2M

Input Tokens

480K

Output Tokens

71% input 29% output

Requests / Day7 days

Mon

1,412

Tue

1,564

Wed

1,738

Thu

1,477

Fri

1,629

Sat

912

Sun

1,708

Top Use Cases

AI Assistant

892 calls • Avg 320 tokens

52%

Job Detection

478 calls • Avg 180 tokens

28%

Sentiment Analysis

338 calls • Avg 95 tokens

20%

ModelDetailScreen — useModelMetrics("gemini-3-flash"). Token bar via TokenUsageBar. Chart via RequestsBarChart.

Gemini Pro Image

9:41

Gemini Pro Image Healthy

gemini-3-pro-image-preview

Design Studio • Google AI

626

Generations/Day

+15.7%

$0.30

Cost / Generation

Same

94.8%

Success Rate

+1.2%

8.4s

Avg Gen Time

-0.6s

Generations / Day7 days

Mon

480

Tue

541

Wed

610

Thu

506

Fri

567

Sat

349

Sun

626

Top Styles Requested

Modern Interior

187 generations • 8.1s avg

30%

Exterior Renovation

144 generations • 9.2s avg

23%

Kitchen Remodel

119 generations • 8.8s avg

19%

Recent GenerationsGallery

ModelDetailScreen — useModelMetrics("gemini-3-pro-image"). Gallery via RecentGenerationsGrid with lazy image loading.

GPT Image 1.5

9:41

GPT Image 1.5 Elevated

gpt-image-1.5

Avatar Generation • OpenAI

513

Avatars/Day

+22.1%

$0.21

Cost / Avatar

-$0.02

Queue Depth

4.2%

Retry Rate

+1.1%

Resolution Breakdown

512x512

308

1024x1024

154

256x256

Queue StatusLive

Processing

5 avatars

Active

Waiting

18 avatars

Queued

Retries Today

22 attempts

4.2%

Completed Today

491 avatars

95.8%

ModelDetailScreen — useModelMetrics("gpt-image-1.5"). Queue via AvatarQueueStatus with real-time sub.

Cost & Optimization

Cost Breakdown

9:41

Cost Breakdown

7 Days

30 Days

90 Days

YTD

Budget Progress

$0 $342 / $440 $440

Cost by Model30 days

Wk1

$78

Wk2

$84

Wk3

$92

Wk4

$88

Flash

Pro Image

GPT Image

Projected Monthly

$427.73

Projected at current rate

Near Budget

Cost by Feature

$342

MTD

Design Studio — 42%

AI Assistant — 28%

Avatars — 20%

Detection — 10%

Per-Model Costs

Gemini Flash

$124.60

36.4%

Gemini Pro Image

$143.70

42.0%

GPT Image 1.5

$73.88

21.6%

CostBreakdownScreen — useAICostMetrics(period). Stacked chart via CostStackedBar. Budget line via BudgetProgressBar.

Rate Limits & Quotas

9:41

Rate Limits Live

Current Usage vs Limits

Gemini Flash

1,708 / 10,000 RPD • 28 / 60 RPM

17%

Gemini Pro Image

626 / 1,000 RPD • 12 / 15 RPM

63%

GPT Image 1.5

513 / 1,000 RPD • 8 / 10 RPM

51%

Burst Capacity

72%

Flash Burst

28%

Pro Image Burst

38%

GPT Burst

Auto-Scaling

Gemini Flash Scaling

Auto-increase RPM on demand spikes

Image Gen Queuing

Queue overflow requests instead of rejecting

Fallback Models

Switch to backup model on rate limit

Recent Throttle EventsView All

GPT Image 1.5 — RPM limit hit

2 min ago • 3 requests queued

Throttled

Gemini Pro Image — RPM limit hit

18 min ago • 1 request queued

Resolved

Flash auto-scaled — 60 to 90 RPM

1 hr ago • Scaled back after 12 min

Resolved

RateLimitsScreen — useQuotaMetrics() real-time. Quota bars via QuotaGaugeRow. Throttle log via ThrottleEventFeed.

Error Monitoring

9:41

Error Monitoring

1.8%

Error Rate (24h)

+0.3%

87%

Retry Success

+2.1%

Error Rate Timeline24h

12 AM 6 AM 12 PM 6 PM Now

Error Types24h

Timeout

Rate Limit

Content

Bad Input

Recent ErrorsView All

Timeout — GPT Image 1.5

Request exceeded 30s timeout during avatar generation for user usr_2kJ9xM

at generateAvatar() → openai.images.generate() → timeout after 30012ms

2 min ago • Retried OK

Content Filter — Gemini Pro Image

Generation blocked by safety filter. Prompt contained flagged terms for design request #DR-4821

at generateDesign() → gemini.generateContent() → SAFETY_BLOCK

8 min ago • User notified

Rate Limit — Gemini Pro Image

429 Too Many Requests. RPM quota exhausted at 15/15. Queued 3 pending requests.

at designStudio() → gemini.generateImages() → 429 RESOURCE_EXHAUSTED

18 min ago • Auto-resolved

Invalid Input — Gemini Flash

Token limit exceeded (32,768 max). User sent 41,200 tokens in AI assistant conversation.

at aiAssistant() → gemini.chat() → INVALID_ARGUMENT: max_tokens

34 min ago • Truncated & retried

Retry Performance

87%

Retry Success

1.3

Avg Retries

2.4s

Avg Wait

ErrorMonitoringScreen — useAIErrorMetrics() real-time subscription. Error log via ErrorLogFeed. Timeline via ErrorRateTimeline.

Optimization

9:41

Optimization

Estimated Monthly Savings

$68.40

4 recommendations • 16% cost reduction

Cache Performance

Response Cache

Identical prompts served from cache

34%

Hit Rate

952

Cache Hits

$18.20

Saved MTD

Recommendations

Optimize Prompt Length

-$24.50/mo

AI Assistant prompts average 1,200 tokens. Trimming system prompts and using structured few-shot examples could reduce input tokens by 35% with no quality loss.

High Impact Low Effort

Downgrade Avatar Resolution

-$18.90/mo

60% of avatars render at 512x512 but are displayed at 128x128 in the app. Generating at 256x256 would cut GPT Image costs by 35% with no visible quality difference.

High Impact Low Effort

Batch Design Requests

-$15.00/mo

Design Studio makes 3 separate API calls per generation (style, render, upscale). Batching into a single multi-step call reduces overhead and latency by 40%.

Medium Impact Medium Effort

Increase Cache TTL

-$10.00/mo

Current cache TTL is 1 hour. Extending to 24 hours for sentiment analysis and job detection would increase hit rate from 34% to an estimated 52%, saving ~$10/mo in redundant calls.

Medium Impact Low Effort

Savings Breakdown

Prompts

$24.50

Avatars

$18.90

Batching

$15.00

Cache

$10.00

OptimizationScreen — useAIOptimizations() computes savings from usage patterns. Recs via RecommendationCard. Cache stats via CachePerformanceCard.

Data Architecture

Backend schemas, queries, hooks, and components powering the AI Usage Dashboard.

aiUsageLogs (Schema)


        _id: Id<"aiUsageLogs">

        timestamp: number // Date.now()

        model: "gemini-3-flash" | "gemini-3-pro-image" | "gpt-image-1.5"

        feature: "ai_assistant" | "job_detection" | "sentiment" | "design_studio" | "avatar"

        status: "success" | "error" | "cached" | "retried"

        inputTokens: v.optional(v.number())

        outputTokens: v.optional(v.number())

        cost: number // USD cents

        latencyMs: number

        userId: v.optional(v.string())

        errorType: v.optional("timeout" | "rate_limit" | "content_filter" | "invalid_input")

        errorMessage: v.optional(v.string())

        retryCount: v.optional(v.number())

        cached: boolean // default false

        resolution: v.optional(v.string()) // "256x256" | "512x512" | "1024x1024"

        metadata: v.optional(v.any())

aiQuotas (Schema)


        _id: Id<"aiQuotas">

        model: "gemini-3-flash" | "gemini-3-pro-image" | "gpt-image-1.5"

        rpmLimit: number // requests per minute

        rpdLimit: number // requests per day

        currentRpm: number

        currentRpd: number

        burstCapacity: number // percentage remaining

        autoScale: boolean

        fallbackEnabled: boolean

        lastUpdated: number

aiBudgets (Schema)


        _id: Id<"aiBudgets">

        month: string // "2026-02"

        budgetCents: number // $440 = 44000

        spentCents: number

        alertThreshold: number // 0.75 = 75%

        alertSent: boolean

        projectedCents: number


        // Per-model breakdown

        flashSpent: number

        proImageSpent: number

        gptImageSpent: number

aiOptimizations (Schema)


        _id: Id<"aiOptimizations">

        type: "prompt_trim" | "resolution_downgrade" | "batch" | "cache_ttl"

        title: string

        description: string

        estimatedSavingsCents: number

        impact: "high" | "medium" | "low"

        effort: "low" | "medium" | "high"

        status: "pending" | "applied" | "dismissed"

        appliedAt: v.optional(v.number())

        appliedBy: v.optional(v.string())

Convex Queries & Mutations


        // api.admin.aiUsage

        overview({ period: "today" | "7d" | "30d" | "custom" })

        modelDetail({ model: string, period })

        costBreakdown({ period })

        errorLog({ model?, errorType?, limit })

        quotaStatus() // real-time quota snapshot

        optimizations() // computed recommendations


        // api.admin.aiUsage (mutations)

        applyOptimization({ optimizationId })

        dismissOptimization({ optimizationId })

        updateBudget({ month, budgetCents, alertThreshold })

        toggleAutoScale({ model, enabled })

        toggleFallback({ model, enabled })

Hooks & Components


        // Hooks

        useAIUsageStats(period) // overview KPIs + model breakdown

        useModelMetrics(model) // per-model detail

        useAICostMetrics(period) // cost breakdown + budget

        useQuotaMetrics() // real-time quota gauges

        useAIErrorMetrics() // error feed + timeline

        useAIOptimizations() // computed savings recs


        // Screen Components

        AIUsageOverview // KPIs + donut + model cards

        ModelDetailScreen // per-model deep dive

        CostBreakdownScreen // stacked bars + budget + pie

        RateLimitsScreen // quota bars + throttle log

        ErrorMonitoringScreen // timeline + error feed

        OptimizationScreen // recs + cache stats


        // Shared Components

        ModelBreakdownChart // donut/conic-gradient

        TokenUsageBar // input/output split bar

        RequestsBarChart // horizontal bar chart

        CostStackedBar // stacked segments by model

        BudgetProgressBar // fill + threshold marker

        QuotaGaugeRow // usage vs limit bar

        ThrottleEventFeed // throttle log list

        ErrorRateTimeline // color-coded bar timeline

        ErrorLogFeed // error list + stack traces

        RecommendationCard // optimization rec with actions

        CachePerformanceCard // hit rate + saved $

        RecentGenerationsGrid // lazy image grid

        AvatarQueueStatus // live queue depth

Data Architecture

aiUsageLogs (Schema)

aiQuotas (Schema)

aiBudgets (Schema)

aiOptimizations (Schema)

Convex Queries & Mutations

Hooks & Components

AI Usage Overview & Model Performance

Cost Breakdown & Rate Limits & Quotas

Error Monitoring & Optimization

AI Usage Dashboard

Model Analytics

Key Design Decisions