Batch Caption Processing Guide
Overview
PhotoSwipe Pro with AI SEO now supports batch caption generation, allowing you to process 1-50 images in a single API request. This dramatically improves performance for galleries with many images.
API Key Ownership - IMPORTANT
Who needs API keys?
- ✅ Server owner (you, the PhotoSwipe Pro license holder) - Provides ONE API key via environment variables
- ❌ End users (website visitors) - Do NOT need their own API keys
- ❌ Client applications - Only need PhotoSwipe Pro license key
How it works:
- You (server owner) set up ONE Gemini or OpenRouter API key in your
.envfile - Your server proxies all AI requests
- Clients authenticate with their PhotoSwipe Pro license key
- Your server validates the license and forwards requests to the AI provider
- You pay for the AI API costs as part of your Pro service
This architecture protects your API keys and provides a seamless experience for customers.
Supported Providers
1. OpenRouter (GPT-4o Vision)
- Pros: Multiple model options, reliable, excellent vision capabilities
- Cons: Higher cost per image
- Setup: Get API key from https://openrouter.ai/
- Model:
openai/gpt-4o(default)
2. Gemini (Google AI)
- Pros: Lower cost, fast, good vision capabilities
- Cons: Requires base64 image encoding (higher bandwidth)
- Setup: Get API key from https://aistudio.google.com/app/apikey
- Model:
gemini-1.5-flash(default)
3. Mock (Testing)
- Pros: Free, instant, no API key needed
- Cons: Returns fake data
- Usage: Development and testing only
Environment Setup
Option 1: OpenRouter (Recommended)
# .env file
AI_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key-here
AI_MODEL=openai/gpt-4o
AI_TIMEOUT_MS=15000
Option 2: Gemini
# .env file
AI_PROVIDER=gemini
GEMINI_API_KEY=your-gemini-api-key-here
GEMINI_MODEL=gemini-1.5-flash
AI_TIMEOUT_MS=15000
Batch Configuration
# Maximum images per batch request
BATCH_MAX_SIZE=50
# Timeout for entire batch (60 seconds)
BATCH_TIMEOUT_MS=60000
# Rate limiting (20 requests per minute per IP)
AI_RATE_LIMIT_WINDOW_MS=60000
AI_RATE_LIMIT_MAX=20
# License validation
REQUIRE_LICENSE=true # Set to false for demo mode
API Endpoints
Single Image
POST /api/ai/caption
{
"url": "https://example.com/photo.jpg",
"context": { "title": "Product Name" },
"licenseKey": "your-photoswipe-pro-license-key"
}
Response:
{
"alt": "A red bicycle leaning against a brick wall",
"caption": "Vintage red bicycle with basket against urban brick wall"
}
Batch Processing
POST /api/ai/caption/batch
{
"images": [
{ "url": "https://example.com/photo1.jpg", "context": { "title": "Product 1" } },
{ "url": "https://example.com/photo2.jpg", "context": { "title": "Product 2" } },
{ "url": "https://example.com/photo3.jpg", "context": { "title": "Product 3" } }
],
"licenseKey": "your-photoswipe-pro-license-key"
}
Response:
{
"results": [
{
"url": "https://example.com/photo1.jpg",
"alt": "Description of photo 1",
"caption": "Engaging caption for photo 1"
},
{
"url": "https://example.com/photo2.jpg",
"alt": "Description of photo 2",
"caption": "Engaging caption for photo 2"
},
{
"url": "https://example.com/photo3.jpg",
"error": "processing_failed"
}
],
"summary": {
"total": 3,
"success": 2,
"failed": 1
}
}
Client-Side Usage
Basic Batch Processing
import { CaptionProvider } from 'photoswipe-pro/ai';
const provider = new CaptionProvider({ baseUrl: '/api/ai' });
// Process 10 images at once
const images = [
{ url: 'https://example.com/photo1.jpg', context: { title: 'Product 1' } },
{ url: 'https://example.com/photo2.jpg', context: { title: 'Product 2' } },
// ... up to 50 images
];
const result = await provider.generateBatch({
images,
licenseKey: 'your-license-key'
});
// Handle results
result.results.forEach(item => {
if (item.error) {
console.error(`Failed to process ${item.url}: ${item.error}`);
} else {
console.log(`${item.url}: ${item.alt}`);
// Update your UI with item.alt and item.caption
}
});
console.log(`Processed ${result.summary.success}/${result.summary.total} images`);
Auto-Batching Helper
For large galleries, use the generateForUrls helper that automatically batches into chunks of 50:
import { CaptionProvider } from 'photoswipe-pro/ai';
const provider = new CaptionProvider({ baseUrl: '/api/ai' });
// Process 200 images (automatically batched into 4 requests of 50)
const urls = [
'https://example.com/photo1.jpg',
'https://example.com/photo2.jpg',
// ... 200 photos
];
const captionsMap = await provider.generateForUrls({
urls,
context: { category: 'products' },
licenseKey: 'your-license-key'
});
// Results stored in a Map
urls.forEach(url => {
const result = captionsMap.get(url);
if (result) {
console.log(`${url}: ${result.alt}`);
// Update your image with result.alt and result.caption
}
});
PhotoSwipe Gallery Integration
import PhotoSwipeLightbox from 'photoswipe/lightbox';
import { CaptionProvider } from 'photoswipe-pro/ai';
const provider = new CaptionProvider({ baseUrl: '/api/ai' });
// Initialize PhotoSwipe
const lightbox = new PhotoSwipeLightbox({
gallery: '#my-gallery',
children: 'a',
pswpModule: () => import('photoswipe')
});
// Generate captions for all images on page load
const images = Array.from(document.querySelectorAll('#my-gallery a')).map(el => ({
url: el.href,
context: { title: el.querySelector('img').alt || '' }
}));
const result = await provider.generateBatch({
images,
licenseKey: 'your-license-key'
});
// Update DOM with AI-generated captions
result.results.forEach((item, index) => {
if (!item.error) {
const img = document.querySelectorAll('#my-gallery img')[index];
img.alt = item.alt;
img.dataset.caption = item.caption;
}
});
lightbox.init();
Performance Considerations
Concurrency
The server processes 5 images in parallel per batch to optimize throughput while respecting AI provider rate limits.
Timeouts
- Single image: 15 seconds (configurable via
AI_TIMEOUT_MS) - Batch request: 60 seconds (configurable via
BATCH_TIMEOUT_MS) - Image fetch (Gemini only): 10 seconds
Rate Limiting
Default limits (per IP address):
- 20 requests per minute
- Applies to both single and batch endpoints
- Batch of 50 images counts as 1 request
Cost Optimization
For 100 images:
| Method | API Calls | Approx Time | Cost (GPT-4o) |
|---|---|---|---|
| Single | 100 calls | ~25 minutes | $0.50-$1.00 |
| Batch | 2 calls | ~2 minutes | $0.50-$1.00 |
Cost is the same, but batch is 12× faster!
Error Handling
Partial Failures
Batch processing continues even if some images fail. Check the error field in results:
const result = await provider.generateBatch({ images, licenseKey });
const failed = result.results.filter(r => r.error);
if (failed.length > 0) {
console.warn(`${failed.length} images failed:`, failed);
// Retry failed images
const retryImages = failed.map(r => ({ url: r.url }));
const retryResult = await provider.generateBatch({
images: retryImages,
licenseKey
});
}
Common Errors
| Error Code | Description | Solution |
|---|---|---|
400 invalid_input | Missing or malformed URL | Check image URLs are valid HTTPS |
400 batch_too_large | More than 50 images | Split into smaller batches |
402 license_invalid | Invalid or expired license | Check license key |
429 rate_limited | Too many requests | Implement exponential backoff |
502 provider_error | AI provider failed | Check API key and provider status |
Retry Logic
async function generateWithRetry(provider, input, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await provider.generateBatch(input);
} catch (error) {
if (error.message === 'rate_limited' && i < maxRetries - 1) {
// Exponential backoff: 2s, 4s, 8s
await new Promise(resolve => setTimeout(resolve, 2000 * Math.pow(2, i)));
continue;
}
throw error;
}
}
}
Static Site Generation (SSG)
For static sites, pre-generate captions at build time:
// build-captions.js
import { CaptionProvider } from 'photoswipe-pro/ai';
import fs from 'fs';
const provider = new CaptionProvider({ baseUrl: 'http://localhost:4001/api/ai' });
// Read image URLs from your static site
const imageUrls = JSON.parse(fs.readFileSync('images.json'));
// Generate captions
const captionsMap = await provider.generateForUrls({
urls: imageUrls,
licenseKey: process.env.PHOTOSWIPE_LICENSE_KEY
});
// Save to JSON file
const captions = Object.fromEntries(captionsMap);
fs.writeFileSync('captions.json', JSON.stringify(captions, null, 2));
console.log(`Generated captions for ${captionsMap.size} images`);
Then in your build process:
# Start local server
npm run server &
# Generate captions
node build-captions.js
# Build static site
npm run build
# Kill server
kill %1
Next.js Integration
// pages/api/generate-captions.js
import { CaptionProvider } from 'photoswipe-pro/ai';
export default async function handler(req, res) {
if (req.method !== 'POST') {
return res.status(405).json({ error: 'Method not allowed' });
}
const { images } = req.body;
const provider = new CaptionProvider({ baseUrl: process.env.AI_API_URL });
try {
const result = await provider.generateBatch({
images,
licenseKey: process.env.PHOTOSWIPE_LICENSE_KEY
});
res.json(result);
} catch (error) {
res.status(500).json({ error: error.message });
}
}
Monitoring and Analytics
Track batch processing performance:
const startTime = Date.now();
const result = await provider.generateBatch({ images, licenseKey });
const duration = Date.now() - startTime;
console.log('Batch Processing Stats:', {
total: result.summary.total,
success: result.summary.success,
failed: result.summary.failed,
successRate: `${(result.summary.success / result.summary.total * 100).toFixed(1)}%`,
duration: `${duration}ms`,
avgPerImage: `${(duration / result.summary.total).toFixed(0)}ms`
});
Best Practices
- Use batch processing for 3+ images - Single requests are fine for 1-2 images
- Implement caching - Store results to avoid re-processing
- Handle partial failures gracefully - Don't fail entire batch if one image fails
- Respect rate limits - Implement exponential backoff
- Monitor costs - Track API usage, especially with GPT-4o Vision
- Test with mock provider first - Validate integration before using real API
- Pre-generate for static sites - Build-time generation saves runtime costs
Cost Comparison
Gemini vs OpenRouter
| Provider | Model | Cost/Image | Speed | Quality |
|---|---|---|---|---|
| Gemini | gemini-1.5-flash | $0.001 | Fast | Good |
| OpenRouter | gpt-4o | $0.01 | Moderate | Excellent |
| OpenRouter | claude-3-haiku | $0.0025 | Fast | Good |
Recommendation: Use Gemini for development/high-volume, GPT-4o for production quality.
Troubleshooting
Images timing out
Increase timeout:
AI_TIMEOUT_MS=30000 # 30 seconds
BATCH_TIMEOUT_MS=120000 # 2 minutes
Rate limits too restrictive
Adjust rate limits:
AI_RATE_LIMIT_MAX=50 # 50 requests per minute
Gemini API errors
Check API key and quota at https://console.cloud.google.com/
OpenRouter API errors
Check API key and credits at https://openrouter.ai/account
Summary
- ✅ Process 1-50 images per batch request
- ✅ 12× faster than individual requests
- ✅ Automatic retry and error handling
- ✅ Support for Gemini and OpenRouter
- ✅ Server owner provides ONE API key
- ✅ End users only need PhotoSwipe Pro license
- ✅ Works with static sites and SSR frameworks
Get started: Copy environment variables from docs/ENV-VARIABLES-TEMPLATE.md and start batch processing!