Batch Processing Guide
Batch Processing Guide
VoP’s batch processing system is designed to handle large-scale payment verification requests efficiently, supporting up to 250,000 records per batch while maintaining SEPA compliance and security standards.
Processing Overview
Batch Types
- Standard Verification: Up to 50,000 records
- Large-Scale Verification: 50,000 - 250,000 records
- Priority Processing: Critical verifications with expedited processing
Processing Stages
- Submission: Initial batch upload and validation
- Processing: Multi-threaded verification processing
- Results: Staged result delivery and retrieval
Batch Submission
Large Dataset Handling
POST /api/v1/verifications/batchContent-Type: application/json
{ "batchReference": "BANK_REF_20241220", "priority": "standard", "notifications": { "callbackUrl": "https://your-callback-url/batch-complete", "emailNotification": "operations@yourbank.com" }, "verificationRequests": [ { "requestId": "REQ_001", "iban": "DE89370400440532013000", "accountName": "Max Mustermann", "correlationId": "PAYMENT_001" } // ... more verification requests ]}
Processing Time Guidelines
Batch Size | Typical Processing Time | Max Processing Time |
---|---|---|
< 10,000 | 5-10 minutes | 15 minutes |
10,000 - 50,000 | 15-30 minutes | 45 minutes |
50,000 - 250,000 | 30-90 minutes | 120 minutes |
Optimized Processing Strategies
Chunked Processing
For large batches (>50,000 records):
- Split into 25,000 record chunks
- Parallel processing of chunks
- Automatic rate limiting per chunk
- Results aggregation
def process_large_batch(records, chunk_size=25000): chunks = split_into_chunks(records, chunk_size) chunk_results = []
for chunk in chunks: chunk_id = submit_chunk(chunk) monitor_chunk_progress(chunk_id) chunk_results.append(get_chunk_results(chunk_id))
return aggregate_results(chunk_results)
Result Retrieval Methods
-
Webhook Notifications
- Immediate notification when chunk is complete
- Results ready for retrieval
- Includes processing statistics
-
Polling with Exponential Backoff
def poll_results(batch_id, max_attempts=10):wait_time = initial_waitfor attempt in range(max_attempts):status = check_batch_status(batch_id)if status.is_complete:return fetch_results(batch_id)wait_time = min(wait_time * 2, max_wait)time.sleep(wait_time) -
Staged Result Retrieval
- Results available in chunks of 10,000
- Paginated retrieval
- Cursor-based continuation
Performance Optimization
Parallel Processing
async def process_verification_batch(records): chunks = create_optimal_chunks(records) async with aiohttp.ClientSession() as session: tasks = [process_chunk(chunk, session) for chunk in chunks] results = await asyncio.gather(*tasks) return combine_results(results)
Memory Management
- Streaming upload for large files
- Chunked result processing
- Efficient data structures
Error Handling
Batch Level
- Invalid format
- Authentication issues
- Quota exceeded
Record Level
- Invalid IBAN
- Missing required fields
- Verification service unavailable
Recovery Procedures
- Log error details
- Retry failed records
- Generate error report
- Notify administrators
Monitoring and Reporting
Real-time Metrics
- Processing speed
- Success/failure rates
- Queue depth
- Resource utilization
Compliance Tracking
- Audit logs
- Processing timestamps
- Security events
- SEPA compliance checks
Best Practices
-
Pre-processing Validation
- Validate IBAN format
- Check required fields
- Remove duplicates
-
Optimal Batch Sizes
- Standard hours: 25,000 records
- Off-peak hours: 50,000 records
- Consider time sensitivity
-
Resource Management
- Monitor memory usage
- Track processing time
- Handle timeouts appropriately
-
Result Handling
- Implement robust error handling
- Store results securely
- Maintain audit trail
Security Considerations
- End-to-end encryption
- Secure file transfer
- Access control
- Audit logging
Support and Maintenance
Contact Points
- Technical support
- Emergency processing
- Capacity planning
Documentation
- Processing logs
- Audit trails
- Performance reports