Problem Description
We’re experiencing severe performance issues when archiving a single Matomo site that contains ~5000+ segments. Each segment is taking 40-50 minutes to archive, making the entire archiving process impractical.
Environment
- Matomo Version: 5.3.0
- Installation: Self-hosted
- Database: Aurora MySQL
- Data Volume:
- Single site ID
- ~2000 different businesses
- Using custom dimensions to segment by business + checkout slug
- Approximately 5000+ segments total
- Deployment: AWS Fargate tasks for scheduled archiving
Current Setup
We have a single Matomo site ID that is used by thousands of businesses. We’re using custom dimensions to segment by business and checkout slugs (One business could have multiple checkouts), resulting in around 5000+ segments.
Issue Details
When running archive via scheduled Fargate tasks, the process takes approximately 40-50 minutes per segment. With 5000+ segments, this makes the archiving process impossible to complete in a reasonable timeframe.
Command Used
php /var/www/html/console core:archive --url={our domain} --force-idsegments=all --force-idsites=${siteid} --no-interaction --no-ansi
Archive Task Configuration
- 2 vCPU
- 4 GB RAM
- Writer + Reader Aurora MySQL DBs
Questions
- Is this expected behavior for a site with 5000+ segments, or is there a performance issue/bug?
- Are there any recommended optimisations for archiving large numbers of segments?
- Is there a recommended maximum number of segments per site for optimal performance?
- Are there architectural recommendations for handling our use case (multiple businesses on a single Matomo instance)?
We’ve also experimented with using separate site IDs (one per checkout slug), but that creates management challenges across 16,000+ sites.
1 post - 1 participant