Configuration for large websites

If you have a big website with thousands of pages, delivering a cached and optimized site presents some challenges. In this doc, we will cover some useful techniques you can apply when caching a large sitemap. 

Heads up! This guide includes advanced configuration of some of our features. In most cases, the default behavior of this feature is adequate, and customizations are not required. The following applies only to very large sitemaps.

When you have a large sitemap, is important to plan a good caching strategy. There are physical limitations resulting from the generation of the cache and optimizations, the delivery of the pages/assets to the users, and the cache cleanup. 

When using WP Rocket on a big scale website, we recommend taking the following caching recommendations into account:

1. Preload

a. Only preload the most visited pages, not the full sitemap:

Problem: If you have a website with 50.000 URLs, there is a good chance that most of the traffic is concentrated on a percentage of them. On large sitemaps, the preload might not reach a good % of completion because there are good chances that the cache will be cleared while it is being processed

Solution: Focus the preload on the most visited pages. A smaller preload best way to ensure a more reliable process. You can achieve this by selecting only a specific sitemap to preload, instead of the full sitemap Customizing preload sitemaps

Visited URLs will still be preloaded!
This is not going to limit the cache preload to that initial sitemap. Any other URLs visited by users will be  immediately added to the preload table and preloaded when further preload processes run.

So, the site's organic traffic will "teach" our tool which pages are important and visited, and we will keep them preloaded for future visits.

b. Server infrastructure and preload speed:

Problem: depending on the number of pages to be preloaded, the process will be challenging to the server infrastructure.

Solution:  Talk with your hosting provider, to ensure your server has enough resources for the process to happen. 
Additonally, WP Rocket can help by allowing control over the preload generation pace. To learn how you can tweak the preload parameters, please check this doc: Customize Preload Parameters

c. Long-term cache, disabling Automatic Cache Clearing:

Problem: The cache might get cleared before its generation is finished. 

Solution: To avoid frequent cache clearing, and keep the cache for a long term, you can use this technique: Disable automatic cache clearing

However, this only applies to non-dynamic websites, or those sites using features that would not require the cache to be purged frequently (e-commerce sites using nonces, for example). You will need to check if this applies to your case before implementing this approach.

2. Remove Unused CSS

Problem: The Used CSS generation process requires our external service to visit every uncached page. On a large sitemap, our external tool will try visiting all the pages added to the preload table. This puts stress on the server and it might be a failure point on huge sitemaps. 

Solution: This is where reducing the sitemap will help and you can expect the process to work better. To learn how you can tweak the Remove Unused CSS parameters, please check this doc: Customize Remove Unused CSS Parameters
   

3. Separate Mobile Cache?

Problem: If you have the Separate Cache for mobile devices, this option doubles the preloading and used CSS generation effort: for each URL, we need to generate two cache files, and two sets of used CSS

Solution: If you have this option enabled, please double-check to see if it is really needed. In some cases, when themes have a responsive design, you can disable it.

4. Optimize the delivery of the cache

Edge cache

Problem: If you have a large amount of traffic, even if the page is cached, and the cache is delivered without requiring PHP, the server's ability to handle single-page requests will surely be the bottleneck.

Solution: This goes beyond WP Rocket's features, but tools such as Cloudflare Cache Everything, Cloudflare APO, Sucuri, or a reverse proxy (Varnish, NGINX) might help.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.