Duda Collections: The Microservice Way

March 17, 2021
0 minute read

Collections are a special data structure included in the content libraries of Duda sites. Collections are incredibly useful for managing websites at scale and especially helpful when working with dynamic pages and Duda’s native widget builder. 


Since collections are so important to so many of the web pros that use our platform, we thought it would be interesting to pull back the curtain and explain how these crucially important tools work. 


Let’s dive in...

A Quick Glance at Our Current Implementation

In Duda Website Builder, we have two major groups of collections:


  • Internal collections (image gallery and various other data field types) — These are stored and managed by customers inside the Duda platform.
  • External collection (Instagram, Airtable, Google Sheets, etc.) — Customers manage this data outside our platform and provide Duda with some mechanism to fetch and cache it for faster access.

There is a principle difference between these two groups. Internal collections data is stored in the database (Oracle). Externally-provided data, since it’s transient by nature and can always be re-fetched, is stored in a fast in-memory cache (Redis).

Current Limitations With Internally Stored Collections

The first and, probably, most important limitation is the database itself. Our Oracle database is heavily loaded with multiple requests — accounts, payments, pages, widgets, blogs, templates metadata and many, many others. 


One option is to scale Oracle vertically; i.e., use more powerful servers. However, relational databases do not scale well. Another option is to extract some of the load to a dedicated database that scales horizontally; i.e., add servers that operate in parallel. In this case, if more and more users use more and more collections with more and more data, it won’t affect data operations that are unrelated to collections. 


Also, Oracle works best for storing data with predefined structure. For example, some “Users” table fields can be defined in advance (e.g., email, login, first and last name). However, each collection has different fields and types of fields. A “Plants” collection would be something like “plant name,” “origin,” “care tips,” etc., but an “Employees” table would be totally different. So, internally we store data as stringified JSON documents. Even though Oracle has extensions that allow convenient operations on unstructured JSON documents, there are other databases that are specifically designed for storing and processing documents without predefined structure.

Current Limitations With Externally Provided Collections


External collections may change at any moment, and Duda needs some way to apply these changes. We have two mechanisms for this:


  • Data expiration. This capability comes out-of-the-box when using in-memory data store Redis. All external collection data expires in two hours, and needs to be re-fetched. However, if we fail to fetch data, we’ll use “fallback” data that is cached for one week (so, we in fact have to store the same data twice — once in a “fast cache” and then again in a “long cache”).
  • A background job that runs every hour and checks if external collection data of published sites has changed. If it has, we clean the cache for rendered runtime pages and pages will be re-rendered with fresh re-fetched data upon the next request.


So, the functional problem with this solution is that Redis is a key-value store, and we store all data as a single chunk. Thus, in order to search/filter data, we have to do it in-memory. In 99 percent of use cases, this is not a big issue, but all of our big collections (>1000 rows and >1 MB of raw data) are external collections. If you’ve got a long list widget that is connected to such a collection, your page is cached with all the data.This may make your site visitors a little unhappy if they end up loading 2 MB from the web.


At Duda, we have lots of new collections-related feature requests and optimizations of existing functionality that need to be done (big collections can cause big issues). Our current solution can’t fit all requirements, so the decision was made to extract site collections into a dedicated infrastructure — into a microservice.

Why Microservices: A Global Trend of Distributed Systems

If you’re familiar with software development, or have worked in an IT-related field for at least a few years, you must have heard the term “microservice.” The concept behind microservices is that instead of having a huge “monolithic” application, you split into multiple smaller applications that are nearly-independently from each other. Microservices usually have different databases (though it’s not a requirement) to better suit their needs.


There is no single “Google” application — such huge services consist of multiple applications (sometimes even thousands) that communicate together behind the scenes. There might be “Gmail” microservice, an “ads” microservice, an “authentication” microservice, etc., but you as a Google user don’t really know or care. You simply type in “google.com”, perform any actions, and it works. So, why microservices?


  • Autonomy of scale — If suddenly millions of people will start sending emails via Gmail instead of WhatsApp messages, only the “Gmail” microservice will need to handle this additional load. More servers will be started and users will still have good experience. With Duda collections - we definitely want our customers to have the option to create more and bigger collections with rich content. We don’t want this additional load to affect other services and capabilities.
  • Autonomy of deploy — Whenever a new feature or a patch is released, a microservice can be deployed in less than an hour without the need to change other parts of the system. And since lots of relatively long automated tests may need to run before each deploy, you don’t need to test the whole huge system, which can take hours! Instead, you only need to test the flows that affect a small part of it, which takes ~10 minutes in the case of Duda’s collections microservice). At present, the Duda monolith is redeployed every business day, but the collections microservice doesn’t need to wait for that. When needed, we release improvements up to 4 times a day. When not needed, we keep the same stable version for days.
  • Too many more to list... — There are many, many other reasons why we decided on microservices, but we’re not going to mention them because this article is about Duda Collections, not microservices in general (if it were, it’d be much longer).


However, it’s also important to mention that microservices significantly increase complexity of the system as a whole.


When there are multiple “moving parts,” a system is more likely to experience failures. Independent microservices still need to communicate over the network — which can fail — and the workflow needs some way to recover from those failures. 



For instance, whenever a published site tries to retrieve collection data, but the collections microservice is not available for some reason, it’s fine to render an empty list widget instead of showing “Sorry, error while rendering your page.” But when the issue is resolved, all affected pages should be re-rendered.

How Do Duda Collections Benefit From Microservices?

Collections benefit in so many ways! Namely, improved performance, new awesome features and a lower time-to-market for those features.

A Powerful Database That Opens New Horizons





As described above, a major reason for extracting all collections functionality is to increase scalability. As opposed to the Duda monolith, our collections microservice is using Amazon DocumentDB (if you heard about MongoDB, it’s almost the same). It’s a non-relational database that is designed to work effectively with user-structured documents. This allows it to perform granular search and filter operations at the database level without the need to load all of the data if you only need to show the first 10 items. This database is also designed to work in cluster mode; i.e., if the primary server fails, edits of collection data won’t be available for some time, but reads won’t be affected thanks to replica servers that still respond within 30-50 milliseconds.

More Frequent and Safer Releases



With microservices, the team has more freedom to choose when to implement new features. Adding a new table to the database doesn’t need to be approved with our system engineers because it won’t affect the main database. Changes in code won’t affect other teams working in the monolith. Thus, “technical overhead” (i.e., feature development) is reduced drastically.

The performance and behavior of a microservice is also easier to monitor. When the development team notices some bad logs, it can be fixed in less than 1 hour, or even rolled-back to the previous stable version in less than 10 minutes.

FAQs

Here are some answers to the most frequent questions we've received about using microservices to power Duda collections.

Is Migration Associated With Downtime or Risk of Wrong Data Displayed in Widgets?

Even though migration to a microservice is complex, we can perform it without any downtime or serious risk of unexpected behavior. With our development processes based on feature flags, we can take granular control of the process. At each step of the migration, we can ensure backward compatibility.


We started the migration process a few months ago, and our general idea was “even when a request comes to the microservice, users should see ‘reference’ values that are returned from the stable system (monolith)”. So for any request, the microservice first redirects it to monolith to perform the relevant operation, then the same operation is executed in the microservices. Results are then compared and the response from monolith is returned. This doesn’t affect performance because time-consuming operations (like ‘get data’) are executed in parallel.

Is There Any Action Required From the Duda Customer Side?

No! The process of migrating all flows to a microservice happens behind the scenes. You may not know it, but your published sites may already be served by our microservice. And some microservice-served customers already have access to beta-stage new capabilities!


Did you find this article interesting?


Thanks for the feedback!
By Shawn Davis April 16, 2026
Website builder analysed 69M AI crawler visits across over 850,000 websites in February 2026 to determine key trends and characteristics that increase local AEO
By Shawn Davis April 1, 2026
Core Web Vitals aren't new, Google introduced them in 2020 and made them a ranking factor in 2021. But the questions keep coming, because the metrics keep changing and the stakes keep rising. Reddit's SEO communities were still debating their impact as recently as January 2026, and for good reason: most agencies still don't have a clear, repeatable way to measure, diagnose, and fix them for clients. This guide cuts through the noise. Here's what Core Web Vitals actually measure, what good scores look like today, and how to improve them—without needing a dedicated performance engineer on every project. What Core Web Vitals measure Google evaluates three user experience signals to determine whether a page feels fast, stable, and responsive: Largest Contentful Paint (LCP) measures how long it takes for the biggest visible element on a page — usually a hero image or headline — to load. Google considers anything under 2.5 seconds good. Above 4 seconds is poor. Interaction to Next Paint (INP) replaced First Input Delay (FID) in March 2024. Where FID measures the delay before a user's first click is registered, INP tracks the full responsiveness of every interaction across the page session. A good INP score is under 200 milliseconds. Cumulative Layout Shift (CLS) measures visual stability — how much page elements unexpectedly move while content loads. A score below 0.1 is good. Higher scores signal that images, ads, or embeds are pushing content around after load, which frustrates users and tanks conversions. These three metrics are a subset of Google's broader Page Experience signals, which also include HTTPS, safe browsing, and mobile usability. Core Web Vitals are the ones you can most directly control and improve. Why your clients' scores may still be poor Core Web Vitals scores vary dramatically by platform, hosting, and how a site was built. Some of the most common culprits agencies encounter: Heavy above-the-fold content . A homepage with an autoplay video, a full-width image slider, and a chat widget loading simultaneously will fail LCP every time. The browser has to resolve all of those resources before it can paint the largest element. Unstable image dimensions . When an image loads without defined width and height attributes, the browser doesn't reserve space for it. It renders the surrounding text, then jumps it down when the image appears. That jump is CLS. Third-party scripts blocking the main thread . Analytics pixels, ad tags, and live chat tools run on the browser's main thread. When they stack up, every click and tap has to wait in line — driving INP scores up. A single slow third-party script can push an otherwise clean site into "needs improvement" territory. Too many web fonts . Each font family and weight is a separate network request. A page loading four font files before rendering any text will fail LCP, especially on mobile connections. Unoptimized images . JPEGs and PNGs served at full resolution, without compression or modern formats like WebP or AVIF, add unnecessary weight to every page load. How to measure them accurately There are two types of Core Web Vitals data you should be looking at for every client: Lab data comes from tools like Google PageSpeed Insights, Lighthouse, and WebPageTest. It simulates page loads in controlled conditions. Lab data is useful for diagnosing specific issues and testing fixes before you deploy them. Field data (also called Real User Monitoring, or RUM) comes from actual users visiting the site. Google collects this through the Chrome User Experience Report (CrUX) and surfaces it in Search Console and PageSpeed Insights. Field data is what Google actually uses as a ranking signal — and it often looks worse than lab data because it reflects real-world device and connection variability. If your client's site has enough traffic, you'll see field data in Search Console under Core Web Vitals. This is your baseline. Lab data helps you understand why the scores are what they are. For clients with low traffic who don't have enough field data to appear in CrUX, you'll be working primarily with lab scores. Set that expectation early so clients understand that improvements may not immediately show up in Search Console. Practical fixes that move the needle Fix LCP: get the hero image loading first The single most effective LCP improvement is adding fetchpriority="high" to the hero image tag. This tells the browser to prioritize that resource over everything else. If you're using a background CSS image for the hero, switch it to anelement — background images aren't discoverable by the browser's preload scanner. Also check whether your hosting serves images through a CDN with caching. Edge delivery dramatically reduces the time-to-first-byte, which feeds directly into LCP. Fix CLS: define dimensions for every media element Every image, video, and ad slot on the page needs explicit width and height attributes in the HTML. If you're using responsive CSS, you can still define the aspect ratio with aspect-ratio in CSS while leaving the actual size fluid. The key is giving the browser enough information to reserve space before the asset loads. Avoid inserting content above existing content after page load. This is common with cookie banners, sticky headers that change height, and dynamically loaded ad units. If you need to show these, anchor them to fixed positions so they don't push content around. Fix INP: reduce what's competing for the main thread Audit third-party scripts and defer or remove anything that isn't essential. Tools like WebPageTest's waterfall view or Chrome DevTools Performance panel show you exactly which scripts are blocking the main thread and for how long. Load chat widgets, analytics, and ad tags asynchronously and after the page's critical path has resolved. For most clients, moving non-essential scripts to load after the DOMContentLoaded event is a meaningful INP improvement with no visible impact on the user experience. For websites with heavy JavaScript — particularly those built on frameworks with large client-side bundles — consider breaking up long tasks into smaller chunks using the browser's Scheduler API or simply splitting components so the main thread isn't locked for more than 50 milliseconds at a stretch. What platforms handle automatically One of the practical advantages of building on a platform optimized for performance is that many of these fixes are applied by default. Duda, for example, automatically serves WebP images, lazy loads below-the-fold content, minifies CSS, and uses efficient cache policies for static assets. As of May 2025, 82% of sites built on Duda pass all three Core Web Vitals metrics — the highest recorded pass rate among major website platforms. That baseline matters when you're managing dozens or hundreds of client sites. It means you're starting each project close to or at a passing score, rather than diagnosing and patching a broken foundation. How much do Core Web Vitals actually affect rankings? Honestly, they're a tiebreaker — not a primary signal. Google has been clear that content quality and relevance still dominate ranking decisions. A well-optimized site with thin, irrelevant content won't outrank a content-rich competitor just because its CLS is 0.05. What Core Web Vitals do affect is the user experience that supports those rankings. Pages with poor LCP scores have measurably higher bounce rates. Sites with high CLS lose users mid-session. Those behavioral signals — time on page, return visits, conversions — are things search engines can observe and incorporate. The practical argument for fixing Core Web Vitals isn't just "because Google said so." It's that faster, more stable pages convert better. Every second of LCP improvement can reduce bounce rates by 15–20% depending on the industry and device mix. For client sites that monetize through leads or eCommerce, that's a revenue argument, not just an SEO argument. A repeatable process for agencies Audit every new site before launch. Run PageSpeed Insights and record LCP, INP, and CLS scores for both mobile and desktop. Flag anything in the "needs improvement" or "poor" range before the client sees the live site. Check Search Console monthly for existing clients. The Core Web Vitals report surfaces issues as they appear in field data. Catching a regression early — before it compounds — is significantly easier than explaining a traffic drop after the fact. Document what you've improved. Clients rarely see Core Web Vitals scores on their own. A monthly one-page performance summary showing before/after scores builds credibility and makes your technical work visible. Prioritize mobile. Google uses mobile-first indexing, and field data shows that mobile CWV scores are almost always worse than desktop. If you only have time to optimize one version, do mobile first. Core Web Vitals aren't a one-time fix. Platforms change, new scripts get added, campaigns bring in new widgets. Build the audit into your workflow and treat it like any other ongoing deliverable, and you'll stay ahead of the issues before they affect your clients' rankings. Duda's platform is built with Core Web Vitals performance in mind. Explore how it handles image optimization, script management, and site speed automatically — so your team spends less time debugging and more time building.
By Ilana Brudo March 31, 2026
Vertical SaaS must transition from tools to an AI-powered Vertical Operating System (vOS). Learn to leverage context, end tech sprawl, and maximize retention.
Show More

Latest posts