Next.js in news media - Schibsted Tech Polska

This article will outline our experiences with Next.js from initial research somewhere in 2020 to the current day, where we have used Next.js in multiple projects. Over time we ran into various problems, and we are not proud of some of the solutions we came up with. To make this article interesting and informative, I will primarily focus on the issues and pain points but let me assure you that my attitude is generally optimistic about Next.js. It is still one of the top contenders whenever I need to set up a new project today.

News media challenges

Working with media, and especially news media, has some unique characteristics. For one, we must be ready for a sudden spike in traffic, whether it’s breaking news or a DDOS attack. Our traffic spikes are often just for a handful of articles that become extremely popular in a short time, but only for some time. We also need to be up and available to readers as much as possible to appear reliable and professional. Moreover, going down may not only hit our reputation but, in some cases, even cause a bit of panic for the readers. On the other side, we also need a near-instant mechanism of purging any cache we have if the editor publishes or updates content.

What my team is working on

Also, I want to introduce my team and what we are working on. I work in the Niche Destinations team. We are maintaining and developing a few sites about topics that are not strictly news related but contain a fair amount of news articles. From time to time, we get a trending piece like, for example, an article about Uncle Ben’s rebranding that lit our monitoring like a Christmas tree for half of the day 😅. Two sites that are the basis for this article are:

tek.no – a tech website with product reviews, comparisons, tests, guides and some tech-related news. This is the bigger of the two and the first one we moved to Next.js.
godt.no – a cooking website with recipes, tips, guides and some cooking-related news.

Old stack

In the past, both tek.no and godt.no were running a custom-made stack using react, redux, redux-first-router, babel and webpack. We had a partial SSR. godt.no rendered just metadata (title, og tags, structured content) on the server side; everything else was rendered client-side. tek.no rendered a bit more. We were consulting SEO specialists that wanted us to have full SSR. To test it without total commitment, Tek rendered not only metadata but also a basic version of article content – meaning just text and images. That content was not re-hydrated but rather re-rendered on the client side. This was more of a test to boost SEO, but we ran it for a while.

Statically burning the content

Our first idea was to try statically burning our content to serve only static files from CDN. The first stack we tried was with Gatsby. We created a quick mock that fetched our articles and tried rendering only text and images with Gatsby to get a rough estimation of how long it would take to render all of them. That’s when we realised that just tek.no already has over 100 000 articles 😅. That amount of articles, even with just a plain view, took us well over 50 minutes to burn (I don’t remember if we managed to burn all of them or decided that this was too long and stopped at that point), but once we added internal linking between the articles we ended up running out of memory on the whole task. The process’s length was due to Gatsby’s speed and the rate we used our backend to extract data. We were afraid to let it go full throttle not to bring the API down, as it was also used for serving normal content. At that point of time, Gatsby lacked features for incremental building. But even with those, it wouldn’t solve instances where we change something in our codebase that would require processing all of the articles again. This experiment killed the idea of statically burning our content ahead of time, regardless of the framework used to achieve it.

Moving to Next.js

A new release of Next.js caught my attention after failing with Gatsby. It was version 9.3, specifically the new Next-gen Static Site Generation (SSG) Support feature. This is where my adventure with Next.js began.

In Schibsted, we have a concept of 10% days or lab days. It is a day you can spend learning something new or testing anything you want. It is believed that we should spend 10% of our time on that so every two weeks we have one day to do just that. I spent eight consecutive lab days coming up with an almost complete (although buggy) rewrite of tek.no to Next.js. After presenting this proof of concept that already tackled all of the significant issues we had in tek.no to my team, we agreed to involve ourselves entirely in completing the migration. From this point forward, I will be jumping through the timeline to describe problems we had or still have and how we solved or learned to live with them.

Data fetching and caching

The first order of business was to understand getStaticPaths, getStaticProps and getServerSideProps. I wanted to use getStaticProps but realised it wasn’t feasible for us. With the fallback option set to true or later to blocking, we could handle any new articles, and we could use getStaticPaths to prepare some of the recent or hot articles to be ready right after the build to speed things up. Still, those methods are burning the article locally and then serving that burned version. Back when we started, there was no way to purge those burned articles, which was unacceptable to us since editors needed to be able to update articles without us having to redeploy the application.

Later on, with the arrival of revalidate option, there appeared a way to mark content as stale. Still, one thing you should know about it is that Next.js adopted the stale-while-revalidate strategy, which means that even if you set revalidate option to 10 seconds, it won’t be revalidated every 10 seconds. It’s revalidated lazily, which isn’t an issue by itself, but while it’s revalidated, you will still get an old response. This means that even an hour later, if no one requested the given route, the first response would still be the old one. This is not an issue unless you run a caching reverse proxy before the Next.js application, as we do. Because we are using Cloudfront to cache the responses and handle traffic spikes and DDOS attacks, this was unacceptable to us as even if we were to set a low revalidate value, Cloudfront would still cache the old response. There was no sure way to get rid of those race conditions to tame that behaviour.

With the arrival of On-Demand Revalidation and Self-hosting ISR, we should now be able to switch to getStaticProps for further optimisations. Still, it would require a sizeable infrastructure redesign to handle that.

getServerSideProps works fine for us in most cases. We like the option to modify the server response. We use it to shorten the Cache-Control header if we approach full or partial error. Also, we can adequately set 410 HTTP responses in case some content is unpublished. I’m not sure if those things would be possible using getStaticProps.

⚠️ If you have a reverse proxy in place, you should know that using getServerSideProps opens a way to omit your caching! getServerSideProps accepts more than just GET methods so that the malicious party can use it to DDOS you more effectively; we already were on the receiving side of that one 😅. The easiest way to mitigate it is to cut those methods on a reverse proxy level. If you don’t have a reverse proxy, this likely doesn’t change anything for you since every request will reach your server either way. More info here: https://github.com/vercel/next.js/discussions/43341

Routing

Next.js file-system-based routing was quite a change for us. We were used to a routing table somewhere in the project. The file-based approach was incredibly challenging for us since our sites are in Norwegian, the parts of the URLs are also in Norwegian, and not one developer speaks Norwegian 😅. Saying that, once we learned those few Norwegian words used in paths or how to search for the proper file another way (e.g. by param names), we were pleased with how easy it was to find most routes. Mainly because we had one exception, one route ended up handling 2 cases because initially, we couldn’t figure out how to separate them using the Next.js convention. The routes were:

/:section?/:subsection?/i/:articleId/:articleSlug? – single article page, note that everything but :articleId is optional
/:section/:subsection? – section page (list of articles)

Previously we could easily recognise article paths by /i/ part, but in Next.js, there were no optional parameters, and Catch all routes can be used only at the end of the URL. So we ended up with Catch all routes handling both cases. We ran this solution for some time, but it was not optimal since bundle splitting packed both the article page and section page in one bundle and Next.js treated it like this was just one page, whilst these were two pages in reality.

Some time later, we found out about Rewrites. With those, we could clean it up (and bend the Next.js convention of file-based routing a bit 😅). So now the section path is pages/[...sectionParams]/index.js, so still with the catch-all, but now it doesn’t have to handle articles because all of them are handled with pages/i/[id]/index.js path and one rewrite:

{
    source: '/:section*/i/:id/:slug?',
    destination: '/i/:id?section=:section*&slug=:slug?',
}

Operators available in rewrites are much more powerful than the ones you can use with file-based routing. :section* handles one or more levels of sections and :slug? handles the optional slug, but the result is most interesting; we are passing id normally since it’s not optional, but we treat everything else as a query param. I believe that routing documentation in Next.js should link to Rewrites documentation for cases that are impossible to handle normally. This was an example from tek.no, and I hope we didn’t invent some anti-pattern since we use the same trick twice in godt.no 😅.

Environments

This is a minor thing, but you should know that Next.js recognises only three environments (development, production and testing), setting NODE_ENV to anything else triggers a warning and won’t be adequately passed to bundles built by Webpack. We have more environments than that; we use development, feature, staging, beta and production. Why so many? Well, we are pretty cautious since any issues we make are usually quickly received by many people. Changes often need approval from UX designers or the business side, so they may wait for review as a review app in the feature environment. To handle this amount of environments, we had to change from using NODE_ENV to a new custom environment variable that is only responsible for loading the proper configuration. NODE_ENV wasn’t a good fit for that use case either way, but it’s good to remember that Next.js doesn’t get along with custom NODE_ENV values.

Hosting

Next.js may be open source, but there is a company behind it – Vercel. Vercel builds a hosting solution specially designed for Next.js. Due to Schibsted’s politics, we couldn’t use Vercel (we landed on AWS with a self-hosted solution, there were no managed platforms when we were starting, so keep in mind that I may be missing some context because of that), but as far as I understand everything is much easier if you use Vercel for hosting Next.js apps.

You can self-host Next.js quite easily, and there are many possible solutions to achieve that. Still, it is being heavily optimised to work with the Vercel platform, which sometimes can negatively impact how it works with your custom-made solution. I get that this is their platform, and that’s where they are making money, but there were instances where updating Next.js broke our apps somehow. At this point, we always thoroughly test the whole application each time we update the Next.js version, even if the changelog suggests no breaking changes.

One of those breaking changes that weren’t obvious to us was Next.js silently switching to streaming responses in version 13. It wasn’t apparent because the site worked just fine, but after promoting it to production, we noticed a stable increase in transferred bytes after some time. Further investigation revealed that while request streaming was meant to be added to the app Directory (beta), it was also switched on for the pages directory. Streaming responses resulted in them lacking a content-length header, lack of that header skips the compression step on Cloudfront. Lacking gzip compression resulted in bigger files, more transfer and users downloading bigger files. Since Vercel’s hosting likely doesn’t use it as that, issues like that can happen, and at least this one had a relatively low priority and took around four months to fix. Unfortunately for us, Next.js is pushing its own reinvented caching solution that doesn’t account for a reverse proxy in front. You can argue that since it is an open-source project, I could have just opened a PR with a fix, and I tried to fix it myself. Unfortunately, there is quite a lot going on in that codebase and changing something, especially something that low level, when you are not acquainted with Next.js from the perspective of its developer, can be challenging. If you want to read more on this issue, check those links:

https://github.com/vercel/next.js/discussions/39546

https://github.com/vercel/next.js/discussions/38606

Another interesting point to consider if you are self-hosting the Next.js app is what happens with all of the Next.js features running on Edge Runtime. At the time of writing this article, there are two features mentioned using Edge Runtime: Middleware and Edge API Routes. If you are self-hosting Next.js, it is likely that your middleware is not running on the Edge; there may not even be an Edge where your app is running. I don’t think documentation provides a sufficient warning about this potential difference.

Next images

The Next.js image component is excellent. It allows us to leverage all of the best practices easily, but migrating our codebase to the next/image was quite a challenge. If you are considering migrating to Next.js, this should be one of the last steps you take. I suggest moving to it only when your app is live and stable on Next.js before you start.

Our hardships with image components are due to our “non-optimal” setup when it comes to images. We get all of the image URLs from the backend response. Those URLs have a signature query parameter.

We can’t generate them on the client side, we tried implementing a proxy for that, but then we learned that removing them from the backend response is an even bigger challenge, so we accepted the defeat and had to bend the image component to work with pre-generated urls. We ended up with a wrapper component for the Next.js image component that takes additional urls parameters and creates a custom Custom Image Loader Configuration for every image to inject those urls.

function YAMSImage({ urls, ...props }) {
    const yamsLoader = useCallback(
        ({ width }) => {
            const perfectFit = urls?.find((image) => image.width === width);

            if (perfectFit) {
                return perfectFit.url;
            }

		// should never happen if next is configured correctly
            const largestImage = urls?.[urls?.length - 1];

            return largestImage?.url;
        },
        [urls]
    );

    return <NextImage layout="responsive" {...props} loader={yamsLoader} src={`yams${imageId}`} />;
}

export default YAMSImage;

Once we configure Device Sizes and Image Sizes to the sizes we get in those url arrays, we should always end up with a perfect fit for the width given in the loader function.

Conclusion

We encountered a lot of other more minor idiosyncrasies of Next.js; at this point I don’t even remember how many times I ended up debugging internal Next.js code. I still believe Next.js is a fantastic project with a large and dedicated community. I’m afraid I can’t agree with all of the design decisions they took, but so far, there have been no blockers for us besides occasional bugs. One thing that takes Next.js to another level for us is how developer-friendly it is. No other project this large I ever used was so easy to work with daily. Their documentation is quite clear on how to work with it and very user-friendly (I wouldn’t mind some deep dives into how it works under the hood, but you don’t need that to work with Next.js).

What we are waiting for

Until recently, we were still stuck on version 12 arising from the issue described in the section Hosting, so one of the most awaited updates for us is updating to the current version of Next.js 😅. Aside from that, the new Metadata API introduced in version 13.2 looks promising. I haven’t played with it yet, so I can’t say more.