Company-wide Spring 5 migration head on

Written by Nathan KooijJan 27, 2020 13:1411 min read

We at Picnic have always been at the frontier of new and exciting technologies. One such technology we adopted is reactive programming. While we were already using it in our mobile applications, we also began to discover and exploit the opportunities it held in backend programming as well. We utilized RxJava 2’s simplified approach to concurrency to compute our supply chain’s purchase orders in parallel. We wrote operational dashboards where all data sources and transformation pipelines were connected by reactive streams. We connected our application’s edges all the way to our databases, leveraging MongoDB’s reactive driver and Spring’s async processing support. Logically it followed then that back in early 2018 we dove in headfirst in early releases of Spring 5 to explore its new framework: WebFlux. Since then, we have been running WebFlux in production for close to two years.

Although more and more of our services started to adopt WebFlux, our older WebMVC applications lagged behind by using Spring 4. A sense of urgency slowly crept up: Spring 4 would reach its end-of-life in December 2020. Earlier this year we, therefore, started the process of migrating all of our services to the latest version of Spring.

Image for post — How WebFlux compares to WebMVC

Migration challenges

The migration consisted of many challenges. A new major version can contain an arbitrary number of breaking changes, which can require changes on your end. Moreover, we were pulling in new libraries in addition to two years’ worth of changes from Spring, which made narrowing down the cause of an error an arduous task at times.

Our apps are relatively similar: a developer can easily switch between teams, assuming they’re familiar with the new application domain. However, the application differences that did exist made for a complex migration landscape:

Some applications rely on sidecar containers for cross-cutting concerns such as security, others use built-in mechanisms;
Some builds produce a WAR that is deployed on Jetty, others produce a completely self-contained uber JAR;
Some applications use WebFlux; others use WebMVC;
We centrally provide a sane base to build an application with, but developers are free to apply their own configuration on top;
Some builds take a significant amount of time.

Early stages of the migration

Before we discuss how we actually tackled the migration, let’s first zoom out by taking a look at how we initially adopted WebFlux within the company. We used a greenfield project as an incubator for all kinds of libraries related to WebFlux and Spring 5. These libraries were built in addition to a collection of already built libraries that we use to run our applications at Picnic called the Platform Support Modules (PSM). While these incubating libraries helped us greatly to understand the changes that Spring 5 had brought, it also sundered our support libraries. We had, on the one hand, Spring 5 libraries (comprising WebFlux and reactive code) in a separate project, and on the other hand, the existing Spring 4 infrastructure in PSM. Rewriting all of our existing projects to use WebFlux was not an option simply due to the sheer size of the codebase.

Figuring out what to do

Next, we can start to consider our plan of attack. As with any large-scale project, scoping is an essential component. Because of the aforementioned incubator project, we had a fairly good view of what was required to make our existing Spring 4 applications run on Spring 5.

Rather than upgrading a specific application, we first turned our attention towards PSM, as this provides the basis for any Picnic application. We asked ourselves the question: “what would be the least minimal changeset that would allow us to just update PSM to Spring 5?”

Thanks to our existing Spring 5 experience, many tickets already existed, which ranged from solving deprecations to migrating the incubating modules to PSM. Solving these would help us unify the split sets of support libraries. However, this was complicated by the fact that the tickets were unorganized, spread over different JIRA projects, and created over a time span of two years. So not only did we collect and sift through those tickets, but we also included any changes as pointed out by Spring’s own migration guides, and those changes that “naturally followed” by just trying to bump the versions and seeing what problems popped up. Basically, poke the bear and see what happens. Combining these issues, the migration guides, and a filtered set of existing tickets, we established the required changes that would allow us to (hopefully) use Spring 5 everywhere.

Getting the job done

Thanks to our scoping process, we could anticipate many of the changes required. Some of the errors we could not, despite our automated test suites. As such we went through what was dubbed the Great Migration Cycle for weeks:

Most of these errors were encountered when we tried to apply the latest version of PSM to a downstream application, as we kept running branches for most applications. Subsequently, the debugger became our best friend, exploring the inner workings of the Java compiler itself, Spring, Jetty, and other libraries. At about a quarter’s way in, we realized that remembering and understanding all the changes we were about to drop on other developers was just as complicated. As such, we started maintaining our own migration guide. Not only did this help others to understand the changes that were coming, but it also served as an overview in navigating this complex landscape just for us. Next, we want to highlight some additional aspects of the migration and what we learned from it.

Learnings

It’s a bug’s life

There will be errors. The errors we encountered were varied: we had compiler errors, application server errors, framework errors, library errors and errors between the chair and the keyboard. Most of these will require you to dig deep, so be prepared to learn the tools of the trade. For instance, I learned how to debug the Java compiler. Moreover, perusing the changelog often provided great insights into resolving issues, but might not be an option depending on the changelog’s size. Our automated test suite, and especially some of our integration tests, helped us catch these kinds of complex errors early. We had also kept up-to-date on most of our dependencies which helped minimize this challenge’s impact.

Deprecations and incompatible libraries

Some deprecations or incompatible libraries had a clear path forward: simply use this new function or class. Others were more complicated and required big refactors. If possible, identifying these early on helps greatly, as it helps keep scope creep at bay. For example, we knew early about an issue with a documentation generator (first identified as an issue for our already running WebFlux apps), and a deprecated commonly used REST client (identified as an issue from the Spring migration guide). Moreover, some of these changes can be handled as part of a preparation effort for the migration. This will lower the complexity of the actual migration by reducing the amount of moving parts to keep in mind. For our migration we also opted to defer resolving a deprecation as a) typically there were unsolved known issues with the new solution b) it simplified the migration significantly c) required much less effort in downstream projects. Allowing for an “imperfect” migration helped us immensely during the scoping and implementation process.

Roll-out

Since we had to validate most of our work by applying the changes to downstream projects (as part of the migration cycle), we had already established upgrade branches in most application repositories. We then asked each team to review the pull request and verify the application behaviour in our dedicated test environment. As each team is responsible for their own release cycle, the rollout to production was gradual which allowed us to observe any potential errors early. This process worked reasonably well for us and made the actual migration in production quite a smooth process. However, this is not a scalable solution as the company grows. In the future, we will have to defer more work to the individual teams, and improve our own testing setup to capture more of Picnic application characteristics in a controlled environment.

Conclusion

The primary goal of the migration was to go to Spring 5. The road was bumpy and arduous at times, but persistence and effort paid off concluding in smooth application transitions. Moreover, this migration has also helped us unify our setup further: we now produce an uber JAR for each build and are in the process of applying sidecars universally.

Key takeaways:

Be prepared to have to dive “deep”, so don’t feel bad to ask for “expert” help.
Narrow the scope of the migration as much as possible. Larger scope not only slows down the actual process (too many inputs), but also complicates verification.
Keep your dependencies up-to-date.
Plan (a lot of) time for the unexpected. There will be issues.
Accept that you can’t fix all issues in one go, prioritize, and just do it (starting helps to identify more pressing issues!).
Have individual teams help verify the application.
Make sure to test different downstream projects; each will have its own quirks.