From Machine Learning to black box testing of Java services, Picnic uses Python throughout its tech stack.
Picnic has grown to serve hundreds of thousands of customers weekly using Python as one of the core technologies co-existing next to Java. But why did we opt for Python? How can hundreds of engineers and analysts contribute across our codebases while maintaining similar style and quality standards? And how do new Python developments influence Picnic’s technical direction? Let’s answer all these questions and start with the first one:
How it started
When Picnic launched in 2015, Java was the main backend language. A few years into our journey, more and more use cases became apparent for which Python is better suited. The first three use cases identified were around:
- building scalable data products.
- component testing the most critical parts of our software supply chain.
- enabling business analysts to write software to identify and solve operational problems end-to-end.
Python’s rich ecosystem of existing (data) packages, its gentle learning curve, and its extendability made it an excellent choice for these use cases. Furthermore, to build the best milkman on earth, we foresaw that we’d need many colleagues to help us realize the vision. It surely helps then that Python is taught at many universities, so many graduates are already familiar when applying at Picnic.
How it’s going
We didn’t stop after the first three use cases. Over the course of the past few years, hundreds of thousands of lines of Python code has been added and a wide range of new internal tools have been developed to tackle our growing business needs. For instance on the fulfilment side, there’s a team developing the warehouse slotting & layout services. These services calculate on a daily basis the optimal article locations in our Fulfillment Centers. On the analysts side, tools like our Project Resources Manager, enables analysts to request service accounts for their solution (e.g. to access Picnic backends or our Snowflake DWH) all within a single pull request.
As our internal Python ecosystem grew, so did the codebases. The flexibility of Python can become its Achilles’ heel if not managed well. To combat this, we created the Python developer platform team. This team standardized CI/CD and created shared packages applicable to almost any Python application in Picnic. One of those packages is picnic-guide, an internal wrapper around various awesome open-source tools. picnic-guide is highly opinionated. It ensures there’s not an endless debate about which tools and tool settings should be enabled for a project. This in turn enables consistency in documentation and code style across projects. Other tools it wraps, like pylint and mypy, prevent common coding errors and perform type checking, which increases the overall code quality and reliability. picnic-guide is always required on CI builds (for all Python projects!) preventing unchecked code from making it to production.
Next to tooling required for all Python projects, internal Python packages enable new use cases, reduce repetitive work, or standardize tool usage patterns across teams. Some examples:
- Picnic-forecast-warehouse: enables easy publishing of predictions by our various machine learning models to our in-house build forecast-warehouse.
- Picnic-client: a wrapper around HTTPX (a popular alternative to Requests since it supports both sync and async calls) in which OAuth 2.0 authentication against Picnic backends is automatically handled.
- Picnic-messaging: standardises access and enforces best practices on the usage of RabbitMQ.
As of today, we have more than 30 such cross-team packages, and new ones are regularly added.
Reproducible environments
Package and dependency management is still an often debated topic in the Python community. The flexibility provided due to the imperative nature of setuptools and setup.py can lead to complicated and hard-to-reproduce project environments. Recently, however, members of the Python Software Foundation (PSF) and community joined forces and are working towards a more standardized and seamless experience for end users. Python Enhancement Proposals (PEPs) to standardize the interface for build-backends (PEP 517), declare build dependencies (PEP 518), and project dependencies (PEP 621) help transition the ecosystem towards a better out-of-the-box experience.
Over the years, we found various issues with Picnic’s Python environment tooling. Initially, we settled on Pipenv, and it served us well during the first few years. We unfortunately also encountered some challenges. These range from slow dependency resolution times to non-standardized dependency declarations and vague installation error messages. In the fall of 2023, we re-evaluated our options on how to provide the best setup for all our Python users. We considered a large range of other tools like, Conda, PDM, pip+pip-tools, and Rye. Another exciting recent project in this space is the uv package manager, which aims to be the ‘Cargo for Python’. While some of these tools showed great promise, we have hundreds of internal users with sometimes very specific requirements, so we need a stable and proven solution. That’s why we chose Poetry. It uses the standardized pyproject.toml file for dependency declaration, it’s considerably faster than Pipenv and, in our experience, is more robustly resolving dependencies.
Compared to the old setup, Cross-platform lock files is another trait we prioritized since we have developers on all major platforms (macOS, Linux, and Windows) and across two instruction sets (x86 and ARM64). Poetry does not solve all our wishes, e.g. it’s not fully compatible with PEP 621 (yet) and needs tools like pyenv alongside it to manage different Python versions. But for us, it’s a big step in the right direction.
Open-source and Community
The popularity of Python is in no small part thanks to the community. Non-profits like the Python Software Foundation (PSF), NumFOCUS and the EuroPython association provide tools and platforms to advance the Python ecosystem. Picnic contributes to this global community by open-sourcing our internal diepvries package, organizing Python & Machine Learning meetups, and sponsoring conferences like EuroPython. Internally, we organize Python and Data Science guild sessions and offer two Python courses as part of the Picnic Tech Academy. One course is specifically designed to kick-start the careers of new analysts/engineers by quickly getting productive with Python. The course teaches them how to write maintainable Pythonic code, some common first- and third-party libraries, and how to set up their virtual environments. The second course focuses on engineers fluent in other programming languages who would like to level up their Python skills. Apart from briefly covering the points of the other course, it explores parts of the Python execution model, like the GIL and the Python garbage collector, and how to write asynchronous code using asyncio.
Future
Python Release Cycle
New Python versions are being released according to a fixed schedule. From Python 3.13, the schedule is as follows: every year, a new major Python version is released. For the following two years, that version will receive full support and bug fixes. For the following 3 years, it will receive security fixes only, after which it’s considered end-of-life. While every year it becomes easier to migrate to a new Python version, we experienced that having roughly a 6-month delay between the public Python release and recommending it for internal development works best for us. It gives a good trade-off between being able to leverage the latest features, and also preventing unnecessary problems when some popular packages are not immediately compatible. We encourage internal users to upgrade by only supporting recent Python versions. The set of versions we support follows an internal Python release cycle. At the moment this means that we’re gearing up to support Python 3.12 while also supporting 3.11, 3.10, and 3.9.
From standardized Python development to standardized for all development
Picnic’s Python ecosystem has matured over the past few years with standardized CI/CD, linting, package management, and testing procedures. There are some technical differences between Java and Python in e.g. how we do CI/CD, how we configure Renovate (which automatically updates our dependencies), and how we have set up alerting and monitoring. Another difference is the way we do formatting and linting. For Python, we use the aforementioned picnic-guide and since it contains tools that are also relevant to Java teams, we’re going to deploy the tool to those teams as well. Our mission is that in the coming year, we’ll take the best of both worlds and align them to a single set of tools such that we enable any engineer or analyst to own the implementation end-to-end, regardless of the chosen technology.
Faster Python
One of the most exciting developments in the overall ecosystem has been the focus by the community on making the base interpreter faster, enabling easy multithreading using sub-interpreters and the increase of Rust in popular packages like polars, pydantic and ruff. When the base interpreter becomes faster (up to 5x compared to 3.9!), all our services run faster simply by upgrading. Better parallelism (PEP 703), without having to rely on multiprocessing and (de-)serializing objects back and forth, will make writing code for problems that can be solved in parallel more ergonomic and needing less resources. Finally, using Rust instead of C for high performance packages, makes writing efficient and memory safe logic more accessible. These improvements combined pave the way for a new era of Python programming that is faster, more scalable, and efficient; Perfect for what we need at Picnic.
Takeaways
- Python’s gentle learning curve and rich ecosystem allows hundreds of engineers and analyst to effortlessly develop a broad array of solutions, extending well beyond just Machine Learning and Data products.
- Consistent formatting, linting and testing help manage Python’s flexibility at scale. Our internal picnic-guide library enforces these standards for all Python projects.
- Picnic contributes back to the Python community by sponsoring Python events, hosting meetups and releasing the open-source data vault library diepvries.
- By having an internal Python release cycle that has a fixed 6-month delay compared to the public release, we get access to the latest features without loosing time on temporary incompatibilities.
- Python’s ecosystem enhancements and speed-ups contribute to Picnic’s journey to become the best milkman on earth.
And now for something completely different
Well, almost ;). Did you know that there’s an actual Python bridge in Amsterdam? How cool is that?! Another reason to join us in Amsterdam and check it out for yourself.
Calling all Pythonistas; we have many open roles in which we require your Python skills 🐍: