Pablo Galindo, a Bloomberg software engineer and Python core developer, was elected in December 2020 as one of the five members of the 2021 Python Steering Council, which is charged with maintaining the quality and stability of the Python language and CPython interpreter.
At Bloomberg, he’s part of Bloomberg’s Python Infrastructure team, helping to drive the use of the language by roughly a third of the company’s 6,000 software engineers. In this brief interview, he opened up about Python’s use at Bloomberg, the language’s evolution, and its developer community.
Over the past few years, it seems that Python has gained a significant amount of prominence within the data-science community. We’ve talked to some developers and data scientists about it, and they suggest that the language’s usage in data science stems from its general ubiquity—since so many people already know it, the data-science community has shifted to using it over languages such as R. Is this true, in your opinion? Why has Bloomberg so aggressively embraced the language in a data-science context?
The ubiquity of Python in the data science and scientific environment is certainly an important factor in its growth, but I think there are two other important elements that have played a strategic and fundamental role: its syntax and its interoperability with native code.
First, Python syntax doesn’t normally get in one’s way, which makes Python code very easy to read and understand. This is one of the reasons scientists and others in research communities choose to use Python over other languages. These communities see Python as a tool and they are focused on their research and getting work done, so it is quite important that the language add the least amount of overhead possible.
The second reason is the ability to extend Python by binding it against native code. There is a very important ecosystem of scientific libraries written in compiled languages like C, C++, or Fortran, and it is tremendously beneficial to be able to leverage that existing work from Python. Python is not as fast as these languages, and scientific computations normally need to be performant and fast, so being able to bind these existing code bases to Python delivers the best of both words. Being able to leverage existing, battle-tested native code, instead of developing an equivalent version from scratch is also almost as important as the performance factor.
While there are many other languages that allow these kind of bindings, Python exposes a considerably big surface area for these purposes. This has resulted in a very vibrant ecosystem of native extensions of Python for scientific purposes, such as NumPy or pandas.
What are the biggest challenges facing Python at the moment from a features/capability perspective? How is the developer community working to make sure those features are implemented/shortfalls eliminated?
There are several important challenges. Of course, these change depending on the set of users you ask. From my perspective, the biggest challenges are the need to improve Python’s performance and multi-core story, bring Python to new platforms (like mobile and the browser), and ensuring good balance in the evolution of the language’s syntax. Let me explain each of these a bit.
Just like other interpreted languages, Python cannot fully leverage multiple cores at the same time. While workarounds already exist for this problem, it would be a big improvement if we could manage to solve this problem natively. Unfortunately, it is a very hard problem to solve that involves quite a lot of thinking, planning and experimenting.
While Python is ubiquitous in the back-end space and data science worlds, it is almost nonexistent in some big areas, such as the browser or mobile platforms. While we don’t expect Python to dethrone the big players in those areas, it would be a very good thing for Python to stay competitive there and it will open up an existing world of opportunities if Python would have more presence and compatibility in those platforms.
Finally, while Python is famous for its clean and concise syntax, this has evolved significantly in recent years, and there is some tension around this between different user groups in the Python ecosystem. Some users welcome new syntax changes that allow them to be more expressive and to solve problems that are mainly present in their fields, while others prefer Python to maintain a clean syntax that is as easy to learn as possible—without complicated constructs or sudden changes. Balancing this tension is one of the most important challenges facing the Python community, as it is critical for ensuring everyone remains comfortable coding in Python.
Python is developed out in the open, primarily by volunteers. As a result, any problems must be resolved through a partnership between the community, the core development team, and the Python Steering Council. Sometimes this makes things a bit more challenging because there isn’t a coordinated effort to approach some of these challenges. On the other hand, many members of the core dev team and the Python community at large are interested in these problems and are active in investigating different solutions. Some of these problems require a lot of investigation and slow consideration, while others just require a considerable amount of work to be done.
The fact that the development is done in the open, primarily by volunteers, creates some challenges, but also allows anyone to give their input and participate. This often results in much better solutions than those that would be reached by a very controlling set of developers, all of whom have similar backgrounds.
The last piece of the puzzle is the Python Steering Council, which is responsible for maintaining the quality and stability of the language, as well as the CPython interpreter. One of the fundamental tasks of the Steering Council is to select precisely which changes are accepted in the language and which one are not. They ultimately maintain and preserve this delicate balance related to many aspects of Python, including its syntax.
What specifically about the Python developer experience could be improved? Is the current state of the developer experience actually slowing the language's adoption?
Several things can be improved in the Python developer experience. One fundamental thing that I think it would be a great step forward would see improvements to the quality of error messages in general, and syntax error in particular. While this is especially important for people learning Python, it will also be helpful for experienced developers. Sometimes, when you execute some invalid Python code, the error messages you get are difficult to comprehend. In turn, this forces the developer to spend a considerable amount of time trying to understand what’s going on and how to fix it. We can do better.
In the next version of the interpreter (Python 3.10) coming later this year, we have included quite a number of improvements in this area and are quite excited for many others we have planned. I hope that this will improve a lot in the future.
Another aspect that could affect the developer experience includes having some standard and robust method for packaging and deploying hermetic Python applications. Other areas I can envision improvements in are better debugging experience for native (C/C++) extension modules and multi-threaded applications. While all of these are hard problems, I am confident that we will see improvements in the years to come.