82
Why Python Is So Slow (And What Is Being Done About It)
(thenewstack.io)
Welcome to the Python community on the programming.dev Lemmy instance!
Past
November 2023
October 2023
July 2023
August 2023
September 2023
If everyone had a magic lamp that told them whether performance was going to be an issue when they started a project then maybe it wouldn't matter. But in my experience people start Python projects with "performance doesn't matter", write 100k lines of code and then ask "ok it's too slow now, what do we do". To which the answer is "you fucked up, you shouldn't have used Python".
No, it's usually "microservices" or "better queries" or something like that. Python performance shouldn't be an issue in a well-architected application. Source: I work on a project with hundreds of thousands of lines of Python code.
Well yeah if by "well architected" you mean "doesn't use Python".
Not everything is a web service. Most of the slow Python code I encounter is doing real work.
We also do "real work," and that uses libraries that use C(++) under the hood, like scipy, numpy, and tensorflow. We do simulations of seismic waves, particle physics simulations, etc. Most of our app is business logic in a webapp, but there's heavy lifting as well. All of "our" code is Python. I even pitched using Rust for a project, but we were able to get the Python code "fast enough" with numba.
We separate expensive logic that can take longer into background tasks from requests that need to finish quickly. We auto-scale horizontally as needed so everything remains responsive.
That's what I mean by "architected well," everything stays responsive and we just increase our hosting costs instead of development costs. If we need to, we could always rewrite parts in a faster language, provided that costs less than the development costs. We really don't spend much time at all optimizing python code, so we're not at that point yet.
That being said, I do appreciate faster-running code. I use Rust for most of my personal projects, but that's because I don't have to pay a team to maintain my projects.
Matrix code is the very best case for offloading work from Python to something else though.
Think about something like a build system (e.g. scons) or a package installer (pip). There is no part of them that you can point to and say "that's the slow bit, write it in C" because the slowness is distributed through the entire thing.
Both of those are largely bound by i/o, but with some processing in between, so the best way to speed things up is probably am async i/o loop that feeds a worker pool. In Python, you'd use processes, which can be expensive and a little complicated, but workable.
And as you pointed out, scons and pip exist, and they're fast enough. I actually use poetry, and it's completely fine.
You could go all out and build something like cargo, but it's the architecture decisions that matter most in something i/o bound like that.
Strong disagree. I switched from pip to uv and it sped my install time up from 58 seconds to 7. Yeah really. If pip is i/o bound where is all that speed up coming from?
That's pretty impressive! We have a bunch of a bunch of compiled stuff (numpy, tensorflow, etc), so I'm guessing we wouldn't see as dramatic of an improvement.
Then again, <1 min is "good enough" for me, certainly good enough to not warrant a rewrite. But I'll have to try uv out, maybe we'll switch to it. We switched from requirements.txt -> pyproject.toml using poetry, so maybe it's worth trying out the improved pyproject.toml support. Our microservices each take ~30s to install (I think w/o cache?), which isn't terrible and it's a relatively insignificant part of our build pipelines, but rebuilding everything from scratch when we upgrade Python is a pain.
Yeah I was very impressed. The only problem with
uv
and third party tools in general is that the main reason we're using Python is because my boss didn't want people to have to install extra stuff to use it. I would prefer using Deno, but apparently a one-line rock solid install command is too much to ask compared to the mess of Python infra... smh.Well, I'm kind of the boss, but I inherited the Python codebase. The original reasoning was it's easier to hire/on-board people, which I think is largely true.
If it was up to me, I'd rewrite a bunch of our code to Rust. I use it for personal projects already, so I know the ecosystem. But that's a tough sale to the product team, so it's probably not happening anytime soon. I'd also have to retrain everyone, which doesn't sound fun...
However, any change I make needs to work smoothly for our devs, and we have a few teams across 3 regions. So it needs clear advantages and whatnot to go through the pain of addressing everyone's concerns.
Sounds like you're not the boss enough!
I agree Rust has a pretty steep learning curve so it's definitely reasonable to worry about people learning it, especially existing employees. Though I don't really buy the "easier to hire people" argument. There are plenty of Rust developers actively looking for Rust jobs, so I suspect you get fewer candidates but the ones you do get are higher quality and more interested.
But anyway I don't think that argument holds for Deno. Typescript is in the same difficulty league as Python. Anyone that knows Python should be able to transition easily.
Yup, I guess not. But if I was on the product team, the customers and director ate the bosses. And on it goes up to the CEO, where the board and shareholders are the boss.
If I can justify the change, we'll do it. That's close enough for me. And I did do a POC w/ Rust and could've switched one service over, but I campaigned against myself since we got good enough perf w/ Python (numpy + numba) and I was the only one who wanted it. That has changed, so I might try again with another service (prob our gateway, we have 3 and they all kinda suck).
I'll have to check out Deno again. I remember looking at it (or something like it) a couple years ago when first announced on Reddit.
Yeah I have yet to really use Deno in anger because so many people are like "but Python exists!" and unsurprisingly we now find ourselves with a mess of virtual environments and pip nonsense that has literally cost me weeks of my life.
Though if you're using Numpy that source like "proper work" not the infrastructure scripting we use Python for so I probably would go with Rust over Deno. I don't know of mature linear algebra libraries for Typescript (though I also haven't looked).
IMO probably the biggest benefit of Rust over most languages is the lower number of bugs and reduced debugging time due to the "if it compiles it probably works" thing.
You don't have to convince me that Rust rocks. I just need to convince my team that it's worth the investment in terms of time to onboard everyone, time to port out application, and risk of introducing bugs.
We have a complex mix of CRUD, math-heavy algorithms, and data transformation logic. Fortunately, each of those are largely broken up into microservices, so they can be replaced as needed. If we decide to port, we can at least do it a little at a time.
The real question is, does the team want to maintain a Python or Rust app, and since almost nobody on the team has professional experience with low-level languages and our load is pretty small (a few thousand users; b2b), Python is preferred.
That depends on what the application needs to do. There's a reason why all performance-critical libraries for Python aren't written in Python.
Sure, and we use those, like numpy, scipy, and tensorflow. Python is best when gluing libraries together, so the more you can get out of those libraries, the better.
Python isn't fast, but it's usually fast enough to shuffle data from one library to the next.
Usually, but when it isn't then you've got a bottleneck. Multithreaded performance is a major weak point if you need to do any processing that isn't handled by one of the libraries.
Then you need to break up your problem into processes. Python doesn't really do multi-threading (hopefully that changes with the GIL going away), but most things can scale reasonably well in a process pool if you manage the worker queue properly (e.g. RabbitMQ works well).
It's not as good as proper threadimg, but it's a lot simpler and easier to scale horizontally. You can later rewrite certain parts if hosting costs become a larger issue than dev costs.
A process pool means extra copying of data around which incurs a huge cost and this is made worse by the tendency for parallel-processing-friendly workloads often consisting of large amounts of data.
Yup, which is why you should try to limit the copying by designing your parallel processing algorithm around it. If you can't, you would handle threading with a native library or something and scale vertical instead of horizontal. Or pick a different language if it's a huge part of your app.
But in a lot of cases, it's reasonable to stick with Python and scale horizontally. That has value if you're otherwise a Python shop.
100k lines of code doesn't mean anything.
You can make a 1k python lines bog down your new shiny PC, as well 1M lines run just fine.
Exactly. We have hundreds of thousands of lines of code that work reasonably well. I think we made the important decisions correctly, so performance issues in one area rarely impact others.
We rewrote ~1k lines of poorly running Fortran code into well-written Python code, and that worked because we got the important parts right (reduced big-O CPU from O(n^3^) to O(n^2^ log n) and memory from O(n^4^) to O(n^3^)). Runtime went from minutes to seconds in medium size data sets, and made large data sets possible to run (those would OOM due to O(n^4^) storage in RAM). If you get the important parts right, Python is probably good enough, and you can get linear optimizations from there by moving parts to a compiled language (or use a JIT like numba). Python wasn't why we could make it fast, it's just what we prototyped with so we could focus on the architecture, and we stopped optimizing when it was fast enough.
Q: what do we do? A: profile and decompose. Should not be that distant as a thought
Profiling is an extremely useful tool for optimising the system that you have. It doesn't help if you have the wrong system entirely though.
That's why you need an architect to design the project for the expected requirements. They'll ask the important questions, like:
You don't need all the answers up front, but you need enough to design a coherent system. It's like building a rail system, building a commuter line is much different than a light rail network, and the planners will need to know if those systems need to interact with anything else.
If you don't do that, you're going to end up overspending in some area, and probably significantly.
Upfront analysis and design is very close to independent from the technology, particularly at the I/O level