Home > News content

Python is slow, but I do not care

via:博客园     time:2017/5/1 13:01:07     readed:1698

English Original:Yes, Python is Slow, and I Don & rsquo; t Care

To sacrifice performance for the pursuit of productivity and cry.

Let me take a break from the discussion of the standard library of asyncio in Python and talk about some of the things I've been thinking about recently: Python's speed. For those who do not understand me, I am a fan of Python, and I am actively using Python in all the places where I can think of it. One of the biggest complaints about Python is that it's slow, and some people even refuse to try using Python because it's slower than other languages. Here to talk about why I think should try to use Python, although it is a bit slow.

Speed ​​is no longer important

The past is that the program takes a long time to run, the CPU is more expensive, the memory is also very expensive. The running time of the program is a very important indicator. The computer is very expensive and the power required for the computer to run is also quite expensive. Optimizing these resources is due to an eternal business rule:

Optimize your most expensive resources.

In the past, the most expensive resource was the running time of the computer. This is why computer science is committed to studying the efficiency of different algorithms. However, this is no longer correct, because now silicon chip is very cheap, really cheap. Running time is no longer your most expensive resource. The company's most expensive resource is now its staff time. Or in other words, you are. It's more important to finish things than to get it faster. In fact, it is quite important that I will put it here again as if it were a citation (for those who were just a rough person)

It's more important to do things faster than doing things quickly.

You might say: "My company is concerned about the speed, I developed a web application, then all the response time must be less than x milliseconds. "Or," we lost the customer because they thought our app was running too slow. " "I do not want to say that speed is not important, I just want to say that speed is no longer the most important thing; it is no longer your most expensive resources.

Speed ​​is the only thing that is important

When you say speed in the context of programming, you usually say that performance, that is, CPU cycles. When your CEO says speed in the context of programming, he refers to the speed of business, the most important indicator is the time to market. Basically, it's not important how fast your product / web program is. It is written in what language is not important. It does not matter how much it costs. At the end of the day, the only thing that keeps your company alive or dead is the time to market. I do not just say the idea of ​​a start-up company - how long it takes you to start making money, and more of the time from the idea to the customer's hands. The only way an enterprise can survive is to innovate faster than your competitors. If your competitors have been listed in advance before your product is listed, then how many good ideas you have come up with is no longer important. You must be listed first, or at least keep up. But once you slow down the pace, you lose.

The only way an enterprise can survive is to innovate faster than your competitors.

A case of micro service

Companies like Amazon, Google and Netflix understand the importance of fast forward. They created a business system that could use the system to move forward quickly and quickly. Micro service is the solution to their problem. This article does not talk about whether you should use micro services, but at least understand why Amazon and Google think they should use micro services.

Micro service is already very slow. The main concept of micro service is to use the network call to break the border. This means that you are using the function call (several CPU cycles) into a network call. Nothing more than that affects performance. Compared with the CPU, the network call is really slow. But these big companies still choose to use micro services. I know the architecture is not even slower than the micro-service. Micro service is the biggest drawback of its performance, but the biggest advantage is the time to market. By building teams on smaller projects and code libraries, a company can iterate and innovate at a faster rate. This shows that very large companies are also very concerned about the time to market, not just only start-up companies.

CPU is not your bottleneck

If you are writing a web application, such as a web server, it is likely that the situation will be, CPU time is not the bottleneck of your program. When your web server processes a request, it may make several network calls, such as to a database, or a cache server like Redis. While these services may be faster themselves, they are slow to call their networks.There is a good blog post about the speed difference of a particular operationThe In this article, the author scales the CPU cycle time to a more understandable human time. If a single CPU cycle is equivalent to 1 second, then a network call from California to New York will be equivalent to 4 years. It shows how much network call is slow. According to some rough estimates, we can assume that in the same data center within the ordinary network call takes about 3 milliseconds. This is equivalent to our "human ratio" for 3 months. Now suppose your program is high CPU intensive, which requires 100000 CPU cycles to respond to a single call. This is equivalent to just over a day. Now let's assume that you are using a language that is five times slower, which will take about 5 days. Well, it will not be very important to compare that with our 3-month network call time. If someone had to wait at least 3 months for a parcel, I do not think the extra four days were really important to them.

The ultimate meaning above is that, although Python is slow, it does not matter. The speed of the language (or the CPU time) is almost never a problem. In fact, Google has done a study on this concept,And they published a paper on thisThe The paper deals with the design of high-throughput systems. In the conclusion, they said:

The use of interpretive language in a high-throughput environment seems contradictory, but we have found that CPU time is almost no limiting factor; language expression means that most programs are source programs and that most of their time is spent on I / O read and write and the machine's run-time code. Moreover, the explanatory language, whether it is in the language level of the easy experiment or allow us to explore the distribution of many machines on the method are very helpful,

emphasize again:

CPU time is almost no limiting factor.

What if CPU time is a problem?

You might say that the front of the situation is really good, but we did have some problems, these problems in the CPU has become our bottleneck, and caused our web application is very slow "or" ldquo "or" ldquo; ; X language on the server requires less hardware resources than the Y language to run. "These are probably right." About web servers have such wonderful things: you can load them almost infinitely. In other words, you can put more hardware on the web server. Of course, Python may require better hardware resources than other languages, such as c language. Just put the hardware on the CPU problem. Compared to your time, the hardware is very cheap. If you save two weeks of productivity in a year, it will be far more than the increased hardware overhead.

So, is Python faster?

This article inside, I have been talking about the most important is the development time. So the question still exists: when the development time, Python than other languages ​​faster? By conventional practice, I,Google and alsootherSeveral peopleCan tell you how much Python isEfficientThe It abstracts a lot of things for you to help you focus on where you really should write code, and will not be trapped in trivial things, such as whether you should use a vector or an array. But you may not like just listen to someone else's words, so let's look at some more empirical data.

In most cases, the debate about whether python is a more efficient language can be summed up in both the scripting language (or dynamic language) and the static type language. I think it is generally accepted that the productivity of the static type language is low, however,This has an excellent paperExplain why this is not the case. In terms of Python, there is onethe study, It investigates the time it takes to write code for strings in different languages ​​for reference.

In the above study, Python is twice as efficient as Java. There are some other studies that also show similar things. Rosetta Code conducted a difference in programming languagesDeep researchThe In the paper, they compare python to other scripting languages ​​/ interpretive language and conclude that:

Python is more concise, even when compared to functional languages ​​(averaging 1.2 to 1.6 times on average)

The general trend seems to be that the code lines in Python are always less. The line of code may sound like a terrible indicator, but includes two studies that have been mentioned aboveA number of studiesIndicating that the time required for each line of code in each language is about the same. Therefore, limiting the number of lines of code can increase productivity. Even codinghorror (a C # programmer) myselfWrote an article on how Python is more efficientThe

I think it is fair to say that Python is more efficient than many other languages. This is mainly due to Python's large number of comes with third-party libraries.This is a simple article that discusses the differences between Python and other languagesThe If you do not know why Python is so small and efficient, I invite you to take this opportunity to learn a little python and practice yourself more. Here is your first program:

Import __hello__

But what if the speed is really important?

The tone of the argument above may make people feel that optimization and speed are not important at all. But the fact is that very often run-time performance is really important. One example is that you have a web application that has a specific endpoint that takes a long time to respond. You know how fast this program is, and know how much the program needs to be improved.

In our example, two things happened:

  1. We noticed that there was an endpoint that was slow to execute.

  2. We admit it is slow because we have a standard that can measure whether it is fast enough, and it does not meet that standard.

We do not have to fine tune all the content in the application, just let each of them be "fast enough". If an endpoint takes a few seconds to respond, your users may notice, but they will not notice that you will reduce the response time from 35 milliseconds to 25 milliseconds. "Is good enough" and everything you need to do. Disclaimer: I should say that there are some applications, such as real-time bidding process, really need to be subtle optimization, every millisecond is very important. But that's just an exception, not a rule.

In order to understand how to optimize the endpoint, your first step will be to configure the code and try to find out where the bottleneck is. after all:

Any improvement other than the bottleneck is illusion.

Any sorry made floor except the bottleneck are an illusion

If your optimization does not touch the bottleneck, you just wasted your time and did not solve the real problem. Before you optimize the bottleneck, you will not get any important improvements. If you try to optimize without knowing what the bottleneck is, then you will only play in some code. Prioritize and determine the bottleneck before optimizing the code is called & ldquo; premature optimization & rdquo ;. People often mention what Donald Knuth said, but he claimed that the words were actually heard from someone else:

Premature optimization is the source of all evil

Premature optimization is the root of all evil.

When it comes to maintaining the code base, the more complete quotation from Donald Knuth is:

In 97% of the time, we should forget the insignificant efficiency: premature optimization is the source of all evil. However in the key 3%, we should not miss the opportunity for optimization. & Mdash; & mdash; Donald Knuth

In other words, what he says is that in most of the time you should forget to optimize your code. It's almost always good enough In the case of not good enough, we usually only need to touch the 3% code path. For example, because you are using an if statement rather than a function, your endpoint is quick for a few nanoseconds, but that does not make you win any awards.

Premature optimizations include calling some faster functions, or even using specific data structures, as it is usually faster. Computer science argues that if a method or algorithm has the same asymptotic growth (or Big-O) with the other, then they are equivalent, even if they are twice as hard as in practice. The computer is so fast that the algorithm increases with the increase in data / usage and the calculation grows far beyond the actual speed itself. In other words, if you have two O (log n) functions, but one is twice as slow, it is not really important. With the increase in the size of the data, they are the same speed & slow down & rdquo ;. That's premature optimization is the source of all evil; it wastes our time and almost never really helps us improve our performance.

For Big-O, you can think of all the languages ​​for your program, O (n), where n is the number of lines of code or instructions. For the same instruction, they grow at the same rate. For progressive growth, the speed of a language is not important, all languages ​​are the same. Under this logic, you can say that choosing a language for your application just because it's & fast "is the final form of premature optimization. You choose some of the expected fast things, but no measurement, do not understand where the bottleneck will be.

Choosing a language for your application just because it's & fast "is the final form of premature optimization.

Optimize Python

One of my favorite Python is that it allows you to optimize a little bit of code at once. Assuming you have a Python method, you find it is your bottleneck. You have optimized it several times and may follow itHerewithThereSome of the guidance, now, you are sure that Python itself is your bottleneck. Python has the ability to call C code, which means that you can use C to override this method to reduce performance problems. You can rewrite one of these methods at once. This process allows you to write a well-optimized bottleneck method in any language that can be compiled into a C-compatible assembler. This allows you to write Python in most of the time, writing code only in lower levels if necessary.

There is a programming language called Cython, which is a superset of Python. It is almost a combination of Python and C, is a progressive type of language. Any Python code is valid for Cython code, Cython code can be compiled into C code. Using Cython, you can write a module or a method and gradually progress to more and more C types and performance. You can mix the C type with the Python duck type. With Cython, you can get the perfect combination after mixing, only at the bottleneck to optimize, while in all other places do not lose the beauty of Python.

A screenshot of the Star Wars: This is a space MMO game written in Python.

When you finally encounter Python's performance problems, you do not need to write your entire code base in another different language. You only need to use Cython to rewrite several functions, almost you can get the performance you need. This isStar Wars EveTake the strategy. This is a large multi-player computer game that uses Python and Cython throughout the architecture. They achieve game-level performance by optimizing bottlenecks in C / Cython. If this strategy is useful to them, then it should be helpful to anyone. Or, there are other ways to optimize your Python. E.g,PyPyIs a Python JIT implementation that provides important runtime improvements (such as web servers) for long-running applications by using PyPy for CPython, which is the default implementation of Python.

Let's take a look at the points:

  • Optimize your most expensive resources. That's you, not the computer.

  • Choose a language / framework / architecture to help you develop quickly (such as Python). Do not just choose them because of some technology.

  • When you encounter performance problems, please find the bottleneck.

  • Your bottleneck is probably not the CPU or Python itself.

  • If Python becomes your bottleneck (you have optimized your algorithm), then you can turn to popular Cython or C.

  • Enjoy the fun of doing things quickly.

I hope you like reading this article just as I like to write this article. If you want to say thank you, please give me some praise. Also, if at some point you want to talk to me about Python, you can be on twitter for me (@nhumrich), or you canPython slack channelfind me.



About the Author:

Nick Humrich - insists on using continuous delivery methods and writes a lot of tools for it. The same is a Python hacker and technology enthusiasts, is currently a DevOps engineers.

Author:Nick HumrichTranslator:Zhousiyu325ProofreadingJasminepeng

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments