Authors: Anthony Shaw is a member of the Python software foundation and a member of the Apache foundation.
Recently, Python has risen sharply. This programming language is used to develop operation and maintenance (DevOps), data science, website development and security.
However, it did not win any medals for speed.
I want to answer this question: why does Python run a similar application 2 to 10 times slower than the other language, why is it so slow, can we make it faster?
The following are several common statements:
So which of the above causes the greatest impact on performance?
Modern computers carry CPU with multiple cores, sometimes with multiple processors. In order to take advantage of all these additional processing capabilities, the operating system defines a low-level structure called a thread: a process (such as a Chrome browser) may generate multiple threads and have instructions for the internal system. In this way, if a process is particularly expensive to consume CPU resources, the load can be shared among many cores, which actually allows most applications to accomplish the task faster.
When I wrote this article, my Chrome browser had 44 threads on. Keep this in mind: Thread structures and APIs are different between POSIX-based operating systems such as Mac OS and Linux and Windows OS. The operating system also handles the scheduling of threads.
If you haven't had multi thread programming before, you need to become familiar with lock as soon as possible. Unlike a single-threaded process, when you need to make sure you change variables in memory, multiple threads do not try to access / change the same memory address at the same time.
What does this mean for the performance of Python applications?
If you have an application with a single thread or a single interpreter, this will not affect speed. Deleting GIL doesn't affect your code's performance at all.
If you want to implement concurrent functionality in a single interpreter (Python process) using a thread mechanism, and threads are IO intensive (such as network IO or disk IO), you will see the consequences of the scramble for GIL.
What is the environment for other Python runtime?
PyPy has a GIL, which is usually 3 times faster than CPython.
Jython does not have GIL because the Python threads in Jython are represented by Java threads and benefit from the JVM memory management system.
I often hear this view, but I think it simplifies the way CPython works. If you write Python myscript.py on the terminal, then CPython will start a long string of operations that read, analyze, parse, compile, interpret, and execute the code.
If you are interested in the mechanism of this process, I wrote an article before: "modify the Python language in 6 minutes".Https://hackernoon.com/modifying-the-python-language-in-7-minutes-b94b0a99ce14).
An important node in this process is to create a.Pyc file; at the compile stage, the bytecode sequence is written to a file in __pycache__/ in Python 3 or the same directory in Python 2. This applies not only to your script, but also to all the imported code, including third party modules.
JIT or instant compilation requires an intermediate language to split the code into blocks (or frames). The advance (AOT) compiler is designed to ensure that CPU understands every line of code before any interaction occurs.
PyPy has JIT. As mentioned above, it is much faster than CPython. This performance benchmark test is introduced in more detail: which Python version is the fastest? "(Https://hackernoon.com/which-is-the-fastest-version-of-python-2ae7c61a6b2b).
So why don't CPython use JIT?
JIT has several disadvantages: one of the drawbacks is the startup time. The startup time of CPython is relatively slow, and the startup time of PyPy is 2 to 3 times slower than that of CPython. As we all know, the Java virtual machine starts very slowly. NET CLR solves this problem by starting the system when it is started, but the CLR developers have also developed an operating system on which the CLR runs.
However, CPython is a universal implementation. So if you're using Python to develop a command line application, you'll have to wait until the JIT starts every time you call the CLI.
CPython has to try to satisfy as many use cases as possible (case). Someone tried to insert JIT into CPython before, but this project basically ran aground.
If you want the benefits of JIT and the workload that suits it, you can use PyPy.
In dynamic type languages, there is still the concept of type, but the type of variables is dynamic.
In this example, Python creates a second variable with the same name, type str, and releases the memory created for the first instance of A.
Static type languages are not designed for you to plug in, they are designed for the operation mode of CPU. If everything is ultimately equivalent to simple binary operations, you have to convert objects and types to low-level data structures.
Python, for you to do this job, you will never see it or worry about it.
No need to declare types is not the reason why Python is slow. The Python language is designed to make almost everything dynamic. You can use the monkey patch (monkey-patch) to add code for low-level system calls to the runtime declaration values. Almost everything is possible.
It is this design that makes it difficult to optimize Python.
To illustrate my point, I will use a system call tracing tool called Dtrace, which can be used in Mac OS. The CPython distribution does not contain DTrace, so you have to recompile CPython. I use 3.6.6 for a demonstration.
The py_callflow tracker displays all the function calls in your application.
So, does the dynamic type of Python slow down?
The cost of comparison and conversion types is very large. Every time you read, write or reference a variable, you need to check the type.
It is difficult to optimize a very dynamic language. Many of the Python's alternative languages are much faster because they sacrifice for flexibility in performance.
Cython combines C-Static types and Python to optimize code of known type, which can increase performance by 84 times.
The reason why Python is slow is mainly due to its dynamic and versatile nature. It can be used as a tool to solve various problems. Python has several alternatives that are more optimized and faster.
However, there are some ways to optimize your Python applications, such as making full use of asynchrony, understanding analysis tools, and considering the use of multiple interpreters.
For applications that are not important for startup time and code will benefit from JIT, consider PyPy.
For parts of code that are of vital importance and with more static type variables, consider using Cython.