Home > News content

Best Python machine learning library

via:博客园     time:2015/12/10 14:32:41     readed:1667

English text:Best Machine Learning Libraries In Python The


There is no doubt that neural networks and machine learning in the past few years has been one of the hottest topics in the field of technology. This is easy to see, because they solve a lot of really interesting cases, such as speech recognition, image recognition, and even music. So, in this article, I decided to compile a list of some of the very good Python machine learning libraries, and put it on the following.

In my opinion, Python is one of the best language learning (and Implementation) machine learning technology, the main reasons are the following:

  • Language is simple: now, Python is the main reason for the preferred language for novice programmers is that it has a simple syntax and a large community.
  • Powerful: the syntax is simple does not mean that it is weak. Python is also one of the most popular languages for data scientists and Web programmers. Python community created by the library allows you to do anything you want to do, including machine learning.
  • Rich ML Library: there are a lot of Python oriented machine learning library. You can choose the most appropriate one from the hundreds of Libraries Based on your usage, technology, and needs.

The last point above can be said to be the most important. Drive machine learning algorithm is very complex, including a lot of mathematical knowledge, so it is difficult to achieve them (and to ensure its normal operation) will be a very difficult task. Fortunately, there are a lot of smart, dedicated people who do this hard work, so we just need to focus on the application in hand.

This is not an exhaustive list. There are a lot of code not listed here, here I will only publish some very relevant or well-known library. Here, look at this list.

Most popular library

I have already made a brief description of some of the more popular libraries and the direction they are good at, and I'll give a more complete list of items in the next section.


This is the latest in the list of neural networks. Just released a few days ago, Tensorflow is a high-level neural network library, can help you design your network architecture, to avoid the emergence of low level of detail error. The focus is on allowing you to represent a data flow graph, which is more suitable for solving complex problems.

This library is mainly written in C, including Python binding, so you don't have to worry about its performance issues. One of my favorite features is its flexible architecture that allows you to use the same API to deploy it to one or more CPU or GPU desktop, server, or mobile devices. There are not many libraries, if you want to say, Tensorflow is one of its.

It was developed for the Google brain project, which has been used by hundreds of engineers, so it is not without doubt whether it is capable of creating interesting solutions.

Although you may have to take some time to learn its API, it should be worth the time it takes to learn it. I only took a few minutes to understand its core functions, it has been known that Tensorflow is worth my time to spend more time to allow me to achieve my network design, and not only through the use of API.


Scikit-learn is definitely one of them, if not the most popular, then it is also regarded as one of the most popular machine learning libraries in all languages. It has a large number of data mining and data analysis functions, so that it becomes the first choice for researchers and developers.

Its built-in NumPy, SciPy, Matplotlib library, so many people who have used these libraries have a feeling of familiarity. Although this library appears to be slightly lower than the other libraries listed below, it is prone to be used as the basis for many other machine learning implementation.


Theano is a machine learning library that allows you to define, optimize and evaluate mathematical expressions involving multidimensional arrays, which may be a setback for other libraries. Like Theano, scikit-learn is a good integrated NumPy library. The transparent use of Theano makes the GPU can be quickly and without the wrong setting, which is very important for beginners. Yet some people have more to describe it as a research tool, rather than as a product to use, so it is necessary to use.

One of the best features of Theano is to have excellent reference documentation and a lot of tutorials. In fact, thanks to the popularity of this library, you will not have too much trouble finding resources, such as how to get your model and running.


Most of the Pylearn2's functions are actually built on Theano, so it has a very solid foundation.

According to Pylearn2 web site:

Pylearn2 is different from scikit-learn, Pylearn2 is designed to provide great flexibility, so that researchers can do almost anything to do, and the purpose of the scikit-learn is as a

Remember that Pylearn2 will encapsulate other libraries, such as scikit-learn, in the right time, so you won't get the code written by 100 users here. However, this is very good, because most of the errors have been solved. Such as the Pylearn2 package library in this list has a very important position.


Neural network research is one of the more exciting and different areas of genetic algorithm. Essentially, the genetic algorithm is a heuristic search procedure for simulating natural selection. Essentially, it is tested on a number of data on the neural network, and the network performance is obtained from a fitting function. Then the change of network iteratively small and random, and then use the same data for testing. A network with a high degree of fit scores is used as the output, and then it is used as the parent node of the next network.

Pyevolve provides a framework for building and executing these algorithms. The authors have indicated that the V0.6 version also supports genetic programming, so in the near future, the framework will be more inclined to serve as an evolutionary computation framework, rather than simply a genetic algorithm framework.


Nupic is another library, which provides a number of different features compared with the standard machine learning algorithm. It is based on a hierarchical temporal memory (HTM) called the neocortex theory,. HTMs can be regarded as a kind of neural network, but it is different in some theory.

Fundamentally speaking, HTMs is a layered, time based memory system, you can accept a variety of data. This means that it will become a new computational framework to mimic how memory and computing in our brain are closely linked. For a detailed description of the theory and its application, seewhite paper.


This library is more like a

There is a good example in the document, using a bunch of tweets to train a classifier to distinguish one from the

Pattern.web import Twitter from

First use twitter.search () to collect data from the'#win'and'#fail'. Then, a K- nearest neighbor (KNN) model is trained by using the adjective extracted from the push. After enough training, you will get a classifier. Just 15 lines of code, but also good.


Caffe is a machine learning library for visual application domain. You might use it to create a deep neural network, recognize the entities in the image, and even recognize a visual pattern.

Caffe provides seamless integration of GPU training, and when you are training images, it is highly recommended to use this library. Although Caffe appears to be primarily academic and research, it has a number of uses for the training model used in production.

Other famous Libraries

Some other machine learning libraries for Python are also listed here. Some of the library and the library have the same function, while others have more narrow goals or more suitable as a learning tool to use.


  • Based on scikit-learn
  • Github


PyBrain (inactive)













Forge Feature






MLPY (inactive)




MDP (inactive)





FFnet (inactive)

















The translator / Liu Diwei / Liu Xiangyu / Zhong Hao commissioning editor.

Liu Diwei, a graduate student at the school of software, Central South University, is concerned with machine learning, data mining and bioinformatics.

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments