Home > News content

Google releases TensorFlow Lattice: benefit from prior knowledge to improve model generalization ability

via:博客园     time:2017/10/13 10:06:56     readed:749

Recently, Google scientists released TensorFlow Lattice, which is a pre-built TensorFlow Estimators, easy to use, it is equivalent to the TensorFlow operator, used to build lattice model (lattice model). Dot matrix is ​​a multi-dimensional interpolation lookup table (look-up table), and geometric text on the back of the sine function similar to the look-up table similar.

Lei Feng network AI technical review compiled as follows:

We use the lookup table structure (which can be keyed by multiple inputs) to estimate the more casual and flexible relationships and to satisfy the specified monotonic relationship in order to better generalize. That is, training to find table values ​​minimizes the loss of training samples. In addition, the adjacent values ​​in the lookup table are constrained to grow in the given direction of the input space, so the output value of the model also grows in these directions. It is important that the lattice model is smooth and predictive is bounded because it is inserted between lookup table values, which helps to avoid spurious predictions with large deviations in the test phase.

Video address:http://static.video.qq.com/TPout.swf?auto=1&vid=z0560xy9zaa

The role of lattice model

Imagine that you are designing a system that recommends a nearby coffee shop to the user, and you need to let the model learn: "If the two coffee shops are the same, then choose a little closer. & rdquo;

In the figure below, we show a flexible model (pink curve) that matches exactly the training data (purple dots) from Tokyo users, with a lot of coffee shops nearby.

As the training sample is noisy, you can see that the pink curve model has been overfilled, and the model also ignores the general trend & mdash; the closer the coffee shop is, the better. If you use this pink curve model to arrange samples from Texas (blue), the distribution of coffee shops in Texas is more dispersed, you will find the performance of the model becomes very strange, and sometimes even think Farther coffee shop better!



In contrast, using the same sample training in Tokyo, the lattice model can be constrained to satisfy monotonic relationships and eventually get a flexible monotonic function (green curve). This function can match exactly with Tokyo's training examples, but it can also be generalized to Texas case, and there will not be a better situation for farther coffee shops.

In general, the input will have the coffee quality, price, etc. for each coffee shop. Flexible models are difficult to capture the overall relationship of this form, especially in some feature spaces, the training data is very sparse and messy. "If all the other inputs are the same, then it's better." The machine learning model that captures the priori knowledge (such as how the input affects the predicted value) is better in practice, easier to debug, and more interpretive.

Pre-built Estimators

We provide a series of dot matrix model architectures as TensorFlow Estimators. The simplest estimator we provide is a calibrated linear model that utilizes a 1-d lattice to learn the best 1-d transform of each feature and then linearly combine all the calibration features. If the training data set is small or there is no complex nonlinear input interaction, the model will be very effective.

Another estimator is the calibrated lattice model, which uses a two-layer single lattice model to nonlinearly combine calibration features to represent complex nonlinear interactions in a data set. If there are 2-10 features, then the calibration lattice model will be a good choice, but for 10 or more than 10 features, we believe that the use of a set of calibration lattice will get the best results, this time you can Use a prebuilt set of architectures to train. Compared to random forests, monotonic lattice ensembles can increase the accuracy by 0.3% to 0.5%. In addition, these new TensorFlow dot estimators can increase the accuracy of 0.1% to 0.4% compared to the previous monotonic learning model.

Create a model

You may want to experiment with a deeper dot matrix network, or use a partial monotonic function (as part of a deep neural network or other TensorFlow architecture). We provide components: TensorFlow calibration operator, lattice insertion and monotonicity projection. The following figure is a 9-layer depth lattice network:


In TensorFlow Lattice, in addition to the flexible selection of the model and the standard L1, L2 regularization, we also provide a new regularization matrix:

  • As described above, monotonic constraints are made on the input.

  • Spinel regularization on the dot matrix in order to make the learning function smoother.

  • Torsion regularization (Torsion generalization) to suppress unnecessary non-linear feature interactions.

We can see the details and start the experiment at the following address:




[1]Lattice Regression, Eric Garcia, Maya Gupta, Advances in Neural Information Processing Systems (NIPS), 2009

[2]Optimized Regression for Efficient Function Evaluation, Eric Garcia, Raman Arora, Maya R. Gupta, IEEE Transactions on Image Processing, 2012

[3]Monotonic Calibrated Interpolated Look-Up Tables, Maya Gupta, Andrew Cotter, Jan Pfeifer, Konstantin Voevodski, Kevin Canini, Alexander Mangylov, Wojciech Moczydlowski, Alexander van Esbroeck, Journal of Machine Learning Research (JMLR), 2016

[4]Fast and Flexible Monotonic Functions with Ensembles of Lattices, Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta, Advances in Neural Information Processing Systems (NIPS), 2016

[5]Deep Lattice Networks and Partial Monotonic Functions, Seungil You, David Ding, Kevin Canini, Jan Pfeifer, Maya R. Gupta, Advances in Neural Information Processing Systems (NIPS), 2017

via:Google Research Blog

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments