Next week's gathering of machine learning and artificial intelligence experts in Long Beach, California, introduces the latest research in artificial intelligence and computational neuroscience at the NIPS2017. For Facebook, they have 10 papers reviewed by NIPS this time, their researchers and engineers in many workshops a week, seminars, tutorials to participate strongly in discussions and show their achievements.
And, during this NIPS, Facebook will broadcast the conference content for the first time on Facebook LIVE, and many conference sessions will be broadcast. Live address atHereReaders who can not attend in person at that time can be thereBeijing time on Tuesday December 5 at 7:30 in the morning(Monday, December 4, at 5.30 p.m. Pacific Time) to watch the live broadcast, when you can see the guest speaker after the opening ceremony of the conference, brought by Google's Principal Scientist John Platt "Powering the next 100 years "About how Google uses machine learning to solve future energy problems.
Facebook paper in NIPS 2017
"Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model"
Double Title: Transferring Knowledge from Confrontational Learning to Generative Visual Dialogue Model
Main Content: The task of visual dialogue requires the AI and humans to make meaningful conversations about visual content. The language of AI also needs to be naturally fluid and able to respond to human beings. For example, suppose a blind user is using social media, and then his friend uploaded a photo. It would be great if AI could describe the picture for him, such as AI said "John just uploaded a photo of his vacation in Hawaii." The user may then ask, "Not bad, is he on the beach?" We also hope that the AI will respond spontaneously and will be able to accurately provide the answer "No, he is on the hill." Or when you talk to an AI assistant, you might say, "Can you see my child in the baby room monitor?" AI said "can see." You then ask: "Is he sleeping or playing?" We also hope to get an accurate answer. Or human and robot team to complete the search and rescue mission, the robot went to the danger zone, the human asked: "Do you have any room around smoking?" Robot replied: "Yes, there is a room," and then said: "Then go in to find someone there."
In the paper, Facebook researchers proposed a new training framework that can be used to train neural network sequence generation models, especially for fact-based dialog generation. The standard training paradigm for such models is Maximum Likelihood Estimation (MLE) or crossover entropy that minimizes the human response. However, in many different fields, it has been found that neural network conversational models (G) trained by MLE method often have problems that they like to answer "safe" and generalized sentences (such as "I do not know", " I can not say "). In contrast, the confrontational dialogue model (D) is trained to provide a list of candidate human responses that yield better performance than the generated model in terms of automatic measures, richness, and informativeness. However, a model such as D does not work well in practice because it does not allow for a real dialogue with humans. Facebook's research hope to achieve the best performance in both areas, to be as practical as G, but also as good as D in several aspects have excellent performance. Their approach is to move the knowledge in D into G.
The main contribution of the dissertation is to propose an end-to-end, trainable, model-based visual dialog model in which G receives the gradient from D as the perceptual loss (not a loss of confrontation) of a sequence sample generated by G. The authors apply the recently proposed Gumbel-Softmax (GS) approximation to this discrete distribution; specifically, an enhanced RNN network with a sequence of GS samplers, along with a straight-through gradient estimator, so that Can provide the differentiability needed for end-to-end training. The authors also proposed a robust video-dialogue encoder, the addition of self-attention mechanisms for coding response sentences, and a measure of learning loss that all helped D better capture the semantic Similarity. Overall, the models proposed by the authors outperformed previous best results on the VisDial dataset with a significant advantage of 2.67% (recall @ 10). The project code address ishttps://github.com/jiasenlu/visDial.pytorch
(Lei Feng Network AI Technology Review Note: In a previous article, Facebook also introduced in more detail their own visual dialogue in the research results, seeAI can see the figure to answer the question how far away from us?)
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
ELF: A scalable, lightweight and flexible research platform for real-time strategy games
ELF is an extensible, lightweight design, highly flexible and enhanced learning platform that provides parallel simulation of the game environment. Based on ELF, the authors implemented a highly customized real-time strategy game (RTS) engine that included several game environments. One of the games, the Mini-RTS, is a miniaturized version of StarCraft that includes key changes in the game and up to 165K frames / second in notebooks, an order of magnitude faster than other platforms. In the experimental part, the authors trained a Mini-RTS AI end-to-end with just one GPU and several CPUs to beat rule-based systems with over 70% wins.
(Lei Feng Network AI Technology Review by: See this article for a detailed explanationExplain Facebook Tian Yuan-dong NIPS2017 thesis: so that we can afford to strengthen the depth of ELF learning and research platform)
Fader Networks: Manipulating Images by Sliding Attributes
Fader Network: Adjust properties, manipulate images
In this dissertation, authors propose a new codec architecture, after training, confrontational training can be used to directly decimate the main information and special attribute values in the image to complete the image Reconstruction. This decoupling process allows us to control its properties and generate many variations of the face photo, such as the assumption that a person is young and old, while still maintaining a sufficient degree of naturalness. Most of the current top-level methods rely on training a confrontational network of one pixel space and select different attribute values during training. Compared with these methods, the methods proposed in the paper use a much simpler training method, and the method also has a good performance when it is changed to change multiple attribute values at the same time.
Gradient Episodic Memory for Continual Learning
Gradient memory for continuous learning
Papers Description: Machine learning in one thing has been doing well, that is, when learning new problems, do not forget the task completed before. In this paper, authors propose a new measure of learning that assesses how models move knowledge across a range of learning tasks. In the end, the authors proposed a new algorithm for top performance, GEM, Gradient Memory, which enabled learning machines to learn new tasks without forgetting the skills they learned in the past.
Houdini: Fooling Deep Structured Prediction Models
Houdini: How to fool the prediction model of deep neural networks
Papers Description: Generating confrontational samples is an important step in trying to evaluate and improve the robustness of learning machines. So far, most approaches have worked only for categorical tasks and can not be used to measure model performance in real problems encountered. In this paper, the authors propose a novel and flexible approach, called Houdini, which generates adversarial samples specifically for the scenarios just mentioned as an assessment of the ultimate performance of the model, whether it be. The authors have successfully used Houdini in many different scenarios, including speech recognition, pose estimation, and semantic segmentation. In all of these scenarios, Houdini-based attacks have achieved higher success rates than the traditional proxy methods used to train the model, and confrontational perturbations used in Houdini are also easier to understand.
One-Sided Unsupervised Domain Mapping
Unidirectional unsupervised theme mapping
About the Papers: One of the major findings of 2017 is that for two different visual themes, simulations can be learned without any matching training samples. For example, given a photo of a handbag, these methods can find matching shoes, even if they have never seen such a match. Recent approaches all require mapping from one topic to another, and then learning the reverse mapping. In this paper, the methods proposed by the authors do not have to complete such a whole process, so the efficiency is much higher. At the same time, the mapping obtained by this method is significantly more accurate.
On the Optimization Landscape of Tensor Decompositions
Discussion on Optimization of Tensor Decomposition
In this paper, the authors analyzed the optimization condition of overcomplete random tensor decomposition problem. This kind of problem in unsupervised learning has many applications, especially in the study of latent variable models. In practical application, non convex objective can be efficiently used gradient ascent method to solve this problem. The authors of the theory the results show that for small arbitrary constants
Poincaré Embeddings for Learning Hierarchical Representations
Poincaré embedding for learning hierarchical representations
The value of characterizing learning methods has become extremely important in modeling symbolic data such as text and computed graphs. Symbolic data often shows characteristics with an implicit hierarchical structure. For example, all dolphins are mammals, all mammals are animals, all animals are living creatures, and so on. If you can capture this hierarchical structure, many of the core issues of artificial intelligence can benefit, such as the inheritance of reasoning, or modeling complex relationships. In this paper, the authors propose a new method for characterization learning that extracts both hierarchical structure and similarity information simultaneously. Their approach is to change the geometry behind the embedded space and to propose an efficient algorithm to learn these hierarchical embeddings. The authors' experiments show that for the data with hidden levels, the proposed models outperform standard methods in terms of characterization capacity and generalization ability.
Unbounded Cache Model for Online Language Modeling with Open Vocabulary
Borderless Caching Model for Online Language Modeling with Unrestricted Vocabulary
If the distribution of training data and test data of the model is changed, modern machine learning methods generally show poor robustness. For example, this problem can arise when training a model with Wikipedia and then testing the model with news data. In this paper, the authors propose a large-scale, non-parametric memory component whose role is to help models dynamically adapt to new data distributions. The authors used this approach in language modeling where the training data and test data come from two different topics (one for Wikipedia and one for news).
VAIN: Attentional Multi-agent Predictive Modeling
VAIN: Attention-based multi-agent predictive modeling
To predict the behavior of a large-scale social system or a physical system needs to model the interactions among different individuals. Recent research advances, such as neural networks, have dramatically improved the quality of predictions by modeling each interaction, yet this approach also has excessive computational resource consumption. In this paper, the authors replaced the "expensive" interactive modeling model with a simple attentional mechanistic model that has similar accuracy but at a much lower computational cost. The linear nature of the computational complexity of time also allows it to be used in much larger multi-agent behavior prediction models.
Papers package download:https://pan.baidu.com/s/1eS3w9OYPassword: kn7v
Other activities on NIPS 2017 on Facebook
Geometric Deep Learning on Graphs and Manifolds, Geometrical Depth Learning of Graphs and Manifolds. Yann LeCun is present; Monday, December 4, from 2.30 pm to 4.45 pm local time, Hall A.
Workshop & amp; Seminar (with Facebook membership or participation)
Black in AI Workshop, Black Workshop in AI
Deep Learning at Supercomputer Scale Workshop, a supercomputer-scale deep learning workshop. On Saturday, December 9, from 8 am to 5 pm, researchers at Facebook Research Institute, DeepMind, Salesforce Research Institute, OpenAI, Google Research Institute and Baidu Research Institute will make presentations
Deep Reinforcement Learning Symposium, Intensive Learning Workshop. Thursday, December 7, from 2:00 pm to 9:00 pm (Poster + Snack Time from 5:00 to 7:00),David Silver at DeepMind will deliver a keynote speech on Deep Learning in AlphaGo. Ruslan Salakhutdinov, director of AI at Apple, will also present a speech entitled "Nerve Map: Building Memory for Deep Learning"
Emergent Communication Workshop, Emerging Language and Communication Workshop
Interpretable Machine Learning Symposium, Explained Machine Learning Workshop. Thursday, December 7, 2:00 p.m. - 9:30 p.m. (6 p.m. to 7 p.m. Poster + dinner time) Venue Hall C,Y.ann LeCun will attend the last roundtable discussion starting at 8.30
Learning Disentangled Representations: from Perception to Control Workshop, Learn Decharacterization of Decoupling: From Sensing to Control Workshop. Saturday, December 9, 8:30 am to 6:00 pm,The plan is to invite Yoshua Bengio to give a speech at 3:30 pm.
Learning in the Presence of Strategic Behavior Workshop, Learning Workshop Emergence of Strategic Behavior
Machine Learning on the Phone and other Consumer Devices Workshop,Machine Learning Workshop on Mobile Phones and Other Consumer Devices. Saturday, December 9, 8 am to 6:30 pm, Venue 102 A + B.
Machine Learning Systems Workshop, Machine Learning System Workshop. Friday, December 8, as early as 8:45 am to 18:15 pm,Among them, Jia Yangqing introduced Caffe2, and Jeff Dean gave a presentation on machine learning at 2:50 pmThere are also presentations on ONNX and PyTorch.
Optimization for Machine Learning Workshop, Machine Learning Optimization Workshop
Women in Machine Learning (WiML) workshop, Machine Learning (WiML) workshop
Workshop on Automated Knowledge Base Construction (AKBC), Automated Knowledge Foundation Building Workshop
Workshop on Conversational AI: Today's Practice and Tomorrow's Potential, Dialogue AI Workshop: Today's Practice Stimulates Tomorrow's Potential
Workshop on Visually-Grounded Interaction and Language (ViGIL), Visual Content Based Interaction and Language (ViGIL) Workshop
Facebook's NIPS 2017 papers and related activities are introduced here. Facebook's 10 papers packaged download address:https://pan.baidu.com/s/1eS3w9OYPassword: kn7v. Lei Feng Network AI Technology Review will continue to follow up NIPS 2017 full coverage, please continue to pay attention Oh.