In our generation, nearly 100% of the people are obsessed with Instagram. Unfortunately, I left the photo sharing platform as early as 2015. The reason is actually very simple, I always should be posted on what photos and what should be added to the text on the hesitation.
Get down to business。This technology developed by Google Brain team of R & D personnel
In order to do this, Google Brain team can only use the text description of the reality of the continuous training of visual and language framework. This helps to avoid mechanical naming of objects in the image. However, the system is not only to point out the beach, kite and people in the picture above, but also generate a complete descriptive statement. If you want to create an accurate model, the key is to consider the relationship between good objects. A man is flying a kite, and a man's head with a kite, these two kinds of expression is different.
Google image text before the mode of training at the Nvidia G20 GPU above, each training step takes 3 seconds, but today the open source version can be engaged in the same task, and time is only 0.7 seconds, said it is equivalent to the time before 1/4. This means that the open source version than the first appeared in last year's Microsoft COCO image text on the game on the version of the more advanced.
Earlier this year, Google discussed the development of a computer vision and Vision and (Pattern Recognition Computer) conference in Las Vegas.Image recognition model, this model can identify the image of the object, through the aggregation of a set of training images (text description by the human body to write) the independent characteristics to provide a text description. An important advantage of this model is that it can narrow the gap between the logic, the object and the context of the link. It is this property, will eventually make this technology in the scene display skills to the full recognition, the computer vision system can accurately distinguish the run out of the police and passers-by fleeing from a site of violence.
Image source: WIN-INITIATIVE/GETTY IMAGES, authorization based on WIN-INITIATIVE protocol
Translation: Hao Yue