Bilinear CNN models in TensorFlow-Keras

2 min readAug 26, 2019

Implementation in TensorFlow and Keras of Bilinear CNN Models for Fine-grained Visual Recognition, paper.

FINE-GRAINED recognition tasks generally involve discrimination between categories that have a shared structure but differ in subtle ways, distinguishing between a “Toyota Prius” and a “Toyota corolla”. This requires the recognition of highly localized attributes under the changes of the position, point of view, lighting and other factors.

A bi-linear model for image classification consists of a quadruple B = (fA, fB, P, C). Where fA and fB are function functions, P is a grouping function and C is a classification function. A characteristic function is a mapping f: L × I → Rc × D that takes an image I and a location L and emits a characteristic of size c × D. The outputs are combined at each location using the external product of the matrix, it is, the combination of bilinear characteristics of fA and fB at a location 1 given by:

Both fA and fB must have the same characteristic dimension c to be compatible.¹

For the construction of the model, we will use two pre-trained VGG-16 neural networks that will be preloaded with the data set ‘ImageNet’ to form a bi-linear network. First, we need the bilinear function implementation in TensorFlow.

Then we will load the VGG-16 network and add two dropouts. The idea of the use of the dropout is the simulation of two different NN, but you can use two different models (with the same output size for the outer product). Then we build the full bilinear model.

Now we only need to train. In my experience, it’s better to train first the fully connected part of the neural network, freezing the weights of the convolution part, and then train all the model. This is a good approximation if we are using pre-trained models like VGG-16 with Imagenet.

My name is Sebastian Correa here is my web page if you wanna see more of my projects.

Sebastian

Hi I’m Sebastian Correa an engineer with experience in machine learning, modeling based on NN, RNN, deep learning and…

scorrea92.github.io

[1] L. T.-Y. A. R. y S. M. , «Bilinear CNN Models for Fine-grained Visual Recognition,» Transactions on Pattern Analysis and Machine Intelligence, p. 14, 2017.

Bilinear CNN models in TensorFlow-Keras

Sebastian

Hi I’m Sebastian Correa an engineer with experience in machine learning, modeling based on NN, RNN, deep learning and…

Written by Sebastian Correa

No responses yet