Check out our recent work on segmenting one instance at each time using a recurrent neural net http://arxiv.org/abs/1511.08250.
You can also found the Torch implementation of the method in https://github.com/bernard24/ris.
You can try the live demo we have implemented using the model described in our paper “Conditional Random Fields as Recurrent Neural Networks”
Here there are a few tricky examples:
In the paper “An embarrassingly simple approach to zero-shot learning” that we presented at ICML this year, we describe an extremely efficient zero-shot learning method, that can be implemented in one line, and outperforms the previous state of the art methods.
The code for the experiments can be downloaded here.
Head-mounted displays (HMDs) have gained more and more interest recently. They can enable people to communicate with each other from anywhere, at anytime. However, since most HMDs today are only equipped with cameras pointing outwards, the remote party would not be able to see the user wearing the HMD. In this paper, we present a system for facial expression tracking based on head-mounted, inward looking cameras, such that the user can be represented with animated avatars at the remote party. The main challenge is that the cameras can only observe partial faces since they are very close to the face. We experiment with multiple machine learning algorithms to estimate facial expression parameters based on training data collected with the assistance of a Kinect depth sensor. Our results show that we can reliably track people’s facial expression even from very limited view angles of the cameras.