About Me

I’m currently an Assistant Professor at the Department of Mathematics, Computer Science and Physics, University of Udine, Italy.
I received the M.Sc. (with honors) and the Ph.D. in Multimedia Communications in 2010 and 2014, respectively under the hood of Prof. Christian Micheloni. I also received the Ph.D. in Information Engineering with a major on hierarchical learning architectures in 2017 under the supervision of Prof. Gian Luca Foresti.

I have published more than 50 papers in the most prestigious journals and conferences in machine learning, computer vision, and image processing. I participated in the organization of conferences of international relevance and I have been part of the program/technical committee of the most top-notch conferences and workshops of international relevance (e.g., CVPR/ICCV/ECCV). I have been a visiting researcher at the University of California Riverside, under the guidance of Prof. Amit K. Roy-Chowdhury. I also have constant collaborations with other research centers, both at national and international level.

Apart from the purely academical work, I am actively involved in transferring the acquired research knowledge by being part of a spin-off of the University of Udine working in vision and multimedia-related fields.

My research interests include machine (deep/self-supervised/active) learning, computer vision, wide area scene analysis and feature transformations.

You can find a more detailed Curriculum Vitae here. You can also visit my Google Scholar profile.

For Students:
If you are a prospective student interested in Computer Vision/Machine Learning/Deep Learning Research at the University of Udine, please read about our Ph.D. admissions process and contact me. If you are applying to our Ph.D. course in Computer Science/Industrial and Information Engineering and are interested in my research, please explicitly state it in your statement of purpose.

If you are a Master Student looking for Thesis, please check your dedicated section.

Latest News

Paper on Aggregating Deep Pyramidal Representations accepted for publication at CVPR2019

Our paper “Aggregating Deep Pyramidal Representations for Person Re-Identification” has been accepted for publication/presentation at the International Conference on Computer Vision and Pattern Recognition  (CVPR2019).

Description:Learning discriminative, view-invariant and multi-scale
representations of person appearance with different se- mantic levels is of paramount importance for person Re- Identification (Re-ID). A surge of effort has been spent by the community to learn deep Re-ID models capturing a holistic single semantic level feature representation. To improve the achieved results, additional visual attributes and body part-driven models have been considered. How- ever, these require extensive human annotation labor or de- mand additional computational efforts. We argue that a pyramid-inspired method capturing multi-scale information may overcome such requirements. Precisely, multi-scale stripes that represent visual information of a person can be used by a novel architecture factorizing them into latent discriminative factors at multiple semantic levels. A multi- task loss is combined with a curriculum learning strategy to learn a discriminative and invariant person representation which is exploited for triplet-similarity learning. Results on three benchmark Re-ID datasets demonstrate that better performance than existing methods are achieved.

Paper on low rank metric learning for person re-identification accepted by PRL

Our paper “Accelerated low-rank sparse metric learning for person re-identification” has been accepted for publication by the Pattern Recognition Letters (PRL) journal.

Description: Person re-identification is an open and challenging problem in computer vision. A surge of effort has been spent design the best feature representation, and to learn either the transformation of such features across cameras or an optimal matching metric. Metric learning solutions which are currently in vogue in the field generally require a dimensionality reduction pre-processing stage to handle the high-dimensionality of the adopted feature representation. Such an approach is suboptimal and a better solution can be achieved by combining such a step in the metric learning process. Towards this objective, a low-rank matrix which projects the high-dimensional vectors to a low-dimensional manifold with a discriminative Euclidean distance is introduced. The goal is achieved with a stochastic accelerated proximal gradient method. Experiments on two public benchmark datasets show that better performances than state-of-the-art methods are achieved.

Two papers accepted at the International Conference on Distributed Smart Cameras’18

We are proud to announce that two papers on unsupervised learning from our group has been accepted for publications at the International Conference on Distributed Smart Cameras (ICDSC’18).

Unsupervised Hashing with Neural Trees for Image Retrieval and Person Re-Identification
Description: Food diary applications represent a tantalizing market. Such applications, based on image food recognition, opened to new challenges for computer vision and pattern recognition algorithms. Recent works in the field are focusing either on hand-crafted representations or on learning these by exploiting deep neural networks. Despite the success of such a last family of works, these generally exploit off-the shelf deep architectures to classify food dishes. Thus, the architectures are not cast to the specific problem. We believe that better results can be obtained if the deep architecture is defined with respect to an analysis of the food composition. Following such an intuition, this work introduces a new deep scheme that is designed to handle the food structure. Specifically, inspired by the recent success of residual deep network, we exploit such a learning scheme and introduce a slice convolution block to capture the vertical food layers. Outputs of the deep residual blocks are combined with the sliced convolution to produce the classification score for specific food categories. To evaluate our proposed architecture we have conducted experimental results on three benchmark datasets. Results demonstrate that our solution shows better performance with respect to existing approaches (e.g., a top-1 accuracy of 90.27% on the Food-101 challenging dataset).

The paper is a joint work coauthored by Niki Martinel, Gian Luca Foresti, and Christian Micheloni.

Unsupervised Smoke Detection in Normally Smoking Environments
Description:
The problem of smoke detection through visual analytics is an open challenging problem. The existing literature has addressed the problem by mainly working on the best feature representation and by exploiting supervised solutions which consider the prob- lem of smoke detection as a binary classification one. Differently from such works, we consider the possibility that in some contexts sensing smokes is a common situation, but want to detect when there are significative fluctuations within this normal situation. In light of such a consideration, we propose an unsupervised solu- tion that leverages on the concept of anomaly detection. Different visual representations have been used together with a temporal smoothing function reduce the effects of noisy measurement. Such temporally smoothed representations are then exploited to learn a robust “normality” model by means of a One-Class Support Vector Machine. A real prototype has been developed and exploited to collect a new dataset which has been considered to evaluate the proposed solution.

The paper is a joint work coauthored by Matteo Chini, Niki Martinel, Matteo Dunnhofer, Carlo Ceschia, and Christian Micheloni.

Paper on Food Recognition with Deep Neural Networks Presented at WACV’18

Our paper “Wide-Slice Residual Networks for Food Recognition” has been accepted and presented at WACV 2018.

Description: Food diary applications represent a tantalizing market. Such applications, based on image food recognition, opened to new challenges for computer vision and pattern recognition algorithms. Recent works in the field are focusing either on hand-crafted representations or on learning these by exploiting deep neural networks. Despite the success of such a last family of works, these generally exploit off-the shelf deep architectures to classify food dishes. Thus, the architectures are not cast to the specific problem. We believe that better results can be obtained if the deep architecture is defined with respect to an analysis of the food composition. Following such an intuition, this work introduces a new deep scheme that is designed to handle the food structure. Specifically, inspired by the recent success of residual deep network, we exploit such a learning scheme and introduce a slice convolution block to capture the vertical food layers. Outputs of the deep residual blocks are combined with the sliced convolution to produce the classification score for specific food categories. To evaluate our proposed architecture we have conducted experimental results on three benchmark datasets. Results demonstrate that our solution shows better performance with respect to existing approaches (e.g., a top-1 accuracy of 90.27% on the Food-101 challenging dataset).

The paper is a joint work coauthored by Niki Martinel, Gian Luca Foresti, and Christian Micheloni.

Paper on Group Person Re-Identification Accepted at ICCV’17

Our paper “Group Re-Identification via Unsupervised Transfer of Sparse Features Encoding” has been accepted for publication at ICCV 2017.

Description: The existing literature has mainly addressed single person re-identification neglecting the fact that people usually move in groups, like in crowded scenarios. We believe that the additional information carried by neighboring individuals provides a relevant visual context that can be exploited to obtain a more robust match of single persons within the group. To this end, we propose a solution for group re-identification that grounds on transferring knowledge from single person re-identification to group re-identification by exploiting sparse dictionary learning.

The paper is a joint work coauthored by Giuseppe Lisanti, Niki Martinel, Alberto Del Bimbo and Gian Luca Foresti.