Learning Output Embeddings in Structured Prediction (submitted to AISTATS’21)

By Luc Brogat-Motte, Alessandro Rudi, Céline Brouard, Juho Rousu, Florence d’Alché-Buc.

Submitted to AISTATS, 2021


Abstract. A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension by means of output kernels, and then, solving a regression problem in this output space. A prediction in the original space is computed by solving a pre-image problem. In such an approach, the embedding, linked to the target loss, is defined prior to the learning phase. In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space. For that purpose, we leverage a priori information on the outputs and also unexploited unsupervised output data, which are both often available in structured prediction problems. We prove that the resulting structured predictor is a consistent estimator, and derive an excess risk bound. Moreover, the novel structured prediction tool enjoys a significantly smaller computational complexity than former output kernel methods. The approach empirically tested on various structured prediction problems reveals to be versatile and able to handle large datasets.

Investigating CoordConv for Fully and Weakly Supervised Medical Image Segmentation (in Proc. IPTA’20)

By Rosana El Jurdi, Thomas Dargent, Caroline Petitjean, Paul Honeine, Fahed Abdallah.

In Proceedings of the 10th International Conference on Image Processing Theory, Tools and Applications (IPTA), Paris, France, 9 – 12 November 2020.

 linkInvestigating CoordConv for Fully and Weakly Supervised Medical Image Segmentation [link]   Investigating CoordConv for Fully and Weakly Supervised Medical Image Segmentation [pdf] paper   doi:10.1109/IPTA50016.2020.9286633

Abstract. Convolutional neural networks (CNN) have established state-of-the-art performance in computer vision tasks such as object detection and segmentation. One of the major remaining challenge concerns their ability to capture consistent spatial attributes, especially in medical image segmentation. A way to address this issue is through integrating localization prior into system architecture. The CoordConv layers are extensions of convolutional neural network wherein convolution is conditioned on spatial coordinates. This paper investigates CoordConv as a proficient substitute to convolutional layers for organ segmentation in both fully and weakly supervised settings. Experiments are conducted on two public datasets, SegTHOR, which focuses on the segmentation of thoracic organs at risk in computed tomography (CT) images, and ACDC, which addresses ventricular endocardium segmentation of the heart in MR images. We show that if CoordConv does not significantly increase the accuracy with respect to standard convolution, it may interestingly increase model convergence at almost no additional computational cost.

BB-UNet: U-Net with Bounding Box Prior (in IEEE Journal of Selected Topics in Signal Processing 2020)

By Rosana El Jurdi, Caroline Petitjean, Paul Honeine, Fahed Abdallah.

in IEEE Journal of Selected Topics in Signal Processing, 14(6): 1189-1198. October 2020.

BB-UNet: U-Net with Bounding Box Prior [pdf] paper   doi:10.1109/JSTSP.2020.3001502

Abstract. Medical image segmentation is the process of anatomically isolating organs for analysis and treatment. Leading works within this domain emerged with the well-known U-Net. Despite its success, recent works have shown the limitations of U-Net to conduct segmentation given image particularities such as noise, corruption or lack of contrast. Prior knowledge integration allows to overcome segmentation ambiguities. This paper introduces BB-UNet (Bounding Box U-Net), a deep learning model that integrates location as well as shape prior onto model training. The proposed model is inspired by U-Net and incorporates priors through a novel convolutional layer introduced at the level of skip connections. The proposed architecture helps in presenting attention kernels onto the neural training in order to guide the model on where to look for the organs. Moreover, it fine-tunes the encoder layers based on positional constraints. The proposed model is exploited within two main paradigms: as a solo model given a fully supervised framework and as an ancillary model, in a weakly supervised setting. In the current experiments, manual bounding boxes are fed at inference and as such BB-Unet is exploited in a semi-automatic setting; however, BB-Unet has the potential of being part of a fully automated process, if it relies on a preliminary step of object detection. To validate the performance of the proposed model, experiments are conducted on two public datasets: the SegTHOR dataset which focuses on the segmentation of thoracic organs at risk in computed tomography (CT) images, and the Cardiac dataset which is a mono-modal MRI dataset released as part of the Decathlon challenge and dedicated to segmentation of the left atrium. Results show that the proposed method outperforms state-of-the-art methods in fully supervised learning frameworks and registers relevant results given the weakly supervised domain.

Interpretable time series kernel analytics by pre-image estimation (in Artificial Intelligence 2020)

by Thi Phuong Thao Tran, Ahlame Douzal-Chouakria, Saeed Varasteh Yazdi, Paul Honeine, Patrick Gallinari.

Paper published in Artificial Intelligence (Volume 286, September 2020, 103342):

Abstract. Kernel methods are known to be effective to analyse complex objects by implicitly embedding them into some feature space. To interpret and analyse the obtained results, it is often required to restore in the input space the results obtained in the feature space, by using pre-image estimation methods. This work proposes a new closed-form pre-image estimation method for time series kernel analytics that consists of two steps. In the first step, a time warp function, driven by distance constraints in the feature space, is defined to embed time series in a metric space where analytics can be performed conveniently. In the second step, the time series pre-image estimation is cast as learning a linear (or a nonlinear) transformation that ensures a local isometry between the time series embedding space and the feature space. The proposed method is compared to the state of the art through three major tasks that require pre-image estimation: 1) time series averaging, 2) time series reconstruction and denoising and 3) time series representation learning. The extensive experiments conducted on 33 publicly-available datasets show the benefits of the pre-image estimation for time series kernel analytics.

Duality in RKHSs with Infinite Dimensional Outputs: Application to Robust Losses (in ICML’20)

By Pierre Laforgue, Alex Lambert, Luc Brogat-Motte, Florence d’Alche-Buc.

In Proceedings of the 37th International Conference on Machine Learning (ICML), Online, PMLR 119, 2020.


Abstract. Operator-Valued Kernels (OVKs) and associated vector-valued Reproducing Kernel Hilbert Spaces provide an elegant way to extend scalar kernel methods when the output space is a Hilbert space. Although primarily used in finite dimension for problems like multi-task regression, the ability of this framework to deal with infinite dimensional output spaces unlocks many more applications, such as functional regression, structured output prediction, and structured data representation. However, these sophisticated schemes crucially rely on the kernel trick in the output space, so that most of previous works have focused on the square norm loss function, completely neglecting robustness issues that may arise in such surrogate problems. To overcome this limitation, this paper develops a duality approach that allows to solve OVK machines for a wide range of loss functions. The infinite dimensional Lagrange multipliers are handled through a Double Representer Theorem, and algorithms for epsilon-insensitive losses and the Huber loss are thoroughly detailed. Robustness benefits are emphasized by a theoretical stability analysis, as well as empirical improvements on structured data applications.

Pixel-wise linear/nonlinear nonnegative matrix factorization for unmixing of hyperspectral data (in ICASSP’20)

By Fei Zhu, Paul Honeine, Jie Chen.

In Proc. 45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4 – 8 May 2020.

 link   Pixel-wise linear/nonlinear nonnegative matrix factorization for unmixing of hyperspectral data [pdf] paper   doi:10.1109/ICASSP40776.2020.9053239

Abstract. Nonlinear spectral unmixing is a challenging and important task in hyperspectral image analysis. The kernel-based bi-objective non-negative matrix factorization (Bi-NMF) has shown its usefulness in nonlinear unmixing; However, it suffers several issues that prohibit its practical application. In this work, we propose an unsupervised nonlinear unmixing method that overcomes these weaknesses. Specifically, the new method introduces into each pixel a parameter that adjusts the nonlinearity therein. These parameters are jointly optimized with endmembers and abundances, using a carefully designed objective function by multiplicative update rules. Experiments on synthetic and real datasets confirm the effectiveness of the proposed method.

Organ Segmentation in CT Images With Weak Annotations: A Preliminary Study (in GRETSI’19)

By Rosana El Jurdi, Caroline Petitjean, Paul Honeine, Fahed Abdallah.

Dans les actes du 27-ème Colloque GRETSI sur le Traitement du Signal et des Images, Lille, France, 26 – 29 August 2019.

Organ Segmentation in CT Images With Weak Annotations: A Preliminary Study [pdf] paper

Abstract. Medical image segmentation has unprecedented challenges, compared to natural image segmentation, in particular because of the scarcity of annotated datasets. Of particular interest is the ongoing 2019 SegTHOR competition, which consists in Segmenting THoracic Organs at Risk in CT images. While the fully supervised framework (i.e., pixel-level annotation) is considered in this competition, this paper seeks to push forward the competition to a new paradigm: weakly supervised segmentation, namely training with only bounding boxes that enclose the organs. After a pre-processing step, the proposed method applies the GrabCut algorithm in order to transforms the images into pixel-level annotated ones. And then a deep neural network is trained on the medical images, where several segmentation loss functions are examined. Experiments show the relevance of the proposed method, providing comparable results to the ongoing fully supervised segmentation competition.