Researcher at NEC Labs America
Media Analytics Department
francescopittaluga at nec-labs dot com
|03/21 -||Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction selected as a CVPR 2021 Oral.|
|10/20 -||Voting-based Approaches For Differentially Private Federated Learning published on arXiv.|
|10/20 -||Towards a MEMS-based Adaptive LIDAR accepted to 3DV 2020.|
|07/20 -||SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction accepted to ECCV 2020.|
|02/20 -||Revealing Scenes by Inverting SfM Reconstructions featured in Computer Vision News.|
|10/19 -||Joined the Media Analytics Department at NEC Labs America as a researcher.|
|06/19 -||Revealing Scenes by Inverting SfM Reconstructions selected as a Best Paper Finalist at CVPR 2019.|
|06/19 -||Presented Privacy Preserving Action Recognition using Coded Aperture Videos at CVPR 2019.|
|06/19 -||Revealing Scenes by Inverting SfM Reconstructions featured in the Microsoft Research Blog.|
|05/19 -||Received a PhD in Electrical Engineering from the University of Florida.|
|04/19 -||Received Attributes of a Gator Engineer Award from UF Herbert Wertheim College of Engineering.|
|03/19 -||Succesfully defended my dissertation: Privacy Preserving Computational Cameras. Thank you Sanjeev J. Koppal, José Principe, Baba Vemuri and Kevin Butler for serving on my dissertation committee.|
|01/19 -||Presented Learning Privacy Preserving Encodings through Adversarial Training at WACV 2019.|
|06/18 -||Awarded Microsoft Research Dissertation Grant for work on Privacy Preserving Computational Cameras.|
|05/18 -||Joined Microsoft Research as a research intern.|
I am a researcher in the Media Analytics Department at NEC Labs America interested in computer vision, machine learning and computational photography. Prior to joining NEC Labs, I received a Ph.D. in Electrical Engineering from the University of Florida, where I worked at the FOCUS Lab under the direction of Sanjeev J. Koppal. As a Ph.D. candidate, I was awarded a Microsoft Research Dissertation Grant for my work on Privacy Preserving Computational Cameras. I also interned at the Toyota Technological Institute at Chicago (TTIC), where I worked with Ayan Chakrabarti; Magic Leap's Advanced Technology Lab, where I worked with Laura Trutoiu and Brian Schowengerdt; and Microsoft Research, where I worked with Sudipta Sinha and Sing Bing Kang. Prior to beginning my doctoral studies, I attended Tufts University, where I received a B.S. in Electrical Engineering with a second major in Computer Science and worked as an undergraduate researcher under the direction of Karen Panetta. During this time, I also interned at GE Intelligent Platforms and participated in the National Science Foundation Research Experience for Undergraduates Program at Florida International University.
University of Florida
Institute at Chicago
Advanced Tech. Lab
NEC Labs America
We addresses two key challenges in trajectory prediction: learning multimodal outputs and improving predictions by imposing constraints using driving knowledge. Recent methods have achieved strong performances using Multi-Choice Learning objectives like winner-takes-all (WTA), but are highly depend on their initialization to provide diverse outputs. We propose a novel Divide-And-Conquer (DAC) approach that acts as a better initialization for the WTA objective, resulting in diverse outputs without any spurious modes. Further, we introduce a novel trajectory prediction framework called ALAN that uses existing lane center lines as anchors to constrained predicted trajectories.
While federated learning enables distributed agents to collaboratively train a centralized model without sharing data with each other, it fails to protect users against inference attacks that mine private information from the centralized model. Thus, facilitating federated learning methods with differential privacy becomes attractive. Existing algorithms based on privately aggregating clipped gradients require many rounds of communication, which may not converge, and cannot scale up to large-capacity models due to explicit dimension-dependence in its added noise. In this paper, we adapt the knowledge transfer model of private learning from PATE, as well as the recent alternative PrivateKNN to the federated learning setting. The key difference is that our method privately aggregates the labels from the agents in a voting scheme, instead of aggregating the gradients, hence avoiding the dimension dependence and achieving significant savings in communication cost.
Unlike most artificial sensors, animal eyes foveate, or distribute resolution where it is needed. This is computationally efficient, since neuronal resources are concentrated on regions of interest. Similarly, we believe that an adaptive LIDAR would be useful on resource constrained small platforms, such as micro-UAVs.We present a proof-of-concept LIDAR design that allows adaptive real-time measurements according to dynamically specified measurement patterns. We describe our optical setup and calibration, which enables fast sparse depth measurements using a scanning MEMS (micro-electro-mechanical) mirror. We validate the efficacy of our prototype LIDAR design by testing on over 75 static and dynamic scenes spanning a range of environments. We show CNN-based depth-map completion experiments which demonstrate that our sensor can realize adaptive depth sensing for dynamic scenes.
We propose advances that address two key challenges in future trajectory prediction: (i) multimodality in both training data and predictions and (ii) constant time inference regardless of number of agents. Existing trajectory predictions are fundamentally limited by lack of diversity in training data, which is difficult to acquire with sufficient coverage of possible modes. Our first contribution is an automatic method to simulate diverse trajectories in the top-view. It uses pre-existing datasets and maps as initialization, mines existing trajectories to represent realistic driving behaviors and uses a multi-agent vehicle dynamics simulator to generate diverse new trajectories that cover various modes and are consistent with scene layout constraints. Our second contribution is a novel method that generates diverse predictions while accounting for scene semantics and multi-agent interactions, with constant-time inference independent of the number of agents. We propose a convLSTM with novel state pooling operations and losses to predict scene-consistent states of multiple agents in a single forward pass, along with a CVAE for diversity.
Many 3D vision systems utilize pose and localization from a pre-captured 3D point cloud. Such 3D models are often obtained using structure from motion (SfM), after which the images are discarded to preserve privacy. In this paper, we show, for the first time, that SfM point clouds retain enough information to reveal scene appearance and compromise privacy. We present a privacy attack that reconstructs color images of the scene from the point cloud. Our method is based on a cascaded U-Net that takes as input, a 2D image of the points from a chosen viewpoint as well as point depth, color, and SIFT descriptors and outputs an image of the scene from that viewpoint. Unlike previous SIFT inversion methods, we handle highly sparse and irregular inputs and tackle the issue of many unknowns, namely, SIFT keypoint orientation and scale, image source, and 3D point visibility. We evaluate our attack algorithm on public datasets (MegaDepth and NYU Depth V2) and analyze the significance of the point cloud attributes. Finally, we synthesize novel views to create compelling virtual tours of scenes.
The risk of unauthorized remote access of streaming video from networked cameras underlines the need for stronger privacy safeguards. Towards this end, we simulate a lens-free coded aperture (CA) camera as an appearance encoder, i.e., the first layer of privacy protection. Our goal is human action recognition from coded aperture videos for which the coded aperture mask is unknown and does not require reconstruction. We insert a second layer of privacy protection by using non-invertible motion features based on phase correlation and log-polar transformation. Phase correlation encodes translation while the log polar transformation encodes in-plane rotation and scaling. We show the key property of the translation features being mask-invariant. This property allows us to simplify the training of classifiers by removing reliance on a specific mask design. Results based on a subset of the UCF and NTU datasets show the feasibility of our system.
We present a framework to learn privacy preserving encodings of images that inhibit inference of chosen private attributes, while allowing recovery of other desirable information. Rather than simply inhibiting a given fixed pretrained estimator, our goal is that an estimator be unable to learn to accurately predict the private attributes even with knowledge of the encoding function. We use a natural adversarial optimization-based formulation for this training the encoding function against a classifier for the private attribute, with both modeled as deep neural networks. The key contribution of our work is a stable and convergent optimization approach that is successful at learning an encoder with our desired properties maintaining utility while inhibiting inference of private attributes, not just within the adversarial optimization, but also by classifiers that are trained after the encoder is fixed. We adopt a rigorous experimental protocol for verification wherein classifiers are trained exhaustively till saturation on the fixed encoders. We evaluate our approach on tasks of real-world complexity learning high-dimensional encodings that inhibit detection of different scene categories and find that it yields encoders that are resilient at maintaining privacy.
The next wave of micro and nano devices will create a world with trillions of small networked cameras. This will lead to increased concerns about privacy and security. Most privacy preserving algorithms for computer vision are applied after image/video data has been captured. We propose to use privacy preserving optics that filter or block sensitive information directly from the incident light-field before sensor measurements are made, adding a new layer of privacy. In addition to balancing the privacy and utility of the captured data, we address trade-offs unique to miniature vision sensors, such as achieving high-quality field-of-view and resolution within the constraints of mass and volume. Our privacy preserving optics enable applications such as depth and thermal sensing and full-body motion tracking. While we demonstrate applications on macro-scale devices (smartphones, webcams, etc.) our theory has impact for smaller devices.
As cameras turn ubiquitous, balancing privacy and utility becomes crucial. To achieve both, we enforce privacy at the sensor level, as incident photons are converted into an electrical signal and then digitized into image measurements. We present sensor protocols and accompanying algorithms that degrade facial information for thermal sensors, where there is usually a clear distinction between humans and the scene. By manipulating the sensor processes of gain, digitization, exposure time, and bias voltage, we are able to provide privacy during the actual image formation process and the original face data is never directly captured or stored. We show privacy-preserving thermal imaging applications such as temperature segmentation, night vision, gesture recognition and HDR imaging.
First responders' ability to respond rapidly to emergency situations is limited by a lack of real time intelligence. To ensure the safety of the responders, the situation must first be evaluated for dangerous conditions including life-threatening hazards. Live visual feeds let remote experts gauge the safety levels and assess damages of a situation, but do not perform adequately when images are captured in poor lighting or in harsh environmental conditions. We present a novel approach that leverages HVS-based (Human Visual System) object detection in combination with low-cost commercial off-the-shelf UAVs, to deliver efficient real time image enhancement and detection. This approach enables our system to deliver timely information in low visibility environments making it ideal for aiding first responders in their search for critical objects such as wounded victims, human bodies and threat objects.