Researcher at NEC Labs America
Media Analytics Department
francescopittaluga at nec-labs dot com
|• 07/01/20 -||SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction accepted to ECCV 2020.|
|• 03/24/20 -||A MEMS-based Foveating LIDAR to enable Real-time Adaptive Depth Sensing published to arXiv.|
|• 02/03/20 -||Revealing Scenes by Inverting SfM Reconstructions featured in Computer Vision News.|
|• 10/03/19 -||Joined the Media Analytics Department at NEC Labs America as a researcher.|
|• 06/18/19 -||Revealing Scenes by Inverting SfM Reconstructions selected as a Best Paper Finalist at CVPR 2019.|
|• 06/18/19 -||Presented Revealing Scenes by Inverting SfM Reconstructions at CVPR 2019 (Oral Session on 3D Multiview).|
|• 06/17/19 -||Presented Revealing Scenes by Inverting SfM Reconstructions at DYNAVIS 2019 (CVPRW on Dynamic Scene Reconstruction).|
|• 06/16/19 -||Presented Privacy Preserving Action Recognition using Coded Aperture Videos at CV-COPS 2019 (CVPRW on the Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security).|
|• 06/13/19 -||Revealing Scenes by Inverting SfM Reconstructions featured in the Microsoft Research Blog.|
|• 05/03/19 -||Received a PhD in Electrical Engineering from the University of Florida.|
|• 04/03/19 -||Awarded UF Herbert Wertheim College of Engineering Attributes of a Gator Engineer Award.|
|• 03/13/19 -||Succesfully defended my dissertation: Privacy Preserving Computational Cameras. Thank you Sanjeev J. Koppal, José Principe, Baba Vemuri and Kevin Butler for serving on my dissertation committee.|
|• 01/07/19 -||Presented Learning Privacy Preserving Encodings through Adversarial Training at WACV 2019.|
|• 10/29/18 -||Presented on Privacy Preserving Computational Cameras at Magic Leap's Advanced Technology Lab.|
|• 06/09/18 -||Presented on Privacy Preserving Computational Cameras at Microsoft Research PhD Summit.|
|• 06/14/18 -||Awarded 2018-2019 Microsoft Research Dissertation Grant for work on Privacy Preserving Computational Cameras.|
|• 11/01/17 -||Joined the Interactive Media Group at Microsoft Research as a research intern.|
|• 07/05/17 -||Pre-Capture Privacy Cameras for Small Vision Sensors accepted for publication in PAMI.|
|• 05/20/17 -||Presented on Privacy Preserving Computational Cameras at CV-COPS 2017.|
|• 09/02/16 -||Joined the Advanced Technology Lab at Magic Leap as an intern.|
|• 09/02/16 -||Joined the Toyota Technological Institute at Chicago (TTIC) as a visiting researcher.|
|• 07/15/16 -||Presented on Pre-Capture Privacy Cameras at the Safe Autonomous Cyber Physical Systems Workshop 2016.|
|• 07/05/16 -||Presented on Pre-Capture Privacy Cameras at the 9th Annual DNDO ARI Conference in Atlanta, GA.|
|• 05/13/16 -||Presented Sensor-level Privacy for Thermal Cameras at ICCP 2016.|
|• 06/08/15 -||Presented Privacy Preserving Optics for Miniature Vision Sensors at CVPR 2015.|
I am a researcher in the Media Analytics Department at NEC Labs America interested in computer vision, machine learning and computational photography. Prior to joining NEC Labs, I received a Ph.D. in Electrical Engineering from the University of Florida, where I worked at the FOCUS Lab under the direction of Sanjeev J. Koppal. As a Ph.D. candidate, I was awarded a Microsoft Research Dissertation Grant for my work on Privacy Preserving Computational Cameras. I also interned at the Toyota Technological Institute at Chicago (TTIC), where I worked with Ayan Chakrabarti; Magic Leap's Advanced Technology Lab, where I worked with Laura Trutoiu and Brian Schowengerdt; and Microsoft Research, where I worked with Sudipta Sinha and Sing Bing Kang. Prior to beginning my doctoral studies, I attended Tufts University, where I received a B.S. in Electrical Engineering with a second major in Computer Science and worked as an undergraduate researcher under the direction of Karen Panetta. During this time, I also interned at GE Intelligent Platforms and participated in the National Science Foundation Research Experience for Undergraduates Program at Florida International University.
University of Florida
Institute at Chicago
Advanced Tech. Lab
NEC Labs America
We propose advances that address two key challenges in future trajectory prediction: (i) multimodality in both training data and predictions and (ii) constant time inference regardless of number of agents. Existing trajectory predictions are fundamentally limited by lack of diversity in training data, which is difficult to acquire with sufficient coverage of possible modes. Our first contribution is an automatic method to simulate diverse trajectories in the top-view. It uses pre-existing datasets and maps as initialization, mines existing trajectories to represent realistic driving behaviors and uses a multi-agent vehicle dynamics simulator to generate diverse new trajectories that cover various modes and are consistent with scene layout constraints. Our second contribution is a novel method that generates diverse predictions while accounting for scene semantics and multi-agent interactions, with constant-time inference independent of the number of agents. We propose a convLSTM with novel state pooling operations and losses to predict scene-consistent states of multiple agents in a single forward pass, along with a CVAE for diversity.
Most active depth sensors sample their visual field using a fixed pattern, decided by accuracy, speed and cost trade-offs, rather than scene content. However, a number of recent works have demonstrated that adapting measurement patterns to scene content can offer significantly better trade-offs. We propose a hardware LIDAR design that allows flexible real-time measurements according to dynamically specified measurement patterns. Our flexible depth sensor design consists of a controllable scanning LIDAR that can foveate, or increase resolution in regions of interest, and that can fully leverage the power of adaptive depth sensing. We describe our optical setup and calibration, which enables fast sparse depth measurements using a scanning MEMS (micro-electro mechanical) mirror. We validate the efficacy of our prototype LIDAR design by testing on over 75 static and dynamic scenes spanning a range of environments. We also show CNN-based depth-map completion from measurements obtained by our sensor. Our experiments show that our sensor can realize adaptive depth sensing systems.
Many 3D vision systems utilize pose and localization from a pre-captured 3D point cloud. Such 3D models are often obtained using structure from motion (SfM), after which the images are discarded to preserve privacy. In this paper, we show, for the first time, that SfM point clouds retain enough information to reveal scene appearance and compromise privacy. We present a privacy attack that reconstructs color images of the scene from the point cloud. Our method is based on a cascaded U-Net that takes as input, a 2D image of the points from a chosen viewpoint as well as point depth, color, and SIFT descriptors and outputs an image of the scene from that viewpoint. Unlike previous SIFT inversion methods, we handle highly sparse and irregular inputs and tackle the issue of many unknowns, namely, SIFT keypoint orientation and scale, image source, and 3D point visibility. We evaluate our attack algorithm on public datasets (MegaDepth and NYU Depth V2) and analyze the significance of the point cloud attributes. Finally, we synthesize novel views to create compelling virtual tours of scenes.
The risk of unauthorized remote access of streaming video from networked cameras underlines the need for stronger privacy safeguards. Towards this end, we simulate a lens-free coded aperture (CA) camera as an appearance encoder, i.e., the first layer of privacy protection. Our goal is human action recognition from coded aperture videos for which the coded aperture mask is unknown and does not require reconstruction. We insert a second layer of privacy protection by using non-invertible motion features based on phase correlation and log-polar transformation. Phase correlation encodes translation while the log polar transformation encodes in-plane rotation and scaling. We show the key property of the translation features being mask-invariant. This property allows us to simplify the training of classifiers by removing reliance on a specific mask design. Results based on a subset of the UCF and NTU datasets show the feasibility of our system.
We present a framework to learn privacy preserving encodings of images that inhibit inference of chosen private attributes, while allowing recovery of other desirable information. Rather than simply inhibiting a given fixed pretrained estimator, our goal is that an estimator be unable to learn to accurately predict the private attributes even with knowledge of the encoding function. We use a natural adversarial optimization-based formulation for this training the encoding function against a classifier for the private attribute, with both modeled as deep neural networks. The key contribution of our work is a stable and convergent optimization approach that is successful at learning an encoder with our desired properties maintaining utility while inhibiting inference of private attributes, not just within the adversarial optimization, but also by classifiers that are trained after the encoder is fixed. We adopt a rigorous experimental protocol for verification wherein classifiers are trained exhaustively till saturation on the fixed encoders. We evaluate our approach on tasks of real-world complexity learning high-dimensional encodings that inhibit detection of different scene categories and find that it yields encoders that are resilient at maintaining privacy.
The next wave of micro and nano devices will create a world with trillions of small networked cameras. This will lead to increased concerns about privacy and security. Most privacy preserving algorithms for computer vision are applied after image/video data has been captured. We propose to use privacy preserving optics that filter or block sensitive information directly from the incident light-field before sensor measurements are made, adding a new layer of privacy. In addition to balancing the privacy and utility of the captured data, we address trade-offs unique to miniature vision sensors, such as achieving high-quality field-of-view and resolution within the constraints of mass and volume. Our privacy preserving optics enable applications such as depth and thermal sensing and full-body motion tracking. While we demonstrate applications on macro-scale devices (smartphones, webcams, etc.) our theory has impact for smaller devices.
As cameras turn ubiquitous, balancing privacy and utility becomes crucial. To achieve both, we enforce privacy at the sensor level, as incident photons are converted into an electrical signal and then digitized into image measurements. We present sensor protocols and accompanying algorithms that degrade facial information for thermal sensors, where there is usually a clear distinction between humans and the scene. By manipulating the sensor processes of gain, digitization, exposure time, and bias voltage, we are able to provide privacy during the actual image formation process and the original face data is never directly captured or stored. We show privacy-preserving thermal imaging applications such as temperature segmentation, night vision, gesture recognition and HDR imaging.
First responders' ability to respond rapidly to emergency situations is limited by a lack of real time intelligence. To ensure the safety of the responders, the situation must first be evaluated for dangerous conditions including life-threatening hazards. Live visual feeds let remote experts gauge the safety levels and assess damages of a situation, but do not perform adequately when images are captured in poor lighting or in harsh environmental conditions. We present a novel approach that leverages HVS-based (Human Visual System) object detection in combination with low-cost commercial off-the-shelf UAVs, to deliver efficient real time image enhancement and detection. This approach enables our system to deliver timely information in low visibility environments making it ideal for aiding first responders in their search for critical objects such as wounded victims, human bodies and threat objects.
A de-identification assembly comprising an object tracking sensor to track features of an object; and a mask generator to produce rays of light in response to the tracked features of the object, the rays of light representing a de-identification mask of the object. The assembly includes a beamsplitter having a first side configured to receive rays of light representing the object and a second side configured to receive the rays of light of the mask from the mask generator. The beamsplitter produces a composite image of the object superimposed with the de-identification mask to anonymize an image of the object. A system including the de-identification assembly and a method are also provided.
Embodiments herein relate to an optical privatizing device, system and method of use. An aspect of the embodiments include a device comprising: a removable frame removably attachable to a sensor housing; and a blurring lens coupled to the removable frame and configured to optically modify light passing to a depth sensor. The optical modified light has a privatizing blur level to neutralize a profile of an object sensed by the depth sensor within a working volume of the depth sensor to an un-identifiable state while maintaining a depth parameter sensed by the depth sensor. Embodiments also include an optical privatizing device and device having a depth sensor with a working volume and a RGB sensor.
Rules: The game starts with five frogs, which are counted as the player's lives. Losing all five frogs results in the end of the game. The objective of the game is to guide each frog to one of the designated spaces at the top of the screen. The frog starts at the bottom of the screen. The player must guide the frog between opposing lanes of traffic to avoid becoming roadkill, which results in a loss of one life. The upper portion of the screen consists of a river with logs and turtles, all moving horizontally across the screen. By jumping on swiftly moving logs and the backs of turtles the player can guide their frog to safety. The player may catch bugs which appear periodically for bonuses.
GPUs: One NVIDIA TitanX,
Nine NVIDIA 1080ti
CPUs: Two Intel Xeon E5-2680 v4
RAM: 512GB (16x32GB) DDR4 2400
Storage: 10TB (5x2TB) SSD
Cables: 8 Pin Male to Dual 2x8 Pin Male PCI Express
Motherboard, Case, Fans, Power Supplies: Supermicro SYS-4028GR-TRT2