Research People Partners Internship About Us

AI for Robotics

We develop models that help robots perceive and understand their surroundings, enabling them to follow natural-language instructions and navigate autonomously. On the perception side, we focus on multispectral imagery (mostly satellite data) and radio-frequency (RF) signals for wireless sensing. On the action side, we study drone navigation in complex environments. We have built a strong vision encoder for satellite imagery with robust cross-satellite generalization. We have also developed neural networks for processing RF data to perform device localization and environment reconstruction. Finally, we fine-tune Qwen and Isaac vision-language models for navigation in simulated urban environments.

2025

Do Satellite Tasks Need Special Pretraining?

We show that pretraining specifically for remote sensing applications does not always improve over general-purpose pretraining of visual models.

Poster @ICLR ML4RS Workshop

2025

Teaching Visual Language Models to Navigate using Maps

We show that Qwen-based visual-language models do not understand simple maps. We fine-tune them to teach that specific skill and improve navigation based on maps and visuals.

Poster @ICLR Robot Learning

2025

Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction

We describe a ViT-based solution for predicting indoor radio maps, as part of ICASSP 2025 SPGC Challenge

Paper @Electronics

2025

GeoCrossBench: Cross-Band Generalization for Remote Sensing

We design a benchmark for evaluating remote sensing foundation models in cross-satellite generalization settings. We also design a strong self-supervised baseline.

Poster @ICML TerraBytes Workshop

2025

Less is More? Data Specialization for Self-Supervised Remote Sensing Models

Data specialization is when you pick a subset of the dataset, and it improves the model performance in a compute-controlled setting. We show an example on how it works on a dataset of satellite images from Maxar.

Poster @ICML DataWorld & TerraBytes Workshops

2025

Fusion of Pervasive RF Data with Spatial Images via Vision Transformers for Enhanced Mapping in Smart Cities

We show how incorrect maps from online mapping services can be improved on the ground by leveraging radio signal parameters across antennas and devices.

Preprint

2025

U-Net for Indoor Pathloss Prediction from Sparse Measurements with Physics-Informed Features

A U-Net based method for predicting indoor radio map when only a few sparse measurements are available. Our solution for MLSP 2025 Challenge

Paper @IEEE MLSP

2025

Bridging the Sim-to-Real Gap in RF Localization with Large-Scale Synthetic Pretraining

This work provides a systematic study in the field of wireless communication of synthetic-to-real transfer in RF localization and highlights the value of simulation-aware pretraining for generalizing DL models to real-world scenarios.

Paper @Information Fusion

2025

Scalable Generation of Synthetic IoT Network Datasets: A Case Study with Cooja

This work introduces an automated pipeline for generating large-scale IoT network datasets by bringing together the Contiki-NG firmware, parameterized topology generation, and Slurm-based orchestration of Cooja simulations. The system supports a variety of network structures, scalable node counts, randomized battery allocations, and routing protocols to reproduce diverse failure modes.

Paper @Future Internet

2025

Towards Fine-tuning a Small Vision-Language Model for Aerial Navigation

This paper addresses the CityNav aerial navigation benchmark by fine-tuning a small, open-source Vision-Language Model, Qwen2.5-VL-3B.

Poster @NeurIPS Embedded World Models Workshop

2024

In-context learning in presence of spurious correlations

Spurious correlations cause serious problems in all ML algorithms. In this paper we investigate the challenges in image classification tasks within the in-context learning paradigm with transformers.

Under review @TMLR

2024

Analyzing Local Representations of Self-supervised Vision Transformers

In this paper we investigate the differences between various self-supervised algorithms for learning visual encoders (e.g. MAE, DINO). We highlight critical issues with MAE-like methods.

Preprint

2024

Deep learning with synthetic data for wireless NLOS positioning with a single base station

We show that having perfect information about radio signals received at a device from even a single base antenna can be enough to localize the device.

Paper @Ad Hoc Networks

2023

Identifying and disentangling spurious features in pretrained image representations

We show that pretrained image representations including DINOv2 contain spurious correlations which can harm classification accuracy. We propose a method to remove such correlations from the representations.

Poster @ICML SCIS Workshop

2021

Failure Modes of Domain Generalization Algorithms

In many practical applications training and test data come from slightly different distributions. This paper provides a comprehensive analysis on how ML models fail in such scenarios.

Paper @CVPR

2021

Deep Semi-Supervised Image Classification Algorithms: a Survey

A comprehensive analysis of semi-supervised learning methods for computer vision.

Paper @JUCS

2020

Robust classification under class-dependent domain shift

Poster @ICML URDL Workshop

AI for Robotics

Do Satellite Tasks Need Special Pretraining?

Teaching Visual Language Models to Navigate using Maps

Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction

GeoCrossBench: Cross-Band Generalization for Remote Sensing

Less is More? Data Specialization for Self-Supervised Remote Sensing Models

Fusion of Pervasive RF Data with Spatial Images via Vision Transformers for Enhanced Mapping in Smart Cities

U-Net for Indoor Pathloss Prediction from Sparse Measurements with Physics-Informed Features

Bridging the Sim-to-Real Gap in RF Localization with Large-Scale Synthetic Pretraining

Scalable Generation of Synthetic IoT Network Datasets: A Case Study with Cooja

Towards Fine-tuning a Small Vision-Language Model for Aerial Navigation

In-context learning in presence of spurious correlations

Analyzing Local Representations of Self-supervised Vision Transformers

Deep learning with synthetic data for wireless NLOS positioning with a single base station

Identifying and disentangling spurious features in pretrained image representations

Failure Modes of Domain Generalization Algorithms

Deep Semi-Supervised Image Classification Algorithms: a Survey

Robust classification under class-dependent domain shift

Research

Lab

Donate