Currently, I am interested in the following topics:

Efficient ML: Deep learning models are becoming increasingly large, making them difficult to train and to deploy on edge devices. I aim to build models that are smaller and faster, while maintaining high performance metrics. Specifically, I’m looking into exploiting low-rank structures that emerge during training through manifold-based learning. On this topic, I am working/haved worked on:

  • Dynamic low-rank training of tensor-compressed neural networks for tensor train decompositon.
  • Dynamic low-rank training of Kolmogorov-Arnold networks for physics-informed neural networks.
  • Knowledge distillation for LLM compression.
  • Zeroth-order training of spiking neural networks for neuromorphic hardware.

Previously:

Safe ML: Existing machine learning models are vulnerable to adversarial attacks and distributional shifts. In the real world, underlying data distributions are hardly stationary, and simply optimizing for accuracy on the training set will result in models that are brittle and unreliable. I aim to build models that are robust to these deviations, and can generalize to unseen data. On this topic, I’ve worked on:

  • Scaling up zeroth-order optimization for larger models, a powerful tool for black-box adversarial attacks and contrastive explanations.