ICLR SCAN-Edge: Finding MobileNet-speed Hybrid Networks for Commodity Edge Devices

Poster
in
Workshop: 5th Workshop on practical ML for limited/low resource settings (PML4LRS) @ ICLR 2024

SCAN-Edge: Finding MobileNet-speed Hybrid Networks for Commodity Edge Devices

Hung-Yueh Chiang · Diana Marculescu

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: Designing low-latency and high-efficiency hybrid networks for diverse low-cost commodity edge devices is costly and tedious, thereby leading to the use of neural architecture search (NAS) for finding optimal architectures. However, the challenges of unifying NAS for a wide range of edge devices lay in the sheer number of hardware designs, supported operations, and compilation optimizations. Existing methods fix the search space of architecture choices (e.g., activation, convolution, or self-attention) for network stages and estimate the latency with hardware-agnostic proxies (e.g., FLOPs), which fail to achieve proclaimed latency on a wide variety of edge devices. We address the issue and propose a unified NAS framework, termed SCAN-Edge, which jointly searches Self-attention, Convolution, and ActivatioN to best accommodate the diversity of Edge devices, such as CPU-, GPU-, and hardware accelerator-based. During the search, SCAN-Edge accurately estimates the end-to-end latency with pre-built calibrated latency lookup tables and addresses the resulting large search space with a hardware-aware evolutionary algorithm, which accelerates the sampling process. Experiments on large-scale datasets show that, compared with prior art, our hybrid networks match actual MobileNetV2 latency for

224 \times 224

$224 \times 224$ input resolution on various commodity edge devices.

Chat is not available.

Poster in Workshop: 5th Workshop on practical ML for limited/low resource settings (PML4LRS) @ ICLR 2024

SCAN-Edge: Finding MobileNet-speed Hybrid Networks for Commodity Edge Devices

Hung-Yueh Chiang · Diana Marculescu

Poster
in
Workshop: 5th Workshop on practical ML for limited/low resource settings (PML4LRS) @ ICLR 2024