Indexed by:
Abstract:
Binary Neural Network (BNN) is a meaningful machine learning model on the data plane. However, due to the chip limitations, the scalability, especially the number of hidden layers in one pipeline, is limited. For better inference performance, existing methods reuse the hidden layers through packet recirculations. Recirculations lead to poor processing latency. Additionally, the simplified operations in the in-network BNN model restrict the flexibility of itself, which results in the unarbitrary input length of neurons for more the additional resource consumption than normal BNN model. In this paper, we present TBNN, an optimized in-network BNN model that achieves both scalability and flexibility. This approach eliminates deployment constraints while maximizing hardware utilization, advancing the feasibility of complex BNN models on resourcelimited data planes. By replacing computational bottleneck actions with Lookup Tables (LUTs), TBNN enables at most 4 × more neurons per pipeline and reduces per-packet latency by 50% through minimized recirculation. LUT-based implementation supports pruning operations, trading an accuracy loss of 1.69% for saving about 24% instructions. © 2025 IEEE.
Keyword:
Reprint 's Address:
Email:
Source :
ISSN: 1548-615X
Year: 2025
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: