Indexed by:
Abstract:
Recently, Graph Attention Networks (GATs) have shown good performance for representation learning on graphs. Furthermore, GAT leverage the masked self-attention mechanism to get a more advanced feature representation than the graph convolution networks (GCNs). However, GAT incurs large amounts of irregularity in computation and memory access, which prevents the efficient use of traditional neural network accelerators. Moreover, existing dedicated GAT accelerators demand high memory volumes and are difficult to implement onto resource-limited edge devices. Due to this, this paper proposes an FPGA-based accelerator, called H-GAT, which achieves excellent performance on acceleration and energy efficiency in GAT inference. HGAT decomposes GAT operation into matrix multiplication and activation function unit. We first design an effective and fully-pipelined PE for sparse matrix multiplication (SpMM) and dense matrix-vector multiplication (DMVM). Moreover, we optimize the softmax data flow so that the computational efficiency of softmax can be improved dramatically. We evaluate our design on Xilinx Kintex-7 FPGA with three popular datasets. Compared to existing CPU, GPU, and state-of-the-art FPGA-based GAT accelerator, H-GAT can achieve speedup by up to 585x, 2.7x, and 11x and increases power efficiency by up to 2095x, 173x, and 65x, respectively.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
JOURNAL OF APPLIED SCIENCE AND ENGINEERING
ISSN: 2708-9967
Year: 2023
Issue: 3
Volume: 27
Page: 2233-2240
1 . 1
JCR@2023
1 . 1 0 0
JCR@2023
JCR Journal Grade:3
Cited Count:
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: