Indexed by:
Abstract:
Copy number variation (CNV) refers to the number of copies of a specific sequence in a genome and is a type of chromatin structural variation. The development of the Hi-C technique has empowered research on the spatial structure of chromatins by capturing interactions between DNA fragments. We utilized machine-learning methods including the linear transformation model and graph convolutional network (GCN) to detect CNV events from Hi-C data and reveal how CNV is related to three-dimensional interactions between genomic fragments in terms of the one-dimensional read count signal and features of the chromatin structure. The experimental results demonstrated a specific linear relation between the Hi-C read count and CNV for each chromosome that can be well qualified by the linear transformation model. In addition, the GCN-based model could accurately extract features of the spatial structure from Hi-C data and infer the corresponding CNV across different chromosomes in a cancer cell line. We performed a series of experiments including dimension reduction, transfer learning, and Hi-C data perturbation to comprehensively evaluate the utility and robustness of the GCN-based model. This work can provide a benchmark for using machine learning to infer CNV from Hi-C data and serves as a necessary foundation for deeper understanding of the relationship between Hi-C data and CNV. © 2024 The Author(s). Quantitative Biology published by John Wiley & Sons Australia, Ltd on behalf of Higher Education Press.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
Quantitative Biology
ISSN: 2095-4689
Year: 2024
Issue: 3
Volume: 12
Page: 231-244
0 . 6 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 1
Affiliated Colleges: