10.6084/M9.FIGSHARE.21235586.V1
Xin Zhang
Xin
Zhang
Jia Liu
Jia
Liu
Zhengyuan Zhu
Zhengyuan
Zhu
Learning Coefficient Heterogeneity over Networks: A Distributed Spanning-Tree-Based Fused-Lasso Regression
<p>Identifying the latent cluster structure based on model heterogeneity is a fundamental but challenging task arises in many machine learning applications. In this paper, we study the clustered coefficient regression problem in the distributed network systems, where the data are locally collected and held by nodes. Our work aims to improve the regression estimation efficiency by aggregating the neighbors’ information while also identifying the cluster membership for nodes. To achieve efficient estimation and clustering, we develop a distributed spanning-tree-based fused-lasso regression (DTFLR) approach. In particular, we propose an adaptive spanning-tree-based fusion penalty for the low-complexity clustered coefficient regression. We show that our proposed estimator satisfies statistical oracle properties. Additionally, to solve the problem parallelly, we design a distributed generalized alternating direction method of multiplier algorithm, which has a simple node-based implementation scheme and enjoys a linear convergence rate. Collectively, our results in this paper contribute to the theories of low-complexity clustered coefficient regression and distributed optimization over networks. Thorough numerical experiments and real-world data analysis are conducted to verify our theoretical results, which show that our approach outperforms existing works in terms of estimation accuracy, computation speed, and communication costs.</p>
Space Science
Biotechnology
Environmental Sciences not elsewhere classified
Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
Mathematical Sciences not elsewhere classified
Science Policy
Taylor & Francis
2022
2022-09-29
2024-02-15
Dataset
1129509 Bytes
10.6084/m9.figshare.21235586
10.1080/01621459.2022.2126363
CC BY 4.0