GEA-COPE: AN EFFECTIVE MODEL FOR CROSS- DOMAIN GRAPH PRE-TRAINING

(Received: 10-Sep.-2025, Revised: 21-Nov.-2025, 10-Dec.-2025 and 13-Jan.-2026 , Accepted: 14-Jan.-2026)

Authors Yiming Zhao, Yongqing Wu,

Keywords #Graph neural networks #Graph pre-training #Transfer learning #External attention

Abstract This paper addresses the negative transfer problem in cross-domain graph pre-training under few-shot learning scenarios, it proposes a multi-component pre-training framework called Graph External Attention-enhanced Coordinators for Pre-training (GEA-CoPe). This framework integrates multi-head external attention with a graph coordinator. Tackling the structural and semantic discrepancies between cross-domain graphs is crucial for mitigating negative transfer; however, conventional methods often lack adaptability to complex, dynamic inter- domain variations and explicit constraints for intermediate feature-distribution consistency. The proposed framework leverages an external attention-based coordinator to mediate between different graph datasets, dynamically generating cross-graph semantic-alignment strategies to alleviate negative transfer induced by structural heterogeneity. It employs a dual-feature normalization strategy that incorporates a cross-layer distribution alignment loss on top of intra-layer node-similarity constraints, effectively suppressing feature drift. Furthermore, Kolmogorov-Arnold Networks (KANs) are introduced, whose parameter-adaptive activation functions better capture non-linear topological dependencies and enhance model interpretability. Experiments on ten real-world graph datasets demonstrate that GEA-CoPe exhibits superior cross-domain generalization capability and significantly improves performance in few-shot node classification tasks, with an average improvement of about 13.3% compared to other methods. The model can more accurately focus on critical graph structures, providing a theoretical foundation and practical paradigms for deploying graph neural networks in complex scenarios.

References

[1] D. Bhattacharjee et al., "Vision Transformer Adapters for Generalizable Multitask Learning," Proc. ofthe IEEE/CVF Int. Conf. on Computer Vision (ICCV), pp. 19015-19026, Paris, France, 2023.

[2] M. Sun et al., "GPPT: Graph Pre-training and Prompt Tuning to Generalize Graph Neural Networks,"Proc. of the 28th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), pp. 1717-1727, DOI: 10.1145/3534678.3539249 2022.

[3] J. Liu et al., "Graph Foundation Models: Concepts, Opportunities and Challenges," IEEE Transactionson Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, no. 6, pp. 5023-5044, 2025.

[4] X. Sun et al., "All in One: Multi-task Prompting for Graph Neural Networks," Proc. of the 29th ACMSIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), pp. 2120–2131, DOI: 10.1145/3580305.3599256, 2023.

[5] A. A. Khan et al., "Blockchain-enabled Secure Internet of Medical Things (IoMT) Architecture forMulti-Modal Data Fusion in Precision Cancer Diagnosis and Continuous Monitoring," Journal of Cloud Computing, vol. 14, p. 58, 2025.

[6] B.-S. Shi et al., "Domain Adaptation for Graph Representation Learning: Challenges, Progress andProspects," Journal of Computer Science and Technology, vol. 40, pp. 283–300, 2025.

[7] Y. Xue et al., "A Review on Transferability Estimation in Deep Transfer Learning," IEEE Transactionson Artificial Intelligence (IEEE TAIS), vol. 5, no.12, pp. 5894 - 5914, 2024.

[8] A. A. Laghari et al., "A Novel and Secure Artificial Intelligence Enabled Zero Trust Intrusion Detection in Industrial Internet of Things Architecture," Scientific Reports, vol. 15, p. 26843, 2025.

[9] X. Wu et al., "ProCom: A Few-shot Targeted Community Detection Algorithm," Proc. of the 30th ACMSIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), pp. 3414–3424, DOI: 10.1145/3637528.3671749, 2024.

[10] L. Sun et al., "RiemannGFM: Learning a Graph Foundation Model from Riemannian Geometry," Proc.of the ACM on Web Conf. (WWW), pp. 1154–1165, DOI: 10.1145/3696410.3714952, 2025.

[11] Z. Wang et al., "Negative as Positive: Enhancing Out-of-distribution Generalization for GraphContrastive Learning," Proc. of the 47th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR), pp. 2548–2552, DOI: 10.1145/3626772.3657927, 2024.

[12] Q. Chen et al., "DAGPrompt: Pushing the Limits of Graph Prompting with a Distribution-aware GraphPrompt Tuning Approach," Proc. of the ACM on Web Conf. (WWW), pp. 4346–4358, DOI: 10.1145/3696410.3714917, 2025.

[13] X. Huang et al., "Enhancing Cross-domain Link Prediction via Evolution Process Modeling," Proc. ofthe ACM on Web Conf. (WWW), pp. 2158–2171, DOI: 10.1145/3696410.3714792, 2025.

[14] L. Kong et al., "Gofa: A Generative One-for-all Model for Joint Graph Language Modeling," Proc. of the13th Int. Conf. on Learning Representations (ICLR), DOI:10.1021/acsengineeringau.3c00058.s001, 2025.

[15] M. Zhang et al., "GraphTranslator: Aligning Graph Model to Large Language Model for Open-endedTasks," Proc. of the ACM Web Conf. (WWW), pp. 1003–1014, DOI: 10.1145/3589334.3645682, 2024.

[16] Y. You et al., "Graph Contrastive Learning with Augmentations," Advances in Neural InformationProcessing Systems (NeurIPS), vol. 33, pp. 5812–5823, 2020.

[17] J. Xia et al., "SimGRACE: A Simple Framework for Graph Contrastive Learning without DataAugmentation," Proc. of the ACM Web Conf. (WWW), pp. 1070–1079, DOI: 10.1145/3485447.3512156, 2022.

[18] Z. Hu et al., "GPT-GNN: Generative Pre-training of Graph Neural Networks," Proc. of the 26th ACMSIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD), pp. 1857–1867, DOI: 10.1145/3394486.3403237, 2020.

[19] Z. Hou et al., "GraphMAE: Self-supervised Masked Graph Auto-encoders," Proc. of the 28th ACMSIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), pp. 594–604, DOI: 10.1145/3534678.3539321, 2022.

[20] H. Yan et al., "Hierarchical Graph Contrastive Learning," Proc. of the Joint European Conf. on MachineLearning and Knowledge Discovery in Databases (ECML-PKDD), vol. 14170, pp. 700–715, 2023.

[21] Q. Dai et al., "Graph Transfer Learning via Adversarial Domain Adaptation with Graph Convolution,"IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 5, pp. 4908–4922, 2022.

[22] T. Vayer et al., "Fused Gromov-Wasserstein Distance for Structured Objects," Algorithms, vol. 13, no.9, p. 212, 2020.

[23] Z. Hu, Y. Dong, K. Wang and Y. Sun, "Heterogeneous Graph Transformer," Proc. of the Web Conf.(WWW), pp. 2704–2710, DOI: 10.1145/3366423.3380027, 2020.

[24] G. Wan et al., "Reinforcement Learning-based Meta-path Discovery in Large-scale HeterogeneousInformation Networks," Proc. of the AAAI Conf. on Artifi. Intell., vol. 34, no. 04, pp. 6094–6101, 2020.

[25] A. Pareja et al., "EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs," Proc. ofthe AAAI Conf. on Artificial Intelligence, vol. 34, no. 04, pp. 5363–5370, 2020.

[26] R. Trivedi et al., "DyRep: Learning Representations over Dynamic Graphs," Proc. of Int. Conf. onLearning Representations (ICLR), DOI:10.32920/26883523.v1, 2019.

[27] G. Song, Y. Zhang, L. Xu and H. Lu, "Domain Adaptive Network Embedding," IEEE Transactions onBig Data, vol. 8, no. 5, pp. 1220–1232, 2020.

[28] L. Chen et al., "Graph Optimal Transport for Cross-Domain Alignment," Proc. of the 37th Int. Conf. onMachine Learning (ICML), vol.119, pp. 1542–1553, 2020.

[29] J. Liang, M. Chen and J. Liang, "Graph External Attention Enhanced Transformer," Proc. of Int. Conf.on Machine Learning (ICML), Vol. 235 pp. 29560–29574, 2024.

[30] W. Jin et al., "Self-supervised Learning on Graphs: Deep Insights and New Direction," [Online],Available: https://doi.org/10.48550/arXiv.2006.10141, 2020.

[31] X. Guo et al., "ContraNorm: A Contrastive Learning Perspective on Over-smoothing and Beyond," Proc.of the 11th Int. Conf. on Learning Representations (ICLR), DOI: 10.48550/arXiv.2303.06562, 2023.

[32] Z. Liu et al., "KAN: Kolmogorov-Arnold Networks," Proc. of the 13th Int. Conf. on LearningRepresentations (ICLR), DOI: 10.31224/5413, 2025.

[33] R. Rossi and N. Ahmed, "The Network Data Repository with Interactive Graph Analytics andVisualization," Proc. of the 29th AAAI Conf. on Artificial Intelligence, vol. 29, no. 1, DOI: 10.1609/aaai.v29i1.9277, 2015.

[34] P. Sen et al., "Collective Classification in Network Data," AI Magazine, vol. 29, no. 3, p. 93, 2008.

[35] G. Namata et al., "Query-driven Active Surveying for Collective Classification," Proc. of the 10th Int.Workshop on Mining and Learning with Graphs (MLG), vol. 8, pp. 1-8, Edinburgh, UK, 2012.

[36] J. McAuley et al., "Image-based Recommendations on Styles and Substitutes," Proc. of the 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR), pp. 43–52, DOI: 10.1145/2766462.2767755, 2015.

[37] O. Shchur, M. Mumme, A. Bojchevski and S. Günnemann, "Pitfalls of Graph Neural NetworkEvaluation," [Online], Available: https://doi.org/10.48550/arXiv.1811.05868, 2018.

[38] H. Pei et al., "Geom-GCN: Geometric Graph Convolutional Networks," Proc. of the Int. Conf. onLearning Representations (ICLR), DOI:10.48550/arXiv.2002.05287, 2020.

[39] Z. Xu et al., "Node Classification Beyond Homophily: Towards a General Solution," Proc. of the 29thACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), pp. 2862–2873, 2023.

[40] S. Luan et al., "When Do Graph Neural Networks Help with Node Classification: Investigating theHomophily Principle on Node Distinguishability," NeurIPS J., vol. 36, pp. 28748–28760, 2023.

[41] T. N. Kipf and M. Welling, "Semi-supervised Classification with Graph Convolutional Networks," Proc.of the Int. Conf. on Learning Representations (ICLR), DOI: 10.18178/wcse.2019.06.016, 2016.

[42] D. Bo, X. Wang, C. Shi and H. Shen, "Beyond Low-frequency Information in Graph ConvolutionalNetworks," Proc. of the AAAI Conf. on Artificial Intell. (AAAI), vol. 35, no.5, pp. 3950–3957, 2021.

[43] R. Hart, L. Yu, Y. Lou and F. Chen, "Improvements on Uncertainty Quantification for NodeClassification via Distance-based Regularization," NeurIPS, vol. 36, pp. 55454–55478, 2023.

[44] J. Jeong et al., "iGraphMix: Input Graph Mixup Method for Node Classification," Proc. of the 12th Int.Conf. on Learning Representations (ICLR), DOI: 10.1145/3442381.3449796, 2024.

[45] H. Zhao, A. Chen, X. Sun, H. Cheng and J. Li, "All in One and One for All: A Simple Yet EffectiveMethod towards Cross-domain Graph Pre-training," Proc. of the 30th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), pp. 4443–4454, DOI: 10.1145/3637528.3671913, 2024.

[46] J. Tang, J. Li, Z. Gao and J. Li, "Re-thinking Graph Neural Networks for Anomaly Detection," Proc. ofthe 39th Int. Conf. on Machine Learning (ICML), vol.162, pp. 21076–21089, Baltimore, USA, 2022.

[47] M.-H. Guo et al., "PCT: Point Cloud Transformer," Computational Visual Media, vol. 7, pp. 187–199,2021.

[48] Z. Liu et al., "GraphPrompt: Unifying Pre-training and Downstream Tasks for Graph Neural Networks,"Proc. of the ACM on Web Conf. (WWW’23), pp. 417—428, DOI: 10.1145/3543507.3583386, 2023.

[49] X. Yu, C. Zhou, Y. Fang and X. Zhang, "Text-free Multi-domain Graph Pre-training: Toward GraphFoundation Models," [Online], Available: https://doi.org/10.48550/arXiv.2405.13934, 2024.

[50] S. Wang et al., "Multi-domain Graph Foundation Models: Robust Knowledge Transfer via TopologyAlignment," [Online], Available: https://doi.org/10.48550/arXiv.2502.02017, 2025.

[51] X. Yu et al., "SAMGPT: Text-free Graph Foundation Model for Multi-domain Pre-training and Cross-domain Adaptation," Proc. of the ACM on Web Conf. (WWW), pp. 1142–1153, DOI: 10.1145/3696410.3714828 2025.

[52] Y. Huang et al., "One Prompt Fits All: Universal Graph Adaptation for Pre-trained Models," [Online], Available: http://doi.org/10.48550/arXiv.2509. 22416, 2025.