(Received: 15-Mar.-2024, Revised: 12-May-2024 and 19-Jun.-2024 , Accepted: 23-Jun.-2024)

The performance of sorting algorithms has a great impact on many computationally intensive applications.
Researchers worked on parallelizing many sorting algorithms on various interconnection networks to improve
their sequential counterpart performance. One of these interconnection networks is the optical chained-cubic
tree (OCCT). In this paper, a parallel bucket sort (PBS) algorithm is presented and applied to the OCCT
interconnection network. This PBS algorithm is evaluated analytically and by simulation in terms of various
performance metrics including parallel runtime, computation time, communication time, concatenation time,
speedup and efficiency, for a different number of processors, dataset sizes and data distributions including
random and descending distributions. Simulation results show that the highest obtained speedup is
approximately 912x on OCCT using 1020 processors, which means that the parallel runtime of the PBS on 1020
processors is 912 times faster than the sequential runtime of BS on a single processor.

[1] L. Rashid, W. M. Hassanein and M. A. Hammad, "Analyzing and Enhancing the Parallel SortOperation on Multithreaded Architectures," Journal of Supercomputing, vol. 53, pp. 293–312, 2010.

[2] N. R. Nitin, "Analysis of Multi-sort Algorithm on Multi-mesh of Trees (MMT) Architecture," Journalof Supercomputing, vol. 57, pp. 276–313, 2011.

[3] S. Al-Haj Baddar and B. Mahafzah, "Bitonic Sort on a Chained-cubic Tree Interconnection Network,"Journal of Parallel and Distributed Computing, vol. 74, pp. 1744–1761, 2014.

[4] F. Dehne and H. Zaboli, "Parallel Sorting for GPUs," In: Adamatzky, A., editor. EmergentComputation. Emergence, Complexity and Computation (ECC), vol. 24, DOI: 10.1007/978-3-319-46376-6_12, Springer, Cham., 2017.

[5] A. Al-Adwan, R. Zaghloul, B. Mahafzah and A. Sharieh, "Parallel Quicksort Algorithm on OTIS HyperHexa-Cell Optoelectronic Architecture," Journal of Parallel and Distributed Computing, vol. 141, pp. 61–73, DOI: 10.1016/j.jpdc.2020.03.015, 2020.

[6] M. K. I. Rahmani, "Smart Bubble Sort: A Novel and Dynamic Variant of Bubble Sort Algorithm,"Computers, Materials & Continua, vol. 71, no. 3, pp. 4895–4913, 2022.

[7] P. Preethi, K. G. Mohan, S. Kumar and K. K. Mahapatra, "Sorter Design with Structured Low PowerTechniques," SN COMPUT. SCI., vol. 4, no. 129, DOI: 10.1007/s42979-022-01546-7, 2023.

[8] Y. Han and X. He, "More Efficient Parallel Integer Sorting," International Journal of Foundations ofComputer Science, vol. 33, no. 5, pp. 411–427, DOI: 10.1142/S0129054122500071, 2022.

[9] B. Bramas, "A Fast Vectorized Sorting Implementation Based on the ARM Scalable Vector Extension(SVE)," PeerJ Computer Science, vol. 7, p. e769, DOI: 10.7717/peerj-cs.769, 2021.

[10] O. Obeya, E. Kahssay, E. M. Fan and J. Shun, "Theoretically Efficient and Practical Parallel In-placeRadix Sorting," Proc. of the 31st ACM Symposium on Parallelism in Algorithms and Architectures, DOI: 10.1145/3323165.3323198, 2019.

[11] T. Tokuue and T. Ishiyama, "Performance Evaluation of Parallel Sortings on the SupercomputerFugaku," Journal of Information Processing, vol. 31, pp. 452–458, 2023.

[12] S. K. Gill, V. P. Singh, P. Sharma and D. Kumar, "A Comparative Study of Various SortingAlgorithms," Int. Journal of Advanced Studies of Scientific Research, vol. 4, pp. 367–372, 2019.

[13] A. Grama, A. Gupta, G. Karypis and V. Kumar, Introduction to Parallel Computing, 2ed Edition,Reading: Addison-Wesley, 2003.

[14] T. Cormen, C. Leiserson, R. Rivest and C. Stein, Introduction to Algorithms, 4th Edition,Massachusetts: The MIT Press, 2022.

[15] H. Wang et al., "PMS-Sorting: A New Sorting Algorithm Based on Similarity," Computers, Materials& Continua, vol. 59, no. 1, pp. 229–237, DOI: 10.32604/cmc.2019.04628, 2019.

[16] M. Nowicki, "Comparison of Sort Algorithms in Hadoop and PCJ," Journal of Big Data, vol. 7, no. 101,DOI: 10.1186/s40537-020-00376-9, 2020.

[17] M. Garland, "Chapter 13 – Sorting," in the Book: Programming Massively Parallel Processors: AHands-on Approach, 4th Edition, Morgan Kaufmann Publisher, pp. 293–310, DOI: 10.1016/B978-0-323-91231-0.00019-7, 2023.

[18] W. X. Zhang and Z. Wen, "Efficient Parallel Algorithms for Some Integer Problems," Proc. of the 19thAnnual Conference on Computer Science (CSC '91), pp. 11–20, San Antonio, USA, DOI: 10.1145/327164.327169, 1991.

[19] T. Rożen, K. Boryczko and W. Alda, "GPU Bucket Sort Algorithm with Applications to Nearest-Neighbour Search," Journal of WSCG, vol. 16, pp. 161–167, 2008.

[20] M. Amirul et al., "Sorting Very Large Text Data in Multi GPUs," Proc. of the 2012 IEEE Int. Conf. onControl System, Computing and Engineering, pp. 160–165, Penang, Malaysia, DOI: 10.1109/ICCSCE.2012.6487134, 2012.

[21] M. Asiatici, D. Maiorano and P. Ienne, "How Many CPU Cores Is an FPGA Worth? Lessons Learnedfrom Accelerating String Sorting on a CPU-FPGA System," Journal of Signal Processing Systems, vol. 93, pp. 1405–1417, DOI: 10.1007/s11265-021-01686-8, 2021.

[22] H. Chen, S. Madaminov, M. Ferdman and P. Milder, "Sorting Large Datasets with FPGA-acceleratedSample Sort," Proc. of 27th IEEE Int. Symposium on Field-Programmable Custom Computing Machines (FCCM 2019), Art. no. 8735541, p. 326, DOI: 10.1109/FCCM.2019.00067, 2019.

[23] M. Kaur and V. Kumar, "Parallel Non-dominated Sorting Genetic Algorithm-II-based ImageEncryption Technique," The Imaging Science Journal, vol. 66, no. 8, pp. 453–462, 2018.

[24] J. Xie, Z. Li, H. Wu, L. Li, B. Pan, P. Guo and G. Sun, “Application of Quicksort Algorithm inInformation Retrieval," Journal on Big Data, vol. 3, no. 4, pp. 135–145, 2021.

[25] N. Faujdar and S. Saraswat, "The Detailed Experimental Analysis of Bucket Sort," Proc. of the 7th Int.Conf. on Cloud Computing, Data Science & Engineering (Confluence), pp. 1–6, Noida, India, DOI: 10.1109/confluence.2017.7943114, 2017.

[26] M. Khurana, N. Faujdar and S. Saraswat, "Hybrid Bucket Sort Switching Internal Sorting Based on theData Inside the Bucket," Proc. of the 6th Int. Conf. on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), pp. 476–482, DOI: 10.1109/icrito.2017.8342474, Noida, India, 2017.

[27] H. Hong, "Parallel Bucket Sorting Algorithm Hiep Hong," [Online], Available:https://api.semanticscholar.org/CorpusID:33533373, 2014.

[28] G. Beliakov, G. Li and S. Liu, "Parallel Bucket Sorting on Graphics Processing Units Based on ConvexOptimization," Optimization, vol. 64, pp. 1033–1055, 2015.

[29] H. I. S. Wijayabandara, Performance Analysis of Parallel Bucket Sort, Thesis for Master's Degree inComputer Science, University of Colombo, School of Computing, 2018.

[30] K. Chen, H. Chen and C. Wang, "Bucket MapReduce: Relieving the Disk I/O Intensity of Data-intensive Applications in MapReduce Frameworks," Proc. of the 29th Euromicro Int. Conf. on Parallel, Distributed and Network-based Processing (PDP), DOI: 10.1109/pdp52278.2021.00013, 2021.

[31] B. Mahafzah, M. Alshraideh, T. Abu-Kabeer, E. Ahmad and N. Hamad, "The Optical Chained-cubicTree Interconnection Network: Topological Structure and Properties," Computers & Electrical Engineering, vol. 38, pp. 330–345, DOI: 10.1016/j.compeleceng.2011.11.023, 2012.

[32] B. Mahafzah, A. Al-Adwan and R. Zaghloul, "Topological Properties Assessment of Opto-electronicArchitectures," Telecomm. Systems, vol. 80, pp. 599–627, DOI: 10.1007/s11235-022-00910-5, 2022.

[33] G. C. Marsden, P. J. Marchand, P. Harvey and S. C. Esener, "Optical Transpose Interconnection SystemArchitectures," Optics Letters, vol. 18, pp. 1083–1085, 1993.

[34] B. A. Mahafzah, A. Sleit, N. A. Hamad, E. F. Ahmad and T. M. Abu-Kabeer, "The OTIS Hyper Hexa-Cell Optoelectronic Architecture," Computing, vol. 94, pp. 411–432, 2012.

[35] M. Abdullah, E. Abuelrub and B. Mahafzah, "The Chained-cubic Tree Interconnection Network," Int.Arab Journal of Information Technology, vol. 8, pp. 334–343, 2011.

[36] O. Kibar, P. J. Marchand and S. C. Esener, "High Speed CMOS Switch Designs for Free-space Opto-electronic MINs," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 6, pp. 372–386, DOI: 10.1109/92.711309, 1998.

[37] I. Kaminow, T. Li and A. Willner, "Optical Fiber Telecommunications VB: Systems and Networks," 5thEdition, Academic Press, 2008.

[38] D. K. J. Lin and J. Chen, "Adaptive Order-of-Addition Experiments via the Quick-sort Algorithm,"Technometrics, vol. 65, no. 3, pp. 396–405, DOI: 10.1080/00401706.2023.2174601, 2023.

[39] M. S. Mohammed and G. A. Abandah, "Characterization of Shared-memory Multi-core Applications,"Jordanian Journal of Computers and Information Technology (JJCIT), vol. 2, no. 1, pp. 37–54, DOI: 10.5455/jjcit.71-1448574289, 2016.

[40] J. Al-Azzeh, "Distributed Mutual Inter-unit Test Method for D-Dimensional Mesh-connectedMultiprocessors with Round-Robin Collision Resolution," Jordanian Journal of Computers and Information Technology (JJCIT), vol. 5, no. 1, pp. 1–16, DOI: 10.5455/jjcit.71-1539688899, 2019.