DISTRIBUTED MUTUAL INTER-UNIT TEST METHOD FOR D-DIMENSIONAL MESH-CONNECTED MULTIPROCESSORS WITH ROUND-ROBIN COLLISION RESOLUTION


(Received: 2018-10-16, Revised: 2018-11-23 , Accepted: 2018-12-08)
Jamil Al-Azzeh,
A collision-free extension to the mutual inter-unit test methodology for d-dimensional VLSI multiprocessors is proposed to guarantee that any processor core is tested only by its neighboring node at a time and no special care needs to be taken to choose those moments when test actions should start. Collision resolution hardware based on the round-robin arbitration routine is discussed in detail. A parallel collision-resolution-aware mutual inter-unit test algorithm is formulated and diagrammed. The proposed approach has been shown to improve the testability of mesh-connected multiprocessors by increasing the probability of successful fault detection as compared with the distributed self-checking methodology. Further, the new approach drastically reduces extra connectivity in the multiprocessor with respect to known mutual inter-unit test methods and leads to more easily manufactured multiprocessor fabric. For example, in a 4-dimensional system, we need 55% less extra connections with our approach.

[1] M. Abramovici, M. A. Breuer and A. D. Friedman, "Digital Systems Testing and Testable Design," IEEE Press, Piscataway, NJ, 1994.

[2] M.K. Aguilera, W. Chen and S. Toueg, "Failure Detection and Consensus in the Crash-Recovery Model," Distributed Computing, vol. 13, no. 2, pp. 99–125, 2000.

[3] R. Ahlswede and H. Aydinian, "On Diagnosability of Large Multiprocessor Networks," Discrete Applied Mathematics, vol. 156, no. 18, pp. 3464–3474, Nov. 2008.

[4] L. Benini and G. De Micheli, "Networks on Chips: A Paradigm," IEEE Transactions on Computers, vol. 35, no. 1, pp. 70–78, 2002.

[5] P. Bernardi, L.M. Ciganda, E. Sanchez and M. Sonza Reorda, "MIHST: A Hardware Technique for Embedded Microprocessor Functional On-Line Self-Test," IEEE Transactions on Computers, vol. 63, no. 11, pp. 2760–2771, Nov. 2014.

[6] R. Bianchini and R. Buskens, "Implementation of On-Line Distributed System-Level Diagnosis Theory," IEEE Transactions on Computers, vol. 41, pp. 616–626, May 1992.

[7] T. Bjerregaard and S. Mahadevan, "A Survey of Research and Practices of Network-on-Chip," ACM Computing Surveys, vol. 38, no. 1. pp. 1–51, 2006.

[8] D. Blough and H. Brown, "The Broadcast Comparison Model for On-Line Fault Diagnosis in Multicomputer Systems: Theory and Implementation," IEEE Transactions on Computers, vol. 48, pp. 470–493, May 1999.

[9] B. Ciciani, Ed., Manufacturing Yield Evaluation of VLSI/WSI Systems, Los Alamitos, CA: IEEE Computer Society Press, 1998.

[10] S. R. Das, "Self-testing of Cores-based Embedded Systems with Built-in Hardware," IEE Proceedings– Circuits, Devices and Systems, vol. 152, no. 5, pp. 539–546, Oct. 2005.

[11] E. P. Duarte Jr. and T. Nanya, "A Hierarchical Adaptive Distributed System-Level Diagnosis Algorithm," IEEE Transactions on Computers, vol. 47, pp. 34–45, Jan. 1998.

[12] D. Fick, A. DeOrio, J. Hu, V. Bertacco, D. Blaauw and D. Sylvester, "Vicis: A Reliable Network for Unreliable Silicon," Proc. of the 46th DAC, pp. 812–817, Jul. 2009.

[13] S. Furber, "Living with Failure: Lessons from Nature?," Proc. of the 11th IEEE ETS, pp. 4–8, May 2006.

[14] T. Horita and I. Takanami, "Fault-tolerant Processor Arrays based on the 1.5-track Switches with Flexible Spare Distributions," IEEE Transactions on Computers, vol. 49, no. 6, pp. 542–552, June 2000.

[15] S. Y. Hsieh and C. Y. Kao, "The Conditional Diagnosability of k-Ary n-Cubes under the Comparison Diagnosis Model," IEEE Transactions on Computers, vol. 62, no. 4, pp. 839 – 843, April 2013.

[16] L. M. Huisman, "Diagnosing Arbitrary Defects in Logic Designs Using Single Location at a Time (SLAT)," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 23, no. 1, pp. 91–101, 2004.

[17] S. M. A. H. Jafri, S. J. Piestrak, O. Sentieys and S. Pillement, "Design of the Coarse-grained Reconfigurable Architecture DART with On-line Error Detection," Microprocessors and Microsystems, vol. 38, no. 2, pp. 124–136, 2014.

[18] G. Jiang, W. Jigang and J. Sun, "Efficient Reconfiguration Algorithm for Three-dimensional VLSI Arrays," Proc. of the IEEE 26th International Parallel and Distributed Processing Symposium Workshops & Ph.D. Forum, pp. 261–265, 2012.

[19] W. Jigang, T. Srikanthan, G. Jiang and K. Wang, "Constructing Sub-Arrays with Short Interconnects from Degradable VLSI Arrays," IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 4, pp. 929–938, April 2014.

[20] A. Kohler and M. Radetzki, "Fault-tolerant Architecture and Deflection Routing for Degradable NoC Switches," Proc. of the 3rd ACM/IEEE Int. Symp. NoC, pp. 22–31, May 2009.

[21] E. Kolonis, M. Nicolaidis, D. Gizopoulos, M. Psarakis, J. Collet and P. Zajac, "Enhanced Self- configurability and Yield in Multicore Grids," Proc. of the 15th IEEE IOLTS, pp. 75–80, Jun. 2009.

[22] A. Krstic, W. C. Lai, K. T. Cheng, L. Chen and S. Dey, "Embedded Software-based Self-test for Programmable Core-based Designs," IEEE Design and Test of Computers, vol. 19, no. 4, pp. 18–27, July/Aug. 2002.

[23] J. C. M. Li and E. J. McCluskey, "Diagnosis of Resistive-Open and Stuck-Open Defects in Digital CMOS Ics," Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 24, no. 11, pp. 1748–1759, 2005.

[24] L. Lin, S. Zhou, L. Xu and D. Wang, "The Extra Connectivity and Conditional Diagnosability of Alternating Group Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 8, pp. 2352–2362, Aug. 2015.

[25] S. Lin, W. Shen, C. Hsu, C. Chao and A. Wu, "Fault-tolerant Router with Built-in Self-test/Self-diagnosis and Fault-isolation Circuits for 2D Mesh-based Chip Multiprocessor Systems," Proc. Int. Symp. VLSI- DAT, pp. 72–75, Apr. 2009.

[26] P. Maestrini and P. Santi, "Self-diagnosis of Processor Arrays Using a Comparison Model," Proc. Of the 14th Symp. on Reliable Distributed Systems, pp. 218–228, 1995.

[27] J. Mekkoth, M. Krishna, J. Qian, W. Hsu, C.-H. Chen, Y. S. Chen, N. Tamarapalli, W. T. Cheng, J. Tofte and M. Keim, "Yield Learning with Layout-Aware Advanced Scan Diagnosis," Proc. of the International Symposium for Testing and Failure Analysis, pp. 412–418, 2006.

[28] M. Psarakis, D. Gizopoulos, E. Sanchez and M. Sonza Reorda, "Microprocessor Software-based Self- testing," IEEE Design and Test of Computers, vol. 27, no. 3, pp. 4–19, May/June 2010.

[29] J. Raik and V. Govind, "Low-area Boundary BIST Architecture for Meshlike Network-on-Chip," Proc. of the 15th IEEE Int’l Symp. DDECS, pp. 95–100, Apr. 2012.

[30] J. Rajski, J. Tyszer, M. Kassab and N. Mukherjee, "Embedded Deterministic Test," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 23, no. 5, pp. 776–792, 2004.

[31] S. Rangarajan, A. T. Dahbura and E. Ziegler, "A Distributed System-Level Diagnosis Algorithm for Arbitrary Network Topologies," IEEE Transactions on Computers, vol. 44, pp. 312–334, Feb. 1995.

[32] S. Rangarajan and D. Fussell, "Diagnosing Arbitrarily Connected Parallel Computers with High Probability," IEEE Transactions on Computers, vol. 41, pp. 606–615, May 1992.

[33] A. Sengupta and A. T. Dahbura, "On Self-Diagnosable Multiprocessor Systems: Diagnosis by the Comparison Approach," IEEE Transactions on Computers, vol. 41, pp. 1386–1396, Nov. 1992.

[34] M. Sharma, C. Schuermyer and B. Benware, "Determination of Dominant-Yield-Loss Mechanism with Volume Diagnosis," Proc. of IEEE Design & Test of Computers, vol. 27, no. 3, pp. 54–61, 2010.

[35] C. Stroud, J. Sunwoo, S. Garimella and J. Harris, "Built-in Self-test for System-on-Chip: A Case Study," Proc. of the Int’l Test Conf., pp. 837–846, 2004.

[36] W. C. Tam, O. Poku and R. D. Blanton, "Systematic Defect Identification through Layout Snippet Clustering," Proc. of the IEEE International Test Conference, pp.1, 2010.

[37] H. Tang, S. Manish, J. Rajski, M. Keim and B. Benware, "Analyzing Volume Diagnosis Results with Statistical Learning for Yield Improvement," Proc. of the European Test Symp., pp. 145–150, 2007.

[38] Z. Wang, M. Marek-Sadowska, K. H. Tsai and J. Rajski, "Analysis and Methodology for Multiple-Fault Diagnosis," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 3, pp. 558–575, 2006.

[39] L. Zhang, "Fault-Tolerant Meshes with Small Degree," IEEE Transactions on Computers, vol. 51, no. 5, pp. 553–560, May 2002.

[40] Z. Zhang, D. Refauvelet, A. Greiner, M. Benabdenbi and F. Pecheux, "On-the-Field Test and Configuration Infrastructure for 2-D-Mesh NoCs in Shared-Memory Many-Core Architectures," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 22, no. 6, pp. 1364–1376, June 2014.

[41] J. Al-Azzeh, M. E. Leonov, D. E Skopin, E. A. Titenko and I. V. Zotov, "The Organization of Built-in Hardware-Level Mutual Self-Test in Mesh-Connected VLSI Multiprocessors," International Journal on Information Technology, vol. 3, no. 2, pp. 29–33, 2015.

[42] J. Al-Azzeh, "A Distributed Multiplexed Mutual Inter-Unit in-Operation Test Method for Mesh- Connected VLSI Multiprocessors," Jordan Journal of Electrical Engineering, vol. 3, no. 3, pp. 193-207, 2017.