[1] K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson and K. Chang, "The Case for a Single-chip Multiprocessor," ACM Sigplan Notices, vol. 31, no. 9, pp. 2–11, 1996.
[2] D. Geer, "Chip Makers Turn to Multicore Processors," Computer, vol. 38, no. 5, pp. 11–13, 2005.
[3] C. van Berkel, "Multi-core for Mobile Phones," in Design, Automation Test in Europe Conference Exhibition, pp. 1260–1265, 2009.
[4] G. Blake, R. G. Dreslinski and T. Mudge, "A Survey of Multicore Processors," Signal Processing Magazine, IEEE, vol. 26, no. 6, pp. 26–37, 2009.
[5] B. A. Mahafzah, "Performance Assessment of Multithreaded Quicksort Algorithm on Simultaneous Multithreaded Architecture," The Journal of Supercomputing, vol. 66, no. 1, pp. 339–363, 2013.
[6] B. A. Mahafzah, "Parallel Multithreaded IDA* Heuristic Search: Algorithm Design and Performance Evaluation," International Journal of Parallel, Emergent and Distributed Systems, vol. 26, no. 1, pp. 61–82, 2011.
[7] G. A. Abandah and E. S. Davidson, "Origin 2000 Design Enhancements for Communication Intensive Applications," in Proc. of the International Conference Parallel Architectures and Compilation Techniques (PACT’98), pp. 30–39, 1998.
[8] J. Dongarra, S. Moore, P. Mucci, K. Seymour and H. You, "Accurate Cache and TLB Characterization Using Hardware Counters," in Computational Science-ICCS 2004, Springer, pp. 432–439, 2004.
[9] M. Bhadauria, V. M. Weaver and S. A. McKee, "Understanding PARSEC Performance on Contemporary CMPs," in IEEE Int’l Symp.Workload Characterization, pp. 98–107, 2009.
[10] M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki and B. Falsafi, "Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware," ACM SIGARCH Computer Architecture News, vol. 40, no. 1, pp. 37–48, 2012.
[11] Z. Jia, L. Wang, J. Zhan, L. Zhang and C. Luo, "Characterizing Data Analysis Workloads in Data Centers," in IEEE Int’l Symp. Workload Characterization (IISWC), pp. 66–76, 2013.
[12] W. E. Cohen and B. A. Mahafzah, "Statistical Analysis of Message Passing Programs to Guide Computer Design," in Proceedings of the IEEE Thirty-First Hawaii International Conference on System Sciences , vol. 7, pp. 544–553, 1998.
[13] S. R. Alam, R. F. Barrett, J. A. Kuehn, P. C. Roth and J. S. Vetter, "Characterization of Scientific Workloads on Systems with Multi-core Processors,” in IEEE International Symposium on Workload Characterization, pp. 225–236, 2006.
[14] L. Chai, Q. Gao and D. K. Panda, "Understanding the Impact of Multicore Architecture in Cluster Computing: A Case Study with Intel Dual-core System," in 7th IEEE Int’l Symp. Cluster Computing and the Grid, 2007, pp. 471–478.
[15] G. A. Abandah, Reducing Communication Cost in Scalable Shared Memory Systems, Ph.D. dissertation, The University of Michigan, 1998.
[16] A. Jaleel, R. S. Cohn, C.-K. Luk and B. Jacob, "CMP$im: A Pin-based on-the-fly Multi-core Cache Simulator,” in Proc. 4th Annual Workshop on Modeling, Benchmarking and Simulation, pp. 28–36, 2008.
[17] G. Contreras and M. Martonosi, "Characterizing and Improving the Performance of Intel Threading Building Blocks," Proc. of the IEEE International Symposium on Workload Characterization (IISWC 2008), pp. 57–66, 2008.
[18] A. Bhattacharjee and M. Martonosi, "Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors," in 18th Int’l Conf. Parallel Architectures and Compilation Techniques, pp. 29–40, 2009.
[19] T. Dey, W. Wang, J. W. Davidson and M. L. Soffa, "Characterizing Multi-threaded Applications Based on Shared-resource Contention," in IEEE Int’l Symp. Performance Analysis of Systems and Software, pp. 76–86, 2011.
[20] R. Natarajan and M. Chaudhuri, "Characterizing Multi-threaded Applications for Designing Sharing-aware Last-level Cache Replacement Policies," in IEEE International Symposium on Workload Characterization, pp. 1–10, 2013.
[21] G. A. Abandah and E. S. Davidson, "Configuration Independent Analysis for Characterizing Shared-memory Applications," in Proc. of the 12th International Parallel Processing Symp. (IPPS), pp. 485–491, 1998.
[22] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi and K. Hazelwood, "Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation," SIGPLAN Not., vol. 40, no. 6, pp. 190–200, 2005.
[23] Intel, "Pin-A Dynamic Binary Instrumentation Tool," https://software.intel.com/en-us/articles/pin-a-dynamic-binaryinstrumentation-tool/, 2015, [Online; accessed 22-March-2015].
[24] M. S. Mohammed, Hardware Configuration-independent Characterization of Multi-core Applications, Master’s Thesis, The University of Jordan, Amman, 2015.
[25] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh and A. Gupta, "The SPLASH-2 Programs: Characterization and Methodological Considerations," in ACM SIGARCH Computer Architecture News, vol. 23, no. 2, pp. 24–36, 1995.
[26] C. Bienia, S. Kumar, J. P. Singh and K. Li, "The PARSEC Benchmark Suite: Characterization and Architectural Implications," in Proc. of the 17th Int’l Conf. Parallel Architectures and Compilation Techniques, pp. 72–81, 2008.