USING STATIC ANALYSIS TOOLS FOR ANALYZING STUDENT BEHAVIOR IN AN INTRODUCTORY PROGRAMMING COURSE

(Received: 15-Mar-2020, Revised: 2-May-2020 , Accepted: 18-May-2020)
Analyzing student coding data can help researchers understand how novice programmers learn and inform practitioners on how to best teach them. This work explores how using static analysis tools in programming assignments can provide insight into student behavior and performance. The use of three static analysis tools in the assignments of an introductory programming course has been analyzed. Our findings confirm previous work regarding that formatting and documentation issues are the most common issues found in student code, that this is constant regardless of major and performance in the course and that there are certain error types which are more correlated with performance. We also found that total error frequency in the course correlates with final course grade and that the presence of any kind of error in final submissions correlates with low performance on exams. Furthermore, we found females to produce less documentation and style errors than males and students who partner to produce less errors in general than students working alone. Our results also raise concerns on the use of certain metrics for assessing the difficulty of fixing errors by students.

[1] P. Ihantola, A. Vihavainen, A. Ahadi, M. Butler, J. Börstler, S. H. Edwards, E. Isohanni, A. Korhonen, A. Petersen, K. Rivers and others, "Educational Data Mining and Learning Analytics in Programming: Literature Review and Case Studies," Proceedings of the 2015 ITiCSE on Working Group Reports, pp. 41-63, [Online], Available: https://doi.org/10.1145/2858796.2858798, 2015.

[2] C. D. Hundhausen, D. M. Olivares and A. S. Carter, "IDE-based Learning Analytics for Computing Education: A Process Model, Critical Review and Research Agenda," ACM Transactions on Education (TOCE), vol. 17, no. 3, pp. 1-26, 2017.

[3] B. Hui and S. Farvolden, "How Can Learning Analytics Improve a Course?," Proceedings of the 22nd Western Canadian Conference on Computing Education (WCCCE'17), pp. 1-6, [Online], Available: https://doi.org/10.1145/3085585.3085586, 2017.

[4] A. Ahadi, A. Hellas, P. Ihantola, A. Korhonen and A. Peterson, "Replication in Computing Education Research: Researcher Attitudes and Experiences," Proceedings of the 16th Koli Calling International Conference on Computing Education Research (Koli Calling '16), pp. 2-11, [Online], Available: https://doi.org/10.1145/2999541.2999554, 2016.

[5] N. C. C. Brown, M. Kölling, D. McCall and I. Utting, "Blackbox: A Large-scale Repository of Novice Programmers’ Activity," Proc. of the 45th ACM Tech. Symposium on Computer Science Education (SIGCSE ’14), pp. 223-228, [Online], Available: https://doi.org/10.1145/2538862.2538924, 2014.

[6] D. Zingaro, Y. Cherenkova, O. Karpova and A. Petersen, "Facilitating Code-writing in PI Classes," Proceedings of the 44th ACM Technical Symposium on Computer Science Education (SIGCSE ’13), pp. 585-590, [Online], Available: https://doi.org/10.1145/2445196.2445369, 2013.

[7] A. Papancea, J. Spacco and D. Hovemeyer, "An Open Platform for Managing Short Programming Exercises," Proceedings of the 9th Annual Int. ACM Conference on International Computing Education Research (ICER ’13), pp. 74-52, [Online], Available: https://doi.org/10.1145/2493394.2493401, 2013.

[8] A. Vihavainen, T. Vikberg, M. Luukkainen and M. Pärtel, "Scaffolding Students’ Learning Using Test My Code," Proc. of the 18th ACM Conf. on Innovation and Technology in Computer Science Education (ITiCSE ’13), pp. 117-122, [Online], Available: https://doi.org/10.1145/2462476.2462501, 2013.

[9] C. Piech, M. Sahami, D. Koller, S. Cooper and P. Blikstein, "Modeling How Students Learn to Program," Proc. of the 43rd ACM Technical Symposium on Computer Science Education (SIGCSE ’12), pp. 153-160, [Online], Available: https://doi.org/10.1145/2157136.2157182, 2012.

[10] V. Karavirta, A. Korhonen and L. Malmi, "On the Use of Resubmissions in Automatic Assessment Systems," Computer Science Education, vol. 16, no. 3, p. 229–240, 2006.

[11] P. Blikstein, "Using Learning Analytics to Assess Students’ Behavior in Open-ended Programming Tasks," Proceedings of the 1st International Conference on Learning Analytics and Knowledge (LAK ’11), pp. 110-116, [Online], Available: https://doi.org/10.1145/2090116.2090132, 2011.

[12] P. Blikstein, M. Worsley, C. Piech, M. Sahami, S. Cooper and D. Koller, "Programming Pluralism: Using Learning Analytics to Detect Patterns in the Learning of Computer Programming," Journal of Learning Sciences, vol. 23, no. 4, p. 561–599, 2014.

[13] A. Allevato and S. H. Edwards, "Discovering Patterns in Student Activity on Programming Assignments," ASEE Southeastern Section Annual Conference and Meeting, [Online], Available: http://people.cs.vt.edu/~edwards/index2.php?option=com_content&do_pdf=1&id=288, Virginia Polytechnic Institute and State University Blacksburg, Virginia, 2010.

[14] A. Altadmri and N. C. C. Brown, "37 Million Compilations: Investigating Novice Programming Mistakes in Large-scale Student Data," Proceedings of the 46th ACM Technical Symposium on Computer Science Education, pp. 522- 527, [Online], Available: https://doi.org/10.1145/2676723.2677258, 2015.

[15] N. C. C. Brown and A. Altadmri, "Novice Java Programming Mistakes: Large-scale Data vs. Educator Beliefs," ACM Transactions on Education (TOCE), vol. 17, no. 2, pp. 1-21, 2017.

[16] E. S. Tabanao, M. M. T. Rodrigo and M. C. Jadud, "Predicting at-Risk Novice Java Programmers through the Analysis of Online Protocols," Proceedings of the 7th International Workshop on Computing Education Research, pp. 85-92, [Online], Available: https://doi.org/10.1145/2016911.2016930, 2011.

[17] M. C. Jadud, "Methods and Tools for Exploring Novice Compilation Behaviour," Proceedings of the 2nd International Workshop on Computing Education Research, pp. 73-84, [Online], Available: https://doi.org/10.1145/1151588.1151600, 2006.

[18] M. C. Jadud and B. Dorn, "Aggregate Compilation Behavior: Findings and Implications from 27,698 Users," Proceedings of the 11th Annual International Conference on International Computing Education Research, pp. 131-139, [Online], Available: https://doi.org/10.1145/2787622.2787718, 2015.

[19] Ma. Mercedes T. Rodrigo, T. C. S. Andallaza, F. E. V. G. Castro, M. L. V. Armenta, T. T. Dy And M. C. Jadud, "An Analysis Of Java Programming Behaviors: Affect, Perceptions And Syntax Errors among Low-achieving, Average And High-achieving Novice Programmers," Journal Of Educational Computing Research, vol. 49, no. 3, Pp. 293-325, 2013.

[20] N. Ayewah, W. Pugh, D. Hovemeyer, J. D. Morgenthaler and J. Penix, "Using Static Analysis to Find Bugs," IEEE Software, vol. 25, pp. 22–29, 2008.

[21] J. Zheng, L. Williams, N. Nagappan, W. Snipes, J. P. Hudepohl and M. A. Vouk, "On the Value of Static Analysis for Fault Detection in Software," IEEE Transactions on Software Engineering, vol. 32, no. 4, pp. 240-253, 2006.

[22] N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix and Y. Zhou, "Evaluating Static Analysis Defect Warnings on Production Software," Proceedings of the 7th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pp. 1-8, [Online], Available: https://doi.org/10.1145/1251535.1251536, 2007.

[23] J. R. Ruthruff, J. Penix, J. D. Morgenthaler, S. Elbaum and G. Rothermel, "Predicting Accurate and Actionable Static Analysis Warnings: An Experimental Approach," Proceedings of the 30th International Conference on Software Engineering (ICSE‘08), pp. 341-350, [Online], Available: https://doi.org/10.1145/1368088.1368135, 2008.

[24] A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak and D. Engler, "A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World," Communications of the ACM, vol. 53, no. 2, pp. 66-75, 2010.

[25] S. A. Mengel and V. Yerramilli, "A Case Study of the Static Analysis of the Quality of Novice Student Programs," Proceedings of the 30th SIGCSE Technical Symposium on Computer Science Education (SIGCSE '99), pp. 78-82, [Online], Available: https://doi.org/10.1145/299649.299689, 1999.

[26] S. Nutbrown and C. Higgins, "Static Analysis of Programming Exercises: Fairness, Usefulness and a Method for Application," Computer Science Education, vol. 26, pp. 104–128, 2016.

[27] N. Truong, P. Roe and P. Bancroft, "Static Analysis of Students' Java Programs," Proceedings of the 6th Australasian Conference on Computing Education (ACE '04), vol. 30, pp. 317-325, 2004.

[28] D. Liu and A. Petersen, "Static Analyses in Python Programming Courses," Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE '19), pp. 666-671, [Online], Available: https://doi.org/10.1145/3287324.3287503, 2019.

[29] T. Flowers, C. A. Carver and J. Jackson, "Empowering Students and Building Confidence in Novice Programmers through Gauntlet," Proceedings of the 34th Annual Frontiers in Education (FIE 2004), vol. 1, pp. T3H/10-T3H/13, Savannah, GA, 2004.

[30] M. Hristova, A. Misra, M. Rutter and R. Mercuri, "Identifying and Correcting Java Programming Errors for Introductory Computer Science Students," ACM SIGCSE Bulletin, vol. 35, no. 1, pp. 153–156, 2003.

[31] M. Striewe and M. Goedicke, "A Review of Static Analysis Approaches for Programming Exercises," Proc. of the International Computer-assisted Assessment Conference, Research into E-Assessment (CAA 2014), Communications in Computer and Information Science, vol. 439, pp. 100-113, 2014.

[32] K. Abd Rahman and M. J. Nordin, "A Review on the Static Analysis Approach in the Automated Programming Assessment Systems," Proc. of the National Conference on Programming, [Online], Available:http://www.academia.edu/download/30480008/khirulnizam-reviewonstaticanalysisapproach.
pdf, 2007.

[33] S. H. Edwards, N. Kandru and M. Rajagopal, "Investigating Static Analysis Errors in Student Java Programs," Proceedings of the 2017 ACM Conference on International Computing Education Research, pp. 65-73, [Online], Available: https://doi.org/10.1145/3105726.3106182, 2017.

[34] H. Keuning, B. Heeren and J. Jeuring, "Code Quality Issues in Student Programs," Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE ’17), pp. 110-115, [Online], Available: https://doi.org/10.1145/3059009.3059061, 2017.

[35] T. Delev and D. Gjorgjevikj, "Static Analysis of Source Code Written by Novice Programmers," Proceedings of the 2017 IEEE Global Engineering Education Conference (EDUCON), DOI: 10.1109/EDUCON.2017.7942942, Athens, Greece, 2017.

[36] T. Schorsch, "CAP: An Automated Self-assessment Tool to Check Pascal Programs for Syntax, Logic and Style Errors," Proceedings of the 26th SIGCSE technical symposium on Computer science education (SIGCSE '95), pp. 168–172, [Online], Available: https://doi.org/10.1145/199688.199769, March 1995.

[37] B. Hanks, S. Fitzgerald, R. McCauley, L. Murphy and C. Zander, "Pair Programming in Education: A Literature Review," Computer Science Education, vol. 21, pp. 135–173, 2011.

[38] L. Murphy, B. Richards, R. McCauley, B. B. Morrison, S. Westbrook and T. Fossum, "Women Catch up: Gender Differences in Learning Programming Concepts," Proceedings of the 37th SIGCSE technical symposium on Computer science education (SIGCSE '06), pp 17–21, [Online], Available: https://doi.org/10.1145/1121341.1121350, 2006.

[39] W. W. F. Lau and A. H. K. Yuen, "Gender Differences in Learning Styles: Nurturing a Gender and Style Sensitive Computer Science Classroom," Australasian Journal of Educational Technology, vol. 26, no. 7, [Online], available: https://doi.org/10.14742/ajet.1036, 2010.

[40] N. Subrahmaniyan, L. Beckwith, V. Grigoreanu, M. Burnett, S. Wiedenbeck, V. Narayanan, K. Bucht, R. Drummond and X. Fern, "Testing vs. Code Inspection vs. What Else?: Male and Female End-users’ Debugging Strategies," Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’08), pp. 617-626, [Online], Available: https://doi.org/10.1145/1357054.1357153, 2008.

[41] L. Beckwith, C. Kissinger, M. Burnett, S. Wiedenbeck, J. Lawrance, A. Blackwell and C. Cook, "Tinkering and Gender in End-user Programmers’ Debugging," Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’06), pp. 231-240, [Online], Available: https://doi.org/10.1145/1124772.1124808, 2006.

[42] M. Burnett, S. D. Fleming, S. Iqbal, G. Venolia, V. Rajaram, U. Farooq, V. Grigoreanu and M. Czerwinski, "Gender Differences and Programming Environments: Across Programming Populations," Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM ’10), Article no. 28, pp. 1–10, [Online], Available: https://doi.org/10.1145/1852786.1852824, 2010.

[43] I. Wagner, "Gender and Performance in Computer Science," ACM Transactions on Computing Education (TOCE), vol. 16, no. 3, pp. 1-16, 2016.

[44] A. Lishinski, A. Yadav, J. Good and R. Enbody, "Learning to Program: Gender Differences and Interactive Effects of Students’ Motivation, Goals and Self-efficacy on Performance," Proceedings of the 2016 ACM Conference on International Computing Education Research (ICER '16), pp. 211-220, [Online], Available: https://doi.org/10.1145/2960310.2960329, 2016.

[45] M. Fisher and A. Cox, "Gender and Programming Contests: Mitigating Exclusionary Practices," Informatics in Education, vol. 5, no. 1, pp. 47–62, 2006.

[46] J. Carter and T. Jenkins, "Spot the Difference: Are There Gender Differences in Coding Style?," Proceedings of the 3rd Annual LTSN-ICS Conference, [Online], Available: https://kar.kent.ac.uk/id/eprint/13757, 2002.

[47] L. Williams, R. R. Kessler, W. Cunningham and R. Jeffries, "Strengthening the Case for Pair Programming," IEEE Software, vol. 17, no. 4, pp. 19–25, 2000.

[48] C. McDowell, L. Werner, H. E. Bullock and J. Fernald, "The Impact of Pair Programming on Student Performance, Perception and Persistence," Proceedings of the 25th International Conference on Software Engineering (ICSE ’03), pp. 602-607, 2003.

[49] N. Salleh, E. Mendes and J. Grundy, "Empirical Studies of Pair Programming for CS/SE Teaching in Higher Education: A Systematic Literature Review," IEEE Transactions on Software Engineering, vol. 37, no. 4, pp. 509–525, 2011.

[50] T. Dybå, E. Arisholm, D. I. K. Sjøberg, J. E. Hannay and F. Shull, "Are Two Heads Better Than One? On the Effectiveness of Pair Programming," IEEE Software, vol. 24, no. 6, pp. 12–15, 2007.

[51] S. H. Edwards., "Using Software Testing to Move Students from Trial-and-error to Reflection-in-action," Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education (SIGCSE ’04), pp. 26-30, [Online], Available: https://doi.org/10.1145/971300.971312, 2004.

[52] A. Hellas, P. Ihantola, A. Petersen, V. V. Ajanovski, M. Gutica, T. Hynninen, A. Knutas, J. Leinonen, C. Messom and S. N. Liao, "Predicting Academic Performance: A Systematic Literature Review," Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education (ITICSE’18), pp. 175-199, [Online], Available: https://doi.org/10.1145/3293881.3295783, 2018.

[53] H. Seo, C. Sadowski, S. Elbaum, E. Aftandilian and R. Bowdidge, "Programmers’ Build Errors: A Case Study (at Google)," Proceedings of the 36th International Conference on Software Engineering (ICSE 2014), pp 724–734, [Online], Available: https://doi.org/10.1145/2568225.2568255, 2014.

[54] J. Leinonen, L. Leppänen, P. Ihantola and A. Hellas, "Comparison of Time Metrics in Programming," Proceedings of the 2017 ACM Conference on International Computing Education Research (ICER’17), pp. 200–208, [Online], Available: https://doi.org/10.1145/3105726.3106181, 2017.

[55] J. Bennedsen and M. E. Caspersen, "Failure Rates in Introductory Programming," ACM SIGCSE Bulletin, vol. 39, no. 2, pp. 32-36, [Online], Available: https://doi.org/10.1145/1272848.1272879, 2007.