Roy, Devjeet; Fakhoury, Sarah; Arnaoudova, Venera Re-assessing Automatic Evaluation Metrics for Code Summarization Tasks Inproceedings In: ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 1105–-1116, 2021. Links | BibTeX | Tags: automatic evaluation metrics, empirical study, machine translation, source code summarization Roy, Devjeet; Zhang, Ziyi; Ma, Maggie; Arnaoudova, Venera; Panichella, Annibale; Panichella, Sebastiano; Gonzalez, Danielle; Mirakhorli, Mehdi DeepTC-Enhancer: Improving the Readability of Automatically Generated Tests Inproceedings In: International Conference on Automated Software Engineering (ASE), pp. 287–298, 2020. Links | BibTeX | Tags: empirical study, source code readability, source code summarization Roy, Devjeet; Fakhoury, Sarah; Lee, John; Arnaoudova, Venera A model to detect incremental readability improvements in incremental changes Inproceedings In: Proceedings of the International Conference on Program Comprehension (ICPC), pp. 25–36, 2020. Links | BibTeX | Tags: developers' perception, empirical study, machine learning, source code readability Fakhoury, Sarah; Roy, Devjeet; Ma, Yuzhan; Arnaoudova, Venera; Adesope, Olusola Measuring the Impact of Inconsistencies on Developers' Cognitive Load during Bug Localization Journal Article In: Empirical Software Engineering (EMSE), vol. 25, pp. 2140–2178, 2020. Links | BibTeX | Tags: Biometrics, empirical study, linguistic antipatterns, program comprehension Roy, Devjeet; Fakhoury, Sarah; Arnaoudova, Venera VITALSE: Visualizing Eye Tracking and Biometric Data Inproceedings In: Proceedings of the International Conference on Software Engineering (ICSE) - Demonstrations Track, pp. 57–60, 2020. Links | BibTeX | Tags: Biometrics, empirical study, program comprehension, Tool Fakhoury, Sarah; Roy, Devjeet; Hassan, Sk. Adnan; Arnaoudova, Venera Improving Source Code Readability: Theory and Practice Inproceedings In: Proceedings of the International Conference on Program Comprehension (ICPC), pp. 2–12, 2019. Links | BibTeX | Tags: empirical study, readability, source code identifiers Fakhoury, Sarah; Ma, Yuzhan; Arnaoudova, Venera; Adesope, Olusola The Effect of Poor Source Code Lexicon and Readability on Developers' Cognitive Load Inproceedings In: Proceedings of the International Conference on Program Comprehension (ICPC), pp. 286–296, 2018, (Distinguished Paper Award). Links | BibTeX | Tags: Biometrics, empirical study, linguistic antipatterns, program comprehension, source code identifiers Fakhoury, Sarah; Arnaoudova, Venera; Noiseux, Cedric; Khomh, Foutse; Antoniol, Giuliano Keep it simple: is deep learning good for linguistic smell detection? Inproceedings In: Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER)—REproducibility Studies and NEgative Results (RENE) Track, 2018. Links | BibTeX | Tags: deep learning, empirical study, linguistic antipatterns, machine learning, source code identifiers, source code readability Sabané, Aminata; Guéhéneuc, Yann-Gaël; Arnaoudova, Venera; Antoniol, Giuliano Fragile base-class problem, problem? Journal Article In: Empirical Software Engineering (EMSE), vol. 22, no. 5, pp. 2612–2657, 2017. Links | BibTeX | Tags: change proneness, empirical study, fault proneness, inheritance Panichella, Sebastiano; Arnaoudova, Venera; Penta, Massimiliano Di; Antoniol, Giuliano Would Static Analysis Tools Help Developers with Code Reviews? Inproceedings In: International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 161–170, 2015. Links | BibTeX | Tags: Code Review, empirical study, mining software repositories, static analysis Arnaoudova, Venera; Penta, Massimiliano Di; Antoniol, Giuliano Linguistic Antipatterns: What They are and How Developers Perceive Them Journal Article In: Empirical Software Engineering (EMSE), vol. 21, no. 1, pp. 104–158, 2015. Abstract | Links | BibTeX | Tags: developers' perception, empirical study, linguistic antipatterns, natural language processing, source code identifiers Arnaoudova, Venera Towards Improving the Code Lexicon and its Consistency PhD Thesis Polytechnique Montréal, 2014. Links | BibTeX | Tags: developers' perception, empirical study, fault prediction, linguistic antipatterns, program comprehension, renaming, source code identifiers Arnaoudova, Venera; Eshkevari, Laleh Mousavi; Penta, Massimiliano Di; Oliveto, Rocco; Antoniol, Giuliano; Guéhéneuc, Yann-Gaël REPENT: Analyzing the Nature of Identifier Renamings Journal Article In: IEEE Transactions on Software Engineering (TSE), vol. 40, no. 5, pp. 502–532, 2014. Abstract | BibTeX | Tags: empirical study, mining software repositories, refactoring, renaming, source code identifiers Medini, Soumaya; Arnaoudova, Venera; Penta, Massimiliano Di; Antoniol, Giuliano; Guéhéneuc, Yann-Gaël; Tonella, Paolo SCAN: An Approach to Label and Relate Execution Trace Segments Journal Article In: Journal of Software: Evolution and Process (JSEP), vol. 26, no. 11, pp. 962–995, 2014. Abstract | BibTeX | Tags: concept identification, dynamic analysis, empirical study, formal concept analysis, information retrieval Arnaoudova, Venera; Eshkevari, Laleh Mousavi; Sharifabadi, Elaheh Safari; Constantinides, Constantinos Overcoming comprehension barriers in the AspectJ programming language Journal Article In: Journal of Object Technology (JOT), vol. 7, no. 6, pp. 121–142, 2008. BibTeX | Tags: aspect-oriented programming, empirical study, program comprehension2021
@inproceedings{Devjeet:fse21:BLEU,
title = {Re-assessing Automatic Evaluation Metrics for Code Summarization Tasks},
author = {Devjeet Roy and Sarah Fakhoury and Venera Arnaoudova},
url = {http://veneraarnaoudova.com/wp-content/uploads/2021/09/2021-FSE-CR-Reassessing-Automatic-Evaluation-Metrics-for-Code-Summarization-Tasks.pdf},
year = {2021},
date = {2021-05-20},
booktitle = {ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)},
pages = {1105–-1116},
keywords = {automatic evaluation metrics, empirical study, machine translation, source code summarization},
pubstate = {published},
tppubtype = {inproceedings}
}
2020
@inproceedings{Devjeet:20:DeepTC-Enhancer,
title = {DeepTC-Enhancer: Improving the Readability of Automatically Generated Tests},
author = {Devjeet Roy and Ziyi Zhang and Maggie Ma and Venera Arnaoudova and Annibale Panichella and Sebastiano Panichella and Danielle Gonzalez and Mehdi Mirakhorli},
url = {http://veneraarnaoudova.com/wp-content/uploads/2020/09/2020-ASE-PREPRINT-DeepTC-Enhancer-Improving-the-Readability-of-Automatically-Generated-Tests.pdf},
year = {2020},
date = {2020-07-30},
booktitle = {International Conference on Automated Software Engineering (ASE)},
pages = {287--298},
keywords = {empirical study, source code readability, source code summarization},
pubstate = {published},
tppubtype = {inproceedings}
}
@inproceedings{Roy:icpc20:ReadabilityModel,
title = {A model to detect incremental readability improvements in incremental changes},
author = {Devjeet Roy and Sarah Fakhoury and John Lee and Venera Arnaoudova},
url = {http://veneraarnaoudova.com/wp-content/uploads/2020/07/2020-ICPC-PREPRINT-A-Model-to-Detect-Readability-Improvements-in-Incremental-Changes.pdf},
year = {2020},
date = {2020-05-24},
booktitle = {Proceedings of the International Conference on Program Comprehension (ICPC)},
pages = {25--36},
keywords = {developers' perception, empirical study, machine learning, source code readability},
pubstate = {published},
tppubtype = {inproceedings}
}
@article{Fakhoury:emse19:CognitiveLoad,
title = {Measuring the Impact of Inconsistencies on Developers' Cognitive Load during Bug Localization},
author = {Sarah Fakhoury and Devjeet Roy and Yuzhan Ma and Venera Arnaoudova and Olusola Adesope},
url = {http://veneraarnaoudova.ca/wp-content/uploads/2019/07/2019-EMSE-PREPRINT-Measuring-the-Impact-of-Lexical-and-Structural-Inconsistencies-on-Developers-Cognitive-Load-during-Bug-Localization.pdf},
year = {2020},
date = {2020-05-14},
journal = {Empirical Software Engineering (EMSE)},
volume = {25},
pages = {2140--2178},
keywords = {Biometrics, empirical study, linguistic antipatterns, program comprehension},
pubstate = {published},
tppubtype = {article}
}
@inproceedings{Roy:icseTool:VITALSE,
title = {VITALSE: Visualizing Eye Tracking and Biometric Data},
author = {Devjeet Roy and Sarah Fakhoury and Venera Arnaoudova},
url = {http://veneraarnaoudova.com/wp-content/uploads/2020/02/2020-ICSE_Tool-PREPRINT-VITALSE-Visualizing-Eye-Tracking-and-Biometric-Data.pdf},
year = {2020},
date = {2020-05-01},
booktitle = {Proceedings of the International Conference on Software Engineering (ICSE) - Demonstrations Track},
pages = {57--60},
keywords = {Biometrics, empirical study, program comprehension, Tool},
pubstate = {published},
tppubtype = {inproceedings}
}
2019
@inproceedings{Fakhoury:icpc18:Readability,
title = {Improving Source Code Readability: Theory and Practice},
author = {Sarah Fakhoury and Devjeet Roy and Sk. Adnan Hassan and Venera Arnaoudova},
url = {http://veneraarnaoudova.ca/wp-content/uploads/2019/03/2019-ICPC-Reverse_Engineering_Readability_Metrics.pdf},
year = {2019},
date = {2019-03-18},
booktitle = {Proceedings of the International Conference on Program Comprehension (ICPC)},
pages = {2--12},
keywords = {empirical study, readability, source code identifiers},
pubstate = {published},
tppubtype = {inproceedings}
}
2018
@inproceedings{Fakhoury:ICPC18:CognitiveLoad,
title = {The Effect of Poor Source Code Lexicon and Readability on Developers' Cognitive Load},
author = {Sarah Fakhoury and Yuzhan Ma and Venera Arnaoudova and Olusola Adesope},
url = {http://veneraarnaoudova.ca/wp-content/uploads/2018/03/2018-ICPC-Effect-lexicon-cognitive-load.pdf},
year = {2018},
date = {2018-03-03},
booktitle = {Proceedings of the International Conference on Program Comprehension (ICPC)},
pages = {286--296},
note = {Distinguished Paper Award},
keywords = {Biometrics, empirical study, linguistic antipatterns, program comprehension, source code identifiers},
pubstate = {published},
tppubtype = {inproceedings}
}
@inproceedings{Fakhoury:saner:CNN,
title = {Keep it simple: is deep learning good for linguistic smell detection?},
author = {Sarah Fakhoury and Venera Arnaoudova and Cedric Noiseux and Foutse Khomh and Giuliano Antoniol},
url = {http://veneraarnaoudova.ca/wp-content/uploads/2018/02/2018-SANER_RENE-preprint-simple-deep-learning.pdf},
year = {2018},
date = {2018-02-22},
booktitle = {Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER)—REproducibility Studies and NEgative Results (RENE) Track},
keywords = {deep learning, empirical study, linguistic antipatterns, machine learning, source code identifiers, source code readability},
pubstate = {published},
tppubtype = {inproceedings}
}
2017
@article{Sabane:emse16:FBCP,
title = {Fragile base-class problem, problem?},
author = {Aminata Sabané and Yann-Gaël Guéhéneuc and Venera Arnaoudova and Giuliano Antoniol},
url = {https://urldefense.proofpoint.com/v2/url?u=http-3A__em.rdcu.be_wf_click-3Fupn-3DKP7O1RED-2D2BlD0F9LDqGVeSILuP3Pf-2D2F66xBYhaXrLLbVQ-2D3D-5FyWA3lQa11O-2D2BAiN-2D2BPLKTKSTkgdaT552EEbnrI10AtIGfXZA2SWwt0Ta0K7qb01t-2D2Bnkg5rKs4tRfeZQclfwwgzS9CjKsm-2D2BE8XVgKY-2D2FMBEeTbI4ZBUMxgYmkYDoUeqUNbmIS8mKc68Mn5V2y5VKD2DQvzAybMlvoI-2D2FEtH9rTW9Hhrn1xOS-2D2B-2D2FoDWwNAOHe9gCqHo3Zc-2D2FzTqelwMQnNej2RwFloPdYlHFSizzTJAaI2PcH2aAqm5f-2D2FfhkkROBvlrNzi3YgY-2D2F4TRyJevfUqSpbVjiHEA-2D3D-2D3D&d=DwMFaQ&c=C3yme8gMkxg_ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw&r=M3gB2eDIMjDnNPLag7B_NDsy5HOP9LrK_38NGrK0iSc&m=9N6oWIn0OxKScz7kmYCSE7aO4YLyGQHwz_yLrYeCpPM&s=0yPHFDTvblFeC4dHQreNjBaq547xGZ7LrEnzMLOz0nU&e=},
year = {2017},
date = {2017-10-01},
journal = {Empirical Software Engineering (EMSE)},
volume = {22},
number = {5},
pages = {2612--2657},
keywords = {change proneness, empirical study, fault proneness, inheritance},
pubstate = {published},
tppubtype = {article}
}
2015
@inproceedings{Panichella:saner15:CodeReviewsWarnings,
title = {Would Static Analysis Tools Help Developers with Code Reviews?},
author = {Sebastiano Panichella and Venera Arnaoudova and Massimiliano {Di Penta} and Giuliano Antoniol},
url = {http://www.veneraarnaoudova.ca/wp-content/uploads/2015/02/2015-SANER-Panichella-et-al-preprint.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
pages = {161--170},
keywords = {Code Review, empirical study, mining software repositories, static analysis},
pubstate = {published},
tppubtype = {inproceedings}
}
@article{LAsPerception-15,
title = {Linguistic Antipatterns: What They are and How Developers Perceive Them},
author = {Venera Arnaoudova and Massimiliano {Di Penta} and Giuliano Antoniol},
url = {/wp-content/uploads/2014/10/2014-EMSE-Arnaodova-et-al-Perception-LAs.pdf},
year = {2015},
date = {2015-01-01},
journal = {Empirical Software Engineering (EMSE)},
volume = {21},
number = {1},
pages = {104--158},
abstract = {Antipatterns are known as poor solutions to recurring problems. For example, Brown et al. and Fowler define practices concerning poor design or implementation solutions. However, we know that the source code lexicon is part of the factors that affect the psychological complexity of a program, i.e., factors that make a program difficult to understand and maintain by humans. The aim of this work is to identify recurring poor practices related to inconsistencies among the naming, documentation, and implementation of an entity—called Linguistic Antipatterns (LAs)—that may impair program understanding. To this end, we first mine examples of such inconsistencies in real open-source projects and abstract them into a catalog of 17 recurring LAs related to methods and attributes1. Then, to understand the relevancy of LAs, we perform two empirical studies with developers—30 external (i.e., not familiar with the code) and 14 internal (i.e., people developing or maintaining the code). Results indicate that the majority of the participants perceive LAs as poor practices and therefore must be avoided—69% and 51% of the external and internal developers, respectively. As further evidence of LAs’ validity, open source developers that were made aware of LAs reacted to the issue by making code changes in 10% of the cases. Finally, in order to facilitate the use of LAs in practice, we identified a sub-set of LAs which were universally agreed upon as being problematic; those which had a clear dissonance between code behavior and lexicon.
},
keywords = {developers' perception, empirical study, linguistic antipatterns, natural language processing, source code identifiers},
pubstate = {published},
tppubtype = {article}
}
2014
@phdthesis{Arnaoudova:phd14:Lexicon,
title = {Towards Improving the Code Lexicon and its Consistency},
author = {Venera Arnaoudova},
url = {/wp-content/uploads/2014/09/2014-PhD_Thesis-Arnaoudova-LexiconConsistency.pdf},
year = {2014},
date = {2014-08-25},
school = {Polytechnique Montréal},
keywords = {developers' perception, empirical study, fault prediction, linguistic antipatterns, program comprehension, renaming, source code identifiers},
pubstate = {published},
tppubtype = {phdthesis}
}
@article{REPENT-14,
title = {REPENT: Analyzing the Nature of Identifier Renamings},
author = {Venera Arnaoudova and Laleh {Mousavi Eshkevari} and Massimiliano {Di Penta} and Rocco Oliveto and Giuliano Antoniol and Yann-Gaël Guéhéneuc},
year = {2014},
date = {2014-01-01},
journal = {IEEE Transactions on Software Engineering (TSE)},
volume = {40},
number = {5},
pages = {502--532},
abstract = {Source code lexicon plays a paramount role in software quality: poor lexicon can lead to poor comprehensibility and even increase software fault-proneness. For this reason, renaming a program entity, i.e., altering the entity identifier, is an important activity during software evolution. Developers rename when they feel that the name of an entity is not (anymore) consistent with its functionality, or when such a name may be misleading. A survey that we performed with 71 developers suggests that 39 percent perform renaming from a few times per week to almost every day and that 92 percent of the participants consider that renaming is not straightforward. However, despite the cost that is associated with renaming, renamings are seldom if ever documented—for example, less than 1 percent of the renamings in the five programs that we studied. This explains why participants largely agree on the usefulness of automatically documenting renamings. In this paper we propose REANAMING PROGRAM ENTITIES (REPENT), an approach to automatically document—detect and classify—identifier renamings in source code. REPENT detects renamings based on a combination of source code differencing and data flow analyses. Using a set of natural language tools, REPENT classifies renamings into the different dimensions of a taxonomy that we defined. Using the documented renamings, developers will be able to, for example, look up methods that are part of the public API (as they impact client applications), or look for inconsistencies between the name and the implementation of an entity that underwent a high risk renaming (e.g., towards the opposite meaning). We evaluate the accuracy and completeness of REPENT on the evolution history of five open-source Java programs. The study indicates a precision of
88 percent and a recall of 92 percent. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five programs, according to our taxonomy.},
keywords = {empirical study, mining software repositories, refactoring, renaming, source code identifiers},
pubstate = {published},
tppubtype = {article}
}
88 percent and a recall of 92 percent. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five programs, according to our taxonomy.@article{SCAN-14,
title = {SCAN: An Approach to Label and Relate Execution Trace Segments},
author = {Soumaya Medini and Venera Arnaoudova and Massimiliano {Di Penta} and Giuliano Antoniol and Yann-Gaël Guéhéneuc and Paolo Tonella},
year = {2014},
date = {2014-01-01},
journal = {Journal of Software: Evolution and Process (JSEP)},
volume = {26},
number = {11},
pages = {962--995},
abstract = {Program comprehension is a prerequisite to any maintenance and evolution task. In particular, when performing feature location, developers perform program comprehension by abstracting software features and identifying the links between high-level abstractions (features) and program elements.
We present Segment Concept AssigNer (SCAN), an approach to support developers in feature location. SCAN uses a search-based approach to split execution traces into cohesive segments. Then, it labels the segments with relevant keywords and, finally, uses formal concept analysis to identify relations among segments. In a first study, we evaluate the performances of SCAN on six Java programs by 31 participants. We report an average precision of 69% and a recall of 63% when comparing the manual and automatic labels and a precision of 63% regarding the relations among segments identified by SCAN. After that, we evaluate the usefulness of SCAN for the purpose of feature location on two Java programs. We provide evidence that SCAN (i) identifies 69% of the gold set methods and (ii) is effective in reducing the quantity of information that developers must process to locate features—reducing the number of methods to understand by an average of 43% compared to the entire execution traces.},
keywords = {concept identification, dynamic analysis, empirical study, formal concept analysis, information retrieval},
pubstate = {published},
tppubtype = {article}
}
We present Segment Concept AssigNer (SCAN), an approach to support developers in feature location. SCAN uses a search-based approach to split execution traces into cohesive segments. Then, it labels the segments with relevant keywords and, finally, uses formal concept analysis to identify relations among segments. In a first study, we evaluate the performances of SCAN on six Java programs by 31 participants. We report an average precision of 69% and a recall of 63% when comparing the manual and automatic labels and a precision of 63% regarding the relations among segments identified by SCAN. After that, we evaluate the usefulness of SCAN for the purpose of feature location on two Java programs. We provide evidence that SCAN (i) identifies 69% of the gold set methods and (ii) is effective in reducing the quantity of information that developers must process to locate features—reducing the number of methods to understand by an average of 43% compared to the entire execution traces.2008
@article{2008-JOT-Arnaoudova-AspectJ,
title = {Overcoming comprehension barriers in the AspectJ programming language},
author = {Venera Arnaoudova and Laleh {Mousavi Eshkevari} and Elaheh {Safari Sharifabadi} and Constantinos Constantinides},
year = {2008},
date = {2008-01-01},
journal = {Journal of Object Technology (JOT)},
volume = {7},
number = {6},
pages = {121--142},
keywords = {aspect-oriented programming, empirical study, program comprehension},
pubstate = {published},
tppubtype = {article}
}