Abstract: |
The underrepresentation of women in open-source software is frequently
attributed to women’s lack of innate aptitude compared to men: natural gender
differences in technical ability (Trinkenreich et al., 2021). Approaching code
as a form of communication, I conduct a novel empirical study of gender
differences in Python programming on GitHub. Based on 1, 728 open-source
projects, I ask if there is a gender difference in the quality and style of
Python code measured in adherence to PEP-8 guidelines. I found significant
gender differences in structure and how Python files are organized. While
there is gendered variation in programming style, there is no evidence of
gender difference in code quality. Using a Random Forest model, I show that
the gender of a programmer can be predicted from the style of their Python
code. The study concludes that gender differences in Python code are a matter
of style, not quality. |