{"id":110,"date":"2023-11-17T11:41:32","date_gmt":"2023-11-17T11:41:32","guid":{"rendered":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/chapter\/issues-with-data-bias-and-fairness\/"},"modified":"2024-01-31T08:11:06","modified_gmt":"2024-01-31T08:11:06","slug":"issues-with-data-bias-and-fairness","status":"publish","type":"chapter","link":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/chapter\/issues-with-data-bias-and-fairness\/","title":{"raw":"Issues with Data: Bias and Fairness","rendered":"Issues with Data: Bias and Fairness"},"content":{"raw":"<p class=\"no-indent\">Bias is prejudice towards or against an identity, whether\u00a0 good or bad, intentional or unintentional<sup>1<\/sup>. Fairness is the counter to this bias and more: when everyone is treated fairly, regardless of their identity and situation. Clear processes have to be set and followed to make sure everyone is treated equitably and has equal access to opportunity<sup>1<\/sup>.<\/p>\nHuman-based systems often comprise a lot of bias and discrimination. Every person has their own unique set of opinions and prejudices. They too are black boxes whose decisions, such as how they score answer sheets, can be difficult to understand. But we have developed strategies and established structures to watch out for and question such practices.\n\nAutomated systems are sometimes touted as the panacea to human subjectivity: Algorithms are based on numbers, so how can they have biases? Algorithms based on flawed data, among other things, can not only pick up and learn existing biases pertaining to gender, race, culture or disability \u2013 they can amplify existing\u00a0biases<sup>1,2,3<\/sup>. And even if they are not locked behind proprietary walls, they can\u2019t be called up to explain their actions due to the inherent lack of explainability in some systems such as those based on <a href=\"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/chapter\/deep-neural-networks\/\" target=\"_blank\" rel=\"noopener\">Deep Neural Networks<\/a>, .\n<h3 style=\"text-align: left\">Examples of bias entering AIED systems<\/h3>\n<ol>\n \t<li style=\"text-align: justify\">When programmers code rule-based systems, they can put their personal biases and stereotypes into the system<sup>1<\/sup>.<\/li>\n \t<li style=\"font-weight: 400;text-align: justify\">A data based algorithm can conclude not to propose a STEM-based career path for girls because female students feature less in the STEM graduate dataset. Is the lesser number of female mathematicians due to existing stereotypes and societal norms or is it due some inherent property of being female? Algorithms have no way to distinguish between the two situations. Since existing data reflects existing stereotypes, the algorithms that train on them replicate existing inequalities and social dynamics<sup>4<\/sup>. Further, if such recommendations are implemented, more girls will opt for non-STEM subjects and the new data will reflect this - a case of self-fulfilling prophecy<sup>3<\/sup>.<\/li>\n \t<li style=\"text-align: justify\">Students from a culture that is under-represented in the training dataset might have different behaviour patterns and different ways of showing motivation. How would a learning analytics calculate metrics for them? If the data is not representative of all categories of students, systems trained on this data might penalise the minority whose behavioural tendencies are not what the program was optimised to reward. If we\u2019re not careful, learning algorithms will generalise, based on the majority culture, leading to a high error rate for minority groups<sup>4,5<\/sup>. Such decisions might discourage those who could bring diversity, creativity and unique talents and those who have different experiences, interests and motivations<sup>2<\/sup>.<\/li>\n \t<li style=\"text-align: justify\">A British student judged by a US essay correction software would be penalised for their spelling mistakes. Local language, changes in spelling and accent, local geography and culture would always be tricky for systems that are designed and trained for another country and another context.<\/li>\n \t<li style=\"text-align: justify\">Some teachers penalise phrases common to a class or region, either consciously or because of biased social associations. If an essay-grading software trains on essays graded by these teachers, it will replicate the same bias.<\/li>\n \t<li style=\"text-align: justify\">Machine learning systems need a <a href=\"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/chapter\/the-flip-side-of-als-some-paradigms-to-take-note-of\/\" target=\"_blank\" rel=\"noopener\">target variable and proxies for which to optimise<\/a>. Let us say high-school test scores were taken as the proxy for academic excellence. The system will now train exclusively to boost patterns that are consistent with students who do well under the stress and narrowed contexts of exam halls. Such systems will boost test scores, and not knowledge, when recommending resources and practice exercises to students. While this might also be true in many of today\u2019s classrooms, the traditional approach at least makes it possible to express multiple goals<sup>4<\/sup>.<\/li>\n \t<li style=\"text-align: justify\">Adaptive learning systems suggest resources to students that will remedy a lack of skill or knowledge. If these resources need to be bought or require home internet connection, then it is not fair to those students who don\u2019t have the means to follow the recommendations: <em>\u201cWhen an algorithm suggests hints, next steps, or resources to a student, we have to check whether the help-giving is unfair because one group systematically does not get useful help which is discriminatory<\/em>\u201d<sup>2<\/sup>.<\/li>\n \t<li style=\"text-align: justify\">The concept of personalising education according to a student\u2019s current knowledge level and tastes might in itself constitute a bias<sup>1<\/sup>. Aren\u2019t we also stopping this student from exploring new interests and alternatives? Wouldn\u2019t this make him or her one-dimensional and reduce overall skills,knowledge and access to opportunities?<\/li>\n<\/ol>\n[caption id=\"attachment_638\" align=\"aligncenter\" width=\"1024\"]<img class=\"wp-image-638 size-large\" src=\"http:\/\/aiopentext.itd.cnr.it\/wp-content\/uploads\/sites\/10\/2023\/11\/ch3-page6-bias-scaled-1.jpg\" alt=\"\" width=\"1024\" height=\"724\"> \"'Data and algorithmic bias in the web'\" by jennychamux is licenced under CC BY 2.0. To view a copy of this licence, visit <a href=\"https:\/\/creativecommons.org\/licenses\/by\/2.0\/?ref=openverse\">https:\/\/creativecommons.org\/licenses\/by\/2.0\/?ref=openverse<\/a>.[\/caption]\n<h3 style=\"text-align: left\">What can the teacher do to reduce the effects of AIED biases?<\/h3>\n<p class=\"no-indent\">Researchers are constantly proposing and analysing different ways to reduce bias. But not all methods are easy to implement \u2013 fairness goes deeper than mitigating bias.<\/p>\n<p class=\"indent\">For example, if existing data is full of stereotypes \u2013 \u201c<em>do we have an obligation to question the data and to design our systems to conform to some notion of equitable behaviour, regardless of whether or not that\u2019s supported by the data currently available to<\/em><em> us?<\/em>\u201d<sup>4<\/sup>. Methods are always in tension and opposition with each other and some interventions to reduce one kind of bias can introduce another bias!<\/p>\n<p class=\"indent\">So, what can the teacher do?<\/p>\n\n<ol>\n \t<li><strong>Question the seller<\/strong> \u2013 before subscribing to an AIED system, ask what type of datasets were used to train it, where, by and for whom was it conceived and designed, and how was it evaluated.<\/li>\n \t<li><strong>Don\u2019t swallow the metrics<\/strong> when you invest in an AIED system. An overall accuracy of, say, five percent, might hide the fact that a model performs badly for a minority group<sup>4<\/sup>.<\/li>\n \t<li><strong>Look at the documentation <\/strong>\u2013 what measures, if any, have been taken to detect and counter bias and enforce fairness<sup>1<\/sup>?<\/li>\n \t<li><strong>Find out about the developers<\/strong> \u2013 are they solely computer science experts or were educational researchers and teachers involved? Is the system based solely on machine learning or were learning theory and practices integrated<sup>2<\/sup>?<\/li>\n \t<li><strong>Give preference to transparent and open learner models which allow you to override decisions<\/strong><sup>2<\/sup>, many AIED models have flexible designs whereby the teacher and student can monitor the system, ask for explanations or ignore the machine\u2019s decisions.<\/li>\n \t<li><strong>Examine the product\u2019s accessibility<\/strong>. Can everyone access it equally, regardless of ability<sup>1<\/sup>?<\/li>\n \t<li><strong>Watch out for the effects of using a technology<\/strong>, both long-term and short-term, on students, and be ready to offer assistance when necessary.<\/li>\n<\/ol>\n<p class=\"no-indent\">Despite the problems of AI-based technology, we can be optimistic about the future of AIED:<\/p>\n\n<ul>\n \t<li>With increased awareness of these topics, methods to detect and correct bias are being researched and trialled;<\/li>\n \t<li>Rule-based and data-based systems can uncover hidden biases in existing educational practices. Exposed thus, these biases can then be dealt with;<\/li>\n \t<li style=\"text-align: justify\">With the potential for customisation in AI systems, many aspects of education could be tailored. Resources could become more responsive to students\u2019 knowledge and experience. Perhaps they could integrate local communities and cultural assets, and meet specific local needs<sup>2<\/sup>.<\/li>\n<\/ul>\n\n<hr>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>1\u00a0<\/sup><a href=\"https:\/\/education.ec.europa.eu\/news\/ethical-guidelines-on-the-use-of-artificial-intelligence-and-data-in-teaching-and-learning-for-educators\">Ethical guidelines on the use of artificial intelligence and data in teaching and learning for educators<\/a>, European Commission, October 2022.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>2\u00a0<\/sup>U.S. Department of Education, Office of Educational Technology, <em>Artificial Intelligence and Future of Teaching and Learning: Insights and Recommendations<\/em>, Washington, DC, 2023.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>3 <\/sup>Kelleher, J.D, Tierney, B, <em>Data Science<\/em>, MIT Press, London, 2018.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>4 <\/sup>Barocas, S.,\u00a0 Hardt, M., Narayanan, A., <em><a href=\"https:\/\/fairmlbook.org\/\" target=\"_blank\" rel=\"noopener\" data-cke-saved-href=\"https:\/\/fairmlbook.org\/\">Fairness and machine learning Limitations and Opportunities<\/a>, <\/em>MIT Press, 2023.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>5\u00a0<\/sup>Milano, S., Taddeo, M., Floridi, L., Recommender systems and their ethical challenges, AI &amp; Soc 35, 957\u2013967, 2020.<\/p>","rendered":"<p class=\"no-indent\">Bias is prejudice towards or against an identity, whether\u00a0 good or bad, intentional or unintentional<sup>1<\/sup>. Fairness is the counter to this bias and more: when everyone is treated fairly, regardless of their identity and situation. Clear processes have to be set and followed to make sure everyone is treated equitably and has equal access to opportunity<sup>1<\/sup>.<\/p>\n<p>Human-based systems often comprise a lot of bias and discrimination. Every person has their own unique set of opinions and prejudices. They too are black boxes whose decisions, such as how they score answer sheets, can be difficult to understand. But we have developed strategies and established structures to watch out for and question such practices.<\/p>\n<p>Automated systems are sometimes touted as the panacea to human subjectivity: Algorithms are based on numbers, so how can they have biases? Algorithms based on flawed data, among other things, can not only pick up and learn existing biases pertaining to gender, race, culture or disability \u2013 they can amplify existing\u00a0biases<sup>1,2,3<\/sup>. And even if they are not locked behind proprietary walls, they can\u2019t be called up to explain their actions due to the inherent lack of explainability in some systems such as those based on <a href=\"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/chapter\/deep-neural-networks\/\" target=\"_blank\" rel=\"noopener\">Deep Neural Networks<\/a>, .<\/p>\n<h3 style=\"text-align: left\">Examples of bias entering AIED systems<\/h3>\n<ol>\n<li style=\"text-align: justify\">When programmers code rule-based systems, they can put their personal biases and stereotypes into the system<sup>1<\/sup>.<\/li>\n<li style=\"font-weight: 400;text-align: justify\">A data based algorithm can conclude not to propose a STEM-based career path for girls because female students feature less in the STEM graduate dataset. Is the lesser number of female mathematicians due to existing stereotypes and societal norms or is it due some inherent property of being female? Algorithms have no way to distinguish between the two situations. Since existing data reflects existing stereotypes, the algorithms that train on them replicate existing inequalities and social dynamics<sup>4<\/sup>. Further, if such recommendations are implemented, more girls will opt for non-STEM subjects and the new data will reflect this &#8211; a case of self-fulfilling prophecy<sup>3<\/sup>.<\/li>\n<li style=\"text-align: justify\">Students from a culture that is under-represented in the training dataset might have different behaviour patterns and different ways of showing motivation. How would a learning analytics calculate metrics for them? If the data is not representative of all categories of students, systems trained on this data might penalise the minority whose behavioural tendencies are not what the program was optimised to reward. If we\u2019re not careful, learning algorithms will generalise, based on the majority culture, leading to a high error rate for minority groups<sup>4,5<\/sup>. Such decisions might discourage those who could bring diversity, creativity and unique talents and those who have different experiences, interests and motivations<sup>2<\/sup>.<\/li>\n<li style=\"text-align: justify\">A British student judged by a US essay correction software would be penalised for their spelling mistakes. Local language, changes in spelling and accent, local geography and culture would always be tricky for systems that are designed and trained for another country and another context.<\/li>\n<li style=\"text-align: justify\">Some teachers penalise phrases common to a class or region, either consciously or because of biased social associations. If an essay-grading software trains on essays graded by these teachers, it will replicate the same bias.<\/li>\n<li style=\"text-align: justify\">Machine learning systems need a <a href=\"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/chapter\/the-flip-side-of-als-some-paradigms-to-take-note-of\/\" target=\"_blank\" rel=\"noopener\">target variable and proxies for which to optimise<\/a>. Let us say high-school test scores were taken as the proxy for academic excellence. The system will now train exclusively to boost patterns that are consistent with students who do well under the stress and narrowed contexts of exam halls. Such systems will boost test scores, and not knowledge, when recommending resources and practice exercises to students. While this might also be true in many of today\u2019s classrooms, the traditional approach at least makes it possible to express multiple goals<sup>4<\/sup>.<\/li>\n<li style=\"text-align: justify\">Adaptive learning systems suggest resources to students that will remedy a lack of skill or knowledge. If these resources need to be bought or require home internet connection, then it is not fair to those students who don\u2019t have the means to follow the recommendations: <em>\u201cWhen an algorithm suggests hints, next steps, or resources to a student, we have to check whether the help-giving is unfair because one group systematically does not get useful help which is discriminatory<\/em>\u201d<sup>2<\/sup>.<\/li>\n<li style=\"text-align: justify\">The concept of personalising education according to a student\u2019s current knowledge level and tastes might in itself constitute a bias<sup>1<\/sup>. Aren\u2019t we also stopping this student from exploring new interests and alternatives? Wouldn\u2019t this make him or her one-dimensional and reduce overall skills,knowledge and access to opportunities?<\/li>\n<\/ol>\n<figure id=\"attachment_638\" aria-describedby=\"caption-attachment-638\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-638 size-large\" src=\"http:\/\/aiopentext.itd.cnr.it\/wp-content\/uploads\/sites\/10\/2023\/11\/ch3-page6-bias-scaled-1.jpg\" alt=\"\" width=\"1024\" height=\"724\" \/><figcaption id=\"caption-attachment-638\" class=\"wp-caption-text\">&#8220;&#8216;Data and algorithmic bias in the web'&#8221; by jennychamux is licenced under CC BY 2.0. To view a copy of this licence, visit <a href=\"https:\/\/creativecommons.org\/licenses\/by\/2.0\/?ref=openverse\">https:\/\/creativecommons.org\/licenses\/by\/2.0\/?ref=openverse<\/a>.<\/figcaption><\/figure>\n<h3 style=\"text-align: left\">What can the teacher do to reduce the effects of AIED biases?<\/h3>\n<p class=\"no-indent\">Researchers are constantly proposing and analysing different ways to reduce bias. But not all methods are easy to implement \u2013 fairness goes deeper than mitigating bias.<\/p>\n<p class=\"indent\">For example, if existing data is full of stereotypes \u2013 \u201c<em>do we have an obligation to question the data and to design our systems to conform to some notion of equitable behaviour, regardless of whether or not that\u2019s supported by the data currently available to<\/em><em> us?<\/em>\u201d<sup>4<\/sup>. Methods are always in tension and opposition with each other and some interventions to reduce one kind of bias can introduce another bias!<\/p>\n<p class=\"indent\">So, what can the teacher do?<\/p>\n<ol>\n<li><strong>Question the seller<\/strong> \u2013 before subscribing to an AIED system, ask what type of datasets were used to train it, where, by and for whom was it conceived and designed, and how was it evaluated.<\/li>\n<li><strong>Don\u2019t swallow the metrics<\/strong> when you invest in an AIED system. An overall accuracy of, say, five percent, might hide the fact that a model performs badly for a minority group<sup>4<\/sup>.<\/li>\n<li><strong>Look at the documentation <\/strong>\u2013 what measures, if any, have been taken to detect and counter bias and enforce fairness<sup>1<\/sup>?<\/li>\n<li><strong>Find out about the developers<\/strong> \u2013 are they solely computer science experts or were educational researchers and teachers involved? Is the system based solely on machine learning or were learning theory and practices integrated<sup>2<\/sup>?<\/li>\n<li><strong>Give preference to transparent and open learner models which allow you to override decisions<\/strong><sup>2<\/sup>, many AIED models have flexible designs whereby the teacher and student can monitor the system, ask for explanations or ignore the machine\u2019s decisions.<\/li>\n<li><strong>Examine the product\u2019s accessibility<\/strong>. Can everyone access it equally, regardless of ability<sup>1<\/sup>?<\/li>\n<li><strong>Watch out for the effects of using a technology<\/strong>, both long-term and short-term, on students, and be ready to offer assistance when necessary.<\/li>\n<\/ol>\n<p class=\"no-indent\">Despite the problems of AI-based technology, we can be optimistic about the future of AIED:<\/p>\n<ul>\n<li>With increased awareness of these topics, methods to detect and correct bias are being researched and trialled;<\/li>\n<li>Rule-based and data-based systems can uncover hidden biases in existing educational practices. Exposed thus, these biases can then be dealt with;<\/li>\n<li style=\"text-align: justify\">With the potential for customisation in AI systems, many aspects of education could be tailored. Resources could become more responsive to students\u2019 knowledge and experience. Perhaps they could integrate local communities and cultural assets, and meet specific local needs<sup>2<\/sup>.<\/li>\n<\/ul>\n<hr \/>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>1\u00a0<\/sup><a href=\"https:\/\/education.ec.europa.eu\/news\/ethical-guidelines-on-the-use-of-artificial-intelligence-and-data-in-teaching-and-learning-for-educators\">Ethical guidelines on the use of artificial intelligence and data in teaching and learning for educators<\/a>, European Commission, October 2022.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>2\u00a0<\/sup>U.S. Department of Education, Office of Educational Technology, <em>Artificial Intelligence and Future of Teaching and Learning: Insights and Recommendations<\/em>, Washington, DC, 2023.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>3 <\/sup>Kelleher, J.D, Tierney, B, <em>Data Science<\/em>, MIT Press, London, 2018.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>4 <\/sup>Barocas, S.,\u00a0 Hardt, M., Narayanan, A., <em><a href=\"https:\/\/fairmlbook.org\/\" target=\"_blank\" rel=\"noopener\" data-cke-saved-href=\"https:\/\/fairmlbook.org\/\">Fairness and machine learning Limitations and Opportunities<\/a>, <\/em>MIT Press, 2023.<\/p>\n<p class=\"hanging-indent\" style=\"text-align: left\"><sup>5\u00a0<\/sup>Milano, S., Taddeo, M., Floridi, L., Recommender systems and their ethical challenges, AI &amp; Soc 35, 957\u2013967, 2020.<\/p>\n","protected":false},"author":1,"menu_order":6,"template":"","meta":{"pb_show_title":"","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"part":84,"_links":{"self":[{"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/chapters\/110"}],"collection":[{"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/wp\/v2\/users\/1"}],"version-history":[{"count":1,"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/chapters\/110\/revisions"}],"predecessor-version":[{"id":111,"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/chapters\/110\/revisions\/111"}],"part":[{"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/parts\/84"}],"metadata":[{"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/chapters\/110\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/wp\/v2\/media?parent=110"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/pressbooks\/v2\/chapter-type?post=110"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/wp\/v2\/contributor?post=110"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/aiopentext.itd.cnr.it\/aiforteacher\/wp-json\/wp\/v2\/license?post=110"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}