The challenges of using machine learning for organ allocation. Reply to Sinnott-Armstrong and Skorburg By Esther Braun, Noah Broestl, Dorothy Chou, and Robert Vandersluis. Published October 15, 2021, in response to: How AI can aid bioethics
Posted by JPE Editors on 2021-10-15
In their paper "How Can AI Aid Practical Ethics", Sinnott-Armstrong and Skorburg propose an AI system for kidney allocation based on the preferences of survey participants. Their proposal that AI systems trained on large-scale survey data will result in more “informed, rational, and impartial” outcomes is optimistic. In an earlier paper, Sinnott-Armstrong and others state that “aggregating the moral views of multiple humans […] may result in a morally better system than that of any individual human, for example because idiosyncratic moral mistakes made by individual humans are washed out in the aggregate” (Conitzer, Sinnott-Armstrong et al., 2017). We provide an alternative point of view that without integration into a moral framework, survey results alone cannot “provide evidence of what is morally right or wrong,” or be “helpful to us in deciding what we should believe and do in complex moral situations,” (Sinnott-Armstrong and Skorburg, 2021) such as kidney transplant allocation decisions.
Attempting to build “morality into AI” (Sinnott-Armstrong and Skorburg, 2021) by exclusively drawing on public attitudes seems to commit the is/ought fallacy that has been extensively criticized in the literature on empirical ethics (Salloch et al., 2014). Survey results can only provide information on the majority view, but not on what is morally right. Drawing prescriptive claims from empirical data on the public’s attitudes seems to assume that the majority view equals the morally correct view. However, popular acceptance of a certain practice cannot serve as an ethical justification for that practice, as demonstrated by many historical examples of morally unacceptable practices which were widely accepted within society.
Popular opinion alone cannot serve as a proxy for ethical decision-making, as the outcomes are likely to exacerbate existing inequities and injustices. Sinnott-Armstrong and Skorburg state that humans are “biased, ignorant and confused”. However, it is likely that the proposed AI system will be equally as biased, ignorant, and confused as the opinions it is based on. The authors believe that bias in the AI system can be avoided by asking survey participants to identify factors they regard as biased. But merely excluding attributes deemed biased by participants without further ethical analysis may have unforeseen consequences. To illustrate this point, we consider the role of race in greater detail, arguing that simply excluding race from organ allocation decisions – as superficially advocated in survey responses – would neither reflect an appropriately nuanced approach to eliminating bias, nor would eliminating bias be technically achievable based on the simple aggregation of survey participant preferences.
The Example of Race
The authors state that participants in the preliminary survey viewed race as a morally irrelevant and inherently biased criterion that should not affect organ allocation. The participants seem to endorse what has been called a “colourblind” approach that assumes ignoring race will prevent bias and discrimination. The term colourblindness refers to claims of not “seeing” race and only noticing the “relevant” attributes of a person. However, a colourblind approach may actually contribute to rather than prevent injustice (Braddock, 2021).
While black patients have the highest incidence of kidney failure among ethnic groups in the U.S., they are less likely to receive kidney transplants compared to white patients (Wesselman et al., 2021). This holds true even after adjusting for socioeconomic factors and differences in comorbidities (Ku et al., 2020). Black patients in the U.S. are also less likely to be waitlisted for kidney transplantation than white patients, even after adjusting for medical factors and social determinants of health (Ng et al., 2020). Unconscious provider bias may play a role in whether an individual achieves waitlisting and ultimately transplantation (Reed and Locke, 2020).
Moreover, brain-dead black patients are less likely to become organ donors. Families of potential black organ donors are less likely to be approached by organ procurement organizations, and white families are more often correctly perceived as receptive to donation than black families. Black families are also given fewer opportunities to consider the decision with healthcare staff or representatives of organ procurement organisations (Siminoff et al., 2003). Since donors and recipients of the same ethnicity have a higher probability of matching tissue markers which are necessary for successful transplantation, a lower percentage of black organ donors results in a lower number of black organ recipients.
While survey participants may be demonstrating good intentions by excluding race, a “colourblind” approach could potentially have harmful consequences. Race is a significant factor for determining the likelihood of receiving a kidney transplant, even under circumstances where racial status is superficially excluded from allocation criteria. Contrary to the intuitions of the participants in the study conducted by Sinnott-Armstrong and Skorburg, it could be argued that race could or even should be taken into account for organ allocation. For example, if a transplant organ matches a white and a black patient who have similar clinical need for the transplant, it may be justified to prioritize the black patient as it is less likely for another organ to become available soon that matches his tissue markers. If the probability of receiving another organ in the future is much lower for black patients, an algorithm that gives an equal chance of receiving an organ to a black and a white patient with identical characteristics may not actually be just.
Furthermore, a superficially “colourblind” approach might actually be far from “colourblind” in reality, as other data points seemingly unrelated to race can act as powerful racial proxies within datasets, leading to the amplification of inequalities. For example, participants in the survey conducted by Sinnott-Armstrong and Skorburg considered factors such as “mental health, record of violent or non-violent crime” as relevant for organ allocation. Both crime and mental illness can be associated with lower socioeconomic status (Anglin et al., 2021, Sharkey et al., 2016), and are reportedly more prevalent among ethnic minority groups for a number of reasons, including over-surveillance (Privacy International, 2020). This could create powerful inequality magnifying racial proxies within an organ allocation process. This has been reflected in algorithmic approaches to prison sentencing, which have been heavily criticised for compounding biased outcomes in the criminal justice system and in predictive policing (Angwin et al., 2016, Heaven, 2021). Apparently, even though survey participants support the exclusion of race from allocation decisions, they also support the inclusion of attributes that can serve as proxies for race. The proposed approach does not provide a solution for situations in which attributes considered relevant for allocation may serve as proxies for other attributes that were considered discriminatory.
AI is not a Magic Bullet
While novel AI techniques such as machine learning have proven to solve complex problems, they are not a magic bullet. AI systems require careful problem definition, construction, and testing in order to be a suitable solution for a specific problem. After exploring several more complex AI techniques, Sinnott-Armstrong and Skorburg are proposing a rather naive AI system with the aim of eliminating bias in the decision-making process. However, AI systems can also have some amount of inherent bias – replacing existing decision-making systems with AI may therefore result in replacing certain types of bias with others.
The authors claim that the proposed AI system will be able to remove bias by surveying a large number of people. Aggregating the views of multiple humans might succeed in eliminating some biases, such as biases held by a minority of the group or differing and conflicting biases. However, if a machine learning model is trained on the aggregation of the opinions of many individuals, it will also amplify the biases in the beliefs that are held by the majority of survey respondents. While the biases of the individual might be removed, the biases of the group will be reproduced and exacerbated. One cannot assume that an outcome from an AI system will be less biased than one determined by a medical expert, because any system based on survey data will reflect and amplify biases inherent in public opinion. If the survey respondents are less aware of the morally and medically relevant issues, as we can assume the general population would be, we can infer that the amplified biases will result in less morally and medically correct decisions being made by the AI system trained using this data.
To further illustrate the trade-off in biases, consider the scenario mentioned by Skorburg and Sinnott-Armstrong in which a doctor wakes up at 3 a.m. and must decide who receives a kidney transplant. The idea that she would consult an online voting system on who should receive the kidney, with the winner being the individual with most votes, is neither morally nor medically justifiable. In effect, that is what is being proposed by using a machine learning system that is built based on a large-scale survey.
We recognise that empirical research can assist in identifying ethical problems (Salloch et al., 2014), and drawing on the experiences of relevant stakeholder groups can help to recognize new moral arguments (Ives and Draper, 2009). Empirical research can also aid in identifying gaps in public knowledge that subsequent education efforts can be based on, as well as informing policy makers about a population’s stance on new technologies (Levitt, 2003). The understanding of and general consensus with the ethical principles underlying the decisions made by AI systems are important for the public’s acceptance and endorsement of these tools (Awad et al., 2018). Additionally, bias in AI may be avoided by consulting minority voices (Gebru, 2020).
Despite the benefits of empirically informed ethical approaches, we argue that it is not possible to use AI systems trained based on large quantitative survey data to eliminate bias or deliver fairer or more ethical outcomes. To answer complex moral questions on organ allocation, adequate empirical information, medical expertise, as well as ethical analysis are all needed. Quantitative survey results are therefore less than ideal as a basis for ethical decision-making.
Anglin, D. M., Ereshefsky, S., Klaunig, M. J., Bridgwater, M. A., Niendam, T. A., Ellman, L. M., Devylder, J., Thayer, G., Bolden, K., Musket, C. W., Grattan, R. E., Lincoln, S. H., Schiffman, J., Lipner, E., Bachman, P., Corcoran, C. M., Mota, N. B. & Van Der Ven, E. 2021. From Womb to Neighborhood: A Racial Analysis of Social Determinants of Psychosis in the United States. Am J Psychiatry, appiajp202020071091.
Angwin, J., Larson, J., Mattu, S. & Kirchner, L. 2016. Machine Bias [Online]. Available: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing [Accessed 7/7/2021].
Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., Shariff, A., Bonnefon, J. F. & Rahwan, I. 2018. The Moral Machine experiment. Nature, 563, 59-64.
Braddock, C. H., 3rd 2021. Racism and Bioethics: The Myth of Color Blindness. Am J Bioeth, 21, 28-32.
Conitzer, V., Sinnott-Armstrong, W., Borg, J. S., Deng, Y. & Kramer, M. 2017. Moral decision making frameworks for artificial intelligence | Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 4831-4835.
Gebru, T. 2020. Race and Gender. In: DUBBER, M. D., PASQUALE, F. & DAS, S. (eds.) The Oxford Handbook of Ethics of AI. Oxford: Oxford University Press.
Heaven, W. D. 2021. Predictive policing is still racist—whatever data it uses. MIT Technology Review.
Ives, J. & Draper, H. 2009. Appropriate methodologies for empirical bioethics: it's all relative. Bioethics, 23, 249-58.
Ku, E., Lee, B. K., Mcculloch, C. E., Roll, G. R., Grimes, B., Adey, D. & Johansen, K. L. 2020. Racial and Ethnic Disparities in Kidney Transplant Access Within a Theoretical Context of Medical Eligibility. Transplantation, 104, 1437-1444.
Levitt, M. 2003. Public consultation in bioethics. What's the point of asking the public when they have neither scientific nor ethical expertise? Health Care Anal, 11, 15-25.
Ng, Y. H., Pankratz, V. S., Leyva, Y., Ford, C. G., Pleis, J. R., Kendall, K., Croswell, E., Dew, M. A., Shapiro, R., Switzer, G. E., Unruh, M. L. & Myaskovsky, L. 2020. Does Racial Disparity in Kidney Transplant Waitlisting Persist After Accounting for Social Determinants of Health? Transplantation, 104, 1445-1455.
Privacy International. 2020. Ethnic Minorities at Greater Risk of Oversurveillance After Protests [Online]. Available: https://privacyinternational.org/news-analysis/3926/ethnic-minorities-greater-risk-oversurveillance-after-protests [Accessed 13/07/2021].
Reed, R. D. & Locke, J. E. 2020. Social Determinants of Health: Going Beyond the Basics to Explore Racial Disparities in Kidney Transplantation. Transplantation, 104, 1324-1325.
Salloch, S., Vollmann, J. & Schildmann, J. 2014. Ethics by opinion poll? The functions of attitudes research for normative deliberations in medical ethics. J Med Ethics, 40, 597-602.
Sharkey, P., Besbris, M. & Friedson, M. 2016. Poverty and Crime. In: BRADY, D. & BURTON, L. M. (eds.) The Oxford Handbook of the Social Science of Poverty. Oxford: Oxford University Press.
Siminoff, L. A., Lawrence, R. H. & Arnold, R. M. 2003. Comparison of black and white families' experiences and perceptions regarding organ donation requests. Crit Care Med, 31, 146-51.
Sinnott-Armstrong, W. & Skorburg, J. A. 2021. How AI can aid Bioethics. Journal of Practical Ethics (forthcoming).
Wesselman, H., Ford, C. G., Leyva, Y., Li, X., Chang, C. H., Dew, M. A., Kendall, K., Croswell, E., Pleis, J. R., Ng, Y. H., Unruh, M. L., Shapiro, R. & Myaskovsky, L. 2021. Social Determinants of Health and Race Disparities in Kidney Transplant. Clin J Am Soc Nephrol, 16, 262-274.
Back to Letters & Forthcoming Articles List