Trade-Offs between Fairness and Interpretability in Machine Learning

Publication information:

Agarwal, S. Trade-Offs between Fairness and Interpretability in Machine Learning. in IJCAI 2021 Workshop on AI for Social Good (2021).

- BibTeX
- EndNote X3 XML
- EndNote 7 XML
- Endnote tagged
- Marc
- PubMedId
- RIS
Trade-Offs between Fairne...

Abstract

In this work, we look at cases where we want a classifier to be both fair and interpretable, and find that it is necessary to make trade-offs between these two properties. We have theoretical results to demonstrate this tension between the two requirements. More specifically, we consider a formal framework to build simple classifiers as a means to attain interpretability, and show that simple classifiers are strictly improvable, in the sense that every simple classifier can be replaced by a more complex classifier that strictly improves both fairness and accuracy.