Week 9: Literature Review and Related Work Section
In this past week, my task was to summarize the papers I have read pertaining to ranking and fairness in the Related Works section of our paper draft. The objective of our paper is to show how fairness, which has previously been defined for classification-type problems, translates to ranking. Additionally, we believe there exist differences in the way fairness can be interpreted in regards to ranking, and we hope to identify these in our paper.
Next week, I will summarize two other papers that I have read/am going to read to put into the Related Works section. From the summaries that I and Caitlin have compiled so far, it appears there are many related but potentially distinct ways to define fairness that may or may not be applicable pre, during, and post processing. I'll define these in more detail next week. Additionally, this next week I will try to make some generality about the definitions of fairness and identify which (if any or if all) of them are applicable at different stages with respect to the model building (processing).
In the remainder of this post, I will summarize the three papers that I read and put into the related works section. I will try to suggest ways in which these papers can guide us on our future steps on this project.
Next week, I will summarize two other papers that I have read/am going to read to put into the Related Works section. From the summaries that I and Caitlin have compiled so far, it appears there are many related but potentially distinct ways to define fairness that may or may not be applicable pre, during, and post processing. I'll define these in more detail next week. Additionally, this next week I will try to make some generality about the definitions of fairness and identify which (if any or if all) of them are applicable at different stages with respect to the model building (processing).
In the remainder of this post, I will summarize the three papers that I read and put into the related works section. I will try to suggest ways in which these papers can guide us on our future steps on this project.
Inherent Trade-Offs in the Fair Determination of Risk Scores
This paper focuses specifically on the goal of predicting whether or not someone from one of two groups is a positive action or a negative action (in the COMPAS example mentioned last week, a person with a positive action would be someone on trial for a crime who will later commit another crime (recidivism)).
They identify three different ways of defining fairness and determine that these three cannot all be true except in very constrained circumstances. The three definitions are named "calibration within groups", "balance for the negative class", and "balance for the positive class". Calibration within groups states that the proportion of positive actions should be the same between groups (for example, if 50% of black defendants are in the positive class, then 50% of the white defendants should be in the positive class). Balance for the positive class states that the proportion of people in the group who were correctly identified as being in the positive class should be the same between groups. For example, if 40% of the black defendants were in the positive class and were predicted to be in the positive class, the same should be true for 40% of the white defendants. The last one, balance for the negative class, states that the proportion of people who were predicted to be in the negative class but were actually in the positive class should be the same. The example follows.
Of note is that individuals are not assigned a binary classifier "predicted to be positive" or "predicted to be negative". Instead, they are assigned a score which represents the probability that they are in one group (for example, person A is 90% likely to be positive).
This may have interesting implications for ranking. In the case of ranking, we would want to assign a continuous variable (like this score) instead of a binary classifier, because we might want to rank people based on the likelihood of them falling into one particular group. However, it is not apparent to me how things like calibration play out in the ranking domain.
On the (im)possibility of fairness
This paper suggests that the mapping from the "observed space", attributes that can be collected, to the "decision space", the output of the model, must go through a novel "construct space", a space that translates observed attributes into meaningful but unobservable variables. They define fairness to be the fidelity (distance between pairs of points) of the mapping from the construct space to the decision space. They suggest that fairness requires assumptions about the mapping from the observed space to the construct space, and that these assumptions are biased by one's worldview.
Two worldviews they identify are (1) What You See Is What You Get (WYSIWYG), where the observable nature of the world accurately reflects the construct space, and (2) We're All Equal (WAE), where you assume that all groups (for example, all groups of different races) should have the same distribution of good values. An example that clarifies the difference is using SAT scores (observable) in place of academic potential (not observable, in construct space) to determine whether or not someone should be admitted to a college (decision). In WYSIWYG, if high SAT scores are associated with the male gender, you would say that males have more academic potential than females and more should be admitted to college. In WAE, you assume that females have just as much academic potential as males, so an equal number should be admitted to college (maybe you still pick the males and females with the highest SAT scores in their respective gender groups).
I don't think one of these is any more valid than the other, but this paper does show how these two different worldviews can create two different determinations. In our ranking paradigm, this may be an interesting implication. It would be interesting to see if the fidelity measure they were using (Gromov-Wasserstein distance) works in the case of ranking where order may matter but scoring does not (less constrained by the idea of measurable distance).
Identifying Significant Predictive Bias in Classifiers
This paper is a bit different from the other two. It gives an algorithm to find an anomalous subgroup in a model. For example, in COMPAS, African American females who previously committed a misdemeanor had an overestimated recidivism risk. This algorithm works in linear time, and it identifies the most anomalous subgroup in a model. These subgroups are given a score, so you could say that if all subgroups had a score below a certain threshold, that your model did not have significant bias. However, if you had a subgroup with a score above that threshold, you could say your model had some bias against that subgroup.
This may be relevant to ranking, especially in the translation of these ideas to the ranking toolkit (RANKIT) that the other group of undergraduates is doing.
Certainly, we would like to create an algorithm or series of algorithms that can be eventually "plugged in" to the RANKIT toolkit. The goal would be to define fairness and bias in ranking in such a way that the toolkit is comprehensive to all possible scenarios (or a reasonable collection of scenarios).
Until next week! MV
Comments
Post a Comment