Responsible DL

Fairness

The quality of deep learning models is measured in terms of accuracy metrics (e.g., training error and test accuracy) and systems metrics (e.g., training time and inference time). While these metrics capture how efficiently and accurately deep learning models estimate specific patterns in the data, they do not give us a way to understand and analyze potential bias present in the decisions that such models make (Stoyanovich2016, Drosou2017). The inherent technical problem is that the quality of the results (predictions) we get from deep learning models depends on the specific data set used for training the model. This is true for both the data chosen for training and the labels used to describe the training data. Both of these are choices made by humans who carry and transfer their biases to the data. Similarly, deep learning model building and design come with several tuning choices and judgment calls to stop training and model tuning. Again these are all human choices that carry the biases of human designers.

The absence of social biases awareness within the design process and deep learning systems can have several negative implications starting from the marginalization of vulnerable groups to the exacerbation of prejudice and discrimination. These effects are especially problematic for applications that deal with social data, e.g., face recognition or mortgage decisions. It is essential to ensure equitable predictions across all groups, and thus it is crucial to integrate ``fairness" as an evaluation metric for deep learning model design.

Formalizing Fairness

Multiple frameworks have been introduced to formalize the notion of fairness and its implications on machine learning data, methods, and results (Mouzannar2019, Skirpan2017, Green2020). A general framing of fairness centers around three questions within the application context: (i) Is it fair to apply machine learning to a problem? (ii) If so, is there a fair way to do so? and (iii) Even if there is a fair method, then are the results produced fair? (Skirpan2017). A system can be fair only when it provides contextual answers to these questions and allows those affected by it to challenge or confirm fairness.

Recent studies utilize frameworks from areas outside Computer Science to conceptualize fairness. One such set of studies borrow from critical race theory and advocate understanding the instability and multi-dimensional nature of various demographic categories (such as race and gender). They use this notion of instability to inform both the design and evaluation of algorithms (Hanna2020, Hu2020). Similarly, economic models of fairness such as equality of opportunity (Heidari2019), and transparent and accountable models for sensitive applications such as criminal sentencing and credit scoring (Kroll2016) are popular examples. Most relevant to the data systems community is a recent study that advocates for developing a shared definition of fairness across the board in engineering and data teams (Passi2019).

Fairness in Data

Deep learning pipelines heavily rely on training data sets, which can replicate various pre-existing social biases and inequality such as ethnicity- or gender-based discrimination. A recent report highlights the existence of a significantly large number of misleading gender stereotypes within data sets at Google and how that carries over to resulting word embeddings (Papakyriakopoulos2020).

There is a growing body of research to address the question of fairness in data. First, there are approaches to ensure fairness at the data collection level advocating for consent, power, inclusivity, and transparency (Geiger2020, Marda2020). For data sets that have already been collected, there are proposals to accompany them with metadata explaining their composition and collection process so that users can use them in an informed manner (Gebru2018, Stoyanovich2019}. Complementary to this are data pre-processing techniques to filter through already collected data to generate a training set that is less biased and more diverse (Celis2016, Schelter2019). Finally, there is work on augmenting and generating data to have better privacy and fairness guarantees (Ping2017, Rodriguez2018}.

Fairness in Algorithms

Unfairness can also occur at the algorithmic level, i.e., during the design and training of the deep neural network. In one such example of algorithmic unfairness, a deep neural network model could infer the gender of a person from images of their retina even though gender was not included in the training data set (Wired2019, Du2020).

There are efforts to mitigate unfairness at the algorithmic level both during and after training. A popular set of techniques is to use adversarial learning, where two models are trained -- the predictor model that learns the most informative representation possible from data and an adversarial model that reduces the predictor's capability to learn about protected attributes (Elazar2018). Some methods can remove bias from an already trained deep neural network. These methods proceed by detecting and removing neurons or parameters from the network strongly correlated with protected attributes (Kim2018).

Data Management Opportunities

Both aspects of unfairness mentioned above deal with concepts and properties familiar to the data management and data systems community. Properly managing, ensuring, and propagating data properties through ontologies and systems optimizations to allow for more complex models or more models (and thus better accuracy) are critical directions that can help achieve a positive push for more ethical deep learning.

Interpretability

The results given by deep learning models used in practice are extremely hard to understand and reason about. This is because they have been generated by networks with millions of parameters trained through a stochastic process. Theory lags far behind practice: Robust theoretical analysis exists only for very simple models trained on very small data sets (Alom2019). Interpretable deep learning has emerged as a sub-field of deep learning that seeks to augment the design, training, and deployment of deep learning models to make them understandable to humans (Carvalho2019, Samek2020, Qin2018). For instance, when applying deep learning to decide whether or not to provide loans to an individual, it is essential not only to have a final decision but also to list reasons on which such a decision was made (so that it can be verified and contested). In addition, when applying deep learning in health care it is critical to know exactly why certain suggestions are made by the models as any wrong decision can be catastrophic. Overall, interpretable deep learning enables experts and non-experts to understand, verify, and trust decisions made by deep neural networks.

We provide an overview of existing work on interpretable deep learning across three directions: dimensionality reduction, visualization, and model surrogacy.

Dimensionality Reduction

Deep learning pipelines are replete with extremely high-dimensional data such as training data sets and evolving neural network parameters. Dimensionality reduction techniques enable understanding high-dimensional data by converting it into a low-dimensional representation while preserving meaningful properties \cite{Roweis2000, Tenenbaum2000, Van2008}. A widely-used algorithm for dimensionality reduction in deep learning pipelines is called t-distributed Stochastic Neighbor Embedding (t-SNE) that preserves local similarities present in high-dimensional data sets (Van2008). For instance, t-SNE can convert the MNIST training data set (with 784-dimensional images) into a two-dimensional representation while maintaining clusters present in the data set, i.e., images belonging to different classes in high-dimensional data stay in different clusters in the low-dimensional representation. T-SNE and its variants, such as Isomap and Locally Linear Embedding, can also be applied to parameters of deep neural networks and its outputs. We can use the resulting low-dimensional representation for debugging, exploration, and visualization, making it much easier to understand the data and potential biases being present.

Visualization of Relationships

Various methods help visualize different aspects of deep learning to understand the trilateral relationship between input data items, parameters of a deep learning model, and the outputs it produces (Qin2018). For example, such visualization can be very useful when a data scientist is interested in mapping out sub-parts of a deep neural network responsible for recognizing certain features present in an input image. Activation Maximization is one such widely-used technique to achieve this. Activation Maximization synthesizes an input that maximally activates a specific part of the neural network (Zeiler2014). This synthetic input indicates the features that a specific part of a neural network recognizes. Another set of techniques called DeconNet takes a specific layer in a convolutional neural network and operates in the reverse direction to figure out patterns in an input image responsible for the activation produced by that layer (Tjoa2020). This is often used for debugging the scenario when the network produces incorrect outputs. Finally, there is the technique of Network Inversion that takes only the local information present at a layer in a neural network and reconstructs the input (Saad2007). This visualizes what aspects of an input (e.g., image) are preserved at every layer. We can apply all these techniques and their variants at various resolutions of a neural network, ranging from a neuron to a layer or even a set of layers. Together, they can construct detailed visualizations of how inputs to the deep neural network get converted to decisions. Such approaches do not solve the problem of bias in the data automatically, but can play a drastic role in helping human designers more easily spot bias in the data or design and act to fix it.

Model Surrogacy

Finally, a very standard approach used in practice is to approximate a deep neural network's decision function with self-explanatory surrogate models (Samek2020). These surrogate models can be models which are straightforward or easier to interpret such as linear classifiers, mixtures of decision trees, or even less complex neural network models. One popular approach is Local Interpretable Model-Agnostic Explanations (LIME) (Ribeiro2016). Given an input and a deep neural network, LIME produces a linear surrogate model that explains the contribution made by all input features to the decision made by the deep neural network. LIME proceeds by first defining a probability distribution around the input data point and, then, learning a linear model that best matches the output produced by the neural network on that distribution. Another approach is to use Knowledge Distillation. Here, the surrogate model takes the form of a less complex deep neural network model that mimics the deep neural network's decision function. Overall, we can combine model surrogacy approaches to produce explanations at different semantic levels. For instance, this could be at the level of pixels, image features, or classes of images.

Frameworks and Systems

Methods to interpret deep learning models have been implemented both as a part of existing deep learning frameworks and standalone packages in various programming languages. Tensorboard is a neural network visualization and debugging framework integrated with TensorFlow. Tensorboard has tools for visualization of an end-to-end deep learning pipeline and can provide visual summaries of the training data, the training process, and the trained deep neural network. TorchRay and Captum provide implementations of various interpretable deep learning algorithms in PyTorch. Other examples of such tools include DeepExplain and iNNvestigate that support different methods to visualize, debug, and answer what-if questions and can be used with Keras and TensorFlow. In addition to these frameworks, there are proposals for full systems for efficient visualization and debugging of trained deep learning models. DeepVis is a system to visualize activations in deep neural networks as they train (Yosinski2015). Mistique is a system to efficiently store, manage, and query deep learning models (Vartak2018). Deepbase provides a declarative interface to specify and test hypotheses and what-if queries on trained models (Sellam2019).

Data Management Opportunities

Various techniques related to interpretable deep learning, such as dimensionality reduction and data visualization, have been extensively explored in database research for understanding relational data. Many optimizations, such as smart caching and aggregations, explored in the data management context can also be explored to improve deep learning interpretability at scale. In addition to this, there are opportunities to design end-to-end deep learning systems with in-built data and model tracking during design, training, and deployment phases. Here, various ideas explored in provenance-aware systems can be applied to build interpretable deep learning systems.

Environmental Impact

Deep learning pipelines require an increasing amount of energy to design, train and deploy. Computational resources needed to produce state-of-the-art deep learning models double every three months and have grown by over 300,000 times from 2015 to 2020 (Toews2020). For instance, even a single training phase of a large deep learning model can emit as much CO2 as five cars produce throughout their lifetimes. This environmental impact is only set to grow exponentially. This is because of: (i) growing training data sets and model sizes as applications that need to employ deep learning get more complex, (ii) lengthy feature generation, model design, and tuning steps where designers have to train a model numerous times, and (iii) increasing pressure for efficient deployment of models that can produce results in the order of milliseconds.

Carbon Footprints

As a first step, it is critical to be able to capture the energy efficiency of deep learning models and use that as a metric in model design. For instance, Machine Learning Emissions Calculator and the Green Algorithms Project can provide a detailed breakdown of a model's carbon footprint based on hardware, cloud provider, and region (Lacoste2019, Lannelongue2020).

Green Hardware and Cloud Providers

Next, there are opportunities to evaluate hardware and cloud providers. One such method is to track the Power Usage Effectiveness Ratio (PUE) of cloud providers and FLOPs/W of hardware and make choices that maximize these metrics given a workload (Teich2018, Lacoste2019, Google2020PUE). Additionally, there is growing research to investigate new paradigms such as photonics and quantum computing to design specialized hardware that drastically improves the metric of FLOPs/Watt (Hamerly2019).

Resolution-setting Frameworks

Last, but not least, setting resolutions and tracking progress is a critical aspect. For instance, Microsoft plans to be carbon-negative by 2030 (Smith2020) and Apple and Amazon plan to attain carbon-neutrality by 2030 and 2050, respectively (Calma2019, Hern2020) by planning to utilize diverse renewable power sources such as wind, solar, advanced nuclear, enhanced geothermal, and green hydrogen for their data centers (Walton2020, Google2020).

Data Management Opportunities

Due to the origins of the issues being the large data sizes and computational costs, data management and systems research can play a drastic role here. We outline ongoing and open directions that data systems researchers and practitioners can take to reduce deep learning's environmental impact. First, there are opportunities to rethink model design, training, and deployment to utilize existing hardware better, e.g., utilizing the massive mismatch between compute and IO capacities of modern GPUs to design models that perform more compute per IO or allocating deep learning jobs in the cloud to minimize energy waste. We can also build deep learning systems that enable reuse and caching across all stages, including data sourcing, design, training, and deployment.