Keywords

1 Introduction

The remarkable computational capabilities unlocked by neural networks have led to the emergence of a rapidly growing class of neural-network based software applications. Unlike traditional software applications whose logic is driven from input-output specifications, neural networks are inherently opaque, as their logic is learned from examples of input-output pairs. The lack of high-level abstractions makes it challenging to interpret the logical reasoning employed by a neural network and hinders the use of standard software engineering practices such as automated testing, debugging, requirements analysis, and formal verification that have been established for producing high-quality software.

In this work, we aim to address this challenge by proposing a feature-guided approach to neural network engineering. Our proposed approach is illustrated in Figure 1. We draw from the insight that, in a neural network, early layers typically extract the important features of the inputs and the dense layers close to the output contain logic in terms of these features to make decisions [12]. The approach therefore first extracts high-level, human-understandable feature representations from the trained neural network which allows us to formally link domain-specific, human-understandable features to the internal logic of a trained model. This in turn enables us to reason about the model through the lens of the features and to drive the above mentioned software engineering activities.

For feature representations, we seek to extract associations between the activation values at the intermediate layers and higher-level abstractions that have clear semantic meaning (e.g., objects in a scene or weather conditions). We present an algorithm to extract these high-level feature representations in the form of rules (\(\texttt {pre}\implies \texttt {post}\)) where the precondition (\(\texttt {pre}\)) is a box over the latent space at an internal layer and the postcondition (\(\texttt {post}\)) denotes the presence (or absence) of the feature.

The formal, checkable rules enable us to evaluate the quality of the datasets, retrieve and label new data, understand scenarios where models make correct and incorrect predictions, detect incorrect (or out-of-distribution) samples at run-time, and verify models against human-understandable requirements.

We evaluate our algorithm for extracting feature representations and the downstream analyses using two networks trained for computer vision tasks, namely TaxiNet [4, 9], a regression model for center-line tracking on airport runways, and YOLOv4-Tiny [14], an object detection model trained on the nuImages [6] dataset for autonomous driving.

Fig. 1.
figure 1

Proposed Approach

2 Extracting Feature Representations

Algorithm 2.1 describes the method for extracting the representation of a particular feature from a trained neural network. A feed-forward neural network \(f: \mathbb {R}^n \rightarrow \mathbb {R}^m\) is organized in multiple layers, each consisting of computational units called neurons. Each neuron takes a weighted sum of the outputs from the previous layer and applies a non-linear activation function on it. The algorithm requires a small dataset D where each raw input is labeled with 0 or 1 indicating whether the feature under consideration is absent or present. The algorithm takes as inputs a neural network f, the dataset D, the index l of the layer used for extracting the feature representations. The first step of the algorithm (line 2) is to construct a new dataset A where each raw input x is replaced by the corresponding activation value a output by layer l (\(f^l(x)\) denotes the output of f at layer l for input x). Next, the algorithm invokes a learning procedure to learn a classifier r that separates activation values that map to feature being present from activation values that map to feature absence (line 3).

figure a

We use decision tree learning on line 3 to extract feature representations as a set of rules of the form \(\texttt {pre}\Rightarrow \{0,1\}\); \(\texttt {pre}\) in each rule is a condition on neuron values at layer l, and 0 or 1 indicates whether the rule corresponds to the feature being absent or present. \(\texttt {pre}\) is a box in the activation space of layer l, i.e., \(\bigwedge _{N_{j} \in {\mathcal N}_{l}} (N_{j}(x) \in [v^L_{j},v^U_{j}]).\) Here \({\mathcal N}_{l}\) is the set of neurons at layer l, and \(v^L_{j}\) and \(v^U_{j}\) are lower and upper bounds for the output of neuron \(N_{j}\). The rules mined by decision-tree learning partition the activation space at a given inner layer. Some partitions may be impure containing inputs both with and without the feature. We only select pure rules, having 100% precision on d. We return these rules as r. Note that there can be activation values for which no rule in r is satisfied and we are unable to say whether the feature is absent or present.

3 Feature-Guided Analyses

The extracted feature representations as formal, checkable rules enable multiple analyses, as listed below.

  • Data analysis and testing. We can assess the quality of the training and testing data in terms of coverage of different features. We can leverage the extracted feature representations to automatically retrieve new data that has the necessary features, by checking that the (unlabeled) data satisfies the corresponding rules. We can also use the extracted rules to label new data with their corresponding features, enabling further data-coverage analysis.

  • Debugging and explanations of network behavior. We can leverage the feature rules to uncover the high-level, human-understandable reasons for a neural network model making correct and incorrect predictions. In the latter case we can repair the model, which involves automatically selecting and training based on inputs with features that caused incorrect predictions.

  • Formulation and analysis of requirements. Extracted feature representations are the key to enabling verification of models with respect to high-level safety requirements (\(\texttt {pre}\implies \texttt {post}\)). Specifically, the constraint \(\texttt {pre}\) in the requirement expressed over features can be translated into a constraint \(\texttt {pre}'\) expressed over activation values, by substituting the features with their corresponding representations. The modified requirement \(\texttt {pre}' \implies \texttt {post}\) can be checked automatically using off-the-shelf verification tools [10].

  • Run-time monitoring. We can also enforce safety properties at run-time. For instance, we can use \(\texttt {pre}'\) as above to check (at run-time) whether inputs satisfy a desired precondition, and reject the ones that don’t.

  • Conformance with the operational design domain (ODD). This is a particular instance of the case above, where we use the rules to formally capture the model’s expected domain of operation and use a run-time guard to ensure that the model is not used in scenarios outside its ODD. A related problem is out-of-distribution detection, where we can similarly formulate the conditions under which the model is not supposed to operate and use run-time monitoring to enforce it.

One can also check overlap between feature rules, using off-the-shelf decision procedures, to uncover spurious correlations between the different features that are learned by the network. We envision many other applications for these rules, whose exploration we leave for the future.

4 Case Studies

We use two case studies to present initial empirical evidence in support of our ideas. In particular, we show that Algorithm 2.1 with decision tree learning is successful in extracting feature representations. We also demonstrate how these representations can be used for analyzing the behavior of neural networks.

4.1 Center-line Tracking with TaxiNet

We first analyzed TaxiNet, a perception model for center-line tracking on airport runways [4, 9]. It takes runway images as input and produces two outputs, cross-track (CTE) and heading angle (HE) errors which indicate the lateral and angular distance respectively of the nose of the plane from the center-line of the runway. We analyzed a CNN model provided by our industry partner, with 24 layers including three dense layers (100/50/10 neurons) before the output layer. It is critical that the TaxiNet model functions correctly and keeps the plane safe without running off the taxiway. The domain experts provided a specification for correct output behavior: \(|y_0-y_{0 ideal }| \le 1.0 m \wedge |y_1-y_{1 ideal }| \le 5\) degrees. One can evaluate the model correctness using Mean Absolute Error (MAE) on a test set (CTE:0.366, HE:1.645).

Feature Elicitation We first need to identify the high-level features that are relevant for the task. These could be some of the simulator parameters (for images generated from a simulator) and/or could be derived from high-level system (natural language) requirements. This is a challenging process requiring several iterations in collaboration with the domain experts. We obtained a list of 10 features: center-line, shadow, skid, position, heading, time-of-day, weather, visibility, intersection (junction) and objects (runway lights, birds, etc.) and values of interest for each feature respectively.

Data Analysis and Annotations We manually annotated a subset of 450 images from the test set with values for each feature. An initial data-coverage analysis of the distribution of the values for every feature across all the images, revealed many gaps. For instance, there were only day-time images, with only cloudy weather and all the images had high visibility. Also apart from runway lights, there were no images with any other objects on the runway. The analysis proved already useful, providing feedback to the experts with regard to the type of images that need to be added to improve the training and testing of the model.

Table 1. Rules for TaxiNet: d: annotated dataset, \(\# d\): total number of instances for that feature value in d, \(R_{d}\): recall (%) on d, \(P_{v}\),\(R_{v}\): precision (%) and recall (%) on validation set. Rules with highest \(R_{d}\) are shown.

Extracting Feature Rules We invoke Algorithm 2.1 to obtain rules in terms of the values of the neurons at the three dense layers of the network. Note that for each feature, we mined a separate rule for every value of interest. We used half of the annotated set of 450 images for extraction (d in Algorithm 2.1) and the remaining for validation of the rules. There are multiple rules extracted for each feature; each rule is associated with a support value (# of instances in d satisfying the rule) and has 100% precision on them since we only extract pure rules. The results are summarized in Table 1, indicating some high-quality rules (for "center-line present" , "shadow present" , "light skid", "position left", "position right"), measured on the validation set.

Fig. 2.
figure 2

Images satisfying rules for features

Figure 2 displays some of the images satisfying different rules. The corresponding heat maps were created by computing the image pixels impacting the neurons in the feature rule [7]. Note that for the "center-line present" rule, the part of the image impacting the rule (highlighted in red) is the center-line, indicating that indeed the rules identify the feature. On the other hand, in the absence of the center-line, it is unclear what information is used by the model (and the image leads to error). The heatmaps for the shadow and skid also correctly highlight the part of the image with the shadow of the nose and the skid marks. We used such visualization techniques to further validate the rules.

Labeling New Data The rules extracted based on a small set of manually annotated data can be leveraged to annotate a much larger data set. We used the rules for center-line (present/absent) to label all of the test data (2000 images). We chose the rule with highest \(R_d\) for the experiments. However, more rules could be chosen to increase coverage. 1822 of the images satisfied the rule for "center-line present" and 79 images for "center-line absent". We visually checked some of the images to estimate the accuracy of the labelling. We similarly annotated more images for the shadow and skid features. These new labels enable further data-coverage analysis over the train and test datasets.

Table 2. Feature-Guided Analysis Results

Feature-Guided Analysis We performed preliminary experiments to demonstrate the potential of feature-guided analyses. We first calculated the model accuracy (MAE) on subsets of the data labelled with the feature present and absent respectively. We also determined the % of inputs in the respective subsets violating the correctness property. The results are summarized in Table 2.

These results can be used by developers to better understand and debug the model behavior. For instance, the model accuracy computed for the subsets with "shadow present" and "dark skid", respectively, is poor and also a high % of the respective inputs violate the correctness property. This information can be used by developers to retrieve more images with shadows and dark skids, to retrain the model and improve its performance. The extracted rules can be leveraged to automate the retrieval.

Furthermore, we observe that in the absence of the center-line feature, the model has difficulty in making correct predictions. This is not surprising, as the presence of the center-line can be considered as a (rudimentary) input requirement for the center-line tracking application. Indeed, in the absence of the center-line it is hard to envision how the network can estimate correctly the airplane position from it. The network may use other clues on the runway, leading to errors. We can thus consider the presence of the center-line feature as part of the ODD for the application. The rules for the center-line feature can be deployed as a run-time monitor to either pass inputs satisfying the rules for "present" or reject those that satisfy the rules for "absent", ensuring that the model operates in the safe zone as defined by the ODD, and at the same time increasing its accuracy.

We also experimented with generating rules to explain correct and incorrect behavior in terms of combinations of features such as: \((center-line\;present) \wedge (shadow\;absent) \wedge (on\;position) \implies correct\), and \(\lnot (center-line\;present) \wedge (heading\;away) \wedge (position\;right) \implies incorrect\).Footnote 1. These rules could be further used by developers to better understand and debug the model behavior.

Table 3. Rules for YOLOv4-Tiny (same metrics as in Table 1).

4.2 Object Detection with YOLOv4-Tiny

We conducted another case study with a more challenging network, an object detector, to evaluate the quality of the extracted feature representations. For this study, we use the nuImages dataset, a public large-scale dataset for autonomous driving [1, 6]. It contains 93000 images collected while driving around in actual cities. To facilitate computer vision tasks such as object detection for autonomous driving, each image comes labeled with 2d bounding boxes and the corresponding object labels (from one of 23 object classes). Each labeled object also comes with additional attribute annotations. For instance, the objects labeled vehicle carry additional annotations like vehicle.moving, vehicle.stopped, and vehicle.parked. Overall, the dataset has 12 categories of additional attribute annotations. We trained a YOLOv4-Tiny object detection model [2, 14] on this dataset. YOLOv4-Tiny has 37 layers with 21 convolutional layers and 2 YOLO layers.

We leveraged the attribute annotations associated with each object as the feature labels (thus no manual labeling was necessary). For extracting feature representations, we run Algorithm 2.1 on a subset of 2000 images from the nuImages dataset, and then evaluate the extracted representations on a separate validation set of 2000 images.

Table 3 describes our results. We used layer 28 of the YOLOv4-Tiny model to extract the feature representations. For brevity, we only report the number of terms in the rule precondition, i.e., the number of neurons that appeared in the constraints, instead of describing the exact rule in Table 3. Note that layer 28 has 798720 neurons. Strikingly, the extracted rules only have between 10 to 25 terms in their preconditions, and yet achieve precision (\(P_v\)) between \(69-74\%\). The recall (\(R_v\)) values are also encouraging, and can be improved further by considering more than one rule for each feature value (here, we only consider pure rules with the highest recall \(R_d\) on dataset d used for feature extraction).

4.3 Challenges and Mitigations

Identifying relevant features is non-trivial and requires refinement and extensive discussions with domain experts. The feature annotations may need to be provided manually which is expensive and error-prone. However, we only need a small annotated dataset to extract the representations, which can be used to further annotate unlabeled data. The extracted rules may be incorrect (e.g., due to unbalanced annotated data). We mitigate by carefully validating them using a separate validation set and visualization techniques. It could also be that the network did not learn some important features. To address the issue, in future work, we plan to investigate neuro-symbolic approaches to build networks that are aware of high-level features and satisfy (by construction) the safety requirements.

5 Related Work

There is growing interest in developing software engineering approaches for machine learning in general, and neural networks specially, investigating requirements for neural networks [3], automated testing [16], debugging and fault localization [8], to name a few. Our work contributes with a feature-centric view of neural network behavior that links high-level requirements with the internal logic of the trained models to enable better testing and analysis of neural networks.

A closely related work [18] uses high-level features to guide neural network analysis. However, the features are extracted from input images, not from the internal neural network representation. Further, the work only considers testing, not other software engineering activities.

Our work is also related to concept analysis [11, 13, 15, 17] which seeks to develop explanations of deep neural network behavior in terms of concepts specified by users. We propose to use high-level features for multiple software engineering activities, which go beyond explanations. Moreover, the use of decision tree learning makes our representations relatively cheap to extract. Note that there are other works that use decision tree learning to distill neural network input-output behavior, e.g., [5]; however none of them extract high-level features from the network’s internal representation.

6 Conclusion

We proposed to extract high-level feature representations related to domain-specific requirements to enable analysis and explanation of neural network behavior. We presented initial empirical evidence in support of our ideas. In future work, we plan to further investigate meaningful requirements for neural networks and effective techniques for checking them. We also plan to apply Marabou [10] for the verification of safety properties expressed in terms of high-level features. Finally, we plan to investigate neuro-symbolic techniques to develope high-assurance neural network models.