Machine Learning

Introducing Shats: A Shapley-based approach to time-phase models

Getting started

are among the most popular tools for describing machine learning (ML) and deep learning (DL) models. However, with Time-Series data, these methods often fail because they do not account for the temporal dependence arising from such fields. In the latest article, we (ángel luis perales Gómez, Lorenzo Fernández Maimó and I) Presented Shats, a novel Shapley-based charactical decabilication made specifically for time models. Shats addresses the limitations of traditional Shapley methods by incorporating planning techniques that improve execution and interpretation.

Shapley Values: Basic

Shapley values ​​come from the cooperative game field and efficiently distribute the total profit among the players based on their cooperative contributions to the cooperative effort. A player's Shapley value is calculated by looking at all the player's cases and determining the marginal contribution of that player.

Formally, the shapley value Φme For the player me is:

[ varphi_i(v) = sum_{S subseteq N setminus {i}}
frac{|S|! (|N| – |S| – 1)!}{|N|!} (v(S cup {i}) – v(S)) ]

where:

  • Ni a collection of all players.
  • A woman Is the alliance of players not included me.
  • v (s) a value function that gives value to each coalition (ie, the total value of that coalition A woman can reach).

This formula calculates the player's marginal contributions me For all possible coalitions, they are weighted by the probability of unity of each one.

From game logic to xai: shapley values ​​in machine learning

According to the definition of AI (Xai), Shapley values ​​separate the model's outputs from its inputs. This is especially useful for understanding complex models, such as deep neural networks, where the relationship between input and output is not always clear.

Shallyy methods can be very expensive, especially as the number of factors increases, because the number of possible coalitions increases in reality. However, approaches, especially those used in the popular library, have made it possible to work. These methods estimate Shapley values ​​by sampling a subset of coalitions rather than testing all possible combinations, greatly reducing the computational burden.

Consider an industrial situation with three components: a water tank, a thermometer and an engine. Let's say we have an anomaly detection (advertisement) ML / DL Model that detects risky activity based on the reading of these components. Using shapes, we can determine how much each factor contributes to the model's prediction of whether an employee is aggressive or aggressive.

Synthesis of the Anomaly Detection Scenario industry. Pictures made by the authors

However, in practical situations the model uses not only the current readings from each sensor but also the previous readings (time window) to make predictions. This method allows the model to pick up temporary patterns and trends, thereby improving its performance. Applying in this situation assigning responsibility to each part of the body becomes more challenging because there is no longer a one-to-one mapping between features and senses. Each sensor now offers multiple features corresponding to different time steps. A common method here is to calculate the shapley value of each element in each section and after that poc sums these values.

Renal Integration of the Anomaly Detection Scenario ewiled Images by the authors.

This method has two main drawbacks:

  • Computational difficulty: The assembly cost increases significantly with the number of items, making it impossible for the larger Seming Series coins.
  • Neglecting temporal dependence: Shapel characters are designed for tabular data without temporal dependence. Post-Hoc Aggregation can lead to inaccurate interpretations because it fails to capture temporal relationships between factors.

SHATS GOES NEW: Gathering Teams Before Computing Matters

In the Shapley Framework, the value of a player is determined only by comparing the performance of the coalition with and without that player. Although the method is defined on an individual level, there is nothing that prevents it from being used in groups of players rather than in singles. So, if we look at the group of players Ni divided type groups G = {g1…, gtype}we can add up the shapley value of each group Picturesme By examining the marginal contribution of the entire group to all the remaining specific groups the remaining groups. Formally, the group's shapley value Picturesme can be expressed as:

[ varphi(G_i) = sum_{T subseteq G setminus G_i} frac{|T|! (|G| – |T| – 1)!}{|G|!} left( v(T cup G_i) – v(T) right) ]

where:

  • Pictures a collection of all groups.
  • T Does party unity not include Picturesme.
  • v

Building on this concept, the shots work with Time Windows and offer three different levels of grouping, depending on the defining purpose:

For a while

Each group contains all measurements recorded at the fastest time within a time window. This strategy helps to identify critical factors that significantly influence the model's predictions.

An example of a short-term planning plan. Pictures made by the authors.

The face

Each group represents an individual's characteristic measurements over a time window. This strategy isolates the impact of certain factors on the model's decisions.

An example of a feature layout plan. Pictures made by the authors.

Illustrated feature

Each group includes aggregated measurements over time windows of features that share a logical relationship or represent a functional unit. This method analyzes the joint effect of joint factors, which ensures their combined detection.

An example of a Multi-fiden group plan. Pictures made by the authors.

When groups are defined, shallyy values ​​are entered exactly as in each case, but using Group-Level Marginal contributions instead of characteristic contributions.

Shats Road View. Pictures made by the authors.

Custom SHATS visualization

SHATS includes visualizations designed specifically for sequential data and three over-processing techniques. The horizontal axis indicates successive windows. The left axis lists the groups, and the right vertical axis shows the anomaly model score for each window. Each heatmap cell (i, g+barWe are divided represents the value of the group Pictures+bar through the window me. A warm red fam indicates a strong positive contribution to the anomaly, cooler blues indicate a strong positive contribution, and whiter white indicates a negligible influence. The purple broken line follows the anomaly points of all windows, and the horizontal line at 0.5 is the decision mark

For example, consider a model that considers windows of length 10 constructed from three elements, X, Yagain Games. When the operator receives an alert and wants to know what the target signal is, they check the results of the feature analysis. In the following image, about windows 10-11 points anomalily rise above the threshold, while the sign of X it gets stronger. This pattern shows that the decision is primarily driven by X.

Shats search for custom feature strategy. Images produced by the Shats library.

If the next question is where, within each window, Anomaly occurs, the operators turn into a temporary group opinion. The following figure shows that the last instant of each window (t9) in constant agreement with the best Adtibution, which reveals that the model has learned to rely on the last phase of the window classification as anamalous.

SHATS visualization for customizing the interim plan. The left list of the Y-axis the Window's Time Slots $ T_0 $ (earliest) to $ T_9 $ (most recent). Images produced by the Shats library.

Test Results: Testing Shats in the SWAT Database

In our latest book, we confirm the shots in safe water treatment (SWAT) tested, an industrial area with areas / actors organized in six categories (P1-P6). A BI-LO-LSTM set trained on the windowed signals serves as a detector, and we compare the shots with a post hoc kernelshap using three ideas: additive (quick in window cases), and or six stages).

On the other side of the attack, the shots reveal strong bands, which interpret the groups that show the real source of the sensor / actuator or in the plant stage – and the position of the hoc – and the post hoc – and the post hoc – where the post hoc has slipped into the most important use of all the many groups, which is confusing root analysis. Shats was faster and had more scale: the smaller the teams the player set, so the space for cohesion decreased significantly; The runtime is constant as the window length increases because the number of groups does not change; And GPU execution speeds up the process, making real-time use more efficient.

Hands-on Example: Integrating mbats into your workflow

This Walkthrough shows how to connect mbats to a standard Python workflow: enter the grobrance strategy, start the scouding stratery, start the values ​​of your trained data and background data, values ​​with intelligent Shapley problems to visualize the results. For example we assume a Pytorch runtime model and that your data is included (eg. [window_len, n_features] sample by sample).

1. Import Shats and configure the character

In your Python script or notebook, first import the required elements from the Shats library. While the archive reveals a category of unusual shats, by practicing its concrete installation (eg FASTSHATS).

import shats
from shats.grouping import TimeGroupingStrategy
from shats.grouping import FeaturesGroupingStrategy
from shats.grouping import MultifeaturesGroupingStrategy

2. Name the model and data

Assume that you have a trained Pytorch Model training session with a background dataset, which should be a wish list representing typical data samples that the model has seen during training. If you want to better manage the background dataset check out this blog from cristoph molnar.

model = MyTrainedModel()
random_samples = random.sample(range(len(trainDataset)), 100)
background = [trainDataset[idx] for idx in random_samples]

shapley_class = shats.FastShaTS(model, 
    support_dataset=background,
    grouping_strategy= FeaturesGroupingStrategy(names=variable_names)

3. Compute Shapley values

As soon as the Changer is started, it combines the values ​​of the shots of your test data. The test database should be formatted similarly to the background data.

shats_values = shaTS.compute(testDataset)

4. Visualize the results

Finally, use the built-in function to edit the values ​​of the shots. You can specify which category (eg, anamalasous or normal) you want to define.

shaTS.plot(shats_values, test_dataset=testDataset, class_to_explain=1)

Key acquisition

  • Focused Focus: Shats offer more focused qualities than the Post Hope shape, making it easier to see the root cause of the time period models.
  • The ability to do: By reducing the number of players in teams, shots significantly reduce the number of Coalitions to check, resulting in faster assembly times.
  • It's a nightmare: Shats maintains consistent performance even as the window size increases, due to its limited group structure.
  • GPU acceleration: Shats can take advantage of GPU resources, and improve their speed and efficiency.

Try it yourself

Interactive demo

Compare Shats with post hoc shape in sonthetic time-series here. You can find the tutorial in the following video.

https://www.youtube.com/watch?v=eihgqWodga

Open Source

The Shats module is completely scripted and ready to plug into your ML / DL Pipeline. Get the code on GitHub.

I hope you like it! You are welcome to contact me if you have any questions, want to share feedback, or simply feel like showing off your projects.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button