You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
!!! info "What is the requirement of the project?"
28
34
- The project focuses on identifying anomalies in time-series data using an LSTM autoencoder. The model learns normal patterns and detects deviations indicating anomalies.
@@ -47,79 +53,98 @@ Synthetic time-series data generated using sine wave with added noise.
47
53
48
54
---
49
55
50
-
### Model Architecture
51
-
- The LSTM autoencoder learns normal time-series behavior and reconstructs it. Any deviation is considered an anomaly.
52
-
- Encoder: Extracts patterns using LSTM layers.
53
-
- Bottleneck: Compresses the data representation.
54
-
- Decoder: Reconstructs the original sequence.
55
-
- The reconstruction error determines anomalies.
56
-
57
-
### Model Structure
58
-
- Input: Time-series sequence (50 time steps)
59
-
- LSTM Layers for encoding
60
-
- Repeat Vector to retain sequence information
61
-
- LSTM Layers for decoding
62
-
- TimeDistributed Dense Layer for reconstruction
63
-
- Loss Function: Mean Squared Error (MSE)
56
+
## 🔍 PROJECT EXPLANATION
57
+
58
+
### 🧩 DATASET OVERVIEW & FEATURE DETAILS
59
+
60
+
??? example "📂 Synthetic dataset"
61
+
62
+
- The dataset consists of a sine wave with added noise.
63
+
64
+
| Feature Name | Description | Datatype |
65
+
|--------------|-------------|:------------:|
66
+
| time | Timestamp | int64 |
67
+
| value | Sine wave value with noise | float64 |
64
68
65
69
---
66
70
67
-
#### WHAT I HAVE DONE
71
+
###🛤 PROJECT WORKFLOW
68
72
69
-
=== "Step 1"
73
+
!!! success "Project workflow"
70
74
71
-
Exploratory Data Analysis
75
+
``` mermaid
76
+
graph LR
77
+
A[Start] --> B{Generate Data};
78
+
B --> C[Normalize Data];
79
+
C --> D[Create Sequences];
80
+
D --> E[Train LSTM Autoencoder];
81
+
E --> F[Compute Reconstruction Error];
82
+
F --> G[Identify Anomalies];
83
+
```
72
84
85
+
=== "Step 1"
73
86
- Generate synthetic data (sine wave with noise)
74
87
- Normalize data using MinMaxScaler
75
88
- Split data into training and validation sets
76
89
77
90
=== "Step 2"
78
-
79
-
Data Cleaning and Preprocessing
80
-
81
91
- Create sequential data using a rolling window approach
82
92
- Reshape data for LSTM compatibility
83
93
84
94
=== "Step 3"
95
+
- Implement LSTM autoencoder for anomaly detection
96
+
- Optimize model using Adam optimizer
85
97
86
-
Feature Engineering and Selection
98
+
=== "Step 4"
99
+
- Compute reconstruction error for anomaly detection
100
+
- Identify threshold for anomalies using percentile-based method
87
101
88
-
- Use LSTM layers for sequence modeling
89
-
- Implement autoencoder-based reconstruction
102
+
=== "Step 5"
103
+
- Visualize detected anomalies using Matplotlib
90
104
91
-
=== "Step 4"
105
+
---
92
106
93
-
Modeling
107
+
### 🖥 CODE EXPLANATION
94
108
95
-
- Train an LSTM autoencoder
96
-
- Optimize loss function using Adam optimizer
97
-
- Monitor validation loss for overfitting prevention
109
+
=== "LSTM Autoencoder"
110
+
- The model consists of an encoder, bottleneck, and decoder.
111
+
- It learns normal time-series behavior and reconstructs it.
112
+
- Deviations from normal patterns are considered anomalies.
98
113
99
-
=== "Step 5"
114
+
---
100
115
101
-
Result Analysis
116
+
### ⚖️ PROJECT TRADE-OFFS AND SOLUTIONS
102
117
103
-
- Compute reconstruction error for anomaly detection
104
-
- Identify threshold for anomalies using percentile-based method
105
-
- Visualize detected anomalies using Matplotlib
118
+
=== "Reconstruction Error Threshold Selection"
119
+
- Setting a high threshold may miss subtle anomalies, while a low threshold might increase false positives.
120
+
- **Solution**: Use the 95th percentile of reconstruction errors as the threshold to balance false positives and false negatives.
106
121
107
122
---
108
123
109
-
#### PROJECT TRADE-OFFS AND SOLUTIONS
124
+
##🖼 SCREENSHOTS
110
125
111
-
=== "Trade Off 1"
126
+
!!! tip "Visualizations and EDA of different features"
112
127
113
-
**Reconstruction Error Threshold Selection:**
114
-
Setting a high threshold may miss subtle anomalies, while a low threshold might increase false positives.
128
+
=== "Synthetic Data Plot"
129
+
115
130
116
-
- **Solution**: Use the 95th percentile of reconstruction errors as the threshold to balance false positives and false negatives.
131
+
??? example "Model performance graphs"
132
+
133
+
=== "Reconstruction Error Plot"
117
134
118
135
---
119
136
120
-
### CONCLUSION
137
+
##📉 MODELS USED AND THEIR EVALUATION METRICS
121
138
122
-
#### WHAT YOU HAVE LEARNED
139
+
| Model | Reconstruction Error (MSE) |
140
+
|------------------|---------------------------|
141
+
| LSTM Autoencoder | 0.015 |
142
+
143
+
---
144
+
145
+
## ✅ CONCLUSION
146
+
147
+
### 🔑 KEY LEARNINGS
123
148
124
149
!!! tip "Insights gained from the data"
125
150
- Time-series anomalies often appear as sudden deviations from normal patterns.
@@ -133,15 +158,13 @@ Synthetic time-series data generated using sine wave with added noise.
133
158
134
159
---
135
160
136
-
#### USE CASES OF THIS MODEL
137
-
138
-
=== "Application 1"
161
+
### 🌍 USE CASES
139
162
140
-
- Financial fraud detection through irregular transaction patterns.
163
+
=== "Financial Fraud Detection"
164
+
- Detect irregular transaction patterns using anomaly detection.
141
165
142
-
=== "Application 2"
166
+
=== "Predictive Maintenance"
167
+
- Identify equipment failures in industrial settings before they occur.
143
168
144
-
- Predictive maintenance in industrial settings by identifying equipment failures.
0 commit comments