File size: 4,342 Bytes
a163694
9a0ca21
 
 
 
 
 
 
 
 
 
 
 
 
 
a022378
 
 
9a0ca21
 
a022378
 
 
 
9a0ca21
 
 
d3d079b
 
 
 
 
 
9a0ca21
 
 
 
 
d3d079b
bc15b84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d3d079b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a022378
d3d079b
 
 
b47a994
9a0ca21
 
 
 
 
 
 
 
 
d3d079b
9a0ca21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a022378
d3d079b
 
 
 
 
 
 
 
 
 
 
 
 
9a0ca21
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
license: mit # Or apache-2.0, gpl-3.0, etc. Choose the license that applies to your project.
tags:
- deepfake-detection
- video-classification
- computer-vision
- xception
- lstm
model-index:
- name: Deepfake Detection Model
  results:
  - task:
      type: video-classification
      name: Video Classification
    dataset:
      name: FaceForensics++ & CelebDFv2 # Updated to reflect both datasets
      type: image-folder # Refers to the processed frames from videos
      split: test # Updated to reflect testing data
    metrics:
      - type: accuracy
        value: 0.9593 # Updated with Test Accuracy
        name: Test Accuracy
      - type: f1
        value: 0.94 # Using previous F1, if you have a specific test F1, update here
        name: F1 Score
---
# Deepfake Detection Model
This repository contains a deepfake detection model built using a combination of a pre-trained Xception network and an LSTM layer. The model is designed to classify videos as either "Real" or "Fake" by analyzing sequences of facial frames extracted from the video.

### Model Architecture

The model architecture consists of the following components:

1.  **Input**: Accepts a sequence of `TIME_STEPS` frames, each resized to `299x299` pixels.
2.  **Feature Extraction**: A **TimeDistributed Xception network** processes each frame, extracting key features.
3.  **Temporal Learning**: An **LSTM layer** with `256` units learns temporal dependencies between these extracted frame features.
4.  **Regularization**: A **Dropout layer** (`0.5` rate) prevents overfitting.
5.  **Output**: A **Dense layer** with `softmax` activation predicts probabilities for "Real" and "Fake" classes.

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data
The model was tested on unseen samples from the FaceForensics++ and CelebDFv2 datasets.

#### Metrics
* **Accuracy**: Measures correct classifications.
* **F1 Score**: Balances precision and recall.

### Results

| Metric             | Value   |
| :----------------- | :------ |
| Training Accuracy  | 98.44%  |
| Validation Accuracy| 97.05%  |
| Test Accuracy      | 95.93%  |

**Disclaimer**: These results were obtained using the FaceForensics++ and CelebDFv2 datasets. Performance in real-world scenarios may vary.


### How to Use

#### 1\. Setup

Clone the repository and install the required libraries:

```bash
pip install tensorflow opencv-python numpy mtcnn Pillow
```

#### 2\. Model Loading

The model weights are loaded from `COMBINED_best_Phase1.keras`. Ensure this file is accessible at the specified `model_path`.

```python
model_path = ''
model = build_model() # Architecture defined in the `build_model` function
model.load_weights(model_path)
```
The `build_model` function defines the architecture as:
```python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Global parameters for model input shape (ensure these are defined before calling build_model)
# TIME_STEPS = 30
# HEIGHT = 299
# WIDTH = 299

def build_model(lstm_hidden_size=256, num_classes=2, dropout_rate=0.5):
    # Input shape: (batch_size, TIME_STEPS, HEIGHT, WIDTH, 3)
    inputs = layers.Input(shape=(TIME_STEPS, HEIGHT, WIDTH, 3))
    # TimeDistributed layer to apply the base model to each frame
    base_model = keras.applications.Xception(weights='imagenet', include_top=False, pooling='avg')
    # For inference, we don't need to set trainable, but if you plan to retrain, you can set accordingly
    # base_model.trainable = False
    # Apply TimeDistributed wrapper
    x = layers.TimeDistributed(base_model)(inputs)
    # x shape: (batch_size, TIME_STEPS, 2048)
    # LSTM layer
    x = layers.LSTM(lstm_hidden_size)(x)
    x = layers.Dropout(dropout_rate)(x)
    outputs = layers.Dense(num_classes, activation='softmax')(x)
    model = keras.Model(inputs, outputs)
    return model
```

#### 3\. Prediction

Once the `video_array` (preprocessed frames) is ready, you can make a prediction using the loaded model:

```python
predictions = model.predict(video_array)
predicted_class = np.argmax(predictions, axis=1)[0]
probabilities = predictions[0]

class_names = ['Real', 'Fake']
print(f"Predicted Class: {class_names[predicted_class]}")
print(f"Class Probabilities: Real: {probabilities[0]:.4f}, Fake: {probabilities[1]:.4f}")
```

<!-- end list -->