File size: 6,341 Bytes


---
tags:
- bertopic
library_name: bertopic
---

# BERTopic_Multimodal

This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. 
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. 

This model was trained on 8000 images from Flickr **without** the captions. This demonstrates how BERTopic can be used for topic modeling using images as input only. 

A few examples of generated topics:

!["multimodal.png"](multimodal.png)

## Usage 

To use this model, please install BERTopic:

```
pip install -U bertopic[vision]
pip install -U safetensors
```

You can use the model as follows:

```python
from bertopic import BERTopic
topic_model = BERTopic.load("MaartenGr/BERTopic_Multimodal")

topic_model.get_topic_info()
```

You can view all information about a topic as follows:

```python
topic_model.get_topic(topic_id, full=True)
```

## Topic overview

* Number of topics: 29
* Number of training documents: 8091

<details>
  <summary>Click here for an overview of all topics.</summary>
  
  | Topic ID | Topic Keywords | Topic Frequency | Label | 
|----------|----------------|-----------------|-------| 
| -1 | while - air - the - in - jumping | 34 | -1_while_air_the_in | 
| 0 | bench - sitting - people - woman - street | 1132 | 0_bench_sitting_people_woman | 
| 1 | grass - running - dog - grassy - field | 1693 | 1_grass_running_dog_grassy | 
| 2 | boy - girl - little - young - holding | 1290 | 2_boy_girl_little_young | 
| 3 | dog - frisbee - running - water - mouth | 1224 | 3_dog_frisbee_running_water | 
| 4 | skateboard - ramp - doing - trick - cement | 415 | 4_skateboard_ramp_doing_trick | 
| 5 | snow - dog - covered - running - through | 309 | 5_snow_dog_covered_running | 
| 6 | mountain - range - slope - standing - person | 205 | 6_mountain_range_slope_standing | 
| 7 | pool - blue - boy - toy - water | 189 | 7_pool_blue_boy_toy | 
| 8 | trail - bike - down - riding - person | 166 | 8_trail_bike_down_riding | 
| 9 | snowboarder - mid - jump - air - after | 126 | 9_snowboarder_mid_jump_air | 
| 10 | rock - climbing - up - wall - tree | 124 | 10_rock_climbing_up_wall | 
| 11 | wave - surfboard - top - riding - of | 112 | 11_wave_surfboard_top_riding | 
| 12 | beach - surfboard - people - with - walking | 102 | 12_beach_surfboard_people_with | 
| 13 | jumping - track - horse - racquet - dog | 98 | 13_jumping_track_horse_racquet | 
| 14 | snowboard - snow - girl - hill - slope | 95 | 14_snowboard_snow_girl_hill | 
| 15 | game - being - football - played - professional | 91 | 15_game_being_football_played | 
| 16 | soccer - kicking - team - ball - player | 80 | 16_soccer_kicking_team_ball | 
| 17 | dirt - bike - person - rider - going | 75 | 17_dirt_bike_person_rider | 
| 18 | soccer - boys - field - ball - kicking | 69 | 18_soccer_boys_field_ball | 
| 19 | baseball - player - bat - swinging - into | 63 | 19_baseball_player_bat_swinging | 
| 20 | basketball - up - and - playing - jumping | 59 | 20_basketball_up_and_playing | 
| 21 | bird - body - flying - over - long | 55 | 21_bird_body_flying_over | 
| 22 | motorcycle - track - race - racer - racing | 55 | 22_motorcycle_track_race_racer | 
| 23 | boat - sitting - water - lake - hose | 53 | 23_boat_sitting_water_lake | 
| 24 | street - riding - down - bike - woman | 52 | 24_street_riding_down_bike | 
| 25 | paddle - suit - paddling - water - in | 49 | 25_paddle_suit_paddling_water | 
| 26 | pair - scissors - stage - white - shirt | 42 | 26_pair_scissors_stage_white | 
| 27 | tennis - court - racket - racquet - swinging | 34 | 27_tennis_court_racket_racquet |
  
</details>

## Training Procedure

The data was retrieved as follows:

```python
import os
import glob
import zipfile
import numpy as np
import pandas as pd
from tqdm import tqdm
from sentence_transformers import util

# Flickr 8k images
img_folder = 'photos/'
caps_folder = 'captions/'
if not os.path.exists(img_folder) or len(os.listdir(img_folder)) == 0:
    os.makedirs(img_folder, exist_ok=True)

    if not os.path.exists('Flickr8k_Dataset.zip'):   #Download dataset if does not exist
        util.http_get('https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip', 'Flickr8k_Dataset.zip')
        util.http_get('https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip', 'Flickr8k_text.zip')

    for folder, file in [(img_folder, 'Flickr8k_Dataset.zip'), (caps_folder, 'Flickr8k_text.zip')]:
        with zipfile.ZipFile(file, 'r') as zf:
            for member in tqdm(zf.infolist(), desc='Extracting'):
                zf.extract(member, folder)
images = list(glob.glob('photos/Flicker8k_Dataset/*.jpg'))
``` 

Then, to perform topic modeling on multimodal data with BERTopic:

```python
from bertopic import BERTopic
from bertopic.backend import MultiModalBackend
from bertopic.representation import VisualRepresentation, KeyBERTInspired

# Image embedding model
embedding_model = MultiModalBackend('clip-ViT-B-32', batch_size=32)

# Image to text representation model
representation_model = {
    "Visual_Aspect": VisualRepresentation(image_to_text_model="nlpconnect/vit-gpt2-image-captioning", image_squares=True),
    "KeyBERT": KeyBERTInspired()
}

# Train our model with images only
topic_model = BERTopic(representation_model=representation_model, verbose=True, embedding_model=embedding_model, min_topic_size=30)
topics, probs = topic_model.fit_transform(documents=None, images=images)
```

The above demonstrates that the input were only images. These images are clustered and from those clusters a small subset of representative images are extracted. The representative images are captioned using `"nlpconnect/vit-gpt2-image-captioning"` to generate a small textual dataset over which we can run c-TF-IDF and the additional 
`KeyBERTInspired` representation model. 

## Training hyperparameters

* calculate_probabilities: False
* language: None
* low_memory: False
* min_topic_size: 30
* n_gram_range: (1, 1)
* nr_topics: None
* seed_topic_list: None
* top_n_words: 10
* verbose: True

## Framework versions

* Numpy: 1.23.5
* HDBSCAN: 0.8.29
* UMAP: 0.5.3
* Pandas: 1.5.3
* Scikit-Learn: 1.2.2
* Sentence-transformers: 2.2.2
* Transformers: 4.29.2
* Numba: 0.56.4
* Plotly: 5.14.1
* Python: 3.10.10