--- license: mit base_model: - timm/swin_base_patch4_window7_224.ms_in22k_ft_in1k pipeline_tag: image-classification library_name: timm --- # PowerPoint slide classifier This is a classifier to classify 5 types of PowerPoint slide layouts. Finetuned from `timm/swin_base_patch4_window7_224.ms_in22k_ft_in1k` and trained on 10k powerpoint slide images. * `0`: Common content slide * `1`: End slide * `2`: Start slide * `3`: Subtitle slide * `4`: Subtitle list slide ## Usage ### Install timm and dependencies ```bash pip install timm==1.0.15 torch==2.7.0 torchvision==0.22.0 ``` ### Inference Use the following code to classify images from a folder. ```python import os import timm import torch from PIL import Image from torchvision import transforms device = torch.device("cuda" if torch.cuda.is_available() else "cpu") image_folder = 'path_to_images' transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize( mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] ) ]) model = timm.create_model('swin_base_patch4_window7_224', pretrained=False, num_classes=5) model.load_state_dict(torch.load('pytorch_model.bin')) model.to(device) model.eval() image_files = [f for f in os.listdir(image_folder) if f.lower().endswith('.png')] idx_to_class = { 0: 'content', 1: 'end', 2: 'start', 3: 'subt', 4: 'subtl' } with torch.no_grad(): for image_name in image_files: image_path = os.path.join(image_folder, image_name) image = Image.open(image_path).convert('RGB') input_tensor = transform(image).unsqueeze(0).to(device) output = model(input_tensor) predicted_class = torch.argmax(output, dim=1).item() predicted_label = idx_to_class[predicted_class] print(f"{image_name} --> {predicted_label}") ```