File size: 4,856 Bytes
dd2b2cc
 
 
 
 
 
 
 
 
 
 
 
 
6216ecd
10f1ada
 
6216ecd
10f1ada
6216ecd
 
 
 
 
 
 
 
 
 
 
10f1ada
6216ecd
 
 
10f1ada
 
6216ecd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10f1ada
 
6216ecd
10f1ada
80e1fb4
 
87c4a7b
 
f32ba3d
 
 
efc3ffb
dd2b2cc
 
 
 
 
 
 
 
 
80e1fb4
 
 
754c321
 
 
 
7fa6516
 
754c321
 
 
7fa6516
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
title: Image Colorization
emoji: 🐒
colorFrom: purple
colorTo: yellow
sdk: docker
pinned: false
license: apache-2.0
app_port: 5000
---

hugging face config

## Image Colorization
==============================

An deep learning based Image Colorization project.

## FINDINGS
- the task we want to learn is `image-colorization` but we can accompolish that by doing different types of tasks, I call these **sub-task**, in our content they could be like `regression based image colorization`, `classification(by binning) based colorization`, `GAN based colorization`, `image colorization + scene classication(Let there be colors research paper did this)`.
- based on analysis and while I was trying to come up with a project file structure I came to know that the data, model, loss, metrics, dataloader all these are very coupled while dealing with a particular task(`image-colorization`) but when we talk about a **sub-task** we have much more freedom.
- within a sub-task(e.g., regression-unet-learner) we already made a set of rules and now we can use different models without changing the data, or we can change different datasets while using the same model, **so it is important to fix the sub-task we want to do first.**
- so making a folder for each sub-task seems right as a sub-task has high cohesion and no coupling with any other sub-task.

## RULES
- use **lower_snake_case** for **functions**
- use **lower_snake_case** for **file_name & folder names**
- use **UpperCamelCase** for **class names**
- **sub-task** name should be in **lower-kebab-case**

## Project File Structure
------------
    .
    β”œβ”€β”€ LICENSE
    β”œβ”€β”€ README.md          <- The top-level README for developers using this project.
    β”œβ”€β”€ data/
    β”‚   β”œβ”€β”€ external       <- Data from third party sources.
    β”‚   β”œβ”€β”€ interim        <- Intermediate data that has been transformed.
    β”‚   β”œβ”€β”€ processed      <- The final, canonical data sets for modeling.
    β”‚   └── raw            <- The original, immutable data dump.
    β”œβ”€β”€ models/             <- Trained models
    β”œβ”€β”€ notebooks/          <- Jupyter notebooks
    β”œβ”€β”€ configs/
    β”‚   β”œβ”€β”€ experiment1.yaml
    β”‚   β”œβ”€β”€ experiment2.yaml
    β”‚   β”œβ”€β”€ experiment3.yaml
    β”‚   └── ...
    └── src/
        β”œβ”€β”€ sub_task_1/
        β”‚   β”œβ”€β”€ validate_config.py
        β”‚   β”œβ”€β”€ data/
        β”‚   β”‚   β”œβ”€β”€ register_datasets.py
        β”‚   β”‚   β”œβ”€β”€ datasets/
        β”‚   β”‚   β”‚   β”œβ”€β”€ dataset1.py
        β”‚   β”‚   β”‚   └── dataset2.py
        β”‚   β”œβ”€β”€ model/
        β”‚   β”‚   β”œβ”€β”€ base_model_interface.py
        β”‚   β”‚   β”œβ”€β”€ register_models.py
        β”‚   β”‚   β”œβ”€β”€ models/
        β”‚   β”‚   β”‚   β”œβ”€β”€ simple_model.py
        β”‚   β”‚   β”‚   └── complex_model.py
        β”‚   β”‚   β”œβ”€β”€ losses.py
        β”‚   β”‚   β”œβ”€β”€ metrics.py
        β”‚   β”‚   β”œβ”€β”€ callbacks.py
        β”‚   β”‚   └── dataloader.py
        β”‚   └── scripts/
        β”‚       β”œβ”€β”€ create_dataset.py
        β”‚       └── create_model.py
        β”œβ”€β”€ sub_task_2/
        β”‚   └── ...
        β”œβ”€β”€ sub_task_3/
        β”‚   └── ...
        β”œβ”€β”€ scripts/
        β”‚   β”œβ”€β”€ create_sub_task.py
        β”‚   β”œβ”€β”€ prepare_dataset.py
        β”‚   β”œβ”€β”€ visualize_dataset.py
        β”‚   β”œβ”€β”€ visualize_results.py
        β”‚   β”œβ”€β”€ train.py
        β”‚   β”œβ”€β”€ evaluate.py
        β”‚   └── inference.py
        └── utils/
            β”œβ”€β”€ data_utils.py
            └── model_utils.py
--------


<p><small>Project based on the <a target="_blank" href="https://drivendata.github.io/cookiecutter-data-science/">cookiecutter data science project template</a>. #cookiecutterdatascience</small></p>


Kaggle API docs:- https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md

## Kaggle Commands:-
- kaggle kernels pull anujpanthri/training-image-colorization-model -p kaggle/
- kaggle kernels push -p kaggle/
- echo "{\"username\":\"$KAGGLE_USERNAME\",\"key\":\"$KAGGLE_KEY\"}" > kaggle.json

## Docker Commands:-
- docker buildx build --secret id=COMET_API_KEY,env=COMET_API_KEY -t testcontainer
- docker run -it -p 5000:5000 -e COMET_API_KEY=$COMET_API_KEY testcontainer

## Git Commands:-
- git lfs migrate info --everything --include="*.zip,*.png,*.jpg"
- git lfs migrate import --everything --include="*.zip,*.png,*.jpg"

### Version 1:

- im gonna skip logging for now and rather use print statements


## Dataset

![](outputs/artifacts/dataset/trainval_image.png)
![](outputs/artifacts/dataset/test_image.png)

## Result

![](outputs/artifacts/result/train_image.png)
![](outputs/artifacts/result/val_image.png)