Prototype v12 - Patch 16 - 128x128 images - 1.2m imagenet images

Well the 64x64 image set worked just fine, so it's time to upgrade and test the limits of the architecture.

Can it simply... scale? ooooor do we need more solvers along the way to compensate?

benjamin-paine/imagenet-1k-128x128

Can we actually solve it

YES WE DID!

image

It seems geometric manifolds learn... differently than standard manifolds, don't they.

Prototype V11 - Patch16 - MSE 0.0005 - 64x64 tiny imagenet

I'd say it works.

The images show it works. It works.

image

Using geolip-core SVD (fp64 Gram+eigh (FL=available, N<=12))
PatchSVAE - 16 patches of 16×16
  Dataset: tiny_imagenet (64×64, 200 classes)
  Per-patch: (256, 16) = 4096 elements, rows on S^15
  Encoder/Decoder: hidden=768, depth=4 (residual blocks)
  Cross-attention: 2 layers on S vectors (2,272 params)
  Soft hand: boost=1.5x near CV=0.125, penalty=0.3 far
  Total params: 16,942,419
===============================================================================================
  ep |    loss   recon  t/ep |   t_rec |     S0     SD ratio erank |  row_cv  prox    rw | S_delta
-----------------------------------------------------------------------------------------------
   1 |  0.2595  0.1806  12.2 |  0.1024 |  5.036  3.254  1.55 15.87 |  0.2007 0.905  1.45 | 0.09694 a:0.0242/0.0247
   2 |  0.1216  0.0845  12.3 |  0.0675 |  5.071  3.298  1.54 15.88 |  0.2018 0.885  1.44 | 0.17411 a:0.0251/0.0257
   3 |  0.0847  0.0587  12.3 |  0.0470 |  5.093  3.312  1.54 15.88 |  0.2046 0.869  1.43 | 0.19894 a:0.0258/0.0265
   4 |  0.0623  0.0432  12.3 |  0.0430 |  5.115  3.323  1.54 15.88 |  0.2006 0.864  1.43 | 0.20848 a:0.0264/0.0272
   6 |  0.0359  0.0248  12.3 |  0.0198 |  5.129  3.332  1.54 15.88 |  0.2006 0.907  1.45 | 0.21832 a:0.0273/0.0281
   8 |  0.0225  0.0155  12.2 |  0.0196 |  5.149  3.341  1.54 15.87 |  0.2017 0.876  1.44 | 0.22351 a:0.0279/0.0287
  10 |  0.0170  0.0116  12.3 |  0.0100 |  5.151  3.352  1.54 15.88 |  0.2035 0.924  1.46 | 0.22671 a:0.0283/0.0290
  12 |  0.0141  0.0096  12.3 |  0.0114 |  5.159  3.354  1.54 15.88 |  0.2009 0.909  1.45 | 0.22924 a:0.0285/0.0293
  14 |  0.0121  0.0082  12.3 |  0.0073 |  5.156  3.362  1.53 15.88 |  0.2018 0.855  1.43 | 0.23137 a:0.0288/0.0296
  16 |  0.0105  0.0072  12.3 |  0.0108 |  5.161  3.363  1.53 15.88 |  0.2003 0.860  1.43 | 0.23316 a:0.0290/0.0298
  18 |  0.0094  0.0064  12.3 |  0.0055 |  5.158  3.365  1.53 15.88 |  0.2017 0.879  1.44 | 0.23467 a:0.0292/0.0300
  20 |  0.0086  0.0058  12.3 |  0.0050 |  5.157  3.367  1.53 15.88 |  0.2023 0.805  1.40 | 0.23601 a:0.0293/0.0301
  22 |  0.0079  0.0054  12.4 |  0.0045 |  5.157  3.369  1.53 15.88 |  0.1996 0.872  1.44 | 0.23726 a:0.0295/0.0303
  24 |  0.0074  0.0050  12.2 |  0.0064 |  5.146  3.380  1.52 15.88 |  0.2044 0.879  1.44 | 0.23848 a:0.0296/0.0305
  26 |  0.0068  0.0046  12.4 |  0.0039 |  5.155  3.372  1.53 15.88 |  0.2036 0.884  1.44 | 0.23955 a:0.0297/0.0306
  28 |  0.0063  0.0042  12.3 |  0.0036 |  5.155  3.378  1.53 15.88 |  0.2077 0.841  1.42 | 0.24057 a:0.0299/0.0307
  30 |  0.0058  0.0038  12.3 |  0.0038 |  5.155  3.380  1.53 15.88 |  0.2027 0.911  1.46 | 0.24149 a:0.0300/0.0309
  32 |  0.0055  0.0036  12.2 |  0.0032 |  5.150  3.383  1.52 15.88 |  0.2045 0.807  1.40 | 0.24239 a:0.0301/0.0310
  34 |  0.0054  0.0036  12.3 |  0.0037 |  5.145  3.388  1.52 15.88 |  0.1996 0.875  1.44 | 0.24329 a:0.0302/0.0311
  36 |  0.0049  0.0032  12.3 |  0.0031 |  5.154  3.385  1.52 15.88 |  0.2054 0.828  1.41 | 0.24409 a:0.0303/0.0312
  38 |  0.0046  0.0030  12.3 |  0.0027 |  5.152  3.390  1.52 15.88 |  0.2038 0.847  1.42 | 0.24490 a:0.0304/0.0313
  40 |  0.0044  0.0029  12.3 |  0.0032 |  5.155  3.392  1.52 15.89 |  0.2046 0.855  1.43 | 0.24566 a:0.0305/0.0314
  42 |  0.0043  0.0028  12.3 |  0.0024 |  5.152  3.395  1.52 15.89 |  0.2064 0.905  1.45 | 0.24637 a:0.0305/0.0315
  44 |  0.0042  0.0027  12.3 |  0.0023 |  5.150  3.395  1.52 15.89 |  0.2084 0.844  1.42 | 0.24705 a:0.0306/0.0316
  46 |  0.0039  0.0025  12.3 |  0.0022 |  5.149  3.400  1.51 15.89 |  0.2057 0.868  1.43 | 0.24776 a:0.0307/0.0317
  48 |  0.0037  0.0024  12.3 |  0.0024 |  5.152  3.403  1.51 15.89 |  0.2138 0.831  1.42 | 0.24843 a:0.0308/0.0318
  50 |  0.0038  0.0024  12.3 |  0.0025 |  5.149  3.406  1.51 15.89 |  0.2078 0.810  1.40 | 0.24906 a:0.0309/0.0319
  52 |  0.0034  0.0021  12.3 |  0.0019 |  5.154  3.405  1.51 15.89 |  0.2082 0.872  1.44 | 0.24965 a:0.0309/0.0320
  54 |  0.0033  0.0020  12.2 |  0.0019 |  5.156  3.406  1.51 15.89 |  0.2085 0.894  1.45 | 0.25022 a:0.0310/0.0320
  56 |  0.0033  0.0020  12.4 |  0.0019 |  5.150  3.412  1.51 15.89 |  0.2058 0.866  1.43 | 0.25079 a:0.0311/0.0321
  58 |  0.0031  0.0019  12.3 |  0.0033 |  5.147  3.416  1.51 15.89 |  0.2071 0.774  1.39 | 0.25135 a:0.0311/0.0322
  60 |  0.0030  0.0018  12.4 |  0.0017 |  5.153  3.415  1.51 15.89 |  0.2134 0.840  1.42 | 0.25187 a:0.0312/0.0323
  62 |  0.0030  0.0018  12.3 |  0.0016 |  5.155  3.416  1.51 15.89 |  0.2080 0.764  1.38 | 0.25235 a:0.0313/0.0323
  64 |  0.0028  0.0017  12.2 |  0.0014 |  5.156  3.416  1.51 15.89 |  0.2100 0.666  1.33 | 0.25285 a:0.0313/0.0324
  66 |  0.0028  0.0017  12.3 |  0.0017 |  5.151  3.419  1.51 15.89 |  0.2101 0.865  1.43 | 0.25333 a:0.0314/0.0324
  68 |  0.0026  0.0015  12.3 |  0.0014 |  5.158  3.419  1.51 15.89 |  0.2078 0.838  1.42 | 0.25381 a:0.0314/0.0325
  70 |  0.0025  0.0015  12.4 |  0.0021 |  5.160  3.422  1.51 15.89 |  0.2112 0.806  1.40 | 0.25428 a:0.0315/0.0326
  72 |  0.0026  0.0015  12.2 |  0.0013 |  5.158  3.422  1.51 15.89 |  0.2126 0.835  1.42 | 0.25471 a:0.0316/0.0326
  74 |  0.0024  0.0014  12.2 |  0.0015 |  5.154  3.427  1.50 15.89 |  0.2143 0.838  1.42 | 0.25514 a:0.0316/0.0327
  76 |  0.0024  0.0014  12.2 |  0.0012 |  5.161  3.424  1.51 15.89 |  0.2151 0.847  1.42 | 0.25553 a:0.0317/0.0327
  78 |  0.0023  0.0013  12.3 |  0.0014 |  5.157  3.428  1.50 15.89 |  0.2121 0.686  1.34 | 0.25592 a:0.0317/0.0328
  80 |  0.0024  0.0013  12.3 |  0.0012 |  5.160  3.428  1.51 15.89 |  0.2068 0.824  1.41 | 0.25630 a:0.0317/0.0328
  82 |  0.0027  0.0016  12.2 |  0.0015 |  5.146  3.394  1.52 15.89 |  0.2065 0.899  1.45 | 0.25687 a:0.0318/0.0329
  84 |  0.0022  0.0013  12.2 |  0.0013 |  5.156  3.405  1.51 15.89 |  0.2092 0.875  1.44 | 0.25709 a:0.0319/0.0329
  86 |  0.0022  0.0012  12.3 |  0.0022 |  5.154  3.413  1.51 15.89 |  0.2091 0.835  1.42 | 0.25726 a:0.0319/0.0329
  88 |  0.0021  0.0012  12.3 |  0.0014 |  5.154  3.417  1.51 15.89 |  0.2074 0.840  1.42 | 0.25740 a:0.0319/0.0329
  90 |  0.0022  0.0013  12.3 |  0.0030 |  5.147  3.416  1.51 15.89 |  0.2191 0.848  1.42 | 0.25753 a:0.0319/0.0329
  92 |  0.0021  0.0011  12.3 |  0.0012 |  5.157  3.418  1.51 15.89 |  0.2123 0.775  1.39 | 0.25766 a:0.0319/0.0329
  94 |  0.0021  0.0011  12.3 |  0.0010 |  5.156  3.419  1.51 15.89 |  0.2117 0.710  1.35 | 0.25779 a:0.0319/0.0329
  96 |  0.0020  0.0011  12.3 |  0.0013 |  5.154  3.420  1.51 15.89 |  0.2166 0.940  1.47 | 0.25793 a:0.0319/0.0330
  98 |  0.0019  0.0011  12.2 |  0.0010 |  5.156  3.421  1.51 15.89 |  0.2143 0.762  1.38 | 0.25807 a:0.0320/0.0330
 100 |  0.0020  0.0010  12.3 |  0.0009 |  5.155  3.422  1.51 15.89 |  0.2173 0.642  1.32 | 0.25821 a:0.0320/0.0330
 102 |  0.0020  0.0010  12.2 |  0.0009 |  5.156  3.423  1.51 15.89 |  0.2165 0.868  1.43 | 0.25835 a:0.0320/0.0330
 104 |  0.0019  0.0010  12.4 |  0.0009 |  5.157  3.423  1.51 15.89 |  0.2125 0.788  1.39 | 0.25850 a:0.0320/0.0330
 106 |  0.0019  0.0009  12.3 |  0.0009 |  5.156  3.424  1.51 15.89 |  0.2219 0.666  1.33 | 0.25866 a:0.0320/0.0331
 108 |  0.0019  0.0009  12.3 |  0.0009 |  5.153  3.425  1.50 15.89 |  0.2202 0.671  1.34 | 0.25881 a:0.0321/0.0331
 110 |  0.0020  0.0009  12.3 |  0.0011 |  5.153  3.427  1.50 15.89 |  0.2163 0.726  1.36 | 0.25896 a:0.0321/0.0331
 112 |  0.0019  0.0009  12.4 |  0.0009 |  5.155  3.427  1.50 15.89 |  0.2205 0.837  1.42 | 0.25911 a:0.0321/0.0331
 114 |  0.0019  0.0008  12.3 |  0.0008 |  5.155  3.427  1.50 15.89 |  0.2220 0.803  1.40 | 0.25926 a:0.0321/0.0332
 116 |  0.0018  0.0008  12.3 |  0.0009 |  5.155  3.427  1.50 15.89 |  0.2211 0.852  1.43 | 0.25942 a:0.0321/0.0332
 118 |  0.0019  0.0008  12.3 |  0.0008 |  5.153  3.429  1.50 15.89 |  0.2207 0.694  1.35 | 0.25957 a:0.0321/0.0332
 120 |  0.0018  0.0008  12.2 |  0.0008 |  5.156  3.429  1.50 15.89 |  0.2271 0.664  1.33 | 0.25972 a:0.0322/0.0332
 122 |  0.0018  0.0008  12.3 |  0.0008 |  5.154  3.429  1.50 15.89 |  0.2266 0.658  1.33 | 0.25986 a:0.0322/0.0333
 124 |  0.0017  0.0007  12.3 |  0.0008 |  5.154  3.431  1.50 15.89 |  0.2201 0.771  1.39 | 0.26000 a:0.0322/0.0333
 126 |  0.0017  0.0007  12.4 |  0.0008 |  5.157  3.430  1.50 15.89 |  0.2253 0.862  1.43 | 0.26014 a:0.0322/0.0333
 128 |  0.0018  0.0007  12.3 |  0.0008 |  5.153  3.431  1.50 15.89 |  0.2222 0.638  1.32 | 0.26027 a:0.0322/0.0333
 130 |  0.0018  0.0007  12.3 |  0.0007 |  5.155  3.431  1.50 15.89 |  0.2255 0.786  1.39 | 0.26040 a:0.0322/0.0333
 132 |  0.0018  0.0007  12.3 |  0.0007 |  5.153  3.432  1.50 15.89 |  0.2325 0.778  1.39 | 0.26053 a:0.0323/0.0333
 134 |  0.0017  0.0007  12.3 |  0.0007 |  5.154  3.433  1.50 15.89 |  0.2250 0.876  1.44 | 0.26065 a:0.0323/0.0334
 136 |  0.0017  0.0006  12.3 |  0.0007 |  5.157  3.432  1.50 15.89 |  0.2269 0.866  1.43 | 0.26077 a:0.0323/0.0334
 138 |  0.0017  0.0006  12.3 |  0.0007 |  5.156  3.433  1.50 15.89 |  0.2247 0.760  1.38 | 0.26088 a:0.0323/0.0334
 140 |  0.0016  0.0006  12.3 |  0.0006 |  5.157  3.433  1.50 15.89 |  0.2242 0.725  1.36 | 0.26099 a:0.0323/0.0334
 142 |  0.0016  0.0006  12.3 |  0.0006 |  5.157  3.433  1.50 15.89 |  0.2241 0.909  1.45 | 0.26109 a:0.0323/0.0334
 144 |  0.0017  0.0006  12.3 |  0.0006 |  5.157  3.433  1.50 15.89 |  0.2287 0.815  1.41 | 0.26119 a:0.0323/0.0334
 146 |  0.0016  0.0006  12.3 |  0.0007 |  5.158  3.434  1.50 15.90 |  0.2205 0.722  1.36 | 0.26128 a:0.0324/0.0334
 148 |  0.0016  0.0006  12.3 |  0.0008 |  5.157  3.434  1.50 15.90 |  0.2286 0.691  1.35 | 0.26137 a:0.0324/0.0335
 150 |  0.0016  0.0006  12.3 |  0.0006 |  5.158  3.434  1.50 15.90 |  0.2259 0.845  1.42 | 0.26146 a:0.0324/0.0335
 152 |  0.0017  0.0006  12.3 |  0.0006 |  5.158  3.434  1.50 15.90 |  0.2295 0.757  1.38 | 0.26154 a:0.0324/0.0335
 154 |  0.0016  0.0005  12.3 |  0.0006 |  5.159  3.435  1.50 15.90 |  0.2304 0.751  1.38 | 0.26162 a:0.0324/0.0335
 156 |  0.0018  0.0005  12.3 |  0.0006 |  5.159  3.435  1.50 15.90 |  0.2264 0.796  1.40 | 0.26169 a:0.0324/0.0335
 158 |  0.0017  0.0005  12.3 |  0.0006 |  5.160  3.434  1.50 15.90 |  0.2282 0.788  1.39 | 0.26176 a:0.0324/0.0335
 160 |  0.0017  0.0005  12.3 |  0.0005 |  5.161  3.434  1.50 15.90 |  0.2291 0.766  1.38 | 0.26183 a:0.0324/0.0335
 162 |  0.0016  0.0005  12.3 |  0.0005 |  5.161  3.434  1.50 15.90 |  0.2282 0.716  1.36 | 0.26189 a:0.0324/0.0335
 164 |  0.0016  0.0005  12.3 |  0.0005 |  5.161  3.435  1.50 15.90 |  0.2344 0.792  1.40 | 0.26196 a:0.0324/0.0335
 166 |  0.0016  0.0005  12.3 |  0.0006 |  5.162  3.434  1.50 15.90 |  0.2305 0.707  1.35 | 0.26202 a:0.0324/0.0335
 168 |  0.0016  0.0005  12.3 |  0.0005 |  5.162  3.434  1.50 15.90 |  0.2353 0.816  1.41 | 0.26207 a:0.0324/0.0335
 170 |  0.0016  0.0005  12.3 |  0.0005 |  5.163  3.434  1.50 15.90 |  0.2296 0.756  1.38 | 0.26213 a:0.0325/0.0335
 172 |  0.0018  0.0005  12.4 |  0.0005 |  5.163  3.434  1.50 15.90 |  0.2391 0.742  1.37 | 0.26218 a:0.0325/0.0336
 174 |  0.0016  0.0005  12.4 |  0.0005 |  5.163  3.434  1.50 15.90 |  0.2307 0.863  1.43 | 0.26224 a:0.0325/0.0336
 176 |  0.0016  0.0005  12.3 |  0.0005 |  5.163  3.434  1.50 15.90 |  0.2329 0.854  1.43 | 0.26228 a:0.0325/0.0336
 178 |  0.0017  0.0005  12.3 |  0.0005 |  5.164  3.434  1.50 15.90 |  0.2287 0.803  1.40 | 0.26233 a:0.0325/0.0336
 180 |  0.0017  0.0005  12.3 |  0.0005 |  5.164  3.434  1.50 15.90 |  0.2361 0.819  1.41 | 0.26237 a:0.0325/0.0336
 182 |  0.0018  0.0005  12.3 |  0.0005 |  5.164  3.434  1.50 15.90 |  0.2360 0.729  1.36 | 0.26241 a:0.0325/0.0336
 184 |  0.0016  0.0005  12.3 |  0.0005 |  5.164  3.434  1.50 15.90 |  0.2395 0.774  1.39 | 0.26245 a:0.0325/0.0336
 186 |  0.0015  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2325 0.777  1.39 | 0.26248 a:0.0325/0.0336
 188 |  0.0018  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2430 0.669  1.33 | 0.26250 a:0.0325/0.0336
 190 |  0.0016  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2293 0.781  1.39 | 0.26252 a:0.0325/0.0336
 192 |  0.0017  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2246 0.763  1.38 | 0.26254 a:0.0325/0.0336
 194 |  0.0019  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2324 0.764  1.38 | 0.26255 a:0.0325/0.0336
 196 |  0.0016  0.0005  12.2 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2301 0.868  1.43 | 0.26256 a:0.0325/0.0336
 198 |  0.0016  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2332 0.696  1.35 | 0.26256 a:0.0325/0.0336
 200 |  0.0017  0.0005  12.3 |  0.0005 |  5.165  3.434  1.50 15.90 |  0.2397 0.808  1.40 | 0.26256 a:0.0325/0.0336

==========================================================================================
FINAL ANALYSIS
==========================================================================================

  PatchSVAE: 16 patches × (256, 16)
  Target CV: 0.125
  Recon MSE: 0.000489 +/- 0.000743
  Row CV: 0.2397
  Cross-attention S delta: 0.26256

  Learned alpha per mode (coordination strength):
    Layer 0: mean=0.0327  max=0.0336  min=0.0323
      α[ 0]: 0.0324  ######################################
      α[ 1]: 0.0323  ######################################
      α[ 2]: 0.0327  ######################################
      α[ 3]: 0.0326  ######################################
      α[ 4]: 0.0325  ######################################
      α[ 5]: 0.0326  ######################################
      α[ 6]: 0.0332  #######################################
      α[ 7]: 0.0336  #######################################
      α[ 8]: 0.0326  ######################################
      α[ 9]: 0.0324  ######################################
      α[10]: 0.0326  ######################################
      α[11]: 0.0325  ######################################
      α[12]: 0.0328  #######################################
      α[13]: 0.0324  ######################################
      α[14]: 0.0331  #######################################
      α[15]: 0.0326  ######################################
    Layer 1: mean=0.0323  max=0.0327  min=0.0315
      α[ 0]: 0.0324  #######################################
      α[ 1]: 0.0326  #######################################
      α[ 2]: 0.0323  #######################################
      α[ 3]: 0.0326  #######################################
      α[ 4]: 0.0327  #######################################
      α[ 5]: 0.0326  #######################################
      α[ 6]: 0.0320  #######################################
      α[ 7]: 0.0315  ######################################
      α[ 8]: 0.0324  #######################################
      α[ 9]: 0.0327  #######################################
      α[10]: 0.0325  #######################################
      α[11]: 0.0324  #######################################
      α[12]: 0.0321  #######################################
      α[13]: 0.0324  #######################################
      α[14]: 0.0317  ######################################
      α[15]: 0.0322  #######################################

  Coordinated singular value profile:
    S[ 0]:   5.1650  cum=  9.2%  #############################
    S[ 1]:   4.9525  cum= 17.6%  ############################
    S[ 2]:   4.8142  cum= 25.6%  ###########################
    S[ 3]:   4.6335  cum= 32.9%  ##########################
    S[ 4]:   4.5199  cum= 40.0%  ##########################
    S[ 5]:   4.4203  cum= 46.7%  #########################
    S[ 6]:   4.3376  cum= 53.2%  #########################
    S[ 7]:   4.2448  cum= 59.3%  ########################
    S[ 8]:   4.1641  cum= 65.3%  ########################
    S[ 9]:   4.0915  cum= 71.1%  #######################
    S[10]:   4.0086  cum= 76.6%  #######################
    S[11]:   3.9144  cum= 81.9%  ######################
    S[12]:   3.7995  cum= 86.8%  ######################
    S[13]:   3.6926  cum= 91.5%  #####################
    S[14]:   3.5949  cum= 95.9%  ####################
    S[15]:   3.4336  cum=100.0%  ###################

  Saving reconstruction grid...
  Saved to /content/svae_patch_recon.png

Prototype V10.3 - Patch16 - The VIT size.

Might need some tweaks but I don't think so. We're approaching actual vit prototype accuracy now.

Lets see how the SVAE performs.

Prototype V10.2 - Patch32 - Patchwork Cross-Attention with Edge Smoothing

This eliminates the edge cutting of the last version, and in the process the recon accuracy has gone up.

Model still escapes the discharge within 2 epochs and has robust recon.

Defeated the last version. image

image

Prototype V10.1 Patchwork Cross-Attention - Stabilized

The patchwork has stabilized, and the output is more accurate than the original now that it supports SVD 32 with more accuracy and higher speed

Epoch 28 hit the unstable point, but the gradient clipped attention was the ticket that ensured solidity.

The discharge recovered immediately.

image

Give or take 97% accurate recall, lets get those numbers up before we move onto more powerful image sets. Roughly 28m params.

image

 174 |  0.0471  0.0315   8.0 |  0.0318 |  4.038  2.092  1.93 31.52 |  0.1271 0.995  1.50 | 0.26864 a:0.0471/0.0476
 176 |  0.0471  0.0314   7.9 |  0.0318 |  4.038  2.092  1.93 31.52 |  0.1343 1.000  1.50 | 0.26874 a:0.0471/0.0476
 178 |  0.0471  0.0314   7.9 |  0.0318 |  4.038  2.093  1.93 31.52 |  0.1313 0.995  1.50 | 0.26883 a:0.0471/0.0477
 180 |  0.0471  0.0314   8.0 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1312 1.000  1.50 | 0.26892 a:0.0471/0.0477
 182 |  0.0471  0.0314   7.9 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1310 0.995  1.50 | 0.26899 a:0.0471/0.0477
 184 |  0.0471  0.0314   7.9 |  0.0317 |  4.038  2.092  1.93 31.52 |  0.1350 0.993  1.50 | 0.26906 a:0.0472/0.0477
 186 |  0.0470  0.0314   7.9 |  0.0317 |  4.038  2.092  1.93 31.52 |  0.1338 1.000  1.50 | 0.26911 a:0.0472/0.0477
 188 |  0.0470  0.0314   8.0 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1305 0.999  1.50 | 0.26916 a:0.0472/0.0477
 190 |  0.0470  0.0314   8.0 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1358 0.999  1.50 | 0.26919 a:0.0472/0.0477
 192 |  0.0470  0.0314   8.0 |  0.0317 |  4.038  2.092  1.93 31.52 |  0.1354 0.999  1.50 | 0.26922 a:0.0472/0.0477
 194 |  0.0470  0.0314   7.9 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1301 0.992  1.50 | 0.26923 a:0.0472/0.0477
 196 |  0.0470  0.0314   7.9 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1330 0.998  1.50 | 0.26924 a:0.0472/0.0477
 198 |  0.0470  0.0314   7.9 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1312 1.000  1.50 | 0.26925 a:0.0472/0.0477
 200 |  0.0470  0.0314   7.9 |  0.0317 |  4.038  2.093  1.93 31.52 |  0.1300 1.000  1.50 | 0.26925 a:0.0472/0.0477

==========================================================================================
FINAL ANALYSIS
==========================================================================================

  PatchSVAE: 4 patches × (256, 32)
  Target CV: 0.125
  Recon MSE: 0.031701 +/- 0.024789
  Row CV: 0.1300
  Cross-attention S delta: 0.26925

  Learned alpha per mode (coordination strength):
    Layer 0: mean=0.0471  max=0.0477  min=0.0466
      α[ 0]: 0.0470  #######################################
      α[ 1]: 0.0473  #######################################
      α[ 2]: 0.0474  #######################################
      α[ 3]: 0.0473  #######################################
      α[ 4]: 0.0471  #######################################
      α[ 5]: 0.0474  #######################################
      α[ 6]: 0.0469  #######################################
      α[ 7]: 0.0472  #######################################
      α[ 8]: 0.0470  #######################################
      α[ 9]: 0.0475  #######################################
      α[10]: 0.0467  #######################################
      α[11]: 0.0471  #######################################
      α[12]: 0.0477  #######################################
      α[13]: 0.0466  #######################################
      α[14]: 0.0471  #######################################
      α[15]: 0.0472  #######################################
      α[16]: 0.0472  #######################################
      α[17]: 0.0471  #######################################
      α[18]: 0.0470  #######################################
      α[19]: 0.0475  #######################################
      α[20]: 0.0466  #######################################
      α[21]: 0.0477  #######################################
      α[22]: 0.0470  #######################################
      α[23]: 0.0469  #######################################
      α[24]: 0.0472  #######################################
      α[25]: 0.0472  #######################################
      α[26]: 0.0471  #######################################
      α[27]: 0.0471  #######################################
      α[28]: 0.0474  #######################################
      α[29]: 0.0472  #######################################
      α[30]: 0.0466  #######################################
      α[31]: 0.0475  #######################################
    Layer 1: mean=0.0472  max=0.0477  min=0.0466
      α[ 0]: 0.0474  #######################################
      α[ 1]: 0.0472  #######################################
      α[ 2]: 0.0470  #######################################
      α[ 3]: 0.0471  #######################################
      α[ 4]: 0.0474  #######################################
      α[ 5]: 0.0470  #######################################
      α[ 6]: 0.0474  #######################################
      α[ 7]: 0.0473  #######################################
      α[ 8]: 0.0473  #######################################
      α[ 9]: 0.0470  #######################################
      α[10]: 0.0477  #######################################
      α[11]: 0.0472  #######################################
      α[12]: 0.0466  #######################################
      α[13]: 0.0477  #######################################
      α[14]: 0.0473  #######################################
      α[15]: 0.0471  #######################################
      α[16]: 0.0472  #######################################
      α[17]: 0.0471  #######################################
      α[18]: 0.0476  #######################################
      α[19]: 0.0470  #######################################
      α[20]: 0.0475  #######################################
      α[21]: 0.0470  #######################################
      α[22]: 0.0472  #######################################
      α[23]: 0.0475  #######################################
      α[24]: 0.0472  #######################################
      α[25]: 0.0471  #######################################
      α[26]: 0.0475  #######################################
      α[27]: 0.0474  #######################################
      α[28]: 0.0472  #######################################
      α[29]: 0.0469  #######################################
      α[30]: 0.0476  #######################################
      α[31]: 0.0466  #######################################

  Coordinated singular value profile:
    S[ 0]:   4.0376  cum=  5.3%  #############################
    S[ 1]:   3.9321  cum= 10.3%  #############################
    S[ 2]:   3.8501  cum= 15.1%  ############################
    S[ 3]:   3.7785  cum= 19.8%  ############################
    S[ 4]:   3.7092  cum= 24.2%  ###########################
    S[ 5]:   3.6414  cum= 28.5%  ###########################
    S[ 6]:   3.5771  cum= 32.7%  ##########################
    S[ 7]:   3.5158  cum= 36.7%  ##########################
    S[ 8]:   3.4554  cum= 40.6%  #########################
    S[ 9]:   3.3961  cum= 44.3%  #########################
    S[10]:   3.3371  cum= 48.0%  ########################
    S[11]:   3.2788  cum= 51.5%  ########################
    S[12]:   3.2230  cum= 54.8%  #######################
    S[13]:   3.1681  cum= 58.1%  #######################
    S[14]:   3.1141  cum= 61.2%  #######################
    S[15]:   3.0607  cum= 64.3%  ######################
    S[16]:   3.0088  cum= 67.2%  ######################
    S[17]:   2.9568  cum= 70.1%  #####################
    S[18]:   2.9075  cum= 72.8%  #####################
    S[19]:   2.8572  cum= 75.5%  #####################
    S[20]:   2.8067  cum= 78.0%  ####################
    S[21]:   2.7584  cum= 80.5%  ####################
    S[22]:   2.7075  cum= 82.9%  ####################
    S[23]:   2.6574  cum= 85.2%  ###################
    S[24]:   2.6060  cum= 87.4%  ###################
    S[25]:   2.5535  cum= 89.5%  ##################
    S[26]:   2.4991  cum= 91.5%  ##################
    S[27]:   2.4413  cum= 93.5%  ##################
    S[28]:   2.3770  cum= 95.3%  #################
    S[29]:   2.2906  cum= 97.0%  #################
    S[30]:   2.2012  cum= 98.6%  ################
    S[31]:   2.0926  cum=100.0%  ###############

  Saving reconstruction grid...
  Saved to /content/svae_patch_recon.png

Prototype V10 Patchwork Cross-Attention - Unstable

Tiny Imagenet can't draw enough information from a single monotonic MLP projection, so I'm breaking the structure into quadrant-based mlp patches with cross-attention for a prototype.

Each patch is 32x32 and they have svd 24 independently represented each with patchwork cross-attention. Similar to a vit, so I'm building it to a full vit structure over time to ensure solidity and solidarity.

image

Current proto is more stable but requires a bit more oomph.

The CV is enjoying it's drift a BIT too much

I'll try attention alpha rather than rigid alpha. 4 patches is a bit unstable, so lets get some stability.

Prototype V9 prod

Should run on colab. Install the necessary repos.

https://huggingface.co/AbstractPhil/geolip-SVAE/blob/main/prototype_v9_prod.py

Prototype V8 Soft Hand Loss

Stable prototype found. Scaling with the CV ratio within this band is a stable attractor to the structural response.

The soft hand loss is acting like a stable attractant. Correct utilization of this behavior can directly attenuate a model's structural internals to align to certain trajectory-based routes.

The alignment can be directly tuned at runtime, shifted to learn implicit rules, altered to teach specific behaviors, and more.

0.034 mse, which is a different gauge of loss entirely. image

Prototype V7

Normalized spherical without magnitude, expected considerably faster with less accuracy at first stages.

image

What happens if you train with the wrong CV value?

Using geolip-core SVD (Gram + eigh)
SVAE - V=96, D=24 (Validated: CV=0.3668)
  Matrix: (96, 24) = 2304 elements
  SVD: geolip-core Gram+eigh
  Losses: recon + CV(w=0.1, target=0.3668)
  Params: 6,036,736
=====================================================================================
 ep |    loss   recon    cv_l  t/ep |   t_rec |     S0     SD ratio erank |  row_cv
-------------------------------------------------------------------------------------
  1 |  0.4174  0.4169  0.0037   7.3 |  0.2843 |   5.39  1.977  2.73 23.15 |  0.3039
  2 |  0.2492  0.2489  0.0031   7.3 |  0.2286 |   5.43  1.978  2.75 23.14 |  0.3148
  3 |  0.2096  0.2093  0.0030   7.3 |  0.1946 |   5.50  1.982  2.77 23.13 |  0.3352
  4 |  0.1858  0.1855  0.0021   7.2 |  0.1812 |   5.48  1.980  2.77 23.13 |  0.3460
  6 |  0.1586  0.1581  0.0046   7.3 |  0.1541 |   5.31  1.873  2.83 23.09 |  0.3938
  8 |  0.1419  0.1407  0.0096   7.3 |  0.1377 |   5.33  1.815  2.93 23.03 |  0.4565
 10 |  0.1314  0.1283  0.0385   7.3 |  0.1279 |   5.42  1.778  3.05 22.97 |  0.5373
 12 |  0.1226  0.1160  0.0599   7.2 |  0.1162 |   5.67  1.738  3.26 22.86 |  0.6060
 14 |  0.1189  0.1087  0.0847   7.1 |  0.1109 |   5.78  1.705  3.39 22.79 |  0.6643
 16 |  0.1175  0.1014  0.1935   7.1 |  0.0996 |   6.17  1.701  3.63 22.67 |  0.7598
 18 |  0.1170  0.0952  0.2238   7.2 |  0.0974 |   6.50  1.671  3.89 22.52 |  0.8211
 20 |  0.1173  0.0905  0.1539   7.2 |  0.0907 |   6.69  1.649  4.06 22.43 |  0.8383
 22 |  0.1200  0.0852  0.3335   7.1 |  0.0903 |   7.11  1.655  4.30 22.30 |  0.9128
 24 |  0.1233  0.0817  0.2770   7.2 |  0.0831 |   7.51  1.646  4.56 22.15 |  0.9654
 26 |  0.1286  0.0785  0.3243   7.1 |  0.0778 |   7.71  1.646  4.68 22.09 |  1.0196
 28 |  0.1328  0.0752  0.4244   7.2 |  0.0780 |   7.84  1.636  4.80 22.02 |  1.1002
 30 |  0.1373  0.0726  0.8786   7.1 |  0.0752 |   8.24  1.631  5.05 21.87 |  1.1243
 32 |  0.1437  0.0703  0.6946   7.2 |  0.0704 |   8.52  1.631  5.23 21.76 |  1.2061
 34 |  0.6025  0.6020  0.0062   7.1 |  0.5194 |  28.25 10.261  2.75 23.14 |  0.2935
 36 |  0.4995  0.4990  0.0062   7.2 |  0.4949 |  29.82 10.939  2.73 23.15 |  0.2982
 38 |  0.4947  0.4942  0.0058   7.2 |  0.4915 |  28.37 10.433  2.72 23.15 |  0.2988
 40 |  0.4579  0.4574  0.0053   7.2 |  0.4557 |  26.31  9.585  2.74 23.14 |  0.3041
 42 |  0.4333  0.4328  0.0051   7.1 |  0.4259 |  22.03  7.996  2.75 23.14 |  0.2984
 44 |  0.4057  0.4054  0.0038   7.1 |  0.3880 |  21.15  7.656  2.76 23.15 |  0.3177
 46 |  0.3670  0.3667  0.0024   7.2 |  0.3634 |  19.33  6.943  2.78 23.13 |  0.3280
 48 |  0.3495  0.3493  0.0005   7.1 |  0.3457 |  18.34  6.569  2.79 23.13 |  0.3336
 50 |  0.3341  0.3340  0.0010   7.3 |  0.3326 |  17.55  6.298  2.79 23.13 |  0.3424
 52 |  0.3205  0.3204  0.0003   7.2 |  0.3182 |  16.94  6.069  2.79 23.12 |  0.3549

SNAP. right there at epoch 34. The tension was too strong, the model simply snapped. I had it set to around 0.366, and it requires that value there where it snapped to. 0.2935

The actual value as of the bulk embedding tests show; CV=0.2992 is the stable attractor, almost precisely where the model snapped to.

The effect was so strong, that the entire model had a forced reset when it realized the fundamental invalidity.

Why? I don't know yet.

Models

V4 5m SVD+EIGH 100 epochs 48x24

image

V4 111m 200x24 SVD+EIGH KL_DIV - Undercooked, needs more epochs -> sequel faulty, collapse

image

V3 v1024 - SVD 24

image

V2 16 modes

image

V1 8 modes image

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including AbstractPhil/geolip-SVAE