Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
tangledgroup
/
tangled-alpha-0.1-core
like
0
Follow
TangledGroup
5
Text Generation
Transformers
20 datasets
107 languages
chat
core
base
instruct
reason
conversational
License:
mit
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
tangled-alpha-0.1-core
/
scripts
Commit History
reqs
9b182f7
Marko Tasic
commited on
Feb 24
grokadamw.GrokAdamW
da80ae1
mtasic85
commited on
Feb 24
grokadamw.GrokAdamW
1386dd6
mtasic85
commited on
Feb 24
global_batch_size: 256; micro_batch_size: 2
afa8f6e
mtasic85
commited on
Feb 24
global_batch_size: 256; micro_batch_size: 4
bee417a
mtasic85
commited on
Feb 24
micro_batch_size: 3
0f5ef2e
mtasic85
commited on
Feb 24
micro_batch_size: 4
1c9b116
mtasic85
commited on
Feb 24
micro_batch_size: 1
056e2c6
mtasic85
commited on
Feb 24
class_path: torch.optim.AdamW
fcc1668
mtasic85
commited on
Feb 24
micro_batch_size: 2
578a7be
mtasic85
commited on
Feb 24
class_path: bitsandbytes.optim.AdamW8bit
ed2b433
mtasic85
commited on
Feb 24
class_path: bitsandbytes.optim.PagedAdamW8bit
14f2503
mtasic85
commited on
Feb 24
micro_batch_size: 4
50de401
mtasic85
commited on
Feb 24
class_path: torchao.prototype.low_bit_optim.AdamW8bit
dfc4418
mtasic85
commited on
Feb 22
class_path: torchao.prototype.low_bit_optim.AdamW4bit
4afec17
mtasic85
commited on
Feb 22
class_path: torchao.prototype.low_bit_optim.AdamW8bit
779fd25
mtasic85
commited on
Feb 22
torchao
fa76479
mtasic85
commited on
Feb 22
max_seq_length: 8192
b68070c
mtasic85
commited on
Feb 22
pretrain model
193a28c
mtasic85
commited on
Feb 22