Auto-PreTrain (auto-pretrain)

FlameF0X

updated a Space 2 days ago

Auto PreTrain

🚂

FlameF0X

updated a model 3 days ago

Auto-PreTrain/APT-GPT

Text Generation • 16.1M • Updated 3 days ago • 13

FlameF0X

published a model 3 days ago

Auto-PreTrain/APT-GPT

Text Generation • 16.1M • Updated 3 days ago • 13

FlameF0X

posted an update 5 months ago

Post

4089

I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore.

7 replies

·

FlameF0X

posted an update 6 months ago

Post

535

the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus.

FlameF0X

posted an update 6 months ago

Post

273

The development of SnowflakeCore-G1-7B-MoE it getting delay. In the mean time I am working on SnowflakeCore-G1-1B-MoE witch would be a pre-train chatbot.

1 reply

·

FlameF0X

posted an update 6 months ago

Post

2956

The development of SnowflakeCore-G1-7B-MoE. I can't say when it would be publish yet because it's big and it requires a lot of computational power.

1 reply

·

FlameF0X

posted an update 6 months ago

Post

291

I just finished the benchmarks for https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny and https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny2 in comparation with openai-community/gpt2 .

FlameF0X

posted an update 6 months ago

Post

313

Hello! Important announcement, I will rename SnowflakeCore-G1-Medium to SnowflakeCore-G1-Tiny2 because it's going to have the same parameters as the Tiny version, but this one is trained on more data.

1 reply

·

FlameF0X

posted an update 6 months ago

Post

745

Currently working on SnowflakeCore-G1-Medium. [Updated loss cruve]

3 replies

·

FlameF0X

posted an update 6 months ago

Post

155

Hello there world! I am happy to announce that you now can fine-tune https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny , the code for that is in the model card.

I aslo lost the training log 😐

FlameF0X

posted an update 6 months ago

Post

1206

Hello! I am sad to say but fine-tuning https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny is complicated and the instruct version would need to wait some time.

2 replies

·

FlameF0X

posted an update 7 months ago

Post

229

SnowflakeCore-G1-Tiny has landed on Hugging Face! 🚀. Give it a try and let me know what you think: https://huggingface.co/FlameF0X/SnowflakeCore-G1-Tiny.

FlameF0X

posted an update 7 months ago

Post

256

SnowflakeCore-G1 Update:
Got it running and training! Context window is currently set to 2048 tokens.
Training is active and stable. Will share results once I have some metrics to report.

2 replies

·

FlameF0X

posted an update 7 months ago

Post

1937

SnowflakeCore-G1 development update: We're building a 24-layer transformer with 32K context and 1024 embedding dimensions - pretty ambitious! Even running at batch_size=1 with heavy gradient accumulation, we're hitting memory walls at 300GB RAM. Scaling up to ~1TB will take some time, but the architecture is looking promising. Thanks for following along with the journey! 😅

1 reply

·

FlameF0X

posted an update 7 months ago

Post

1152

Hello there!
I just find out that all the SnowflakeCore-G0 series are Mask Language Models instead of LLM's.
The development of SnowflakeCore-G0-Releas-3 would be delayed even more.

Edit: I officially end the development of SnowflakeCore-G0 and start the development of SnowflakeCore-G1 what SHOULD be the text generator.

Edit-2: After some evaluation of the code, the models are actual Text Generator. So the development of G0 will continue.

FlameF0X

posted an update 7 months ago

Post

1375

Hi everyone!
The release of https://huggingface.co/FlameF0X/SnowflakeCore-G0-Release-3-1B is currently delayed due to hardware limitations—I'm currently lacking the compute resources needed to complete training. I'm exploring options and will keep you updated on any progress.
Thank you for your patience and support!

FlameF0X

posted an update 7 months ago

Post

212

Hi there! I'm currently developing https://huggingface.co/FlameF0X/SnowflakeCore-G0-Release-3-1B and have estimated the model size at approximately 1.06B parameters. However, this value is only an estimate, and I don't have the exact parameter count.

FlameF0X

posted an update 7 months ago

Post

1980

I realised a small documentation on how to make your own LM architecture called [LM-From-Scratch](https://github.com/FlameF0X/LM-From-Scratch)

auto-pretrain

AI & ML interests

Recent Activity

Auto PreTrain

Auto-PreTrain/APT-GPT

Auto-PreTrain/APT-GPT

AI & ML interests

Recent Activity

Team members 1

Auto-PreTrain's activity

Auto PreTrain