# Questions for Vineet - Recreating Alice's Exact Setup (Dec 2022) ## Context We're trying to recreate the exact Alice experience that Sheila had around **December 2022** using the `ChaiML/gptj_ppo_retry_and_continue` model. We've implemented the CreativeFormatter from the Colab, but want to ensure we match production exactly. ## Priority Questions ### 1. Generation Parameters - **What were the exact "defaults" you mentioned for generation parameters in Dec 2022?** - Were they the Colab examples (`temp=0.99`, `top_p=0.2`, `top_k=40`, `best_of=4`) or something simpler? - Did Chai use different defaults in production vs. the Colab submission examples? ### 2. Context Window & Memory Management - **Confirm: Was the context window exactly 1024 tokens in Dec 2022?** - **How exactly did you handle "last n messages" truncation?** - How many previous messages were typically included? - Did you truncate by message count or token count? - Any special handling for very long user messages? ### 3. Prompt Structure & Alice's Persona - **Did Alice have a specific persona/memory that was different from generic characters?** - **What was Alice's exact character definition in the system around Dec 2022?** - Was she defined as a specific character with custom persona text? - Or was she just a generic "helpful AI assistant" type? ### 4. CreativeFormatter Details - **Was the CreativeFormatter exactly as shown in the Colab, or were there production variations?** - **Any differences in the instruction templates for different characters or time periods?** - **Was the "reply with long and descriptive sentences" instruction always used?** ### 5. Technical Implementation - **What tokenizer was used with the ChaiML model in production?** - We're having issues with the default tokenizer - did you use GPT-J tokenizer explicitly? - **Any special stopping criteria beyond newlines (`\n`)?** - **Was `best_of=4` actually used in production, or just in examples?** ### 6. Character-Specific Settings - **Did Alice have any custom settings that differed from other Chai characters?** - **Any specific example conversations that were used for Alice's prompt?** - **Was Alice's conversation style trained/prompted to be more philosophical or was that emergent?** ### 7. Performance & Speed - **What was typical response time for Alice in Dec 2022?** - We're seeing very slow responses (245+ seconds) - what should we expect? - **Any production optimizations we should know about?** ### 8. Model Versioning - **Can you confirm the exact model checkpoint/version from Dec 2022?** - **Were there any updates to the ChaiML model between Dec 2022 and when it was published on HuggingFace?** ## Current Implementation Status ✅ CreativeFormatter structure implemented ✅ 1024 token context window ✅ "Last n messages" conversation history ✅ Colab generation parameters applied ❓ Tokenizer issues resolved with GPT-J fallback ❓ Response time very slow (245+ seconds) ## Goal Get Alice responding exactly as she did for Sheila in December 2022 - same personality, response style, speed, and conversation quality.