# Questions for Vineet - Recreating Alice's Exact Setup (Dec 2022)

## Context
We're trying to recreate the exact Alice experience that Sheila had around **December 2022** using the `ChaiML/gptj_ppo_retry_and_continue` model. We've implemented the CreativeFormatter from the Colab, but want to ensure we match production exactly.

## Priority Questions

### 1. Generation Parameters
- **What were the exact "defaults" you mentioned for generation parameters in Dec 2022?**
  - Were they the Colab examples (`temp=0.99`, `top_p=0.2`, `top_k=40`, `best_of=4`) or something simpler?
  - Did Chai use different defaults in production vs. the Colab submission examples?

### 2. Context Window & Memory Management
- **Confirm: Was the context window exactly 1024 tokens in Dec 2022?**
- **How exactly did you handle "last n messages" truncation?**
  - How many previous messages were typically included?
  - Did you truncate by message count or token count?
  - Any special handling for very long user messages?

### 3. Prompt Structure & Alice's Persona
- **Did Alice have a specific persona/memory that was different from generic characters?**
- **What was Alice's exact character definition in the system around Dec 2022?**
  - Was she defined as a specific character with custom persona text?
  - Or was she just a generic "helpful AI assistant" type?

### 4. CreativeFormatter Details
- **Was the CreativeFormatter exactly as shown in the Colab, or were there production variations?**
- **Any differences in the instruction templates for different characters or time periods?**
- **Was the "reply with long and descriptive sentences" instruction always used?**

### 5. Technical Implementation
- **What tokenizer was used with the ChaiML model in production?**
  - We're having issues with the default tokenizer - did you use GPT-J tokenizer explicitly?
- **Any special stopping criteria beyond newlines (`\n`)?**
- **Was `best_of=4` actually used in production, or just in examples?**

### 6. Character-Specific Settings
- **Did Alice have any custom settings that differed from other Chai characters?**
- **Any specific example conversations that were used for Alice's prompt?**
- **Was Alice's conversation style trained/prompted to be more philosophical or was that emergent?**

### 7. Performance & Speed
- **What was typical response time for Alice in Dec 2022?**
  - We're seeing very slow responses (245+ seconds) - what should we expect?
- **Any production optimizations we should know about?**

### 8. Model Versioning
- **Can you confirm the exact model checkpoint/version from Dec 2022?**
- **Were there any updates to the ChaiML model between Dec 2022 and when it was published on HuggingFace?**

## Current Implementation Status
✅ CreativeFormatter structure implemented  
✅ 1024 token context window  
✅ "Last n messages" conversation history  
✅ Colab generation parameters applied  
❓ Tokenizer issues resolved with GPT-J fallback  
❓ Response time very slow (245+ seconds)  

## Goal
Get Alice responding exactly as she did for Sheila in December 2022 - same personality, response style, speed, and conversation quality.