Commit 
							
							·
						
						77fc53d
	
1
								Parent(s):
							
							6e61c89
								
corrected a typo
Browse files
    	
        README.md
    CHANGED
    
    | @@ -131,7 +131,7 @@ Falcon-40B was trained on 1,000B tokens of [RefinedWeb](https://huggingface.co/d | |
| 131 | 
             
            | **Data source**    | **Fraction** | **Tokens** | **Sources**                       |
         | 
| 132 | 
             
            |--------------------|--------------|------------|-----------------------------------|
         | 
| 133 | 
             
            | [RefinedWeb-English](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | 75%          | 750B     | massive web crawl                 |
         | 
| 134 | 
            -
            | RefinedWeb-Europe              | 7%           | 70B       | European massive  | 
| 135 | 
             
            | Books  | 6%           | 60B        |                  |
         | 
| 136 | 
             
            | Conversations      | 5%           | 50B        | Reddit, StackOverflow, HackerNews |
         | 
| 137 | 
             
            | Code               | 5%           | 50B        |                                   |
         | 
|  | |
| 131 | 
             
            | **Data source**    | **Fraction** | **Tokens** | **Sources**                       |
         | 
| 132 | 
             
            |--------------------|--------------|------------|-----------------------------------|
         | 
| 133 | 
             
            | [RefinedWeb-English](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | 75%          | 750B     | massive web crawl                 |
         | 
| 134 | 
            +
            | RefinedWeb-Europe              | 7%           | 70B       | European massive web crawl                                   |
         | 
| 135 | 
             
            | Books  | 6%           | 60B        |                  |
         | 
| 136 | 
             
            | Conversations      | 5%           | 50B        | Reddit, StackOverflow, HackerNews |
         | 
| 137 | 
             
            | Code               | 5%           | 50B        |                                   |
         | 

