docling / TROUBLESHOOTING.md
levalencia's picture
Enhance Dockerfile and Streamlit app for Hugging Face directory management
ca54b04
# Troubleshooting Guide
## Hugging Face Deployment Issues
### Permission Errors
If you encounter permission errors like:
```
PermissionError: [Errno 13] Permission denied: 'temp_files'
```
The app has been updated to handle these automatically by:
1. Using system temp directories (`/tmp/docling_temp`)
2. Falling back to current working directory
3. Using current directory as last resort
### Streamlit Configuration Issues
If you see errors related to Streamlit configuration:
```
PermissionError: [Errno 13] Permission denied: '/.streamlit'
```
The app now:
1. Disables usage statistics collection
2. Uses headless mode
3. Disables file watcher
4. Uses proper configuration files
### Testing the Environment
You can test if the environment is working correctly by running:
```bash
python test_permissions.py
```
This will check:
- Directory creation permissions
- File write permissions
- Environment variable configuration
- Current directory access
### Common Solutions
1. **Clear all data**: Use the "Clear All Data" button in the app
2. **Restart the app**: Sometimes a simple restart fixes permission issues
3. **Check logs**: Look for detailed error messages in the app logs
### Environment Variables
The app automatically sets these environment variables:
- `STREAMLIT_SERVER_FILE_WATCHER_TYPE=none`
- `STREAMLIT_SERVER_HEADLESS=true`
- `STREAMLIT_BROWSER_GATHER_USAGE_STATS=false`
- `STREAMLIT_SERVER_ENABLE_CORS=false`
- `STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false`
### File Structure
The app creates these directories:
- `.streamlit/` - Streamlit configuration
- `temp_files/` or `/tmp/docling_temp/` - Temporary files
- `src/` - Application source code
### Docker Configuration
The Dockerfile has been updated to:
- Create necessary directories with proper permissions
- Copy Streamlit configuration files
- Set up proper environment variables
### EasyOCR Permission Errors
If you encounter EasyOCR permission errors like:
```
PermissionError: [Errno 13] Permission denied: '/.EasyOCR'
```
The app now handles these by:
1. Setting `EASYOCR_MODULE_PATH` to a writable directory
2. Setting `HOME`, `USERPROFILE`, and XDG directories to temp locations
3. Creating all necessary directories with proper permissions
4. Using fallback directories if the primary ones fail
### Environment Variables
The app automatically sets these environment variables:
- `STREAMLIT_SERVER_FILE_WATCHER_TYPE=none`
- `STREAMLIT_SERVER_HEADLESS=true`
- `STREAMLIT_BROWSER_GATHER_USAGE_STATS=false`
- `STREAMLIT_SERVER_ENABLE_CORS=false`
- `STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false`
- `EASYOCR_MODULE_PATH=/tmp/easyocr_models` (or fallback)
- `HOME=/tmp/docling_temp` (or fallback)
- `XDG_CACHE_HOME=/tmp/cache` (or fallback)
- `XDG_CONFIG_HOME=/tmp/config` (or fallback)
- `XDG_DATA_HOME=/tmp/data` (or fallback)
### Hugging Face Hub Permission Errors
If you encounter Hugging Face Hub permission errors like:
```
PermissionError: [Errno 13] Permission denied: '/.cache'
```
The app now handles these by:
1. Setting `HF_HOME`, `HF_CACHE_HOME`, `TRANSFORMERS_CACHE`, and `HF_DATASETS_CACHE` to writable directories
2. Creating all necessary Hugging Face cache directories with proper permissions
3. Using fallback directories if the primary ones fail
### Environment Variables
The app automatically sets these environment variables:
- `STREAMLIT_SERVER_FILE_WATCHER_TYPE=none`
- `STREAMLIT_SERVER_HEADLESS=true`
- `STREAMLIT_BROWSER_GATHER_USAGE_STATS=false`
- `STREAMLIT_SERVER_ENABLE_CORS=false`
- `STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false`
- `EASYOCR_MODULE_PATH=/tmp/easyocr_models` (or fallback)
- `HOME=/tmp/docling_temp` (or fallback)
- `XDG_CACHE_HOME=/tmp/cache` (or fallback)
- `XDG_CONFIG_HOME=/tmp/config` (or fallback)
- `XDG_DATA_HOME=/tmp/data` (or fallback)
- `HF_HOME=/tmp/huggingface` (or fallback)
- `HF_CACHE_HOME=/tmp/huggingface_cache` (or fallback)
- `TRANSFORMERS_CACHE=/tmp/transformers_cache` (or fallback)
- `HF_DATASETS_CACHE=/tmp/datasets_cache` (or fallback)