Istvan-Adem commited on
Commit
e966906
·
1 Parent(s): 7e8e988

add original text with filtering

Browse files
Files changed (1) hide show
  1. ocr/api/message/prompts.py +3 -13
ocr/api/message/prompts.py CHANGED
@@ -26,19 +26,9 @@ The report must be structured as follows, with each section containing only rele
26
 
27
  [/INST]"""
28
  extract_original_text = """## Task
29
-
30
- You must extract all text from the provided images and return it in the **text** field. However, you must **strictly** exclude any information related to the **patient’s name, contact details, or demographic data**.
31
 
32
- ## Requirements
33
 
34
- - Extract **all readable text** from the images.
35
- - **Do not** include any **patient-identifiable information**, including:
36
- - Names (first, last, middle, initials)
37
- - Contact details (phone numbers, email addresses, addresses)
38
- - Demographic information (age, date of birth, gender, ethnicity, etc.)
39
- - Preserve **the structure and order** of the text as much as possible.
40
 
41
- ## Formatting Guidelines
42
-
43
- - Do not alter or interpret the content—your task is **only extraction**.
44
- - If a section contains both medical and personal data, extract only the medical data and redact the personal information."""
 
26
 
27
  [/INST]"""
28
  extract_original_text = """## Task
 
 
29
 
30
+ You must extract all text from the attached images and return it in the **text** field. You must not include the patient's name, contact details, or demographic data.
31
 
32
+ ## Important notes
 
 
 
 
 
33
 
34
+ - You must extract all text but exclude any information related to the name, contact details, and demographic data."""