Spaces:
Paused
Paused
Plan2Align-NV
/
laser
/tools-external
/moses-tokenizer
/share
/nonbreaking_prefixes
/nonbreaking_prefix.it
#Anything in this file, followed by a period (and an upper-case word), does NOT indicate an end-of-sentence marker. | |
#Special cases are included for prefixes that ONLY appear before 0-9 numbers. | |
#any single upper case letter followed by a period is not a sentence ender (excluding I occasionally, but we leave it in) | |
#usually upper case letters are initials in a name | |
A | |
B | |
C | |
D | |
E | |
F | |
G | |
H | |
I | |
J | |
K | |
L | |
M | |
N | |
O | |
P | |
Q | |
R | |
S | |
T | |
U | |
V | |
W | |
X | |
Y | |
Z | |
#List of titles. These are often followed by upper-case names, but do not indicate sentence breaks | |
Adj | |
Adm | |
Adv | |
Amn | |
Arch | |
Asst | |
Avv | |
Bart | |
Bcc | |
Bldg | |
Brig | |
Bros | |
C.A.P | |
C.P | |
Capt | |
Cc | |
Cmdr | |
Co | |
Col | |
Comdr | |
Con | |
Corp | |
Cpl | |
DR | |
Dott | |
Dr | |
Drs | |
Egr | |
Ens | |
Gen | |
Geom | |
Gov | |
Hon | |
Hosp | |
Hr | |
Id | |
Ing | |
Insp | |
Lt | |
MM | |
MR | |
MRS | |
MS | |
Maj | |
Messrs | |
Mlle | |
Mme | |
Mo | |
Mons | |
Mr | |
Mrs | |
Ms | |
Msgr | |
N.B | |
Op | |
Ord | |
P.S | |
P.T | |
Pfc | |
Ph | |
Prof | |
Pvt | |
RP | |
RSVP | |
Rag | |
Rep | |
Reps | |
Res | |
Rev | |
Rif | |
Rt | |
S.A | |
S.B.F | |
S.P.M | |
S.p.A | |
S.r.l | |
Sen | |
Sens | |
Sfc | |
Sgt | |
Sig | |
Sigg | |
Soc | |
Spett | |
Sr | |
St | |
Supt | |
Surg | |
V.P | |
# other | |
a.c | |
acc | |
all | |
banc | |
c.a | |
c.c.p | |
c.m | |
c.p | |
c.s | |
c.v | |
corr | |
dott | |
e.p.c | |
ecc | |
es | |
fatt | |
gg | |
int | |
lett | |
ogg | |
on | |
p.c | |
p.c.c | |
p.es | |
p.f | |
p.r | |
p.v | |
post | |
pp | |
racc | |
ric | |
s.n.c | |
seg | |
sgg | |
ss | |
tel | |
u.s | |
v.r | |
v.s | |
#misc - odd period-ending items that NEVER indicate breaks (p.m. does NOT fall into this category - it sometimes ends a sentence) | |
v | |
vs | |
i.e | |
rev | |
e.g | |
#Numbers only. These should only induce breaks when followed by a numeric sequence | |
# add NUMERIC_ONLY after the word for this function | |
#This case is mostly for the english "No." which can either be a sentence of its own, or | |
#if followed by a number, a non-breaking prefix | |
No #NUMERIC_ONLY# | |
Nos | |
Art #NUMERIC_ONLY# | |
Nr | |
pp #NUMERIC_ONLY# | |