samedi 27 juin 2015

How to isolate only the first space in a string using regex?

I have a foreign language to English dictionary that I'm trying to import into a sql database. This dictionary is in a text file and the lines look like this:

field1 field2 [romanization] /definition 1/definition 2/definition 3/

I'm using regex in python to identify the delimiters. So far I've been able to isolate every delimiter except for the space in-between field 1 and field 2.

(?<=\S)\s\[|\]\s/(?=[A-Za-z])|/
#(?<=\S)\s\[  is the opening square bracket after field 2
#\]\s/(?=[A-Za-z]) is the closing square bracket after the romanization
#/ is the forward slashes in-between definitions.
#????????? is the space between field 1 and field two

Aucun commentaire:

Enregistrer un commentaire