How To Remove Leading Underscores And Numbers In A String In Python
I need to sanitize some strings and remove invalid leading (non-alphabet) characters from them. For example: '3_hello' -> 'hello' '_hello' -> 'hello' '__hello' -> 'hello
Solution 1:
You can try ^[^A-Za-z]*
, some testing cases here:
import re
re.sub('^[^A-Za-z]*', '', "3_hello")
# 'hello'
re.sub('^[^A-Za-z]*', '', "_hello")
# 'hello'
re.sub('^[^A-Za-z]*', '', "++hello")
# 'hello'
re.sub('^[^A-Za-z]*', '', "__hello")
# 'hello'
- Where the first
^
denotes the beginning of the string; - In the character class
[]
use another^
to negate the alpha letters; - Use
*
as a greedy quantifier so that any non alphabets starting from the beginning of the string will be removed.
Solution 2:
You can use the function str.isalpha() and iterate the string to get only the alphabets and place it in another variable.
Solution 3:
import re
forstring in [ "++hello", "__hello", "3_hello"]:
print"".join(re.findall("[a-zA-Z]", string))
hello
hello
hello
Solution 4:
I suggest you to use re.search
to search for alphabetical string:
m = re.search("[a-zA-Z]+", "3_hello")
print(m.group(0))
hello
Solution 5:
You can use a loop and str.isalpha to strip the leading non-alphabetic characters
a = '3_hello'for i inrange (0, len(a)):
ifstr.isalpha(a[i:]):
a=a[i:]
breakprint(a)
hello
Post a Comment for "How To Remove Leading Underscores And Numbers In A String In Python"