Skip to content Skip to sidebar Skip to footer

How To Remove Leading Underscores And Numbers In A String In Python

I need to sanitize some strings and remove invalid leading (non-alphabet) characters from them. For example: '3_hello' -> 'hello' '_hello' -> 'hello' '__hello' -> 'hello

Solution 1:

You can try ^[^A-Za-z]*, some testing cases here:

import re
re.sub('^[^A-Za-z]*', '', "3_hello")
# 'hello'

re.sub('^[^A-Za-z]*', '', "_hello")
# 'hello'

re.sub('^[^A-Za-z]*', '', "++hello")
# 'hello'

re.sub('^[^A-Za-z]*', '', "__hello")
# 'hello'
  • Where the first ^ denotes the beginning of the string;
  • In the character class [] use another ^ to negate the alpha letters;
  • Use * as a greedy quantifier so that any non alphabets starting from the beginning of the string will be removed.

Solution 2:

You can use the function str.isalpha() and iterate the string to get only the alphabets and place it in another variable.

Solution 3:

import re
forstring in [ "++hello", "__hello", "3_hello"]:
    print"".join(re.findall("[a-zA-Z]", string))

hello
hello
hello

Solution 4:

I suggest you to use re.search to search for alphabetical string:

m = re.search("[a-zA-Z]+", "3_hello")
print(m.group(0))

hello

Solution 5:

You can use a loop and str.isalpha to strip the leading non-alphabetic characters

a = '3_hello'for i inrange (0, len(a)):
    ifstr.isalpha(a[i:]):
        a=a[i:]
        breakprint(a)

hello

Post a Comment for "How To Remove Leading Underscores And Numbers In A String In Python"