numpy字符串处理

字符串拼接

numpy.char.add(x1, x2)
Return element-wise string concatenation for two arrays of str or unicode.

x1x2拼接在一起

>>>import numpy as np
>>>np.char.add("aaa","bbb")
array('aaabbb', dtype='<U6')
>>>np.char.add(["aaa"],["bbb"])
array(['aaabbb'], dtype='<U6')

numpy.char.join(sep, seq)
Return a string which is the concatenation of the strings in the sequence seq.

seq是一组字符串,用sep把它们连接起来

>>>np.char.join('-',['a','b','c'])
array(['a', 'b', 'c'], dtype='<U1')

字符串转小/大写

转小写:numpy.char.lower(a)

转大写:numpy.char.upper(a)

演示:

>>> np.char.lower('aAbBcC')
array('aabbcc', dtype='<U6')
>>> np.char.upper('aAbBcC')
array('AABBCC', dtype='<U6')

字符串去掉最左/右边开头的元素

numpy.char.lstrip(a, chars=None)
For each element in a, return a copy with the leading characters removed.

参数说明:
chars {str, unicode}, optional
The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix; rather, all combinations of its values are stripped.

如果a最左边字母序列是chars,则被去掉

numpy.char.rstrip(a, chars=None)
For each element in a, return a copy with the trailing characters removed.

>>> c = np.array(['aAaAaA', '  aA  ', 'abBABba'])
>>> c
array(['aAaAaA', '  aA  ', 'abBABba'], dtype='<U7')
np.char.lstrip(c, 'a')
array(['AaAaA', '  aA  ', 'bBABba'], dtype='<U7')

字符串分割

numpy.char.split(a, sep=None, maxsplit=None)
For each element in a, return a list of the words in the string, using sep as the delimiter string.

参数说明:
sep str or unicode, optional
If sep is not specified or None, any whitespace string is a separator.

如果遇到sep,就在这个位置给字符串a打个隔断

示例

>>>np.char.split('a b c',' ')
array(list(['a', 'b', 'c']), dtype=object)
上一篇:Pandas学习:预备知识


下一篇:pytorch入门到项目(三)tensor的概念以及创建