Python酷库之旅-第三方库Pandas(079)

CSDN 2024-09-10 16:05:01 阅读 76

一、用法精讲

326、pandas.Series.str.normalize方法

326-1、语法

326-2、参数

326-3、功能

326-4、返回值

326-5、说明

326-6、用法

326-6-1、数据准备

326-6-2、代码示例

326-6-3、结果输出

327、pandas.Series.str.pad方法

327-1、语法

327-2、参数

327-3、功能

327-4、返回值

327-5、说明

327-6、用法

327-6-1、数据准备

327-6-2、代码示例

327-6-3、结果输出

328、pandas.Series.str.partition方法

328-1、语法

328-2、参数

328-3、功能

328-4、返回值

328-5、说明

328-6、用法

328-6-1、数据准备

328-6-2、代码示例

328-6-3、结果输出

329、pandas.Series.str.removeprefix方法

329-1、语法

329-2、参数

329-3、功能

329-4、返回值

329-5、说明

329-6、用法

329-6-1、数据准备

329-6-2、代码示例

329-6-3、结果输出

330、pandas.Series.str.removesuffix方法

330-1、语法

330-2、参数

330-3、功能

330-4、返回值

330-5、说明

330-6、用法

330-6-1、数据准备

330-6-2、代码示例

330-6-3、结果输出

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

一、用法精讲

326、pandas.Series.str.normalize方法

326-1、语法

<code># 326、pandas.Series.str.normalize方法

pandas.Series.str.normalize(form)

Return the Unicode normal form for the strings in the Series/Index.

For more information on the forms, see the unicodedata.normalize().

Parameters:

form

{‘NFC’, ‘NFKC’, ‘NFD’, ‘NFKD’}

Unicode form.

Returns:

Series/Index of objects.

326-2、参数

326-2-1、form(必须)：指定了规范化的形式，可以选择以下四种形式：

'NFC'：Normalization Form C (Canonical Composition)，规范化形式C，将分解的字符组合成一个字符。比如，将 "é" 和 "é" 规范化为 "é"。'NFD'：Normalization Form D (Canonical Decomposition)，规范化形式D，将字符分解为其基础字符和组合标记。比如，将 "é" 分解为 "e" 和 "́"。'NFKC'：Normalization Form KC (Compatibility Composition)，兼容性组合，将兼容性等价的字符组合到一起，同时执行 NFC 规范化。'NFKD'：Normalization Form KD (Compatibility Decomposition)，兼容性分解，将字符分解为其兼容性等价的基础字符和组合标记。

326-3、功能

对字符串进行规范化处理，确保字符序列的唯一性，它对于处理来自不同来源的数据、统一字符串格式、提高字符串比较的一致性非常有用。

326-4、返回值

返回一个新的pandas.Series对象，其中每个字符串都经过指定形式的规范化处理。

326-5、说明

无

326-6、用法

326-6-1、数据准备

无

326-6-2、代码示例

# 326、pandas.Series.str.normalize方法

import pandas as pd

# 示例数据

data = pd.Series(['café', 'e\u0301clair', 'cafe\u0301'])

# 使用NFC进行规范化

normalized_data = data.str.normalize('NFC')

print(normalized_data)

# 使用NFD进行规范化

normalized_data = data.str.normalize('NFD')

print(normalized_data)

# 使用NFKC进行规范化

normalized_data = data.str.normalize('NFKC')

print(normalized_data)

# 使用NFKD进行规范化

normalized_data = data.str.normalize('NFKD')

print(normalized_data)

326-6-3、结果输出

# 326、pandas.Series.str.normalize方法

# 0 café

# 1 éclair

# 2 café

# dtype: object

# 0 café

# 1 éclair

# 2 café

# dtype: object

# 0 café

# 1 éclair

# 2 café

# dtype: object

# 0 café

# 1 éclair

# 2 café

# dtype: object

327、pandas.Series.str.pad方法

327-1、语法

# 327、pandas.Series.str.pad方法

pandas.Series.str.pad(width, side='left', fillchar=' ')code>

Pad strings in the Series/Index up to width.

Parameters:

width

int

Minimum width of resulting string; additional characters will be filled with character defined in fillchar.

side

{‘left’, ‘right’, ‘both’}, default ‘left’

Side from which to fill resulting string.

fillchar

str, default ‘ ‘

Additional character for filling, default is whitespace.

Returns:

Series or Index of object

Returns Series or Index with minimum number of char in object.

`327-2、参数`

 327-2-1、width(必须)：整数，用于定义字符串在填充后的总宽度，如果字符串的长度小于这个宽度，将会在指定方向填充字符，使其达到指定宽度。
 
327-2-2、side(可选，默认值为'left')：指定填充的方向，选项有：
 
'left'：在字符串的左侧进行填充。'right'：在字符串的右侧进行填充。'both'：在字符串的两侧进行填充，如果需要在两侧填充，但总宽度不均匀，多余的填充字符会放在右侧。 
327-2-3、fillchar(可选，默认值为' ')：字符串，用于填充的字符，该字符必须是单个字符长度的字符串。
 
327-3、功能
         将字符串填充到指定的宽度，这对于对齐文本或格式化输出非常有用，根据需要，您可以选择在字符串的左侧、右侧或两侧添加填充字符。
 
327-4、返回值
         返回一个新的pandas.Series对象，其中每个字符串都经过了指定方向和填充字符的处理，长度达到了指定的宽度。
 
327-5、说明
         无
 
327-6、用法
 327-6-1、数据准备
 无 
327-6-2、代码示例
 # 327、pandas.Series.str.pad方法
import pandas as pd
# 示例数据
data = pd.Series(['cat', 'dog', 'elephant'])
# 在左侧填充，使每个字符串的长度达到10，填充字符为'*'
padded_left = data.str.pad(width=10, side='left', fillchar='*')code>
# 在右侧填充，使每个字符串的长度达到10，填充字符为'-'
padded_right = data.str.pad(width=10, side='right', fillchar='-')code>
# 在两侧填充，使每个字符串的长度达到10，填充字符为'~'
padded_both = data.str.pad(width=10, side='both', fillchar='~')code>
print("Left Padded:\n", padded_left)
print("Right Padded:\n", padded_right)
print("Both Sides Padded:\n", padded_both) 
327-6-3、结果输出
 # 327、pandas.Series.str.pad方法
# Left Padded:
# 0 *******cat
# 1 *******dog
# 2 **elephant
# dtype: object
# Right Padded:
# 0 cat-------
# 1 dog-------
# 2 elephant--
# dtype: object
# Both Sides Padded:
# 0 ~~~cat~~~~
# 1 ~~~dog~~~~
# 2 ~elephant~
# dtype: object 
328、pandas.Series.str.partition方法
 328-1、语法
 # 328、pandas.Series.str.partition方法
pandas.Series.str.partition(sep=' ', expand=True)code>
Split the string at the first occurrence of sep.
This method splits the string at the first occurrence of sep, and returns 3 elements containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return 3 elements containing the string itself, followed by two empty strings.
Parameters:
sep
str, default whitespace
String to split on.
expand
bool, default True
If True, return DataFrame/MultiIndex expanding dimensionality. If False, return Series/Index.
Returns:
DataFrame/MultiIndex or Series/Index of objects. 
328-2、参数
 328-2-1、sep(可选，默认值为' ')：字符串，用于分割字符串的分隔符，该分隔符可以是任意字符或字符串。如果字符串中没有找到指定的分隔符，那么结果将包含原字符串，并且中间和右侧的结果为空字符串。
 
328-2-2、expand(可选，默认值为True)：布尔值，指定返回值的形式。
 
如果为True，方法将返回一个DataFrame，其中包含三列，分别对应分隔符前的部分、分隔符本身、分隔符后的部分。如果为False，方法将返回一个Series，其中每个元素是一个包含这三部分的元组(before，sep，after)。 
328-3、功能
         通过指定的分隔符将每个字符串分为三部分，该方法非常适合用于处理包含特定分隔符的字符串，帮助我们快速获取分隔符两侧的内容。
 
328-4、返回值
         根据expand参数的值，该方法有两种不同的返回值：
 
当expand=True时，返回一个DataFrame，每列分别表示分隔符前的部分、分隔符本身、分隔符后的部分。当expand=False时，返回一个Series，其中每个元素是一个(before，sep，after)的元组。 
328-5、说明
         无
 
328-6、用法
 328-6-1、数据准备
 无 
328-6-2、代码示例
 # 328、pandas.Series.str.partition方法
import pandas as pd
# 示例数据
data = pd.Series(['apple-pie', 'banana-split', 'cherry'])
# 使用'-'作为分隔符进行分割，expand=True，返回DataFrame
partitioned_df = data.str.partition(sep='-', expand=True)code>
# 使用'-'作为分隔符进行分割，expand=False，返回Series
partitioned_series = data.str.partition(sep='-', expand=False)code>
print("Partitioned DataFrame:\n", partitioned_df)
print("Partitioned Series:\n", partitioned_series) 
328-6-3、结果输出
 # 328、pandas.Series.str.partition方法
# Partitioned DataFrame:
# 0 1 2
# 0 apple - pie
# 1 banana - split
# 2 cherry 
# Partitioned Series:
# 0 (apple, -, pie)
# 1 (banana, -, split)
# 2 (cherry, , )
# dtype: object 
329、pandas.Series.str.removeprefix方法
 329-1、语法
 # 329、pandas.Series.str.removeprefix方法
pandas.Series.str.removeprefix(prefix)
Remove a prefix from an object series.
If the prefix is not present, the original string will be returned.
Parameters:
prefix
str
Remove the prefix of the string.
Returns:
Series/Index: object
The Series or Index with given prefix removed. 
329-2、参数
 329-2-1、prefix(必须)：字符串，指定要移除的前缀，如果字符串的开头部分与prefix匹配，那么该部分将被移除；如果字符串不以prefix开头，则字符串保持不变。
 
329-3、功能
         从每个字符串的开头移除指定的前缀，该方法特别适用于清理数据时，需要删除统一的开头标识符或固定格式的前缀。
 
329-4、返回值
         返回一个新的Series，其中每个字符串都已经移除了指定的前缀，如果原始字符串不包含指定的前缀，则返回的字符串与原字符串相同。
 
329-5、说明
         无
 
329-6、用法
 329-6-1、数据准备
 无 
329-6-2、代码示例
 # 329、pandas.Series.str.removeprefix方法
import pandas as pd
# 示例数据
data = pd.Series(['prefix_text1', 'prefix_text2', 'no_prefix_text'])
# 使用'removeprefix'方法移除前缀'prefix_'
removed_prefix = data.str.removeprefix('prefix_')
print("Original Series:\n", data)
print("Series after removing prefix:\n", removed_prefix) 
329-6-3、结果输出
 # 329、pandas.Series.str.removeprefix方法
# Original Series:
# 0 prefix_text1
# 1 prefix_text2
# 2 no_prefix_text
# dtype: object
# Series after removing prefix:
# 0 text1
# 1 text2
# 2 no_prefix_text
# dtype: object 
330、pandas.Series.str.removesuffix方法
 330-1、语法
 # 330、pandas.Series.str.removesuffix方法
pandas.Series.str.removesuffix(suffix)
Remove a suffix from an object series.
If the suffix is not present, the original string will be returned.
Parameters:
suffix
str
Remove the suffix of the string.
Returns:
Series/Index: object
The Series or Index with given suffix removed. 
330-2、参数
 330-2-1、suffix(必须)：字符串，指定要移除的后缀，如果字符串的结尾部分与suffix匹配，那么该部分将被移除；如果字符串不以suffix结尾，则字符串保持不变。
 
330-3、功能
         从每个字符串的结尾移除指定的后缀，该方法特别适用于清理数据时，需要删除统一的结尾标识符或固定格式的后缀。
 
330-4、返回值
         返回一个新的Series，其中每个字符串都已经移除了指定的后缀，如果原始字符串不包含指定的后缀，则返回的字符串与原字符串相同。
 
330-5、说明
         无
 
330-6、用法
 330-6-1、数据准备
 无 
330-6-2、代码示例
 # 330、pandas.Series.str.removesuffix方法
import pandas as pd
# 示例数据
data = pd.Series(['text1_suffix', 'text2_suffix', 'text3_nosuffix'])
# 使用'removesuffix'方法移除后缀'_suffix'
removed_suffix = data.str.removesuffix('_suffix')
print("Original Series:\n", data)
print("Series after removing suffix:\n", removed_suffix) 
330-6-3、结果输出
 # 330、pandas.Series.str.removesuffix方法
# Original Series:
# 0 text1_suffix
# 1 text2_suffix
# 2 text3_nosuffix
# dtype: object
# Series after removing suffix:
# 0 text1
# 1 text2
# 2 text3_nosuffix
# dtype: object 
二、推荐阅读
 1、Python筑基之旅
 2、Python函数之旅
 3、Python算法之旅
 4、Python魔法之旅
 5、博客个人主页

 
 
   上一篇： manim边学边做--常用多边形  
  下一篇： 43.常用C++编译器推荐——《跟老吕学C++》 
  本文标签 
  Python酷库之旅-第三方库Pandas(079)    
 
  
  声明
  本文内容仅代表作者观点，或转载于其他网站，本站不以此文作为商业用途
 如有涉及侵权，请联系本站进行删除
 转载本站原创文章，请注明来源及作者。