String Cleaning¶
clean_strcol¶
-
pywrangle.str_cleaning.clean_strcol.clean_strcol(df: DataFrame, colname: str, case: Union[l, t, u] = 'l', trim: bool = True) → DataFrame¶ Cleans column in DataFrame based on case and trim args.
- Parameters
df (DataFrame) – DataFrame to clean.
colname (str) – Column to clean.
case (Union['l', 't', 'u']) – Case to standardize column, available in constants.py module. Defaults to ‘l’ for lowercase.
trim (bool, optional) – If should trim white spaces from column. Defaults to True.
- Returns
Returns DataFrame with cleaned strings in specified column.
- Return type
DataFrame
Example
>>> df1.animals = pw.clean_strcol(df1, 'animals', CASE_LIST[i])
clean_all_strcol¶
-
pywrangle.str_cleaning.clean_all_strcol.clean_all_strcols(df: DataFrame, columns: Optional[Union[list, tuple]] = None, col_cases: Optional[Union[list, tuple]] = None, trim: bool = True, clean_case: Union[l, t, u] = 'l') → DataFrame¶ Returns DataFrame with cleaned string columns.
- Parameters
df (DataFrame) – DataFrame to clean.
col_cases (Union[ list, tuple, None]) – Names of the columns to clean. If not specified, will attempt to clean all columns.
columns (Union[list, tuple, None], optional) – col_cases to use with the columns. If not specified, will default to optional clean_case parameter.
trim (bool, optional) – If should trim the string data in columns. Defaults to True.
clean_case (Union['l', 't', 'u']) – Sentence case to default string column cleaning. Defaults to ‘l’, or lowercase.
- Returns
Returns DataFrame with cleaned string columns.
- Return type
DataFrame
Notes
If columns is not specified, the function will clean all string columns in DataFrame.
May optionally pass column & col_cases to specify what columns to clean and how.
Available clean_case arguments represent lower, title, and upper respectively.
Example
>>> df = create_df.create_mixed_df_size(10, 10) >>> df = pw.clean_all_strcols(df) Record | Column | Is Str Col | Clean Method ------ | ------ | ---------- | ------------ 1 | A | False | None 2 | B | True | lower 3 | C | False | None 4 | D | True | lower 5 | E | False | None