downcast_numeric_columns¶
- pyhelpers.ops.downcast_numeric_columns(*dataframes)[source]¶
Downcasts numeric columns in pandas DataFrames to optimal dtypes to reduce memory usage.
- This function processes multiple DataFrames in one pass, converting:
Integer columns to the smallest signed integer dtype (int8, int16, int32, or int64)
Floating-point columns to the smallest floating dtype (float32 or float64) that can safely represent their values.
- Parameters:
dataframes (pandas.DataFrame | polars.DataFrame) – One or more pandas DataFrames to optimize
- Returns:
New DataFrame(s) with downcasted numeric columns.
- Return type:
None | pandas.DataFrame | polars.DataFrame | tuple[pandas.DataFrame | polars.DataFrame]
Note
Modifies DataFrames in place for memory efficiency.
Skips non-numeric columns automatically.
Uses pandas’ built-in optimisations for batch processing.
Examples:
>>> from pyhelpers.ops import downcast_numeric_columns >>> from pyhelpers._cache import example_dataframe >>> import polars as pl >>> df1 = example_dataframe().copy() >>> df1.dtypes Longitude float64 Latitude float64 dtype: object >>> df2 = example_dataframe().T.copy() >>> df2.dtypes City London float64 Birmingham float64 Manchester float64 Leeds float64 dtype: object >>> df11, df21 = downcast_numeric_columns(df1, df2) >>> df11.dtypes Longitude float32 Latitude float32 dtype: object >>> df21.dtypes City London float32 Birmingham float32 Manchester float32 Leeds float32 dtype: object >>> df1, df2 = map(pl.from_pandas, (df1, df2)) >>> df1.dtypes [Float64, Float64] >>> df2.dtypes [Float64, Float64, Float64, Float64] >>> df21, df22 = downcast_numeric_columns(df1, df2) >>> df21.dtypes [Float32, Float32] >>> df22.dtypes [Float32, Float32, Float32, Float32]