cryptodatapy.transform.impute
Classes
Handles missing values. |
Module Contents
- class cryptodatapy.transform.impute.Impute(filtered_df: pandas.DataFrame, plot: bool = False, plot_series: tuple = ('BTC', 'close'))
Handles missing values.
- filtered_df
- plot = False
- plot_series = ('BTC', 'close')
- imputed_df = None
- fwd_fill() pandas.DataFrame
Imputes missing values by imputing missing values with latest non-missing values.
- Returns:
imputed_df – DataFrame MultiIndex with DatetimeIndex (level 0), ticker (level 1) and fields (cols) with imputed values using forward fill method.
- Return type:
pd.DataFrame - MultiIndex
- interpolate(method: str = 'linear', order: int | None = None, axis: int = 0, limit: int | None = None) pandas.DataFrame
Imputes missing values by interpolating using various methods.
- Parameters:
method (str, {'linear', ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘spline’, ‘barycentric’,) – ‘polynomial’, ‘krogh’, ‘piecewise_polynomial’, ‘pchip’, ‘akima’, ‘cubicspline’}, default spline Interpolation method to use.
order (int, optional, default None) – Order of polynomial or spline.
axis ({{0 or ‘index’, 1 or ‘columns’, None}}, default None) – Axis to interpolate along.
limit (int, optional, default None) – Maximum number of consecutive NaNs to fill. Must be greater than 0.
- Returns:
imputed_df – DataFrame MultiIndex with DatetimeIndex (level 0), ticker (level 1) and fields (cols) with imputed values using interpolation method.
- Return type:
pd.DataFrame - MultiIndex
- fcst(yhat_df: pandas.DataFrame) pandas.DataFrame
Imputes missing values with forecasts from outlier detection algorithm.
- Parameters:
yhat_df (pd.DataFrame - MultiIndex) – Multiindex dataframe with DatetimeIndex (level 0), tickers (level 1) and fields (cols) with forecasted values.
- Returns:
imputed_df – DataFrame MultiIndex with DatetimeIndex (level 0), ticker (level 1) and fields (cols) with imputed values using forecasts from outlier detection method.
- Return type:
pd.DataFrame - MultiIndex
- plot_imputed() None
Plots filtered time series.