base_wrangler
Attributes
Classes
Base class for vendor-specific data wranglers. |
Functions
|
Loads fields.csv from the config path and creates a nested dictionary map: |
Module Contents
- base_wrangler.logger
- base_wrangler._load_field_map() Dict[str, Dict[str, str]]
Loads fields.csv from the config path and creates a nested dictionary map: {‘vendor_name’: {‘vendor_field_lower’: ‘CRYPTODATAPY_FIELD’}}
- class base_wrangler.BaseDataWrangler(data_req: cryptodatapy.extract.datarequest.DataRequest, data_resp: Dict | pandas.DataFrame)
Bases:
abc.ABCBase class for vendor-specific data wranglers. Handles common data cleaning, filtering, and field mapping operations.
- _FIELD_MAP
- _DEFAULT_AGG_MAP
- data_req
- data_resp
- field_map
- _convert_fields_to_lib(data_source: str) None
Convert columns from vendor field names to CryptoDataPy standard field names using the dictionary map. Mutates self.data_resp.
- _set_index_and_sort(index_cols: str | List[str] = 'date') None
Sets the index and sorts the DataFrame by the index.
It ensures that if ‘date’ is part of the index, it is converted to a date-only Timestamp (time component set to 00:00:00) while retaining the datetime64[ns] dtype for optimal index performance.
- _filter_dates() None
Filters data response based on start and end dates in data_req.
- _resample(agg_func: str | Dict[str, str] | None = None) None
Resamples the DataFrame to the frequency in the data_req.
Logic: 1. If agg_func is a string (‘last’, ‘sum’), it applies to all columns. 2. If agg_func is None, it uses the DEFAULT_AGG_MAP for known columns. 3. If a column isn’t in the map, it defaults to ‘last’.
- Parameters:
agg_func (str or dict, optional) – Aggregation function(s) to use during resampling. If a string is provided, it applies to all columns. If a dict is provided, it should map column names to aggregation functions. If None, the default aggregation map is used.
- _reorder_columns(requested_fields: bool = False) None
Reorders columns based on the provided column order list.
- Parameters:
requested_fields (bool) – If True, only requested fields are kept and ordered. If False, all columns are kept.
- _clean_data() None
Removes duplicates, NaNs (full row/col), and 0 values.
- _convert_types() None
Converts columns to appropriate numeric types, explicitly excluding known string/metadata columns, and uses standard pandas dtypes.
- abstractmethod wrangle() pandas.DataFrame
Abstract method for wrangling. Must be implemented by child classes.