BPt.Dataset.get_Xy#
- Dataset.get_Xy(problem_spec='default', **problem_spec_params)[source]#
This function is used to get a sklearn-style grouping of input data (X) and target data (y) from the Dataset as according to a passed problem_spec.
Note: X and y are returned as pandas DataFrames not Datasets, so none of the Dataset meta data is accessible through the returned X, y here.
- Parameters
- problem_spec
ProblemSpec
or ‘default’, optional - This argument accepts an instance of the params class
ProblemSpec
. This object is essentially a wrapper around commonly used parameters needs to define the context the model pipeline should be evaluated in. It includes parameters like problem_type, scorer, n_jobs, random_state, etc…If left as ‘default’, then will initialize a ProblemSpec with default params.SeeProblemSpec
for more information and for how to create an instance of this object.default = 'default'
- problem_spec_params
ProblemSpec
params, optional You may also pass any valid parameter value pairs here, e.g.
X, y = get_Xy(problem_spec=problem_spec, problem_type='binary')
Any parameters passed here will override the original value in problem spec. This can be useful when using all default values for problem spec except for one, e.g., you just want to change random_state.
X, y = get_Xy(problem_spec='default', random_state=5)
- problem_spec
- Returns
- Xpandas DataFrame
DataFrame with the input data and columns as specified by the passed problem_spec. Note: the index will be sorted in identicially between X and y.
- ypandas Series
Series with the the target values as requested by the passed problem_spec. Note: the index will be sorted in identicially between X and y.