Preprocess¶

klar_eda.preprocessing.preprocess_csv(csv, target_column=None, index_column=None)¶

Preprocesses the csv file OR the dataframe, generates a csv file in the current directory with preprocessed data - Preprocess_file.csv

Parameters

csv (pandas.Dataframe / string) – Either pandas Dataframe ( with column names as row 0 ) OR path to csv file
target_column (string, optional) – Name of the target column, defaults to last column in the dataframe.
index_column (list of string, optional) – List of column names which contain indexes/ do not contribute at all, defaults to None

klar_eda.preprocessing.preprocess_images(data_path, dataset_type='other', save=True, show=False)¶

Processes the image data, and generates folders with preprocessed images.

Parameters

data_path (string) – Path to folder containing image data ( Caution : Make sure the folder contains only images )
dataset_type (string, optional) – Either ‘ocr’ , ‘face’ or ‘other’ - Preprocessing is different for each category, defaults to ‘other’
save (bool, optional) – Save the results to directory, defaults to True
show (bool, optional) – Preview the results, defaults to False

klar-eda