Preprocess

klar_eda.preprocessing.preprocess_csv(csv, target_column=None, index_column=None)

Preprocesses the csv file OR the dataframe, generates a csv file in the current directory with preprocessed data - Preprocess_file.csv

Parameters
  • csv (pandas.Dataframe / string) – Either pandas Dataframe ( with column names as row 0 ) OR path to csv file

  • target_column (string, optional) – Name of the target column, defaults to last column in the dataframe.

  • index_column (list of string, optional) – List of column names which contain indexes/ do not contribute at all, defaults to None

klar_eda.preprocessing.preprocess_images(data_path, dataset_type='other', save=True, show=False)

Processes the image data, and generates folders with preprocessed images.

Parameters
  • data_path (string) – Path to folder containing image data ( Caution : Make sure the folder contains only images )

  • dataset_type (string, optional) – Either ‘ocr’ , ‘face’ or ‘other’ - Preprocessing is different for each category, defaults to ‘other’

  • save (bool, optional) – Save the results to directory, defaults to True

  • show (bool, optional) – Preview the results, defaults to False