Faiz, T (2019) Multi-approaches on scrubbing data for medium-sized enterprises. In: 2019 International Conference on Digitization (ICD), 18-19 November 2019, Sharjah, United Arab Emirates.
53.pdf - Published Version
Restricted to Registered users only
Download (4MB) | Request a copy
Abstract
Tidy and fit for purpose data are the prerequisite for analyzing data and for guaranteeing good business decisions. Data Scrubbing or data cleaning is the process of identifying errors and inconsistencies in the data and fixing these errors before analyzing the data. Organization's decisions rely on Data Quality which makes data scrubbing a very important step towards their productivity. Untidy data includes; importing data from multiple sources, missing values or corrupt records, data types mismatch, special character removal or discarding duplicates. Current research is lacking the latest data scrubbing techniques practiced by the medium sized enterprises. This article highlights possible data errors, literature review, and data science project life cycle. The document explains how to clean data using Python libraries for exploratory data analysis such as Pandas, NumPy, Scikit- Learn and libraries for data visualization for example matplotlib, Seaborn, and Plotly.
Affiliation: | Skyline University College |
---|---|
SUC Author(s): | Faiz, T |
All Author(s): | Faiz, T |
Item Type: | Conference or Workshop Item (Paper) |
Uncontrolled Keywords: | Data Scrubbing, Data Cleaning, Data Cleansing, Exploratory data analysis, Python – Data Cleaning, Data Quality, Pandas Library, Data Pre-processing, Data transformation |
Subjects: | B Information Technology > BQ Data Analytics B Information Technology > BT Data Management |
Divisions: | Skyline University College > School of IT |
Depositing User: | Mr Veeramani Rasu |
Date Deposited: | 28 Nov 2021 09:07 |
Last Modified: | 28 Nov 2021 09:07 |
URI: | https://research.skylineuniversity.ac.ae/id/eprint/55 |
Publisher URL: | https://doi.org/10.1109/ICD47981.2019.9105739 |
Publisher OA policy: | |
Related URLs: |
Actions (login required)
Statistics for this ePrint Item |