Data preparation takes up to 80% of time for a Data Scientist role. Nonetheless, it is still a necessary step to identify and fix errors, duplicates and irrelevant data and run effective analysis.
In this talk, I will talk about how data preparation could be made easy with AWS Python SDK for Pandas, an open-source python library, and AWS Glue DataBrew, a visual data preparation tool. AWS Python SDK for Pandas lets you load data into your dataframes from various AWS data stores like Opensearch, DynamoDB, Redshift, etc with just one line of code. And with Glue DataBrew, you can choose from over 250 pre-built transformations to automate data preparation tasks, all without the need to write any code. The talk will include live demonstration.
Date and Time : October 29, 2022 / 14:00-14:30 ( UTC+8 ) Language : English Speaker : Mr. Fortune Hui / AWS / Hong Kong
Speaker Introduction
Mr. Fortune Hui
Fortune Hui is a Solutions Architect @ Amazon Web Services specializing in Data Analytics, designing Serverless Data Platform, Data Warehouse, Text analytics platform and etc. Prior to joining AWS, he engineered data pipelines in various industries including Retail, Aviation and Insurance.