This talk will provide a practical deep dive on how to build industry-ready machine learning and data pipelines in Python. I will cover practical presentation that will build from the basics of Airflow, and show how it is possible to build scalable and distributed machine learning data pipelines using a distributed architecture with a producer- consumer backend using Celery. I will provide insights on some of the key learnings I have obtained throughout my career building machine learning systems, as well as caveats and best practices deploying scalable data pipelines systems in production environments.
Buzzwords: Python, Airflow, Scalable, Industry-ready, best practices, Machine Learning, ML Pipelines, Data Pipelines, Data Engineering, DataOps, DevOps
Level: Intermediate: Target audiences with intermediate experience in python programming
Requirements to Audiences: Interest in Production Machine Learning
Language: English
Speaker: Alejandro Saucedo (London, United Kingdom)
Speaker Bio: Alejandro Saucedo is a technology leader with over 10 years of software development experience. He is currently the Chief Scientist at The Institute for Ethical AI & Machine Learning, a UK-based research centre. Throughout his career, Alejandro has held technical leadership positions across hyper-growth scale-ups and tech giants including Eigen Technologies, Bloomberg LP and Hack Partners. Alejandro has a strong track record building multiple departments of machine learning engineers from scratch, and leading the delivery of numerous large-scale machine learning systems across the financial, insurance, legal, transport, manufacturing and construction sectors (in Europe, US and Latin America).
GitHub: https://github.com/axsauze
LinkedIn: http://linkedin.com/in/axsaucedo
Twitter: https://twitter.com/axsaucedo