Part 1 – Digitizing documents with powerful python libraries
So.. in the digital era you suddenly realise that until very recently not all valuable information kept is/was machine friendly, there are tons of word and pdf documents with reports, meetings minutes, contracts and so-on waiting to be re-discovered. With python libraries PdfMiner, PyPDF2 among others, we will unlock the power of text data in pdf format.
Part 2 – Optimise risk assessment with advanced machine learning models Traditional risk assessment in insurance process can be based on creating a set of rules to classify cases that are likely to default. With powerful machine learning libraries such as SCIKIT-LEARN and XGBOOST, a probability of default model is trained using historical data to predict the future cases. In order to understand the decision making process, we can evaluate the output using Eli5 libraries and create visualisation tools for the end users.
Buzzwords: Insurtech, machine learning, PDF parsing, digitization
Level: Intermediate: Target audiences with intermediate experience in python programming
Requirements to Audiences: Nil
Language: English
Speaker: Calvin Cheung, Rodrigo Acosta (Hong Kong)
Speaker Bio: Calvin is a data scientist with several years experience in fintech startup and insurance industry. Prior starting a career as a data scientist, Calvin was a climate scientist and his researches involve using statistical models and machine learning algorithms to predict the impacts of climate change. With a PhD from HKU and a postdoc from CUHK, Calvin is a member of Climate Informatics and passionate to apply data science in climate application. Apart from data quenching, attending hackathons and participate in open source events, Calvin also enjoy various sports such as running, swimming and rock climbing.
Rodrigo is a data scientist passionate about knowledge discovering and encouraging companies to take data-driven decisions. Most of his career experience has been in insurance and e-payments industries. He holds an M.Sc. in Big Data from HKUST and an M.Sc. in Operations Research from Los Andes University. When not working he is very likely reading or planning his next trip.
GitHub: https://github.com/calvinccs
LinkedIn: (Calvin) https://www.linkedin.com/in/thisiscalvin/ (Rodrigo) https://hk.linkedin.com/in/rodrigo-acosta/en