MACHINE LEARNING FOR MATERIALS RESEARCH BOOTCAMP 2021
Bootcamp (Days 1-4)
Four days of lectures and hands-on exercises covering a range of data analysis topics from introduction to python and data pre-processing to advanced machine learning analysis techniques. Example topics include:
- Identifying important features in complex/high dimensional data
- Visualizing high dimensional data to facilitate user analysis.
- Identifying the 'descriptors' that best predict variance in functional properties.
- Quantifying similarities between materials using complex/high dimensional data
- Identifying the most informative experiment to perform next.
Hands-on exercises will include practical use of machine learning tools on real materials experimental data (scalar values, spectra, micrographs, etc.)
Scientists will also demonstrate how they performed recently published research, from loading and preprocessing data to analyzing and visualizing results, all in Jupyter notebooks. Day 4 will include hand-on exercises on how to use the AFLOW database online.
Registration
If you are a student (graduate, undergraduate, or high school), write to us first at MLMR@umd.edu, so we can send you a student discount code BEFORE you register. Write to us also for an academic discount code if you work at an academic institution (university, etc.) BEFORE you register.
Register here
Bootcamp Schedule
The bootcamp will run daily from 9:00am - 5:00pm (EDT). It will be live on zoom, and recorded sessions will be made available to participants afterwards.
Day 1:
- Welcome and High-level intro to Machine Learning
- Introduction to Python
- Data Preprocessing
- Filtering / Noise Smoothing
- Normalization / Standardization
- Background Subtraction
- Feature Extraction: e.g. Cross-correction, wavelets, edges, boundaries, shapes
Day 2: Unsupervised Learning
- Review of Linear Algebra and Notations
- Dissimilarity Measures
- Latent Variable Analysis
- Spectral Unmixing / Matrix Factorization under constraints
- Clustering
Day 3: Supervised Learning
- Data Handling
- Algorithms:
- Regularized Linear Regression
- The Kernel Trick
- Gaussian Processes
- Neutral Networks
- Decision Trees & Ensemble Learning
- Symbolic Regression
Day 4: Active Learning, DFT, and Natural Language Processing
- Introduction to DFT and Tutorial on AFLOW
- Machine Learning for DFT-based Data
- Natural Language Processing
- Active Learning, Bayesian Optimization, and Gaussian Processes
Day 5: Workshop on Advanced ML Techniques for Materials Discovery