Course Outline
Module 1: Essential Python for Machine Learning Workflows
• Programme introduction and workspace preparation
Align on learning objectives and establish a reproducible Python ML environment
• Core Python language features (accelerated review)
Refresh syntax, control structures, functions, and patterns prevalent in ML codebases
• Python data structures for ML
Utilising lists, dictionaries, sets, and tuples for features, labels, and metadata
• Comprehensions and functional programming tools
Implementing transformations via comprehensions and higher-order functions
• Object-oriented Python for ML developers
Classes, methods, composition, and practical design choices
• dataclasses and lightweight modelling
Using typed containers for configuration, examples, and results
• Decorators and context managers
Implementing timing, caching, logging, and resource-safe execution patterns
• File handling and path management
Ensuring robust dataset handling and serialization formats
• Exceptions and defensive programming
Writing ML scripts that fail safely and transparently
• Modules, packages, and project structure
Organising reusable ML codebases effectively
• Typing and code quality
Applying type hints, documentation, and lint-friendly structures
Module 2: NumPy, SciPy, and Data Handling Essentials
• NumPy foundations for vectorised computing
Efficient array operations and performance-conscious coding
• Indexing, slicing, broadcasting, and shapes
Safe tensor manipulation and shape reasoning
• Linear algebra basics with NumPy and SciPy
Stable matrix operations and decompositions relevant to ML
• Deep dive into SciPy
Statistics, optimisation, curve fitting, and sparse matrices
• Pandas for tabular ML data
Cleaning, joining, aggregating, and preparing datasets
• Deep dive into scikit-learn
Estimator interface, pipelines, and reproducible workflows
• Data visualisation essentials
Creating diagnostic plots for data exploration and model behaviour analysis
Module 3: Design Patterns for Machine Learning Applications
• Transitioning from notebooks to maintainable projects
Refactoring exploratory code into structured packages
• Configuration management
Externalising parameters and implementing startup validation
• Logging, warnings, and observability
Structured logging for debuggable ML systems
• Building reusable components via OOP and composition
Designing extensible transformers and predictors
• Practical design patterns
Implementing Pipeline, Factory, Registry, Strategy, and Adapter patterns
• Data validation and schema checks
Preventing silent data issues
• Performance profiling and optimisation
Identifying bottlenecks and applying optimisation techniques
• Model I/O and inference interfaces
Ensuring safe persistence and clean prediction interfaces
• End-to-end mini-build
Constructing a production-style ML pipeline with configuration and logging
Module 4: Statistical Learning for Tabular, Text, and Image Data
• Evaluation fundamentals
Train/validation splits, rigorous cross-validation, and business-aligned metrics
• Advanced tabular machine learning
Regularised GLMs, tree ensembles, and leakage-free preprocessing
• Calibration and uncertainty estimation
Platt scaling, isotonic regression, bootstrap methods, and conformal prediction
• Classical NLP techniques
Tokenisation trade-offs, TF-IDF, linear models, and Naive Bayes
• Topic modelling
LDA fundamentals and practical limitations
• Classical computer vision methods
HOG, PCA, and feature-based pipelines
• Error analysis
Detecting bias, label noise, and spurious correlations
• Hands-on labs
Leakage-proof tabular pipeline
Text baseline comparison and interpretation
Classical vision baseline with structured failure analysis
Module 5: Neural Networks for Tabular, Text, and Image Data
• Mastering the training loop
Implementing clean PyTorch loops with AMP, clipping, and reproducibility measures
• Optimisation and regularisation techniques
Initialisation, normalisation, optimisers, and schedulers
• Mixed precision and scaling strategies
Gradient accumulation and checkpointing approaches
• Neural networks for tabular data
Categorical embeddings, feature crosses, and ablation studies
• Neural networks for text data
Embeddings, CNNs, BiLSTMs, GRUs, and sequence handling
• Neural networks for vision data
CNN fundamentals and ResNet-style architectures
• Hands-on labs
Developing a reusable training framework
Comparing Tabular NN vs boosting
CNN experiments with augmentation and scheduling
Module 6: Advanced Neural Architectures
• Transfer learning strategies
Freeze and unfreeze patterns, discriminative learning rates
• Transformer architectures for text
Self-attention internals and fine-tuning approaches
• Vision backbones and dense prediction
ResNet, EfficientNet, Vision Transformers, and U-Net concepts
• Advanced tabular architectures
TabTransformer, FT-Transformer, and Deep and Cross networks
• Time series considerations
Temporal splits and covariate shift detection
• PEFT and efficiency techniques
LoRA, distillation, and quantisation trade-offs
• Hands-on labs
Fine-tuning a pretrained text transformer
Fine-tuning a pretrained vision model
Comparing Tabular transformer vs GBDT
Module 7: Generative AI Systems
• Fundamentals of prompting
Structured prompting and controlled generation techniques
• Foundations of LLMs
Tokenisation, instruction tuning, and mitigating hallucinations
• Retrieval-Augmented Generation (RAG)
Chunking, embeddings, hybrid search, and evaluation metrics
• Fine-tuning strategies
LoRA and QLoRA with data quality controls
• Diffusion models
Understanding latent diffusion and practical adaptation
• Synthetic tabular data generation
CTGAN and privacy considerations
• Hands-on labs
Building a production-style RAG mini-application
Validating structured output with schema enforcement
Optional diffusion experimentation
Module 8: AI Agents and MCP
• Agent loop design
Observe, plan, act, reflect, and persist mechanisms
• Agent architectures
ReAct, plan-and-execute, and multi-agent coordination
• Memory management
Episodic, semantic, and scratchpad approaches
• Tool integration and safety
Tool contracts, sandboxing, and defending against prompt injection
• Evaluation frameworks
Replayable traces, task suites, and regression testing
• MCP and protocol-based interoperability
Designing MCP servers with secure tool exposure
• Hands-on labs
Building an agent from scratch
Exposing tools via an MCP-style server
Creating an evaluation harness with safety constraints
Requirements
Participants must possess a functional understanding of Python programming.
This programme is designed for technical professionals at intermediate to advanced levels.
Testimonials (2)
the ML ecosystem not only MLFlow but Optuna, hyperops, docker , docker-compose
Guillaume GAUTIER - OLEA MEDICAL
Course - MLflow
I enjoyed participating in the Kubeflow training, which was held remotely. This training allowed me to consolidate my knowledge for AWS services, K8s, all the devOps tools around Kubeflow which are the necessary bases to properly tackle the subject. I wanted to thank Malawski Marcin for his patience and professionalism for training and advice on best practices. Malawski approaches the subject from different angles, different deployment tools Ansible, EKS kubectl, Terraform. Now I am definitely convinced that I am going into the right field of application.