WU | Intelligent Optimization
An advanced course covering the theory and application of intelligent optimization techniques, ranging from classical gradient methods to modern evolutionary meta-heuristics, multi-objective optimization, surrogate-assisted models, and federated machine learning applications.
Course Overview
📚 Content Summary
An advanced course exploring the intersection of classical mathematical optimization, evolutionary computation, and contemporary machine learning. It covers single and multi-objective optimization, data-driven strategies, and advanced topics like Neural Architecture Search and Federated Learning.
Master the evolution of intelligence through advanced data-driven and multi-objective optimization.
🎯 Learning Objectives
- [Apply advanced optimization techniques to complex, data-driven mathematical problems.]
- [Design and evaluate evolutionary and machine learning architectures for multi-objective scenarios.]
🔹 Lesson 1: Fundamentals of Optimization
Overview: This lesson provides a comprehensive introduction to the mathematical foundations of optimization, ranging from classical gradient-based techniques to biologically inspired evolutionary strategies. It establishes the rigorous definitions of decision variables, objective functions, and constraints while exploring the taxonomy of optimization problems. The material transitions from local search methods like Newton-Raphson and Gradient Descent into the principles of natural evolution and the mechanics of Genetic Algorithms.
Learning Outcomes:
- Define and Classify Optimization Problems: Distinguish between continuous, discrete, combinatorial, and convex optimization based on mathematical properties and constraints.
- Execute Classical Gradient-Based Methods: Apply the Newton-Raphson method and Gradient Descent (including variants like RPROP and Momentum) to find local optima of differentiable functions.
- Translate Biological Evolution to Computational Logic: Explain the relationship between genotype/phenotype, natural selection, and the structure of Evolutionary Algorithms (EAs).
- Address Algorithmic Challenges: Identify and mitigate common issues in optimization, such as the "Hamming cliff" in binary coding, the "vanishing gradient problem," and the computational cost of Hessian inversion.
🔹 Lesson 2: Genetic Algorithms: Coding, Selection, and Genetic Operators
Overview: This lesson covers the fundamental mechanics of Canonical and Real-Coded Genetic Algorithms (RCGAs). It explores variable representation including Binary, Gray, and Real coding, alongside reproduction mechanisms such as crossover and mutation. The material also introduces complex problem-solving techniques for combinatorial optimization and constraint handling through penalty functions and repair methods.
Learning Outcomes:
- Analyze different representation schemes (Binary vs. Gray coding) and their impact on search causality and "Hamming cliffs."
- Calculate selection probabilities and decode binary strings into integer or real-valued decision variables.
- Differentiate between various crossover operators for both binary (n-point, uniform) and real-coded (BLX, SBX, linear) genetic algorithms.
- Formulate strategies for handling constraints and combinatorial optimization in NP-hard problems like the Traveling Salesman Problem (TSP).
🔹 Lesson 3: Evolution Strategies and Step-size Adaptation
Overview: This lesson explores advanced evolutionary computation techniques, specifically focusing on constraint-handling mechanisms and the mechanics of Evolution Strategies (ES). It details the transition from binary to continuous optimization, emphasizing the importance of step-size adaptation. Key methods discussed include the 1/5 success rule and Covariance Matrix Adaptation (CMA-ES).
Learning Outcomes:
- Evaluate techniques for handling constraints, including gradient-based repair methods for continuous problems and stochastic ranking for infeasible solutions.
- Apply Real-Coded Genetic Algorithm operators, specifically Simulated Binary Crossover (SBX) and Polynomial Mutation.
- Analyze the algorithmic structure of Evolution Strategies (ES), including (1+1)-ES and population-based variants using (μ, λ) and (μ + λ) selection.
- Explain the mathematical logic behind step-size adaptation, comparing the 1/5 success rule, global step-size, and individual step-size strategies.
🔹 Lesson 4: Genetic Programming and Tree-based Representations
Overview: This lesson covers Genetic Programming (GP) as a method for solving optimization and machine learning problems by representing mathematical functions and code as tree structures. Students explore how GP evolves these structures through specific initialization methods and genetic operators. The lesson also addresses constraint-handling techniques like penalized fitness functions within symbolic regression.
Learning Outcomes:
- Differentiate between primitive function sets and terminal sets in tree-based representations.
- Execute initialization strategies, including the Full, Grow, and Ramped-half-and-half methods.
- Analyze the mechanics of tree-based recombination and mutation, including their effects on population size and structure.
- Evaluate techniques for handling constraints in evolutionary optimization, specifically hard, dynamic, and adaptive penalties.
🔹 Lesson 5: Differential Evolution and Particle Swarm Optimization Variants
Overview: This lesson explores advanced meta-heuristic optimization techniques, beginning with repair-based constraint handling for both combinatorial and continuous problems. It provides a deep dive into the mechanisms of Differential Evolution (DE) and Canonical Particle Swarm Optimization (PSO). The content extends into specialized variants like Competition-Based PSO (CSO) and Social Learning PSO (SL-PSO) for large-scale optimization.
Learning Outcomes:
- Analyze constraint-handling techniques, specifically repair methods for combinatorial (gene replacement) and continuous (gradient-based) optimization.
- Execute the mathematical operators of Differential Evolution, including vector-difference mutation, crossover, and selection.
- Formulate the velocity and position update equations for Canonical PSO and its variants (Local PSO, Constriction Factor, and CSO).
- Distinguish between various swarm intelligence models based on their learning strategies (e.g., global best vs. social learning from demonstrators).
🔹 Lesson 6: Multi-objective Optimization: Pareto Dominance and Classical Methods
Overview: This lesson covers the fundamental transition from Single-objective Optimization (SOO) to Multi-objective Optimization (MOO). It emphasizes that MOO results in a set of trade-off solutions rather than a single optimum. The lesson evaluates classical methods like Weighted Aggregation against modern evolutionary decomposition approaches like MOEA/D and RVEA.
Learning Outcomes:
- Define Pareto dominance, Pareto optimal sets, and Pareto fronts mathematically.
- Distinguish between A priori, A posteriori, and Interactive approaches to multi-objective optimization.
- Evaluate the limitations of classical weighted aggregation, particularly regarding concave Pareto fronts.
- Explain the mechanisms of decomposition-based algorithms, including weight adaptation and reference vector guidance.
🔹 Lesson 7: Dominance-based MOO Approaches and Performance Indicators
Overview: This lesson explores dominance-based approaches in Multi-Objective Optimization (MOO), focusing on selection strategies transition from single-objective methods to dominance and diversity-based selection. It details specific fitness assignment techniques, including rank mapping and niching for diversity. The lesson also evaluates the procedural steps and computational complexities of major non-dominated sorting algorithms applied in the NSGA-II framework.
Learning Outcomes:
- Analyze the transition from single-objective selection to dominance-based and diversity-based fitness assignment.
- Calculate solution ranks, fitness mapping, and niche counts using sharing functions.
- Compare the procedural logic and computational complexity of Basic, Fast, and Efficient Non-Dominated Sorting algorithms.
- Outline the execution flow of the Non-Dominated Sorting Genetic Algorithm (NSGA-II).
🔹 Lesson 8: Estimation of Distribution Algorithms and Modeling Regularity
Overview: This lesson explores advanced selection mechanisms in Multi-Objective Optimization, specifically focusing on NSGA-II's crowding distance and performance indicators like IGD and Hyper-volume. It transitions into Estimation of Distribution Algorithms (EDAs), which replace traditional genetic operators with probabilistic models. Finally, the material introduces "Modeling Regularity" (MR-MOO), leveraging the geometric properties of Pareto sets to improve search efficiency.
Learning Outcomes:
- Calculate and Implement NSGA-II Selection: Define and compute crowding distance and apply crowded tournament selection rules.
- Evaluate MOO Performance: Differentiate between performance indicators such as Generational Distance (GD), Inverted Generational Distance (IGD), and Hyper-volume.
- Analyze EDA Mechanisms: Explain the shift from random search operators to building and sampling from probabilistic models (UMDA, UGM, MGM).
- Model Regularity in MOPs: Describe the (m-1)-dimensional manifold property of Pareto sets and how local PCA and latent space mapping are used in MR-MOO.
🔹 Lesson 9: Machine Learning: Neural Networks, Backpropagation, and Model Selection
Overview: This lesson covers the transition from many-objective optimization to advanced machine learning architectures and model evaluation. It details the mechanics of Multi-Layer Perceptrons (MLPs), the mathematical foundation of backpropagation, and various gradient descent optimization techniques. Furthermore, it addresses critical issues in model selection, such as the bias-variance trade-off and ensemble methods.
Learning Outcomes:
- Identify remedies for selection pressure loss in Many-Objective Optimization, including hypervolume-based and decomposition methods.
- Differentiate between Generative and Discriminative models and their respective roles in machine learning.
- Apply the Delta training rule and backpropagation using differentiation rules to update neural network weights.
- Evaluate strategies for alleviating overfitting, specifically regularization, early stopping, and various cross-validation techniques.
- Analyze ensemble learning methods (Bagging, Boosting, Stacking) to manage the bias-variance trade-off and improve model diversity.
🔹 Lesson 10: Memetic Algorithms and Influence of Lifetime Learning
Overview: This lesson explores Memetic Algorithms (MA), which integrate global genetic exploration with local phenotypic refinement through lifetime learning. It examines the mechanical distinctions between Lamarckian and Baldwinian evolution and the application of local search in multi-objective optimization. The material also covers the use of evolutionary strategies to optimize neural network structures and parameters.
Learning Outcomes:
- Differentiate between standard Evolutionary Algorithms and Memetic Algorithms based on the inclusion of lifetime learning and local search.
- Contrast the Lamarckian and Baldwinian paradigms regarding the transmission of acquired phenotypic changes to offspring.
- Evaluate the influence of the Baldwin Effect versus the Hiding Effect on the rate of evolution and selection pressure.
- Describe the methodology for optimizing Neural Network (NN) connection matrices and weight parameters through evolutionary and gradient-based training.
🔹 Lesson 11: Evolutionary Machine Learning and MOML
Overview: This lesson explores Multi-Objective Machine Learning (MOML) as a framework for balancing competing objectives in neural network design, such as accuracy versus complexity. Using algorithms like NSGA-II, the material demonstrates how to optimize network structures and parameters to derive Pareto-optimal models. The content extends these principles to practical applications including rule extraction, multi-objective clustering, and noise-robust feature extraction.
Learning Outcomes:
- Define the mathematical basis for Pareto-based regularization in MOML, specifically the tradeoff between error and model complexity.
- Explain the mechanisms of network representation and mutation within an evolutionary framework.
- Analyze Pareto-optimal fronts to extract interpretable decision rules from neural networks.
- Describe how multi-objective optimization is applied to clustering and feature extraction.
🔹 Lesson 12: Data-Driven Evolutionary Optimization and Model Management
Overview: This lesson covers the motivations and methodologies for Data-Driven Evolutionary Optimization (DDEO), focusing on scenarios where objective functions are computationally expensive. It details the critical role of Model Management and explores advanced strategies like individual-based and generation-based approaches. Bayesian optimization approaches using surrogate models are examined to balance exploration and exploitation.
Learning Outcomes:
- Identify the core motivations for data-driven optimization, including computational intensity and the expense of physical simulations.
- Categorize model management strategies (Individual-based, Generation-based, and Population-based) and their specific implementation techniques.
- Explain the workflow of Bayesian Optimization and the role of Gaussian Processes (GP) as non-parametric surrogate models.
🔹 Lesson 13: Bayesian Optimization and Acquisition Functions
Overview: This lesson covers the advanced application of Bayesian Optimization (BO) within evolutionary frameworks, focusing on surrogate-assisted optimization. It details the transition from standard Gaussian Process (GP) models to ensembles to address the "Curse of Dimensionality." The material further explores specialized strategies for Heterogeneously Expensive Multi-Objective Problems (HE-MOPs) and knowledge transfer.
Learning Outcomes:
- Define and compare common acquisition functions (LCB, EI, Thompson Sampling) used to balance exploration and exploitation.
- Identify the computational limitations of Gaussian Processes and explain strategies for dimension reduction and surrogate replacement.
- Analyze methods for handling heterogeneously expensive objectives through parameter-based and instance-based knowledge transfer.
🔹 Lesson 14: Evolutionary Neural Architecture Search (E-NAS)
Overview: This lesson explores the automation of neural network design through evolutionary algorithms, focusing on the transition from hand-crafted architectures to automated Search Spaces. It covers the E-NAS pipeline, weight sharing in supernets, and advanced techniques like surrogate-assisted evaluation. The material also extends into biologically inspired models, including evolved plasticity and co-evolution of morphology.
Learning Outcomes:
- Define the five-step E-NAS process and the distinction between macro and micro search spaces.
- Explain the concept of weight sharing in supernets and identify the primary challenges of one-shot methods.
- Describe the mechanisms of node inheritance, including how crossover and mutation operators are applied to Directed Acyclic Graphs (DAGs).
- Evaluate strategies for reducing computational costs, such as proxy metrics, surrogate models, and sampled training.
🔹 Lesson 15: Privacy-Preserving and Federated Machine Learning
Overview: This lesson explores the transition from centralized cloud-based machine learning to distributed, privacy-preserving frameworks. It covers the fundamental architectures of Federated Learning (Horizontal and Vertical) and techniques for enhancing communication efficiency. The lesson also discusses the integration of evolutionary algorithms for Federated Neural Architecture Search (NAS) and Secure Federated Evolutionary Algorithms.
Learning Outcomes:
- Distinguish between centralized cloud learning and distributed on-device learning in terms of privacy, security, and model accuracy.
- Explain the mechanics of core privacy-preserving techniques, including Differential Privacy, Homomorphic Encryption, and Secure Multi-party Computation.
- Analyze the operational flow of Horizontal and Vertical Federated Learning and their respective challenges with non-IID data.
- Evaluate methods for communication efficiency, such as Layer-wise Asynchronous Updates and Ternary Quantization.
- Describe the implementation of Federated Bayesian Optimization and Secure Federated Evolutionary Algorithms using Diffie-Hellman masking.