diff --git a/phiml-blog-supporting-code/FNO/FNO_README.md b/phiml-blog-supporting-code/FNO/FNO_README.md new file mode 100644 index 0000000..902f3bf --- /dev/null +++ b/phiml-blog-supporting-code/FNO/FNO_README.md @@ -0,0 +1,196 @@ + +# Fourier Neural Operator for Nonlinear Pendulum + +This example demonstrates how to use a Fourier Neural Operator (FNO) to model the dynamics of a nonlinear pendulum with a time\-dependent forcing term. The governing equation is given by the following second\-order ordinary differential equation (ODE): + + + $\ddot{\theta} (t)=-\omega_0^2 \sin (\theta (t))+f(t)$, + + +where $\theta$ is the angular position, $\omega_0$ is the natural frequency, and $f$ is the an external forcing function. + + +Traditionally, solving this system for a new forcing function $f$ requires numerically solving the ODE from scratch. In constrast, an FNO learns a mapping from the input forcing function $f$ to the corresponding solution $\theta$, enabling rapid inference without re\-solving the equation. + + +The FNO is trained on a dataset of input\-output pairs $(f_i (t),\theta_i (t))$, and once trained, it can predict $\theta$ for unseen forcing functions. While this approach may be excessive for simple ODEs like the pendulum, it can be advantageous for complex simulations involving partial differential equations (PDEs), where traditional solvers (e.g. finite element method) are computationally expensive. + +```matlab +rng(0); % for reproducibility +``` +# Generate or Import Data +```matlab +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumDataFNO.m'); +% If first time generating data, set generateData to true. Else, set to false. +res = 512; +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + numSamples = 3000; + doPlot = 0; + generatePendulumDataFNO(omega0,x0,numSamples,res,doPlot); +end +``` + +```matlabTextOutput +FNO data of resolution 512 written to fno_data_512.mat +``` + +```matlab +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', sprintf('fno_data_%d.mat',res)); +% Read the data +load(dataFile,'data'); +res = 512; + +f = data.fSamples; +theta = data.thetaSamples; +t = data.tGrid; + +% normalize and center the data +fMean = mean(f, 'all'); +fStd = std(f, 0, 'all'); +thetaMean = mean(theta, 'all'); +thetaStd = std(theta, 0, 'all'); +f = (f - fMean) / fStd; +theta = (theta - thetaMean) / thetaStd; + +% visualize some of the training data +numPlots = 3; +figure +tiledlayout(numPlots,2) +for i = 1:numPlots + nexttile + plot(t,f(i,:)); + title("Observation " + i + newline + "Forcing Function") + xlabel("$t$",Interpreter='latex'); + ylabel("$f(t)$",Interpreter='latex'); + + nexttile + plot(t,theta(i,:)); + title("ODE Solution") + xlabel("$t$",Interpreter='latex') + ylabel("$\theta(t)$",Interpreter='latex') +end +``` +# Prepare Training Data +```matlab +numObservations = size(f,1); +[idxTrain,idxValidation,idxTest] = trainingPartitions(numObservations,[0.8 0.1 0.1]); +fTrain = f(idxTrain,:); +thetaTrain = theta(idxTrain,:); +fValidation = f(idxValidation,:); +thetaValidation = theta(idxValidation,:); +fTest = f(idxTest,:); +thetaTest = theta(idxTest,:); +``` + +FNO requires input data which contains spatio\-temporal information. Concatenate the grid with the input data. + +```matlab +tGridTrain = repmat(t, [numel(idxTrain) 1]); +tGridValidation = repmat(t, [numel(idxValidation) 1]); +fTrain = cat(3,fTrain,tGridTrain); +fValidation = cat(3, fValidation, tGridValidation); + +size(fTrain) +``` + +```matlabTextOutput +ans = 1x3 + 2400 512 2 + +``` + +```matlab +size(fValidation) +``` + +```matlabTextOutput +ans = 1x3 + 300 512 2 + +``` + +# Define Neural Network Architecture + +Network consists of multiple Fourier\-GeLU blocks connected in series. + +```matlab +numModes = 16; +tWidth = 128; +numBlocks = 8; + +fourierBlock = [ + fourierLayer(numModes,tWidth) + geluLayer]; + +layers = [ + inputLayer([NaN tWidth 2],"BSC"); + convolution1dLayer(1,tWidth,Name="fc0") + repmat(fourierBlock,[numBlocks 1]) + convolution1dLayer(1,128) + geluLayer + convolution1dLayer(1,1)]; +``` +# Specify Training Options +```matlab +schedule = piecewiseLearnRate(DropFactor=0.5); + +options = trainingOptions("adam", ... + InitialLearnRate=1e-3, ... + LearnRateSchedule=schedule, ... + MaxEpochs=50, ... + MiniBatchSize=64, ... + Shuffle="every-epoch", ... + InputDataFormats="BSC", ... + Plots="training-progress", ... + ValidationData={fValidation,thetaValidation}, ... + Verbose=false); +``` +# Train the Network +```matlab +rng(0); % for reproducibility +net = trainnet(fTrain,thetaTrain,layers,"mse",options); +``` + +![figure_0.png](FNO_README_media/figure_0.png) +# Test the Model and Visualize Results +```matlab +tGridTest = repmat(t, [numel(idxTest) 1]); + +fTest = cat(3,fTest,tGridTest); +mseTest = testnet(net,fTest,thetaTest,"mse") +``` + +```matlabTextOutput +mseTest = 0.0741 +``` + + +Visualize the predictions on the test set + +```matlab +Y = minibatchpredict(net, fTest); +numTestPlots = 3; +for i = 1:numTestPlots + figure(); + subplot(2,1,1); + plot(t,fTest(i,:,1),LineWidth=2.5) + title("Forcing Function") + xlabel("$t$",Interpreter="latex") + ylabel("$f(t)$",Interpreter="latex") + set(gca,FontSize=14,LineWidth=2.5) + subplot(2,1,2) + plot(t,Y(i,:),'b-',LineWidth=2.5,DisplayName='FNO'); hold on + plot(t,thetaTest(i,:),'k--',LineWidth=2.5,DisplayName='True Solution'); hold off + title("Angular Position") + xlabel("$t$",Interpreter="latex") + ylabel("$\theta(t)$",Interpreter="latex") + legend(Location='best'); + set(gca,FontSize=14,LineWidth=2.5) +end +``` diff --git a/phiml-blog-supporting-code/FNO/FNO_README_media/figure_0.png b/phiml-blog-supporting-code/FNO/FNO_README_media/figure_0.png new file mode 100644 index 0000000..27c1e09 Binary files /dev/null and b/phiml-blog-supporting-code/FNO/FNO_README_media/figure_0.png differ diff --git a/phiml-blog-supporting-code/FNO/FNO_nonlinear_pendulum.mlx b/phiml-blog-supporting-code/FNO/FNO_nonlinear_pendulum.mlx new file mode 100644 index 0000000..ec57edf Binary files /dev/null and b/phiml-blog-supporting-code/FNO/FNO_nonlinear_pendulum.mlx differ diff --git a/phiml-blog-supporting-code/FNO/FNO_nonlinear_pendulum_code.m b/phiml-blog-supporting-code/FNO/FNO_nonlinear_pendulum_code.m new file mode 100644 index 0000000..f50ee08 --- /dev/null +++ b/phiml-blog-supporting-code/FNO/FNO_nonlinear_pendulum_code.m @@ -0,0 +1,156 @@ +%% Fourier Neural Operator for Nonlinear Pendulum +% This example demonstrates how to use a Fourier Neural Operator (FNO) to model +% the dynamics of a nonlinear pendulum with a time-dependent forcing term. The +% governing equation is given by the following second-order ordinary differential +% equation (ODE): +% +% $\ddot{\theta}(t) = -\omega_0^2\sin(\theta(t)) + f(t)$, +% +% where $\theta$ is the angular position, $\omega_0$ is the natural frequency, +% and $f$ is the an external forcing function. +% +% Traditionally, solving this system for a new forcing function $f$ requires +% numerically solving the ODE from scratch. In constrast, an FNO learns a mapping +% from the input forcing function $f$ to the corresponding solution $\theta$, +% enabling rapid inference without re-solving the equation. +% +% The FNO is trained on a dataset of input-output pairs $(f_i(t), \theta_i(t))$, +% and once trained, it can predict $\theta$ for unseen forcing functions. While +% this approach may be excessive for simple ODEs like the pendulum, it can be +% advantageous for complex simulations involving partial differential equations +% (PDEs), where traditional solvers (e.g. finite element method) are computationally +% expensive. + +rng(0); % for reproducibility +%% Generate or Import Data + +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumDataFNO.m'); +% If first time generating data, set generateData to true. Else, set to false. +res = 512; +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + numSamples = 3000; + doPlot = 0; + generatePendulumDataFNO(omega0,x0,numSamples,res,doPlot); +end +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', sprintf('fno_data_%d.mat',res)); +% Read the data +load(dataFile,'data'); +res = 512; + +f = data.fSamples; +theta = data.thetaSamples; +t = data.tGrid; + +% normalize and center the data +fMean = mean(f, 'all'); +fStd = std(f, 0, 'all'); +thetaMean = mean(theta, 'all'); +thetaStd = std(theta, 0, 'all'); +f = (f - fMean) / fStd; +theta = (theta - thetaMean) / thetaStd; + +% visualize some of the training data +numPlots = 3; +figure +tiledlayout(numPlots,2) +for i = 1:numPlots + nexttile + plot(t,f(i,:)); + title("Observation " + i + newline + "Forcing Function") + xlabel("$t$",Interpreter='latex'); + ylabel("$f(t)$",Interpreter='latex'); + + nexttile + plot(t,theta(i,:)); + title("ODE Solution") + xlabel("$t$",Interpreter='latex') + ylabel("$\theta(t)$",Interpreter='latex') +end +%% Prepare Training Data + +numObservations = size(f,1); +[idxTrain,idxValidation,idxTest] = trainingPartitions(numObservations,[0.8 0.1 0.1]); +fTrain = f(idxTrain,:); +thetaTrain = theta(idxTrain,:); +fValidation = f(idxValidation,:); +thetaValidation = theta(idxValidation,:); +fTest = f(idxTest,:); +thetaTest = theta(idxTest,:); +%% +% FNO requires input data which contains spatio-temporal information. Concatenate +% the grid with the input data. + +tGridTrain = repmat(t, [numel(idxTrain) 1]); +tGridValidation = repmat(t, [numel(idxValidation) 1]); +fTrain = cat(3,fTrain,tGridTrain); +fValidation = cat(3, fValidation, tGridValidation); +%% Define Neural Network Architecture +% Network consists of multiple Fourier-GeLU blocks connected in series. + +numModes = 16; +tWidth = 128; +numBlocks = 8; + +fourierBlock = [ + fourierLayer(numModes,tWidth) + geluLayer]; + +layers = [ + inputLayer([NaN tWidth 2],"BSC"); + convolution1dLayer(1,tWidth,Name="fc0") + repmat(fourierBlock,[numBlocks 1]) + convolution1dLayer(1,128) + geluLayer + convolution1dLayer(1,1)]; +%% Specify Training Options + +schedule = piecewiseLearnRate(DropFactor=0.5); + +options = trainingOptions("adam", ... + InitialLearnRate=1e-3, ... + LearnRateSchedule=schedule, ... + MaxEpochs=50, ... + MiniBatchSize=64, ... + Shuffle="every-epoch", ... + InputDataFormats="BSC", ... + Plots="training-progress", ... + ValidationData={fValidation,thetaValidation}, ... + Verbose=false); +%% Train the Network + +rng(0); % for reproducibility +net = trainnet(fTrain,thetaTrain,layers,"mse",options); +%% Test the Model and Visualize Results + +tGridTest = repmat(t, [numel(idxTest) 1]); + +fTest = cat(3,fTest,tGridTest); +mseTest = testnet(net,fTest,thetaTest,"mse"); +%% +% Visualize the predictions on the test set + +Y = minibatchpredict(net, fTest); +numTestPlots = 3; +for i = 1:numTestPlots + figure(); + subplot(2,1,1); + plot(t,fTest(i,:,1),LineWidth=2.5) + title("Forcing Function") + xlabel("$t$",Interpreter="latex") + ylabel("$f(t)$",Interpreter="latex") + set(gca,FontSize=14,LineWidth=2.5) + subplot(2,1,2) + plot(t,Y(i,:),'b-',LineWidth=2.5,DisplayName='FNO'); hold on + plot(t,thetaTest(i,:),'k--',LineWidth=2.5,DisplayName='True Solution'); hold off + title("Angular Position") + xlabel("$t$",Interpreter="latex") + ylabel("$\theta(t)$",Interpreter="latex") + legend(Location='best'); + set(gca,FontSize=14,LineWidth=2.5) +end \ No newline at end of file diff --git a/phiml-blog-supporting-code/FNO/fnoDataHelper.m b/phiml-blog-supporting-code/FNO/fnoDataHelper.m new file mode 100644 index 0000000..89ab5f3 --- /dev/null +++ b/phiml-blog-supporting-code/FNO/fnoDataHelper.m @@ -0,0 +1,82 @@ +function [fSamples, thetaSamples, grid] = fnoDataHelper(numModes, numSamples, omega0, x0, gridSize, doPlot) + % Generate samples of the forcing function f from N(0, (-Δ + I)^(-1)) + % numModes: number of Fourier modes + % numSamples: number of samples to generate + % omega0: natural frequency + % x0: initial condition + % gridSize: number of points in t domain + + % Example usage + % f_samples = generate_forcing_functions(5, 2048, 512, 10); + + rng(0); % for reproducibility + + % Define the x domain + xMax = 10; + tSol = linspace(0, xMax, gridSize); + + % Initialize storage for samples + fSamples = zeros(numSamples, length(tSol)); + thetaSamples = zeros(numSamples, length(tSol)); + + % scaling factor + sigma = 0.1; + + % Generate samples + for j = 1:numSamples + % Initialize the sample + f = zeros(size(tSol)); + + % Generate the Fourier series + for k = 1:numModes + % Compute the eigenvalue for the covariance operator + lambda_k = 1 / (k^2 + 1); + + % Generate random coefficients + a_k = sigma*sqrt(lambda_k) * randn; + b_k = sigma*sqrt(lambda_k) * randn; + + % Update the sample with the cosine and sine terms + f = f + a_k * cos(2 * pi * k * tSol/xMax) + b_k * sin(2 * pi * k * tSol/xMax); + end + + % Store the sample + fSamples(j, :) = f; + + % Solve the ode for theta + F = ode; + fInterp = @(x) interp1(tSol, f, x, 'spline', 'extrap'); + F.ODEFcn = @(t, y) [y(2); -omega0^2 * sin(y(1)) + fInterp(t)]; + F.InitialValue = x0; + F.Solver = "ode45"; + sol = solve(F,tSol); + thetaSamples(j, :) = sol.Solution(1,:); + + end + + if doPlot + % Plot a few samples + figure; + hold on; + for i = 1:min(numSamples, 5) + plot(tSol, fSamples(i, :)); + end + title('Samples of Forcing Function f'); + xlabel('x'); + ylabel('f(x)'); + hold off; + + figure; + hold on; + for i = 1:min(numSamples, 5) + plot(tSol, thetaSamples(i, :)); + end + title('Samples of Solution Function \theta'); + xlabel('x'); + ylabel('f(x)'); + hold off; + end + + grid = tSol; +end + diff --git a/phiml-blog-supporting-code/FNO/fourierLayer.m b/phiml-blog-supporting-code/FNO/fourierLayer.m new file mode 100644 index 0000000..7389f2b --- /dev/null +++ b/phiml-blog-supporting-code/FNO/fourierLayer.m @@ -0,0 +1,27 @@ +function layer = fourierLayer(numModes,tWidth,args) + +arguments + numModes + tWidth + args.Name = "" +end +name = args.Name; + +net = dlnetwork; + +layers = [ + identityLayer(Name="in") + spectralConvolution1dLayer(numModes,tWidth,Name="specConv") + additionLayer(2,Name="add")]; + +net = addLayers(net,layers); + +layer = convolution1dLayer(1,tWidth,Name="fc"); +net = addLayers(net,layer); + +net = connectLayers(net,"in","fc"); +net = connectLayers(net,"fc","add/in2"); + +layer = networkLayer(net,Name=name); + +end \ No newline at end of file diff --git a/phiml-blog-supporting-code/FNO/spectralConvolution1dLayer.m b/phiml-blog-supporting-code/FNO/spectralConvolution1dLayer.m new file mode 100644 index 0000000..9248193 --- /dev/null +++ b/phiml-blog-supporting-code/FNO/spectralConvolution1dLayer.m @@ -0,0 +1,97 @@ +classdef spectralConvolution1dLayer < nnet.layer.Layer ... + & nnet.layer.Formattable ... + & nnet.layer.Acceleratable + % spectralConvolution1dLayer 1-D Spectral Convolution Layer + + properties + NumChannels + OutputSize + NumModes + end + + properties (Learnable) + Weights + end + + methods + function this = spectralConvolution1dLayer(numModes,outChannels,args) + % spectralConvolution1dLayer 1-D Spectral Convolution Layer + % + % layer = spectralConvolution1dLayer(outChannels, numModes) + % creates a spectral convolution 1d layer. outChannels + % specifies the number of channels in the layer output. + % numModes specifies the number of modes which are combined + % in Fourier space. + % + % layer = spectralConvolution1dLayer(outChannels, numModes, + % Name=Value) specifies additional options using one or more + % name-value arguments: + % + % Name - Name for the layer. The default value is "". + % + % Weights - Complex learnable array of size + % (inChannels)x(outChannels)x(numModes). The + % default value is []. + + arguments + numModes (1,1) double + outChannels (1,1) double + args.Name {mustBeTextScalar} = "spectralConv1d" + args.Weights = [] + end + + this.OutputSize = outChannels; + this.NumModes = numModes; + this.Name = args.Name; + this.Weights = args.Weights; + end + + function this = initialize(this, ndl) + inChannels = ndl.Size( finddim(ndl,'C') ); + outChannels = this.OutputSize; + numModes = this.NumModes; + + if isempty(this.Weights) + this.NumChannels = inChannels; + this.Weights = 1./(inChannels*outChannels).*( ... + rand([inChannels outChannels numModes]) + ... + 1i.*rand([inChannels outChannels numModes]) ); + else + assert( inChannels == this.NumChannels, 'The input channel size must match the layer' ); + end + end + + function y = predict(this, x) + % First compute the rfft, normalized and one-sided + x = real(x); + x = stripdims(x); + N = size(x, 1); + xft = iRFFT(x, 1, N); + + % Multiply selected Fourier modes + xft = permute(xft(1:this.NumModes, :, :), [3 2 1]); + yft = pagemtimes( xft, this.Weights ); + yft = permute(yft, [3 2 1]); + + S = floor(N/2)+1 - this.NumModes; + yft = cat(1, yft, zeros([S size(yft, 2:3)], 'like', yft)); + + % Return to physical space via irfft, normalized and one-sided + y = iIRFFT(yft, 1, N); + + % Re-apply labels + y = dlarray(y, 'SCB'); + y = real(y); + end + end +end + +function y = iRFFT(x, dim, N) +y = fft(x, [], dim); +y = y(1:floor(N/2)+1, :, :)./N; +end + +function y = iIRFFT(x, dim, N) +x(end+1:N, :, :, :) = conj( x(ceil(N/2):-1:2, :, :, :) ); +y = ifft(N.*x, [], dim, 'symmetric'); +end diff --git a/phiml-blog-supporting-code/FNO/trainingPartitions.m b/phiml-blog-supporting-code/FNO/trainingPartitions.m new file mode 100644 index 0000000..28cbefa --- /dev/null +++ b/phiml-blog-supporting-code/FNO/trainingPartitions.m @@ -0,0 +1,49 @@ +function varargout = trainingPartitions(numObservations,splits) +%TRAININGPARTITONS Random indices for splitting training data +% [idx1,...,idxN] = trainingPartitions(numObservations,splits) returns +% random vectors of indices to help split a data set with the specified +% number of observations, where SPLITS is a vector of length N of +% partition sizes that sum to one. +% +% % Example: Get indices for 50%-50% training-test split of 500 +% % observations. +% [idxTrain,idxTest] = trainingPartitions(500,[0.5 0.5]) +% +% % Example: Get indices for 80%-10%-10% training, validation, test split +% % of 500 observations. +% [idxTrain,idxValidation,idxTest] = trainingPartitions(500,[0.8 0.1 0.1]) + +arguments + numObservations (1,1) {mustBePositive} + splits {mustBeVector,mustBeInRange(splits,0,1,"exclusive"),mustSumToOne} +end + +rng(0); % for reproducibility + +numPartitions = numel(splits); +varargout = cell(1,numPartitions); + +idx = randperm(numObservations); + +idxEnd = 0; + +for i = 1:numPartitions-1 + idxStart = idxEnd + 1; + idxEnd = idxStart + floor(splits(i)*numObservations) - 1; + + varargout{i} = idx(idxStart:idxEnd); +end + +% Last partition. +varargout{end} = idx(idxEnd+1:end); + +end + +function mustSumToOne(v) +% Validate that value sums to one. + +if sum(v,"all") ~= 1 + error("Value must sum to one.") +end + +end diff --git a/phiml-blog-supporting-code/HNN/HNN_README.md b/phiml-blog-supporting-code/HNN/HNN_README.md new file mode 100644 index 0000000..dedee65 --- /dev/null +++ b/phiml-blog-supporting-code/HNN/HNN_README.md @@ -0,0 +1,291 @@ + + + +# Hamiltonian Neural Network for Nonlinear Pendulum + +This example demonstrates how to use a **Hamiltonian Neural Network (HNN)** to learn the dynamics of a nonlinear pendulum from noisy observational data. HNNs incorporate physical structure by learning the system's Hamiltonian and using Hamilton's equations to compute the time evolution of the state. + + +The true dynamics of the nonlinear pendulum are governed by the second\-order differential equation: + + + $\ddot{\theta} =-\omega_0^2 \sin \theta$, + + +where $\theta$ is the angular position and $\omega_0$ is the natural frequency. We assume access to noisy measurements of $\theta$ and the angular velocity $\dot{\theta}$, as well as their time derivatives, which in this example are approximated numerically. + + +The input to the HNN is the state vector $[q,p]=[\theta ,\dot{\theta} ]$. The network outputs the scalar\-valued Hamiltonian $\mathcal{H}(q,p)$. Using automatic differentiation, the penalizes deviations from Hamilton's equations via a custom loss function: + + $$ {L\left(q,p\right)=\left\|\frac{\textrm{dq}}{\textrm{dt}}-\frac{\partial \mathcal{H}}{\partial p}\right\|}_2^2 +{\left\|\frac{\textrm{dp}}{\textrm{dt}}+\frac{\partial \mathcal{H}}{\partial q}\right\|}_2^2 $$ + +to ensure the learned Hamiltonian produces physically consistent dynamics. Once the network is trained, predictions are made by solving Hamilton's equations + + $ \frac{dq}{dt}=\frac{\partial \mathcal{H}}{\partial p}, $ $ \frac{dp}{dt}=-\frac{\partial \mathcal{H}}{\partial p} $ + +using the learned Hamiltonian. + +```matlab +rng(0); % for reproducibility +``` + + +# Prepare Data for Training + +Load the data contained in `pendulum_qp_dqdp.mat` if it already exists, or generate and save the data if not. + +```matlab +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumData.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumData(omega0,x0,tSpan,noiseLevel,doPlot); +end +``` + +```matlabTextOutput +Pendulum data written to pendulum_qp_dqdp.mat +``` + +```matlab +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_qp_dqdp.mat'); +% Read the data +load(dataFile,'data'); +theta = data.thetaNoisy; % Noisy angle measurements (radians) +omega = data.omegaNoisy; % Noisy angular velocity measurements (radians/sec) +t = data.t; % Time vector + +% Numerical derivatives (targets) +qdot = data.thetaDot; % d(theta)/dt +pdot = data.omegaDot; % d(omega)/dt +``` + +Use data up to $t=10$ for training. + +```matlab +inds = find(t <= 10); +tTrain = t(inds); +q = theta(inds); +p = omega(inds); +qdot = qdot(inds); +pdot = pdot(inds); +``` + +Prepare the data + +```matlab +inputs = [q p]; % (N)x(2) array +targets = [qdot pdot]; % (N)x(2) array +``` + +Convert data to array datastore + +```matlab +inputds = arrayDatastore(inputs, IterationDimension=1); +targetds = arrayDatastore(targets, IterationDimension=1); +ds = combine(inputds, targetds); +``` + + +# Build the HNN Architecture + +We represent the Hamiltonian $\mathcal{H}$ using a simple multilayer perceptron (MLP). The network takes a 2\-dimensional input and outputs a single scalar value for the Hamiltonian. + + +The architecture consists of: + +- Input: 2\-dimensional state vector $[q,p]=[\theta ,\dot{\theta} ]$ +- Hidden layers: 2 [fully connected layers](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.fullyconnectedlayer.html), each with 64 neurons +- Activation function: [Hyperbolic tangent (tanh) ](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.tanhlayer.html) +- Output: 1\-dimensional value representing the Hamiltonian $\mathcal{H}(q,p)$ +```matlab +inputSize = 2; +outputSize = 1; +numHiddenLayers = 2; +hiddenSize = 64; + +fcBlock = [ + fullyConnectedLayer(hiddenSize) + tanhLayer]; +layers = [ + featureInputLayer(inputSize) + repmat(fcBlock,[numHiddenLayers 1]) + fullyConnectedLayer(outputSize)]; +rng(0); % for reproducibility +net = dlnetwork(layers); +net = dlupdate(@double, net); +``` + + +# Specify Training Options + +Here we define some of the options we will use for the ADAM optimizer in our custom training loop. + +```matlab +numEpochs = 1200; +miniBatchSize = 10; +executionEnvironment = "auto"; +initialLearnRate = 0.01; +decayRate = 1e-4; +N = length(tTrain); +numIterationsPerEpoch = ceil(N/miniBatchSize); +numIterations = numEpochs*numIterationsPerEpoch; +``` + +Create a mini\-batch queue. + +```matlab +mbq = minibatchqueue(ds, ... + MiniBatchSize=miniBatchSize, ... + MiniBatchFormat="BC", ... + OutputEnvironment=executionEnvironment, ... + OutputCast="double"); +averageGrad = []; +averageSqGrad = []; +``` + +Accelerate training. + +```matlab +accfun = dlaccelerate(@modelLoss); +``` + +Create a monitor to track the training progress. + +```matlab +monitor = trainingProgressMonitor( ... + Metrics="TrainingLoss", ... + XLabel="Iteration", ... + Info="Epoch"); +``` + +![figure_0.png](HNN_README_media/figure_0.png) + +```matlab +yscale(monitor,"TrainingLoss","log"); +``` + + +# Train the Network + +We use a custom training loop. For more information on custom training loops, see [https://www.mathworks.com/help/deeplearning/custom\-training\-loops.html.](https://www.mathworks.com/help/deeplearning/custom-training-loops.html.) + +```matlab +iteration = 0; +epoch = 0; +monitor.Status = "Running"; +tol = 1e-2; +loss = 1000; +while epoch < numEpochs && ~monitor.Stop && loss > tol + epoch = epoch + 1; + rng(0); % for reproducibility + shuffle(mbq); + while hasdata(mbq) && ~monitor.Stop + iteration = iteration + 1; + [X, T] = mbq.next(); + [loss, grad] = dlfeval(accfun, net, X, T); + % update learning rate + learningRate = initialLearnRate/(1+decayRate*iteration); + % Update the network parameters using the adamupdate function. + [net,averageGrad,averageSqGrad] = adamupdate(net,grad,averageGrad, ... + averageSqGrad,iteration,learningRate); + recordMetrics(monitor, iteration, TrainingLoss=loss); + updateInfo(monitor,Epoch=string(epoch)+" of "+string(numEpochs)); + end + monitor.Progress = 100*epoch/numEpochs; +end +``` + +Update the training status. + +```matlab +if monitor.Stop == 1 + monitor.Status = "Training stopped"; +else + monitor.Status = "Training complete"; +end +``` + +![figure_1.png](HNN_README_media/figure_1.png) + + +# Predict the Pendulum Trajectory + +Use the model to make predictions with the Hamiltonian NN. In order to make predictions, we need to solve the ODE system: + + + $\frac{dq}{dt}=\frac{\partial H}{\partial p}$, $\frac{dp}{dt}=-\frac{\partial H}{\partial q}$. + +```matlab +accModel = dlaccelerate(@model); +tspan = tTrain; +x0 = dlarray([q(1); p(1)]); % initial conditions +odeFcn = @(ts,xs) dlfeval(accModel, net, xs); +[~, qp] = ode45(odeFcn, tspan, x0); +qp = qp'; % Transpose to return to (2)x(N) +``` + + +# Visualize the Results +```matlab +% Plot phase portrait +figure(); +plot(q,p,'ko',DisplayName='Noisy Data',LineWidth=1); hold on +plot(qp(1,:),qp(2,:),'b-',DisplayName='HNN',LineWidth=3); +xlabel('$\theta$ (radians)',Interpreter='latex') +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex') +title('Phase-Space: $\theta$ vs. $\dot{\theta}$',Interpreter='latex'); +legend('Location', 'best'); +set(gca,FontSize=14,LineWidth=2.5) +hold off +``` + +![figure_2.png](HNN_README_media/figure_2.png) + +```matlab +% Plot time vs angular position +figure; +subplot(2,1,1); +plot(tspan, q, 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tspan, qp(1,:), 'b-', DisplayName='HNN', LineWidth=3); +legend(); +ylabel('$\theta$ (radians)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) + +% Plot time vs angular velocity +subplot(2,1,2); +plot(tspan, p, 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tspan, qp(2,:), 'b-', DisplayName='HNN', LineWidth=3); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +xlabel('Time (s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +legend(); +``` + +![figure_3.png](HNN_README_media/figure_3.png) + + +## Helper Functions +```matlab +function [loss,gradients] = modelLoss(net, qp, qpdotTarget) +qpdot = model(net, qp); +loss = l2loss(qpdot, qpdotTarget, DataFormat="CB"); +gradients = dlgradient(loss,net.Learnables); +end + +function qpdot = model(net, qp) +H = forward(net,dlarray(qp,'CB')); +H = stripdims(H); +qp = stripdims(qp); +dH = dljacobian(H,qp,1); +qpdot = [dH(2,:); -1.*dH(1,:)]; +end +``` diff --git a/phiml-blog-supporting-code/HNN/HNN_README_media/figure_0.png b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_0.png new file mode 100644 index 0000000..394a828 Binary files /dev/null and b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_0.png differ diff --git a/phiml-blog-supporting-code/HNN/HNN_README_media/figure_1.png b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_1.png new file mode 100644 index 0000000..bf5e3ad Binary files /dev/null and b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_1.png differ diff --git a/phiml-blog-supporting-code/HNN/HNN_README_media/figure_2.png b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_2.png new file mode 100644 index 0000000..4468e6e Binary files /dev/null and b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_2.png differ diff --git a/phiml-blog-supporting-code/HNN/HNN_README_media/figure_3.png b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_3.png new file mode 100644 index 0000000..ae945a1 Binary files /dev/null and b/phiml-blog-supporting-code/HNN/HNN_README_media/figure_3.png differ diff --git a/phiml-blog-supporting-code/HNN/HNN_nonlinear_pendulum.mlx b/phiml-blog-supporting-code/HNN/HNN_nonlinear_pendulum.mlx new file mode 100644 index 0000000..6197dad Binary files /dev/null and b/phiml-blog-supporting-code/HNN/HNN_nonlinear_pendulum.mlx differ diff --git a/phiml-blog-supporting-code/HNN/HNN_nonlinear_pendulum_code.m b/phiml-blog-supporting-code/HNN/HNN_nonlinear_pendulum_code.m new file mode 100644 index 0000000..25f3b88 --- /dev/null +++ b/phiml-blog-supporting-code/HNN/HNN_nonlinear_pendulum_code.m @@ -0,0 +1,238 @@ +%% Hamiltonian Neural Network for Nonlinear Pendulum +% This example demonstrates how to use a *Hamiltonian Neural Network (HNN)* +% to learn the dynamics of a nonlinear pendulum from noisy observational data. +% HNNs incorporate physical structure by learning the system's Hamiltonian and +% using Hamilton's equations to compute the time evolution of the state. +% +% The true dynamics of the nonlinear pendulum are governed by the second-order +% differential equation: +% +% $\ddot{\theta} = -\omega_0^2\sin\theta$, +% +% where $\theta$ is the angular position and $\omega_0$ is the natural frequency. +% We assume access to noisy measurements of $\theta$ and the angular velocity +% $\dot{\theta}$, as well as their time derivatives, which in this example are +% approximated numerically. +% +% The input to the HNN is the state vector $[q,p]=[\theta,\dot{\theta}]$. The +% network outputs the scalar-valued Hamiltonian $\mathcal{H}(q,p)$. Using automatic +% differentiation, the penalizes deviations from Hamilton's equations via a custom +% loss function: +% +% $${L\left(q,p\right)=\left\|\frac{\textrm{dq}}{\textrm{dt}}-\frac{\partial +% \mathcal{H}}{\partial p}\right\|}_2^2 +{\left\|\frac{\textrm{dp}}{\textrm{dt}}+\frac{\partial +% \mathcal{H}}{\partial q}\right\|}_2^2$$ +% +% to ensure the learned Hamiltonian produces physically consistent dynamics. +% Once the network is trained, predictions are made by solving Hamilton's equations +% +% $$\frac{dq}{dt}=\frac{\partial \mathcal{H}}{\partial p},$$ $$\frac{dp}{dt} +% = -\frac{\partial \mathcal{H}}{\partial p}$$ +% +% using the learned Hamiltonian. + +rng(0); % for reproducibility +%% Prepare Data for Training +% Load the data contained in |pendulum_qp_dqdp.mat| if it already exists, or +% generate and save the data if not. + +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumData.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumData(omega0,x0,tSpan,noiseLevel,doPlot); +end +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_qp_dqdp.mat'); +% Read the data +load(dataFile,'data'); +theta = data.thetaNoisy; % Noisy angle measurements (radians) +omega = data.omegaNoisy; % Noisy angular velocity measurements (radians/sec) +t = data.t; % Time vector + +% Numerical derivatives (targets) +qdot = data.thetaDot; % d(theta)/dt +pdot = data.omegaDot; % d(omega)/dt +%% +% Use data up to $t = 10$ for training. + +inds = find(t <= 10); +tTrain = t(inds); +q = theta(inds); +p = omega(inds); +qdot = qdot(inds); +pdot = pdot(inds); +%% +% Prepare the data + +inputs = [q p]; % (N)x(2) array +targets = [qdot pdot]; % (N)x(2) array +%% +% Convert data to array datastore + +inputds = arrayDatastore(inputs, IterationDimension=1); +targetds = arrayDatastore(targets, IterationDimension=1); +ds = combine(inputds, targetds); +%% Build the HNN Architecture +% We represent the Hamiltonian $\mathcal{H}$ using a simple multilayer perceptron +% (MLP). The network takes a 2-dimensional input and outputs a single scalar value +% for the Hamiltonian. +% +% The architecture consists of: +%% +% * Input: 2-dimensional state vector $[q,p]=[\theta,\dot{\theta}]$ +% * Hidden layers: 2 , each with 64 neurons +% * Activation function: +% * Output: 1-dimensional value representing the Hamiltonian $\mathcal{H}(q,p)$ + +inputSize = 2; +outputSize = 1; +numHiddenLayers = 2; +hiddenSize = 64; + +fcBlock = [ + fullyConnectedLayer(hiddenSize) + tanhLayer]; +layers = [ + featureInputLayer(inputSize) + repmat(fcBlock,[numHiddenLayers 1]) + fullyConnectedLayer(outputSize)]; +rng(0); % for reproducibility +net = dlnetwork(layers); +net = dlupdate(@double, net); +%% Specify Training Options +% Here we define some of the options we will use for the ADAM optimizer in our +% custom training loop. + +numEpochs = 1200; +miniBatchSize = 10; +executionEnvironment = "auto"; +initialLearnRate = 0.01; +decayRate = 1e-4; +N = length(tTrain); +numIterationsPerEpoch = ceil(N/miniBatchSize); +numIterations = numEpochs*numIterationsPerEpoch; +%% +% Create a mini-batch queue. + +mbq = minibatchqueue(ds, ... + MiniBatchSize=miniBatchSize, ... + MiniBatchFormat="BC", ... + OutputEnvironment=executionEnvironment, ... + OutputCast="double"); +averageGrad = []; +averageSqGrad = []; +%% +% Accelerate training. + +accfun = dlaccelerate(@modelLoss); +%% +% Create a monitor to track the training progress. + +monitor = trainingProgressMonitor( ... + Metrics="TrainingLoss", ... + XLabel="Iteration", ... + Info="Epoch"); +yscale(monitor,"TrainingLoss","log"); +%% Train the Network +% We use a custom training loop. For more information on custom training loops, +% see + +iteration = 0; +epoch = 0; +monitor.Status = "Running"; +tol = 1e-2; +loss = 1000; +while epoch < numEpochs && ~monitor.Stop && loss > tol + epoch = epoch + 1; + rng(0); % for reproducibility + shuffle(mbq); + while hasdata(mbq) && ~monitor.Stop + iteration = iteration + 1; + [X, T] = mbq.next(); + [loss, grad] = dlfeval(accfun, net, X, T); + % update learning rate + learningRate = initialLearnRate/(1+decayRate*iteration); + % Update the network parameters using the adamupdate function. + [net,averageGrad,averageSqGrad] = adamupdate(net,grad,averageGrad, ... + averageSqGrad,iteration,learningRate); + recordMetrics(monitor, iteration, TrainingLoss=loss); + updateInfo(monitor,Epoch=string(epoch)+" of "+string(numEpochs)); + end + monitor.Progress = 100*epoch/numEpochs; +end +%% +% Update the training status. + +if monitor.Stop == 1 + monitor.Status = "Training stopped"; +else + monitor.Status = "Training complete"; +end +%% Predict the Pendulum Trajectory +% Use the model to make predictions with the Hamiltonian NN. In order to make +% predictions, we need to solve the ODE system: +% +% $\frac{dq}{dt} = \frac{\partial H}{\partial p}$, $\frac{dp}{dt} = -\frac{\partial +% H}{\partial q}$. + +accModel = dlaccelerate(@model); +tspan = tTrain; +x0 = dlarray([q(1); p(1)]); % initial conditions +odeFcn = @(ts,xs) dlfeval(accModel, net, xs); +[~, qp] = ode45(odeFcn, tspan, x0); +qp = qp'; % Transpose to return to (2)x(N) +%% Visualize the Results + +% Plot phase portrait +figure(); +plot(q,p,'ko',DisplayName='Noisy Data',LineWidth=1); hold on +plot(qp(1,:),qp(2,:),'b-',DisplayName='HNN',LineWidth=3); +xlabel('$\theta$ (radians)',Interpreter='latex') +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex') +title('Phase-Space: $\theta$ vs. $\dot{\theta}$',Interpreter='latex'); +legend('Location', 'best'); +set(gca,FontSize=14,LineWidth=2.5) +hold off +% Plot time vs angular position +figure; +subplot(2,1,1); +plot(tspan, q, 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tspan, qp(1,:), 'b-', DisplayName='HNN', LineWidth=3); +legend(); +ylabel('$\theta$ (radians)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) + +% Plot time vs angular velocity +subplot(2,1,2); +plot(tspan, p, 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tspan, qp(2,:), 'b-', DisplayName='HNN', LineWidth=3); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +xlabel('Time (s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +legend(); +% Helper Functions + +function [loss,gradients] = modelLoss(net, qp, qpdotTarget) +qpdot = model(net, qp); +loss = l2loss(qpdot, qpdotTarget, DataFormat="CB"); +gradients = dlgradient(loss,net.Learnables); +end + +function qpdot = model(net, qp) +H = forward(net,dlarray(qp,'CB')); +H = stripdims(H); +qp = stripdims(qp); +dH = dljacobian(H,qp,1); +qpdot = [dH(2,:); -1.*dH(1,:)]; +end \ No newline at end of file diff --git a/phiml-blog-supporting-code/NeuralODE/NeuralODE_README.md b/phiml-blog-supporting-code/NeuralODE/NeuralODE_README.md new file mode 100644 index 0000000..a10623b --- /dev/null +++ b/phiml-blog-supporting-code/NeuralODE/NeuralODE_README.md @@ -0,0 +1,202 @@ + +# Neural ODE for Noisy Pendulum Data + +This example demonstrates how to use a **Neural Ordinary Differential Equation (Neural ODE)** to model the dynamics of a nonlinear pendulum from noisy data. The Neural ODE learns the time evolution of the pendulum's angle $\theta$ and angular velocity $\dot{\theta}$ directly from observations. + + +The true dynamics of the nonlinear pendulum are governed by the second\-order differential equation: + + + $\ddot{\theta} =-\omega_0^2 \sin \theta$. + + +We rewrite this as a first\-order system: + + $$ \frac{d}{\textrm{dt}}\left\lbrack \begin{array}{c} \theta \\ \dot{\theta} \end{array}\right\rbrack =\mathit{\mathbf{f}}\left(\theta ,\dot{\theta} \right) $$ + +In the Neural ODE framework, the unknown dynamics $\mathbf{f}$ are approximated using a neural network, which is trained to fit the observed trajectories. + +```matlab +rng(0); % for reproducibility +warning('off'); +``` +# Prepare Data for Training + +Load the data contained in `pendulum_qp_dqdp.mat` if it already exists, or generate and save the data if not. + +```matlab +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumData.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumData(omega0,x0,tSpan,noiseLevel,doPlot); +end +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_qp_dqdp.mat'); +% Read the data +load(dataFile,'data'); +theta = data.thetaNoisy; % Noisy angle measurements (radians) +omega = data.omegaNoisy; % Noisy angular velocity measurements (radians/sec) +t = data.t; % Time vector +``` + +Train on dataset where $t\le 10$ + +```matlab +inds = find(t <= 10); +tTrain = t(inds); +``` + +Combine into a single matrix: $[\theta ,\omega ]$ + +```matlab +Y = [theta(inds), omega(inds)]; % N x 2 +``` + +Prepare target data (excluding the initial condition) + +```matlab +Ytarget = Y(2:end, :); % (N-1) x 2 +``` + +Format for Neural ODE: (Channels x Batch x Time) + +```matlab +Ytrain = permute(Ytarget, [2 3 1]); % Now size is 2 x 1 x (N-1) (C x B x T) +Ytrain = dlarray(Ytrain, "CBT"); +``` + +Initial condition (first row of `theta` and `omega`) + +```matlab +Xtrain = reshape([theta(1); omega(1)], [2 1]); % 2 x 1 +Xtrain = dlarray(Xtrain, "CB"); +``` +# Define the Neural ODE model + +To model the right\-hand side of the ODE, we define a neural network that approximates the unknown dynamics function $f$. Here we use a multilayer perceptron (MLP) with the following architecture: + +- Input: 2\-dimensional state vector $[\theta ,\dot{\theta} ]$ +- Hidden layers: 2 [fully connected layers](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.fullyconnectedlayer.html), each with 32 neurons +- Activation function: [Gaussian error linear unit (GELU)](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.gelulayer.html) +- Ouput: 2\-dimensional vector representing $\frac{d}{dt}[\theta ,\dot{\theta} ]$ +```matlab +fLayers = [ + featureInputLayer(2, Name="input") + fullyConnectedLayer(32) + geluLayer + fullyConnectedLayer(32) + geluLayer + fullyConnectedLayer(2, Name="output") +]; +rng(0); % for reproducibility +fNet = dlnetwork(fLayers); +``` + +Instead of training the model on the derivatives $\dot{\theta}$ and $\ddot{\theta}$, we use an [ODE solver](https://www.mathworks.com/help/deeplearning/ref/dlarray.dlode45.html), to integrate the output of fNet over time, starting from the initial condition `Xtrain`. The solver produces a predicted trajectory of the state $[\theta ,\dot{\theta} ]$, which is then compared to the observed data `Ytrain`, and the network is optimized to minimize the error bewteen the predicted and actual trajectories. + + +The [neuralODElayer](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.neuralodelayer.html) approximately solves the system + + + $\frac{d}{\textrm{dt}}\left\lbrack \begin{array}{c} \theta \\ \dot{\theta} \end{array}\right\rbrack =f\left(\theta ,\dot{\theta} \right)$, + + +where the righthand side is the output of `fNet.` + +```matlab +nODElayers = [ + featureInputLayer(2, Name="IC_in") + neuralODELayer(fNet, tTrain, ... + GradientMode="adjoint", ... + Name="ODElayer") +]; +rng(0); % for reproducibility +nODEnet = dlnetwork(nODElayers); +``` +# Specify Training Options + +Here we use the ADAM optimizer with settings like a minibatch size of 50 and 300 maximum epochs. + +```matlab +opts = trainingOptions("adam", ... + Plots="training-progress", ... + MiniBatchSize=50, ... + MaxEpochs=500, ... + LearnRateSchedule="piecewise", ... + LearnRateDropFactor=0.1, ... + LearnRateDropPeriod=1200, ... + InitialLearnRate=0.01, ... + ExecutionEnvironment="cpu", ... + Verbose=false); +``` +# Train the Network + +Train the model to fit the noisy pendulum data. + +```matlab +rng(0); % for reproducibility +nODEnet = trainnet(Xtrain, Ytrain, nODEnet, "l2loss", opts); +``` + +![figure_0.png](NeuralODE_nonlinear_pendulum_media/figure_0.png) +# Predict the Pendulum Trajectory + +Used the trained network to predict the pendulum's trajectory. + +```matlab +Ypred = predict(nODEnet, Xtrain); % Output: 2 x 1 x (N-1) +Ypred = extractdata(Ypred); % Convert from dlarray to numeric +Ypred = squeeze(Ypred)'; % (N-1) x 2 +``` +# Visualize the Results + +Plot the angle $\theta$ over time. + +```matlab +figure; +subplot(2,1,1); +plot(tTrain, theta(inds), 'ko', DisplayName='Noisy Data',LineWidth=1); hold on +plot(tTrain(2:end), Ypred(:,1), 'b-', LineWidth=3,DisplayName='Neural ODE'); hold off +legend(); +ylabel('$\theta$ (radians)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +``` + +Plot the angular velocity $\dot{\theta}$ over time + +```matlab +subplot(2,1,2); +plot(tTrain,omega(inds),'ko',DisplayName='Noisy Data',LineWidth=1); hold on +plot(tTrain(2:end), Ypred(:,2), 'b-',DisplayName='Neural ODE',LineWidth=3); +legend(); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +xlabel('Time (s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +``` + +![figure_1.png](NeuralODE_nonlinear_pendulum_media/figure_1.png) + +Plot the phase\-space: $\theta$ vs $\dot{\theta}$. + +```matlab +figure; +plot(theta(inds), omega(inds), 'ko', LineWidth=1, DisplayName='Noisy Data'); hold on; +hold on; +plot(Ypred(:,1), Ypred(:,2), 'b-', LineWidth=3, DisplayName='Neural ODE Prediction'); +hold off; +legend('Location', 'best'); +xlabel('$\theta$ (radians)',Interpreter='latex'); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +title('Phase-Space: $\theta$ vs. $\dot{\theta}$',Interpreter='latex'); +``` + +![figure_2.png](NeuralODE_nonlinear_pendulum_media/figure_2.png) diff --git a/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum.mlx b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum.mlx new file mode 100644 index 0000000..c7cea1b Binary files /dev/null and b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum.mlx differ diff --git a/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_code.m b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_code.m new file mode 100644 index 0000000..8c94455 --- /dev/null +++ b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_code.m @@ -0,0 +1,173 @@ +%% Neural ODE for Noisy Pendulum Data +% This example demonstrates how to use a *Neural Ordinary Differential Equation +% (Neural ODE)* to model the dynamics of a nonlinear pendulum from noisy data. +% The Neural ODE learns the time evolution of the pendulum's angle $\theta$ and +% angular velocity $\dot{\theta}$ directly from observations. +% +% The true dynamics of the nonlinear pendulum are governed by the second-order +% differential equation: +% +% $\ddot{\theta} = -\omega_0^2\sin\theta$. +% +% We rewrite this as a first-order system: +% +% $$\frac{d}{\textrm{dt}}\left\lbrack \begin{array}{c}\theta \\\dot{\theta} +% \end{array}\right\rbrack =\mathit{\mathbf{f}}\left(\theta ,\dot{\theta} \right)$$ +% +% In the Neural ODE framework, the unknown dynamics $\bf\it{f}$ are approximated +% using a neural network, which is trained to fit the observed trajectories. + +rng(0); % for reproducibility +warning('off'); +%% Prepare Data for Training +% Load the data contained in |pendulum_qp_dqdp.mat| if it already exists, or +% generate and save the data if not. + +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumData.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumData(omega0,x0,tSpan,noiseLevel,doPlot); +end +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_qp_dqdp.mat'); +% Read the data +load(dataFile,'data'); +theta = data.thetaNoisy; % Noisy angle measurements (radians) +omega = data.omegaNoisy; % Noisy angular velocity measurements (radians/sec) +t = data.t; % Time vector +%% +% Train on dataset where $t \leq 10$ + +inds = find(t <= 10); +tTrain = t(inds); +%% +% Combine into a single matrix: $[\theta, \omega]$ + +Y = [theta(inds), omega(inds)]; % N x 2 +%% +% Prepare target data (excluding the initial condition) + +Ytarget = Y(2:end, :); % (N-1) x 2 +%% +% Format for Neural ODE: (Channels x Batch x Time) + +Ytrain = permute(Ytarget, [2 3 1]); % Now size is 2 x 1 x (N-1) (C x B x T) +Ytrain = dlarray(Ytrain, "CBT"); +%% +% Initial condition (first row of |theta| and |omega|) + +Xtrain = reshape([theta(1); omega(1)], [2 1]); % 2 x 1 +Xtrain = dlarray(Xtrain, "CB"); +%% Define the Neural ODE model +% To model the right-hand side of the ODE, we define a neural network that approximates +% the unknown dynamics function $f$. Here we use a multilayer perceptron (MLP) +% with the following architecture: +%% +% * Input: 2-dimensional state vector $[\theta,\dot{\theta}]$ +% * Hidden layers: 2 , each with 32 neurons +% * Activation function: +% * Ouput: 2-dimensional vector representing $\frac{d}{dt}[\theta,\dot{\theta}]$ + +fLayers = [ + featureInputLayer(2, Name="input") + fullyConnectedLayer(32) + geluLayer + fullyConnectedLayer(32) + geluLayer + fullyConnectedLayer(2, Name="output") +]; +rng(0); % for reproducibility +fNet = dlnetwork(fLayers); +%% +% Instead of training the model on the derivatives $\dot{\theta}$ and $\ddot{\theta}$, +% we use an , to integrate the output of fNet over time, starting from the initial +% condition |Xtrain|. The solver produces a predicted trajectory of the state +% $[\theta,\dot{\theta}]$, which is then compared to the observed data |Ytrain|, +% and the network is optimized to minimize the error bewteen the predicted and +% actual trajectories. +% +% The approximately solves the system +% +% $\frac{d}{\textrm{dt}}\left\lbrack \begin{array}{c}\theta \\\dot{\theta} \end{array}\right\rbrack +% =f\left(\theta ,\dot{\theta} \right)$, +% +% where the righthand side is the output of |fNet.| + +nODElayers = [ + featureInputLayer(2, Name="IC_in") + neuralODELayer(fNet, tTrain, ... + GradientMode="adjoint", ... + Name="ODElayer") +]; +rng(0); % for reproducibility +nODEnet = dlnetwork(nODElayers); +%% Specify Training Options +% Here we use the ADAM optimizer with settings like a minibatch size of 50 and +% 300 maximum epochs. + +opts = trainingOptions("adam", ... + Plots="training-progress", ... + MiniBatchSize=50, ... + MaxEpochs=500, ... + LearnRateSchedule="piecewise", ... + LearnRateDropFactor=0.1, ... + LearnRateDropPeriod=1200, ... + InitialLearnRate=0.01, ... + ExecutionEnvironment="cpu", ... + Verbose=false); +%% Train the Network +% Train the model to fit the noisy pendulum data. + +rng(0); % for reproducibility +nODEnet = trainnet(Xtrain, Ytrain, nODEnet, "l2loss", opts); +%% Predict the Pendulum Trajectory +% Used the trained network to predict the pendulum's trajectory. + +Ypred = predict(nODEnet, Xtrain); % Output: 2 x 1 x (N-1) +Ypred = extractdata(Ypred); % Convert from dlarray to numeric +Ypred = squeeze(Ypred)'; % (N-1) x 2 +%% Visualize the Results +% Plot the angle $\theta$ over time. + +figure; +subplot(2,1,1); +plot(tTrain, theta(inds), 'ko', DisplayName='Noisy Data',LineWidth=1); hold on +plot(tTrain(2:end), Ypred(:,1), 'b-', LineWidth=3,DisplayName='Neural ODE'); hold off +legend(); +ylabel('$\theta$ (radians)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +%% +% Plot the angular velocity $\dot{\theta}$ over time + +subplot(2,1,2); +plot(tTrain,omega(inds),'ko',DisplayName='Noisy Data',LineWidth=1); hold on +plot(tTrain(2:end), Ypred(:,2), 'b-',DisplayName='Neural ODE',LineWidth=3); +legend(); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +xlabel('Time (s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +%% +% Plot the phase-space: $\theta$ vs $\dot{\theta}$. + +figure; +plot(theta(inds), omega(inds), 'ko', LineWidth=1, DisplayName='Noisy Data'); hold on; +hold on; +plot(Ypred(:,1), Ypred(:,2), 'b-', LineWidth=3, DisplayName='Neural ODE Prediction'); +hold off; +legend('Location', 'best'); +xlabel('$\theta$ (radians)',Interpreter='latex'); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +title('Phase-Space: $\theta$ vs. $\dot{\theta}$',Interpreter='latex'); \ No newline at end of file diff --git a/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_0.png b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_0.png new file mode 100644 index 0000000..2be8f3f Binary files /dev/null and b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_0.png differ diff --git a/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_1.png b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_1.png new file mode 100644 index 0000000..a3c2166 Binary files /dev/null and b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_1.png differ diff --git a/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_2.png b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_2.png new file mode 100644 index 0000000..f0f4550 Binary files /dev/null and b/phiml-blog-supporting-code/NeuralODE/NeuralODE_nonlinear_pendulum_media/figure_2.png differ diff --git a/phiml-blog-supporting-code/PINN/PINN_README.md b/phiml-blog-supporting-code/PINN/PINN_README.md new file mode 100644 index 0000000..664c64b --- /dev/null +++ b/phiml-blog-supporting-code/PINN/PINN_README.md @@ -0,0 +1,258 @@ + +# Physics\-Informed Neural Network for Nonlinear Pendulum + +This example demonstrates how to use a Physics\-Informed Neural Networks (PINN) to model the dynamics of a nonlinear pendulum governed by the second\-order differential equation: + + + $\ddot{\theta} =-\omega_0^2 \sin \theta$. + + +We assume access to sparse, noisy measurements of the pendulum's angular position $\theta$ over a short time interval $[0,3]$, and aim to predict its behavior over a longer interval $[0,10]$. Rather than relying solely on the data, the PINN incorporates the governing ODE directly into the training process through the loss function. + + +The network is trained to minimize a composite loss function that includes: + +- a data loss, which penalizes the difference between the network's predictions and the given measurements of $\theta$, and +- a physics loss, which penalizes violations of the governing pendulum equation. +```matlab +rng(0); % for reproducibility +``` +# Prepare Data for Training + +Load the data contained in `pendulum_qp_dqdp.mat` if it already exists, or generate and save the data if not. + +```matlab +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumData.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumData(omega0,x0,tSpan,noiseLevel,doPlot); +end +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_qp_dqdp.mat'); +% Read the data +load(dataFile,'data'); +theta = data.thetaNoisy; % Noisy angle measurements (radians) +t = data.t; % Time vector +``` + +Use data up to $t=3$ for training. + +```matlab +inds = find(t <= 3); +tData = t(inds); +thetaData = theta(inds); +``` + +Remove points from the training set to model the scenario where we have sparse data. + +```matlab +tData = tData(1:5:end); +thetaData = thetaData(1:5:end); +``` + +Create points in $[0,10]$ at which to enforce the physics loss. + +```matlab +% physics loss points +numInternalCollocationPoints = 200; +pointSet = sobolset(1); +points = 10*net(pointSet,numInternalCollocationPoints); +tPhysics = points(2:end); % remove 0 from the training data as this is the initial time and is handled in the ICs. + +% visualize the training data +hold on; +scatter(tData,thetaData,24,[0.6350 0.0780 0.1840],DisplayName='noisy measurement data',LineWidth=2); +scatter(tPhysics,zeros(length(tPhysics),1),30,[0.4660 0.6740 0.1880],"filled",DisplayName='physics loss points'); +xlabel('$t$',Interpreter='latex'); +legend() +set(gca,FontSize=14,LineWidth=2.5) +hold off +``` + +![figure_0.png](PINN_README_media/figure_0.png) + +Prepare data for training. + +```matlab +tData = dlarray(tData',"CB"); +thetaData = dlarray(thetaData',"CB"); +t0 = dlarray(0,"CB"); +theta0 = dlarray(0,"CB"); +ds = arrayDatastore(tPhysics); +``` +# Define the PINN Architecture + +We represent the solution $\theta$ using a multilayer perceptron (MLP) that maps time $t$ to the pendulum angle $\theta$. The network architecture consists of: + +- Input: 1\-dimensional input representing time $t$, +- Hidden layers: 6 [fully connected layers](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.fullyconnectedlayer.html), each with 64 neurons, +- Activation function: [Hyperbolic tangent (tanh)](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.tanhlayer.html), +- Output: 1\-dimensional output representing the predicted angle $\theta$ +```matlab +numHiddenLayers = 6; +numNeurons = 64; +fcBlock = [ % fully connected block + fullyConnectedLayer(numNeurons) + tanhLayer]; +layers = [ + featureInputLayer(1) % one input (t) + repmat(fcBlock,[numHiddenLayers 1]) % repeat fully connected block 6 times + fullyConnectedLayer(1)]; % one output (theta) +rng(0); % for reproducibility +pinnNet = dlnetwork(layers); +pinnNet = dlupdate(@double,pinnNet); +``` +# Define the Loss Function + +To train the PINN, we define a composite loss function that combines physics, data, and initial condition constraints. + + +**Physics\-Informed Loss** + + +The physics loss enforces the differential equation by penalizing deviations from the governing pendulum ODE. Functions like [`dllaplacian`](https://www.mathworks.com/help/releases/R2025a/deeplearning/ref/dlarray.dllaplacian.html?searchPort=60735) and [`dljacobian`](https://www.mathworks.com/help/deeplearning/ref/dlarray.dljacobian.html) make it simple to define derivative terms appearing in ordinary and partial differential equations for use in a PINN. + +```matlab +function loss = physicsInformedLoss(net,t) + omega0 = sqrt(9.81/1); + theta = forward(net,t); + thetatt = dllaplacian(stripdims(theta),stripdims(t),1); + residual = thetatt + omega0^2*sin(theta); + loss = mean(residual.^2, 'all'); +end +``` + +Alternatively, the physics loss can be created using the supporting function `ode2PINNLossFunction` available for download at [PINN Loss Function Generation with Symbolic Math](https://www.mathworks.com/matlabcentral/fileexchange/172049-pinn-loss-function-generation-with-symbolic-math). + + +**Initial Condition Loss** + + +The initial condition loss penalizes the network during training for violating the initial condition $\theta (t=0)=\theta_0$. + +```matlab +function loss = initialConditionLoss(net,t0,theta0) + thetaPred0 = forward(net,t0); + loss = l2loss(thetaPred0,theta0); +end +``` + +**Data Loss** + + +The data loss penalizes the difference between the network's predictions and the given measurements of $\theta$. + +```matlab +function loss = dataLoss(net, t, thetaTarget) + theta = forward(net, t); + loss = l2loss(theta, thetaTarget); +end +``` + +The total loss is a weighted combination of the physics, data, and initial condition losses. + +```matlab +function [loss, gradients, lossPinn, lossData, lossIC] = modelLoss(net, tData, thetaData, tPhysics, t0, theta0) + lossPinn = physicsInformedLoss(net,tPhysics); + lossData = dataLoss(net, tData, thetaData); + lossIC = initialConditionLoss(net,t0,theta0); + loss = 0.15*lossPinn + lossData + lossIC; % weight the individual loss terms + gradients = dlgradient(loss, net.Learnables); +end +``` +# Specify Training Options + +We will use the L\-BFGS optimizer via the [`lbfgsupdate`](https://www.mathworks.com/help/deeplearning/ref/lbfgsupdate.html) function in a custom training loop in the next section. + + +Accelerate the model loss function. + +```matlab +accFcn = dlaccelerate(@modelLoss); +lossFcn = @(net) dlfeval(accFcn, net, tData, thetaData, dlarray(tPhysics,"BC"), t0, theta0); +solverState = []; +maxIterations = 5000; +gradientTolerance = 1e-6; +stepTolerance = 1e-6; +``` + +Monitor the training progress. Plot the total loss. + +```matlab +monitor = trainingProgressMonitor( ... + Metrics="TrainingLoss", ... + Info=["Iteration" "GradientsNorm" "StepNorm"], ... + XLabel="Iteration"); +``` + +![figure_1.png](PINN_README_media/figure_1.png) +# Train the Network + +In order to train a PINN, we create a [custom training loop](https://www.mathworks.com/help/deeplearning/deep-learning-custom-training-loops.html). + +```matlab +iteration = 0; + +while iteration < maxIterations && ~monitor.Stop + iteration = iteration + 1; + + [pinnNet, solverState] = lbfgsupdate(pinnNet, lossFcn, solverState); + + updateInfo(monitor, ... + Iteration=iteration, ... + GradientsNorm=solverState.GradientsNorm, ... + StepNorm=solverState.StepNorm); + + recordMetrics(monitor, iteration, TrainingLoss=solverState.Loss); + monitor.Progress = 100 * iteration / maxIterations; + + if solverState.GradientsNorm < gradientTolerance || ... + solverState.StepNorm < stepTolerance || ... + solverState.LineSearchStatus == "failed" + break + end +end +``` + +![figure_2.png](PINN_README_media/figure_2.png) +# Visualize the Results + +Use the PINN to predict the value of $\theta$ at 100 evenly spaced points between $[0,10]$. + +```matlab +tTest = linspace(0,10,100)'; +thetaPred = predict(pinnNet,tTest); +plot(tTest, thetaPred, 'b-', DisplayName='PINN prediction', LineWidth = 3, Color='b'); +hold on +``` + +Compare against the true solution. + +```matlab +gValue = 9.81; +rValue = 1; +omega0 = sqrt(gValue/rValue); +thetaDot0 = 1.99*omega0; +sol = solveNonlinearPendulum(tTest,omega0,thetaDot0); +yTrue = sol.Solution; +plot(tTest,yTrue(1,:),'k--',DisplayName='Exact solution',LineWidth=3); + +% show the training data +scatter(extractdata(tData),extractdata(thetaData),30,[0.6350 0.0780 0.1840],LineWidth=2.5,DisplayName='data points'); +scatter(tPhysics,zeros(length(tPhysics),1),36,[0.4660 0.6740 0.1880],"filled",DisplayName='physics points'); +xlabel('Time (s)',Interpreter='latex'); +ylabel('$\theta$ (radians)',Interpreter='latex'); +hold off +set(gca,FontSize=14,LineWidth=2.5) +legend(); +``` + +![figure_3.png](PINN_README_media/figure_3.png) diff --git a/phiml-blog-supporting-code/PINN/PINN_README_media/figure_0.png b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_0.png new file mode 100644 index 0000000..e16fb1b Binary files /dev/null and b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_0.png differ diff --git a/phiml-blog-supporting-code/PINN/PINN_README_media/figure_1.png b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_1.png new file mode 100644 index 0000000..742a1ab Binary files /dev/null and b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_1.png differ diff --git a/phiml-blog-supporting-code/PINN/PINN_README_media/figure_2.png b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_2.png new file mode 100644 index 0000000..16d1879 Binary files /dev/null and b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_2.png differ diff --git a/phiml-blog-supporting-code/PINN/PINN_README_media/figure_3.png b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_3.png new file mode 100644 index 0000000..d12bbd2 Binary files /dev/null and b/phiml-blog-supporting-code/PINN/PINN_README_media/figure_3.png differ diff --git a/phiml-blog-supporting-code/PINN/PINN_nonlinear_pendulum.mlx b/phiml-blog-supporting-code/PINN/PINN_nonlinear_pendulum.mlx new file mode 100644 index 0000000..8d25109 Binary files /dev/null and b/phiml-blog-supporting-code/PINN/PINN_nonlinear_pendulum.mlx differ diff --git a/phiml-blog-supporting-code/PINN/PINN_nonlinear_pendulum_code.m b/phiml-blog-supporting-code/PINN/PINN_nonlinear_pendulum_code.m new file mode 100644 index 0000000..d11d622 --- /dev/null +++ b/phiml-blog-supporting-code/PINN/PINN_nonlinear_pendulum_code.m @@ -0,0 +1,230 @@ +%% Physics-Informed Neural Network for Nonlinear Pendulum +% This example demonstrates how to use a Physics-Informed Neural Networks (PINN) +% to model the dynamics of a nonlinear pendulum governed by the second-order differential +% equation: +% +% $\ddot{\theta} = -\omega_0^2 \sin\theta$. +% +% We assume access to sparse, noisy measurements of the pendulum's angular position +% $\theta$ over a short time interval $[0,3]$, and aim to predict its behavior +% over a longer interval $[0,10]$. Rather than relying solely on the data, the +% PINN incorporates the governing ODE directly into the training process through +% the loss function. +% +% The network is trained to minimize a composite loss function that includes: +%% +% * a data loss, which penalizes the difference between the network's predictions +% and the given measurements of $\theta$, and +% * a physics loss, which penalizes violations of the governing pendulum equation. + +rng(0); % for reproducibility +%% Prepare Data for Training +% Load the data contained in |pendulum_qp_dqdp.mat| if it already exists, or +% generate and save the data if not. + +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumData.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumData(omega0,x0,tSpan,noiseLevel,doPlot); +end +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_qp_dqdp.mat'); +% Read the data +load(dataFile,'data'); +theta = data.thetaNoisy; % Noisy angle measurements (radians) +t = data.t; % Time vector +%% +% Use data up to $t = 3$ for training. + +inds = find(t <= 3); +tData = t(inds); +thetaData = theta(inds); +%% +% Remove points from the training set to model the scenario where we have sparse +% data. + +tData = tData(1:5:end); +thetaData = thetaData(1:5:end); +%% +% Create points in $[0,10]$ at which to enforce the physics loss. + +% physics loss points +numInternalCollocationPoints = 200; +pointSet = sobolset(1); +points = 10*net(pointSet,numInternalCollocationPoints); +tPhysics = points(2:end); % remove 0 from the training data as this is the initial time and is handled in the ICs. + +% visualize the training data +hold on; +scatter(tData,thetaData,24,[0.6350 0.0780 0.1840],DisplayName='noisy measurement data',LineWidth=2); +scatter(tPhysics,zeros(length(tPhysics),1),30,[0.4660 0.6740 0.1880],"filled",DisplayName='physics loss points'); +xlabel('$t$',Interpreter='latex'); +legend() +set(gca,FontSize=14,LineWidth=2.5) +hold off +%% +% Prepare data for training. + +tData = dlarray(tData',"CB"); +thetaData = dlarray(thetaData',"CB"); +t0 = dlarray(0,"CB"); +theta0 = dlarray(0,"CB"); +ds = arrayDatastore(tPhysics); +%% Define the PINN Architecture +% We represent the solution $\theta$ using a multilayer perceptron (MLP) that +% maps time $t$ to the pendulum angle $\theta$. The network architecture consists +% of: +%% +% * Input: 1-dimensional input representing time $t$, +% * Hidden layers: 6 , each with 64 neurons, +% * Activation function: , +% * Output: 1-dimensional output representing the predicted angle $\theta$ + +numHiddenLayers = 6; +numNeurons = 64; +fcBlock = [ % fully connected block + fullyConnectedLayer(numNeurons) + tanhLayer]; +layers = [ + featureInputLayer(1) % one input (t) + repmat(fcBlock,[numHiddenLayers 1]) % repeat fully connected block 6 times + fullyConnectedLayer(1)]; % one output (theta) +rng(0); % for reproducibility +pinnNet = dlnetwork(layers); +pinnNet = dlupdate(@double,pinnNet); +%% Define the Loss Function +% To train the PINN, we define a composite loss function that combines physics, +% data, and initial condition constraints. +% +% *Physics-Informed Loss* +% +% The physics loss enforces the differential equation by penalizing deviations +% from the governing pendulum ODE. Functions like and make it simple to define derivative terms appearing in ordinary +% and partial differential equations for use in a PINN. + +function loss = physicsInformedLoss(net,t) + omega0 = sqrt(9.81/1); + theta = forward(net,t); + thetatt = dllaplacian(stripdims(theta),stripdims(t),1); + residual = thetatt + omega0^2*sin(theta); + loss = mean(residual.^2, 'all'); +end +%% +% Alternatively, the physics loss can be created using the supporting function +% |ode2PINNLossFunction| available for download at . +% +% *Initial Condition Loss* +% +% The initial condition loss penalizes the network during training for violating +% the initial condition $\theta(t=0)=\theta_0$. + +function loss = initialConditionLoss(net,t0,theta0) + thetaPred0 = forward(net,t0); + loss = l2loss(thetaPred0,theta0); +end +%% +% *Data Loss* +% +% The data loss penalizes the difference between the network's predictions and +% the given measurements of $\theta$. + +function loss = dataLoss(net, t, thetaTarget) + theta = forward(net, t); + loss = l2loss(theta, thetaTarget); +end +%% +% The total loss is a weighted combination of the physics, data, and initial +% condition losses. + +function [loss, gradients, lossPinn, lossData, lossIC] = modelLoss(net, tData, thetaData, tPhysics, t0, theta0) + lossPinn = physicsInformedLoss(net,tPhysics); + lossData = dataLoss(net, tData, thetaData); + lossIC = initialConditionLoss(net,t0,theta0); + loss = 0.15*lossPinn + lossData + lossIC; % weight the individual loss terms + gradients = dlgradient(loss, net.Learnables); +end +%% Specify Training Options +% We will use the L-BFGS optimizer via the function in a custom training loop in the next section. +% +% Accelerate the model loss function. + +accFcn = dlaccelerate(@modelLoss); +lossFcn = @(net) dlfeval(accFcn, net, tData, thetaData, dlarray(tPhysics,"BC"), t0, theta0); +solverState = []; +maxIterations = 5000; +gradientTolerance = 1e-6; +stepTolerance = 1e-6; +%% +% Monitor the training progress. Plot the total loss. + +monitor = trainingProgressMonitor( ... + Metrics="TrainingLoss", ... + Info=["Iteration" "GradientsNorm" "StepNorm"], ... + XLabel="Iteration"); +%% Train the Network +% In order to train a PINN, we create a . + +iteration = 0; + +while iteration < maxIterations && ~monitor.Stop + iteration = iteration + 1; + + [pinnNet, solverState] = lbfgsupdate(pinnNet, lossFcn, solverState); + + updateInfo(monitor, ... + Iteration=iteration, ... + GradientsNorm=solverState.GradientsNorm, ... + StepNorm=solverState.StepNorm); + + recordMetrics(monitor, iteration, TrainingLoss=solverState.Loss); + monitor.Progress = 100 * iteration / maxIterations; + + if solverState.GradientsNorm < gradientTolerance || ... + solverState.StepNorm < stepTolerance || ... + solverState.LineSearchStatus == "failed" + break + end +end +%% Visualize the Results +% Use the PINN to predict the value of $\theta$ at 100 evenly spaced points +% between $[0,10]$. + +tTest = linspace(0,10,100)'; +thetaPred = predict(pinnNet,tTest); +figure; +plot(tTest, thetaPred, 'b-', DisplayName='PINN prediction', LineWidth = 3, Color='b'); +hold on +%% +% Compare against the true solution. + +gValue = 9.81; +rValue = 1; +omega0 = sqrt(gValue/rValue); +thetaDot0 = 1.99*omega0; +sol = solveNonlinearPendulum(tTest,omega0,thetaDot0); +yTrue = sol.Solution; +plot(tTest,yTrue(1,:),'k--',DisplayName='Exact solution',LineWidth=3); + +% show the training data +scatter(extractdata(tData),extractdata(thetaData),30,[0.6350 0.0780 0.1840],LineWidth=2.5,DisplayName='data points'); +scatter(tPhysics,zeros(length(tPhysics),1),36,[0.4660 0.6740 0.1880],"filled",DisplayName='physics points'); +xlabel('Time (s)',Interpreter='latex'); +ylabel('$\theta$ (radians)',Interpreter='latex'); +hold off +set(gca,FontSize=14,LineWidth=2.5) +legend(); \ No newline at end of file diff --git a/phiml-blog-supporting-code/PINN/solveNonlinearPendulum.m b/phiml-blog-supporting-code/PINN/solveNonlinearPendulum.m new file mode 100644 index 0000000..351ad1a --- /dev/null +++ b/phiml-blog-supporting-code/PINN/solveNonlinearPendulum.m @@ -0,0 +1,9 @@ +function sol = solveNonlinearPendulum(tspan,omega0,thetaDot0) + +F = ode; +F.ODEFcn = @(t,x) [x(2); -omega0^2.*sin(x(1))]; +F.InitialValue = [0; thetaDot0]; +F.Solver = "ode45"; +sol = solve(F,tspan); + +end \ No newline at end of file diff --git a/phiml-blog-supporting-code/README.md b/phiml-blog-supporting-code/README.md new file mode 100644 index 0000000..47cf4d6 --- /dev/null +++ b/phiml-blog-supporting-code/README.md @@ -0,0 +1,32 @@ +# Physics Informed Machine Learning Methods and Implementation supporting code +This collection provides examples demonstrating a variety of Physics-Informed Machine Learning (PhiML) techniques, applied to a simple pendulum system. The examples provided here are discussed in detail in the accompanying blog post "Physics-Informed Machine Learning: Methods and Implementation". + +![hippo](https://www.mathworks.com/help/examples/symbolic/win64/SimulateThePhysicsOfAPendulumsPeriodicSwingExample_10.gif) + +The techniques described for the pendulum are categorized by their primary objectives: +- **Modeling complex systems from data** + 1. **Neural Ordinary Differential Equation**: learns underlying dynamics $f$ in the differential equation $\frac{d \mathbf{x}}{d t} = f(\mathbf{x},\mathbf{u})$ using a neural network. + 2. **Neural State-Space**: learns both the system dynamics $f$ and the measurement function $g$ in the state-space model $\frac{d \mathbf{x}}{dt} = f(\mathbf{x},\mathbf{u}), \mathbf{y}=g(\mathbf{x},\mathbf{u})$ using neural networks. Not shown in this repository, but illustrated in the documentation example [Neural State-Space Model of Simple Pendulum System](https://www.mathworks.com/help/ident/ug/training-a-neural-state-space-model-for-a-simple-pendulum-system.html). + 3. **Universal Differential Equation**: combines known dynamics $g$ with unknown dynamics $h$, for example $\frac{d\mathbf{x}}{dt} = f(\mathbf{x},\mathbf{u}) = g(\mathbf{x},\mathbf{u}) + h(\mathbf{x},\mathbf{u})$, learning the unknown part $h$ using a neural network. + 4. **Hamiltonian Neural Network**: learns the system's Hamiltonian $\mathcal{H}$, and accounts for energy conservation by enforcing Hamilton's equations $\frac{dq}{dt} = \frac{\partial \mathcal{H}}{\partial p}, \frac{dp}{dt} = -\frac{\partial \mathcal{H}}{\partial q}$. +- **Discovering governing equations from data:** + 1. **SINDy (Sparse Identification of Nonlinear Dynamics)**: learns a mathematical representation of the system dynamics $f$ by performing sparse regression on a library of candidate functions. +- **Solving known ordinary and partial differential equations:** + 1. **Physics-Informed Neural Networks**: learns the solution to a differential equation by embedding the governing equations directly into the neural network's loss function. + 2. **Fourier Neural Operator**: learns the solution operator that maps input functions (e.g. forcing function) to the solution of a differential equation, utilizing Fourier transforms to parameterize the integral kernel and efficiently capture global dependencies. + +## Setup +Run startup.m to correctly set the path. + +## MathWorks Products ([https://www.mathworks.com](https://www.mathworks.com/)) +Created and tested with MATLAB® R2025a +- [Deep Learning Toolbox™](https://www.mathworks.com/products/deep-learning.html) + +## License +The license is available in the [license.txt](license.txt) file in this Github repository. + +## Community Support +[MATLAB Central](https://www.mathworks.com/matlabcentral) + +Copyright 2025 The MathWorks, Inc. + diff --git a/phiml-blog-supporting-code/UDE_SINDy/UDE_SINDy_README.md b/phiml-blog-supporting-code/UDE_SINDy/UDE_SINDy_README.md new file mode 100644 index 0000000..e60493a --- /dev/null +++ b/phiml-blog-supporting-code/UDE_SINDy/UDE_SINDy_README.md @@ -0,0 +1,352 @@ + +# Universal Differential Equation for Noisy Pendulum Data + +This example demonstrates how to use a **Universal Differential Equation (UDE)** to model the dynamics of a nonlinear pendulum from noisy observational data. A UDE extends the concept of a neural ordinary differential equation (neural ODE) by incorporating known physical laws alongside neural networks, enabling the model to learn unknown or unmodeled dynamics. + + +The true dynamics of a frictionless nonlinear pendulum are governed by the second\-order differential equation: + + + $\ddot{\theta} =-\omega_0^2 \sin \theta$. + + +In this example, we assume the presense of an unknown frictional or damping force, denoted by $h$: + + + $\ddot{\theta} =-\omega_0^2 \sin \theta +h(\theta ,\dot{\theta} )$. + + +We reforumulate this as a system of first\-order ODEs: + + + $\frac{d}{\textrm{dt}}\left\lbrack \begin{array}{c} \theta \\ \dot{\theta} \end{array}\right\rbrack =\left\lbrack \begin{array}{c} \dot{\theta} \\ -\omega_0^2 \sin \;\theta +h\left(\theta ,\dot{\;\theta } \right) \end{array}\right\rbrack =\mathit{\mathbf{g}}\left(\theta ,\dot{\theta} \right)+\left\lbrack \begin{array}{c} 0\\ h\left(\theta ,\dot{\theta} \right) \end{array}\right\rbrack$, + + +where $\mathbf{g}$ captures the known pendulum dynamics, and the unknown component $h$ is approximated using a neural network. + +```matlab +rng(0); % for reproducibility +warning('off'); +``` +# Prepare Data for Training + +Import the data contained in `pendulum_with_damping_qp_dqpdp_F.mat` if it already exists, or generate and save the data if not. + +```matlab +% Get the path to the main directory +mainDir = findProjectRoot('generatePendulumDataWithDamping.m'); +% If first time generating data, set generateData to true. Else, set to false. +generateData = 1; +if generateData + g = 9.81; r = 1; + omega0 = sqrt(g/r); + x0 = [0;1.99*sqrt(9.81)]; + tSpan = linspace(0,20,400); + noiseLevel = 0.01; + doPlot = 0; + generatePendulumDataWithDamping(omega0,x0,tSpan,noiseLevel,doPlot); +end +``` + +```matlabTextOutput +Pendulum data written to pendulum_with_damping_qp_dqdp_F.mat +``` + +```matlab +% Construct full path to the data file +dataFile = fullfile(mainDir, 'pendulumData', 'pendulum_with_damping_qp_dqdp_F.mat'); +% Read the data +load(dataFile,'data'); + +theta = data.thetaNoisy; % Noisy angle measurements (radians) +omega = data.omegaNoisy; % Noisy angular velocity measurements (radians/sec) +t = data.t; % Time vector + +% Numerical derivatives +thetaDot = data.thetaDot; % d(theta)/dt +omegaDot = data.omegaDot; % d(omega)/dt + +% Force +frictionForce = data.F_data; +``` + +Use only the dataset where $t\le 10$ to train. + +```matlab +inds = find(t <= 10); +tTrain = t(inds); +``` + +Combine into single state vector: $X=[\theta ,\omega ]$. + +```matlab +Y = [theta(inds), omega(inds)]; % N x 2 +``` + +Prepare target data (excluding the initial condition) + +```matlab +Ytarget = Y(2:end, :); % (N-1) x 2 +``` + +Format for Neural ODE: (Channels x Batch x Time) + +```matlab +Ytrain = permute(Ytarget, [2 3 1]); % Now size is 2 x 1 x (N-1) (C x B x T) +Ytrain = dlarray(Ytrain, "CBT"); +``` + +Initial condition (first row of theta and omega) + +```matlab +Xtrain = reshape([theta(1); omega(1)], [2 1]); % 2 x 1 +Xtrain = dlarray(Xtrain, "CB"); +``` +# Define the Neural ODE model + +We define a function layer `gLayer` that encodes the known part of the dynamics, $\mathbf{g}$. + +```matlab +stateSize = 2; % X = [theta; omega] +omega0 = sqrt(9.81/1); +gFcn = @(X) [X(2,:); -omega0^2*sin(X(1,:))]; +gLayer = functionLayer(gFcn, Acceleratable=true, Name="g"); +``` + +We use a small multilayer perceptron (MLP) to approximate the unknown dynamics, $h$: + +```matlab +hLayers = [ + fullyConnectedLayer(16,Name="fc1") + geluLayer + fullyConnectedLayer(1,Name="h") +]; +``` + +Define a custom layer to add the known and learned parts of the dynamics. + +```matlab +combineFcn = @(x, y) [x(1,:); x(2,:) + y]; +combineLayer = functionLayer(combineFcn,Name="combine",Acceleratable=true); +``` + +Construct the full network by connecting the input to both the known physics and the neural network, and combining their outputs. + +```matlab +fNet = [featureInputLayer(stateSize,Name="input") + gLayer + combineLayer]; +rng(0); % for reproducibility +fNet = dlnetwork(fNet,Initialize=false); +fNet = addLayers(fNet,hLayers); +fNet = connectLayers(fNet,"input","fc1"); +fNet = connectLayers(fNet,"h","combine/in2"); +``` + +Use a [`neuralODELayer`](https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.neuralodelayer.html) to integrate the system over time, starting from the initial condition. + +```matlab +nODElayers = [ + featureInputLayer(stateSize, Name="IC_in") + neuralODELayer(fNet, tTrain, ... + GradientMode="adjoint", ... + Name="ODElayer") +]; +rng(0); % for reproducibility +nODEnet = dlnetwork(nODElayers); +``` +# Specify Training Options + +Here, we use the ADAM optimizer with a mini\-batch size of 50 and 200 maximum epochs. + +```matlab +opts = trainingOptions("adam", ... + Plots="training-progress", ... + MiniBatchSize=50, ... + MaxEpochs=200, ... + LearnRateSchedule="piecewise", ... + LearnRateDropFactor=0.1, ... + LearnRateDropPeriod=1200, ... + InitialLearnRate=0.01, ... + ExecutionEnvironment="cpu", ... + Verbose=false); +``` +# Train the Network +```matlab +rng(0); % for reproducibility +nODEnet = trainnet(Xtrain, Ytrain, nODEnet, "l2loss", opts); +``` + +![figure_0.png](UDE_SINDy_README_media/figure_0.png) +# Predict the Pendulum Trajectory + +Used the trained network to predict $[\theta ,\dot{\theta} ]$ from an initial condition. + +```matlab +Ypred = predict(nODEnet, Xtrain); % Output: 2 x 1 x (N-1) +Ypred = extractdata(Ypred); % Convert from dlarray to numeric +Ypred = squeeze(Ypred)'; % (N-1) x 2 +``` +# Visualize the Results + +Plot the angle $\theta$ over time. + +```matlab +figure; +subplot(2,1,1); +plot(tTrain, theta(inds), 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tTrain, [Xtrain(1); Ypred(:,1)], 'b-', DisplayName='UDE', LineWidth=3); hold off +legend(); +ylabel('$\theta$ (radians)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +``` + +Plot the angular velocity $\dot{\theta}$ over time + +```matlab +subplot(2,1,2); +plot(tTrain, omega(inds), 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tTrain, [Xtrain(2); Ypred(:,2)], 'b-', DisplayName='UDE', LineWidth=3); hold off +legend(); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +xlabel('Time (s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +``` + +![figure_1.png](UDE_SINDy_README_media/figure_1.png) + +Plot the phase\-space: $\theta$ vs $\dot{\theta}$. + +```matlab +figure; +plot(theta(inds), omega(inds), 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot([Xtrain(1); Ypred(:,1)], [Xtrain(2); Ypred(:,2)], 'b-', LineWidth=3, DisplayName = 'UDE'); hold off; +legend('Location', 'best'); +xlabel('$\theta$ (radians)',Interpreter='latex'); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +title('Phase-Space: $\theta$ vs. $\dot{\theta}$',Interpreter='latex'); +``` + +![figure_2.png](UDE_SINDy_README_media/figure_2.png) +# Identify equations for the universal approximator + +Extract $h$ from `neuralODE`. + +```matlab +fNetTrained = nODEnet.Layers(2).Network; +hNetTrained = removeLayers(fNetTrained,["g","combine"]); +lrn = hNetTrained.Learnables; +lrn = dlupdate(@dlarray, lrn); +hNetTrained = initialize(hNetTrained); +hNetTrained.Learnables = lrn; +``` + +Use the training data $Y$ as the sample points $X_j$ and evaluate $h(X_j )$. + +```matlab +hEval = predict(hNetTrained,Y',InputDataFormats="CB"); +``` + +Denote $X=(\theta ,\dot{\theta} )$ and specify the basis functions $e_1 (\theta ,\dot{\theta} )=\dot{\theta}$, $e_2 (\theta ,\dot{\theta} )={\dot{\theta} }^2$, $e_3 (\theta ,\dot{\theta} )={\dot{\theta} }^3$. Note that the dynamics $\mathbf{g}$ are not included as they are already known. + +```matlab +e1 = @(X) X(2,:); +e2 = @(X) X(2,:).^2; +e3 = @(X) X(2,:).^3; +E = @(X) [e1(X); e2(X); e3(X)]; +``` + +Evaluate the basis functions at the training points. + +```matlab +EEval = E(Y'); +``` + +Sequentially solve $h=WE$ for 10 iterations, and set terms with absolute value less than 0.05 to 0. + +```matlab +iters = 10; +threshold = 0.05; +Ws = cell(iters,1); +% Initial least squares solution for W +W = hEval/EEval; +Ws{1} = W; +for iter = 2:iters + % Zero out small coefficients for sparsity + belowThreshold = abs(W) to integrate the system over time, starting from the initial +% condition. + +nODElayers = [ + featureInputLayer(stateSize, Name="IC_in") + neuralODELayer(fNet, tTrain, ... + GradientMode="adjoint", ... + Name="ODElayer") +]; +rng(0); % for reproducibility +nODEnet = dlnetwork(nODElayers); +%% Specify Training Options +% Here, we use the ADAM optimizer with a mini-batch size of 50 and 200 maximum +% epochs. + +opts = trainingOptions("adam", ... + Plots="training-progress", ... + MiniBatchSize=50, ... + MaxEpochs=200, ... + LearnRateSchedule="piecewise", ... + LearnRateDropFactor=0.1, ... + LearnRateDropPeriod=1200, ... + InitialLearnRate=0.01, ... + ExecutionEnvironment="cpu", ... + Verbose=false); +%% Train the Network + +rng(0); % for reproducibility +nODEnet = trainnet(Xtrain, Ytrain, nODEnet, "l2loss", opts); +%% Predict the Pendulum Trajectory +% Used the trained network to predict $[\theta,\dot{\theta}]$ from an initial +% condition. + +Ypred = predict(nODEnet, Xtrain); % Output: 2 x 1 x (N-1) +Ypred = extractdata(Ypred); % Convert from dlarray to numeric +Ypred = squeeze(Ypred)'; % (N-1) x 2 +%% Visualize the Results +% Plot the angle $\theta$ over time. + +figure; +subplot(2,1,1); +plot(tTrain, theta(inds), 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tTrain, [Xtrain(1); Ypred(:,1)], 'b-', DisplayName='UDE', LineWidth=3); hold off +legend(); +ylabel('$\theta$ (radians)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +%% +% Plot the angular velocity $\dot{\theta}$ over time + +subplot(2,1,2); +plot(tTrain, omega(inds), 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot(tTrain, [Xtrain(2); Ypred(:,2)], 'b-', DisplayName='UDE', LineWidth=3); hold off +legend(); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +xlabel('Time (s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +%% +% Plot the phase-space: $\theta$ vs $\dot{\theta}$. + +figure; +plot(theta(inds), omega(inds), 'ko', DisplayName='Noisy Data', LineWidth=1); hold on +plot([Xtrain(1); Ypred(:,1)], [Xtrain(2); Ypred(:,2)], 'b-', LineWidth=3, DisplayName = 'UDE'); hold off; +legend('Location', 'best'); +xlabel('$\theta$ (radians)',Interpreter='latex'); +ylabel('$\dot{\theta}$ (radians/s)',Interpreter='latex'); +set(gca,FontSize=14,LineWidth=2.5) +title('Phase-Space: $\theta$ vs. $\dot{\theta}$',Interpreter='latex'); +%% Identify equations for the universal approximator +% Extract $h$ from |neuralODE|. + +fNetTrained = nODEnet.Layers(2).Network; +hNetTrained = removeLayers(fNetTrained,["g","combine"]); +lrn = hNetTrained.Learnables; +lrn = dlupdate(@dlarray, lrn); +hNetTrained = initialize(hNetTrained); +hNetTrained.Learnables = lrn; +%% +% Use the training data $Y$ as the sample points $X_j$ and evaluate $h(X_j)$. + +hEval = predict(hNetTrained,Y',InputDataFormats="CB"); +%% +% Denote $X = (\theta,\dot{\theta})$ and specify the basis functions $e_1(\theta,\dot{\theta}) +% = \dot{\theta}$, $e_2(\theta,\dot{\theta}) = \dot{\theta}^2$, $e_3(\theta,\dot{\theta})=\dot{\theta}^3$. +% Note that the dynamics $\mathbf{g}$ are not included as they are already known. + +e1 = @(X) X(2,:); +e2 = @(X) X(2,:).^2; +e3 = @(X) X(2,:).^3; +E = @(X) [e1(X); e2(X); e3(X)]; +%% +% Evaluate the basis functions at the training points. + +EEval = E(Y'); +%% +% Sequentially solve $h= WE$ for 10 iterations, and set terms with absolute +% value less than 0.05 to 0. + +iters = 10; +threshold = 0.05; +Ws = cell(iters,1); +% Initial least squares solution for W +W = hEval/EEval; +Ws{1} = W; +for iter = 2:iters + % Zero out small coefficients for sparsity + belowThreshold = abs(W)