Convergence of parameters of heat equation #199

AliaNajwaMY · 2024-10-15T18:18:47Z

AliaNajwaMY
Oct 15, 2024

I'm currently trying to solve the problem u_t- a* u_xx -b * u_x -c * u =0 when x \in [0,1] and t \in [0,1]. The boundary condition is u(t, 0)=u(t,1 )=0 and the initial condition is u(x,0)=sin(pi x). In my code, I'm trying to recover a, b and c (variables) when the a, b and c that I originally wanted is non 0. I have tried to solve this problem using the PINN model from the necromancer package.

To code up this problem, I used the code in https://colab.research.google.com/github/pnnl/neuromancer/blob/master/examples/PDEs/Part_3_PINN_BurgersEquation_inverse.ipynb as a reference, and changed the generation of data. The rest of the code should be similar. My question is: why am i unable to recover the parameters a, b and c?

Here is my code:

# %% [markdown]
# ## Imports

# %%
# torch and numpy imports
import torch
import torch.nn as nn
import numpy as np

# data imports
from scipy.io import loadmat

# plotting imports
import matplotlib.pyplot as plt

# filter some user warnings from torch broadcast
import warnings
warnings.filterwarnings("ignore")


#Set default dtype to float32
torch.set_default_dtype(torch.float)

#PyTorch random number generator
torch.manual_seed(1234)

# Random number generators in other libraries
np.random.seed(1234)

# Device configuration
if torch.backends.mps.is_available():
    device = torch.device('mps')
elif torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')




def heat_eq_exact_solution(x, t):
        """Returns the exact solution for a given x and t (for sinusoidal initial conditions).

        Parameters
        ----------
        x : np.ndarray
        t : np.ndarray
        """
        return np.sin(np.pi * x + b * np.pi  * t) * np.exp((-a * np.pi ** 2  + c) *t)



def gen_exact_solution():
    """Generates exact solution for the heat equation for the given values of x and t."""
    # Number of points in each dimension:
    x_dim, t_dim = (100, 100)

    # Bounds of 'x' and 't':
    x_min, t_min = (0, 0.0)
    x_max, t_max = (1.0, 1.0)

    # Create tensors:
    t = np.linspace(t_min, t_max, num=t_dim).reshape(t_dim, 1)
    x = np.linspace(x_min, x_max, num=x_dim).reshape(x_dim, 1)
    usol = np.zeros((x_dim, t_dim), dtype=np.complex128).reshape(x_dim, t_dim)

    # Obtain the value of the exact solution for each generated point:
    for i in range(x_dim): 
        for j in range(t_dim):
            usol[i][j] = heat_eq_exact_solution(x[i], t[j])
    usol = (usol - min(usol.flatten()))/ (max(usol.flatten())- min(usol.flatten()))
    
    # Save solution:
    np.savez("heat_eq_data_neuro", x=x, t=t, usol=usol)

a = 1.0
b = 0
c = 4.0

L = 1
gen_exact_solution()


data = np.load('heat_eq_data_neuro.npz')
x = data['x']                                   # space:      256 points between -1 and 1 [256x1]
t = data['t']                                   # time:       100 time points between 0 and 1 [100x1]
ysol = data['usol'].real                             # velocitu:   PDE solution [256x100]

X, T = np.meshgrid(x,t)                         # makes 2 arrays X and T such that u(X[i],T[j])=usol[i][j] are a tuple
X = torch.tensor(X.T).float()
T = torch.tensor(T.T).float()
y_real = torch.tensor(ysol).float()

# print(x, t, ysol)

# %%
print(X.shape, T.shape, y_real.shape)

# %% [markdown]
# ### Plot the solution

# %%
def plot3D(X, T, y):
    X = X.detach().numpy()
    T = T.detach().numpy()
    y = y.detach().numpy()

    #     2D
    fig = plt.figure()
    ax1 = fig.add_subplot(121)
    cm = ax1.contourf(T, X, y, 20,cmap="viridis")
    fig.colorbar(cm, ax=ax1) # Add a colorbar to a plot
    ax1.set_title('u(x,t)')
    ax1.set_xlabel('t')
    ax1.set_ylabel('x')
    ax1.set_aspect('equal')
        #     3D
    ax2 = fig.add_subplot(122, projection='3d')
    ax2.plot_surface(T, X, y,cmap="viridis")
    ax2.set_xlabel('t')
    ax2.set_ylabel('x')
    ax2.set_zlabel('u(x,t)')
    fig.tight_layout()

# %%
plot3D(X, T, y_real)


X_test = X.reshape(-1,1)
T_test = T.reshape(-1,1)
Y_test = y_real.reshape(-1,1)

print(X_test.shape, T_test.shape, Y_test.shape)

# %%
total_points = 10000
# train_points = total_points * 0.8
train_points = 500

id_f = np.random.choice(total_points, int(train_points), replace=False)# Randomly chosen points for Interior
X_train = X_test[id_f]
T_train = T_test[id_f]
Y_train = Y_test[id_f] 



print(X_train.shape, T_train.shape, Y_train.shape)


# %%
print("We have",total_points,"points. We will select",X_train.shape[0],"points to train our model.")


plt.figure()
plt.scatter(X_train.detach().numpy(), T_train.detach().numpy(),
            s=4., c='blue', marker='o', label='CP')
plt.title('Samples of the PDE solution y(x,t) for training')
plt.xlim(0, 1.0)
plt.ylim(0., 1.0)
plt.grid(True)
plt.xlabel('x')
plt.ylabel('t')
plt.legend(loc='upper right')
plt.show()
plt.show(block=True)


from neuromancer.dataset import DictDataset

# turn on gradients for PINN
X_train.requires_grad=True
T_train.requires_grad=True

# Training dataset
train_data = DictDataset({'x': X_train, 't':T_train, 'y':Y_train}, name='train')
# test dataset
test_data = DictDataset({'x': X_test, 't':T_test, 'y':Y_test}, name='test')

# torch dataloaders
batch_size = X_train.shape[0]  # full batch training
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,
                                           collate_fn=train_data.collate_fn,
                                           shuffle=False)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size,
                                         collate_fn=test_data.collate_fn,
                                         shuffle=False)



from neuromancer.modules import blocks
from neuromancer.system import Node

# neural net to solve the PDE problem bounded in the PDE domain
net = blocks.MLP(insize=2, outsize=1, hsizes=[32, 32], nonlin=nn.Tanh)


pde_net = Node(net, ['x', 't'], ['y_hat'], name='net')

# %%
print("symbolic inputs  of the pde_net:", pde_net.input_keys)
print("symbolic outputs of the pde_net:", pde_net.output_keys)

# %%
# evaluate forward pass on the train data
net_out = pde_net(train_data.datadict)
net_out['y_hat'].shape

from neuromancer.constraint import variable

# symbolic Neuromancer variables
y = variable('y')           # PDE measurements from the dataset
y_hat = variable('y_hat')  # PDE solution generated as the output of a neural net (pde_net)
t = variable('t')          # temporal domain
x = variable('x')          # spatial domain
# trainable parameters with initial values
a_var = variable(torch.nn.Parameter(torch.tensor(0.5)), display_name='a')       # trainable PDE parameter a
b_var = variable(torch.nn.Parameter(torch.tensor(0.5)), display_name='b')             # trainable PDE parameter b
c_var = variable(torch.nn.Parameter(torch.tensor(0.5)), display_name='c')             # trainable PDE parameter c

# %%
# get the symbolic derivatives
dy_dt = y_hat.grad(t)
dy_dx = y_hat.grad(x)
d2y_d2x = dy_dx.grad(x)
# get the PINN form

f_pinn = dy_dt - a_var * d2y_d2x - b_var * dy_dx -c_var * y_hat

# %%
# computational graph of the PINN neural network
f_pinn.show()


# scaling factor for better convergence
scaling = 400

# PDE CP loss
ell_f = (f_pinn == 0.)^2

# PDE supervised learning loss
ell_2 = scaling * (y_hat == y)^2

ymax = -1
ymin = 1

# output constraints to bound the PINN solution in the PDE output domain [-1.0, 1.0]
con_1 = (y_hat <= ymax)^2
con_2 = (y_hat >= ymin)^2

print(type(ell_f))


from neuromancer.loss import PenaltyLoss
from neuromancer.problem import Problem

# create Neuromancer optimization loss
pinn_loss = PenaltyLoss(objectives=[ell_f, ell_2], constraints=[])



# construct the PINN optimization problem
problem = Problem(nodes=[pde_net],      # list of nodes (neural nets) to be optimized
                  loss=pinn_loss,       # physics-informed loss function
                  grad_inference=True   # argument for allowing computation of gradients at the inference time)
                 )


from neuromancer.trainer import Trainer

optimizer = torch.optim.AdamW(problem.parameters(), lr=0.001)
epochs = 10000

#  Neuromancer trainer
trainer = Trainer(
    problem.to(device),
    train_loader,
    optimizer=optimizer,
    epochs=epochs,
    epoch_verbose=1000,
    train_metric='train_loss',
    dev_metric='train_loss',
    eval_metric="train_loss",
    warmup=epochs,
    device=device
)

# %%
# Train PINN
best_model = trainer.train()

# load best trained model
problem.load_state_dict(best_model)

print('True parameter a = ', a)
print('Estimated parameter lambda = ', float(a_var.value))
print('True parameter b = ', b)
print('Estimated parameter nu = ', float(b_var.value))
print('True parameter c = ', c)
print('Estimated parameter c = ', float(c_var.value))

# %%
# evaluate trained PINN on test data
PINN = problem.nodes[0].cpu()
y1 = PINN(test_data.datadict)['y_hat']

# arrange data for plotting
y_pinn = y1.reshape(shape=[100, 100]).detach().cpu()

# %%
# plot PINN solution
plot3D(X, T, y_pinn)

# %%
# plot exact PDE solution
plot3D(X, T, y_real)

# %%
# plot residuals PINN - exact PDE
plot3D(X, T, y_pinn-y_real)

Answered by brunopjacob

Oct 22, 2024

Hi @AliaNajwaMY! Thank you for using Neuromancer!

I've reviewed your issue, and I was able to reproduce what you mentioned regarding the parameters a, b and c not matching the expected values. I wanted to point out a few things so that we can help you solve this problem:

You mention in your question that you are trying to solve this advection-diffusion-reaction equation: u_tt - a* u_xx -b * u_x -c * u =0. However, when constructing the loss function for the PDE, I noticed you used dy_dt - a_var * d2y_d2x - b_var * dy_dx -c_var * y_hat. There is a mismatch in the u_tt and dy_dt that can be causing the issue.
I noticed you are using an analytical solution to generate u, as u(t,x) = sin(…

View full answer

brunopjacob · 2024-10-22T23:57:14Z

brunopjacob
Oct 22, 2024
Maintainer

Hi @AliaNajwaMY! Thank you for using Neuromancer!

I've reviewed your issue, and I was able to reproduce what you mentioned regarding the parameters a, b and c not matching the expected values. I wanted to point out a few things so that we can help you solve this problem:

You mention in your question that you are trying to solve this advection-diffusion-reaction equation: u_tt - a* u_xx -b * u_x -c * u =0. However, when constructing the loss function for the PDE, I noticed you used dy_dt - a_var * d2y_d2x - b_var * dy_dx -c_var * y_hat. There is a mismatch in the u_tt and dy_dt that can be causing the issue.
I noticed you are using an analytical solution to generate u, as u(t,x) = sin(pi*x) * exp((-a*(pi^2) + c)*t). I do believe, however, that this is not a solution of the PDE that you are trying to solve, even with b = 0; it seems that a source term is missing out of your PDE, that should probably take the form d2y_dt2 - a_var * d2y_d2x - b_var * dy_dx -c_var - f, where f is the source term that you get from plugging your manufactured solution into the PDE. I did test this on wolfram alpha and by hand, just to make sure, and I do believe this is the case. Please double check to make sure that's the case.
Notice that the issues mentioned in 1) and 2) are not necessarily affecting the solution found by the PINN, as you are using a scaling of 400x in the data loss ell_2 when compared with the PDE loss ell_1. Note that this is not necessarily wrong (we did that in the tutorial you mentioned, and that's a good practice when using data loss as means of imposing initial/boundary conditions), but since the network is learning u(t,x) mostly in a supervised manner, (the data loss doesn't have a clear signal from a, b and c, as it doesn't know the PDE), it might be hard for the model to actually learn a, b and c when u is not a solution of the PDE.

Please let me know if any of these points need clarification. I'd be happy to follow up on these.

Best wishes,
Bruno

0 replies

AliaNajwaMY · 2024-10-24T13:36:02Z

AliaNajwaMY
Oct 24, 2024
Author

Hello Bruno, Thanks for replying! I just realised that the right PDE is actually the one in my code file: meaning that I wanted to solve u_t - a* u_xx -b * u_x -c * u =0, sorry for that! In that case, I was thinking that your last suggestion would apply the most since the the function that I would use for data generation would actually satisfy the PDE. What scaling do you suggest I use instead? I tried a variation of scaling, but when I increased the scaling of ell_2 instead of ell_1, it just made finding the solution of PDE be inaccurate but not actually helping in recovering the parameters. On Tuesday, 22 October 2024 at 07:57:37 pm GMT-4, Bruno Jacob ***@***.***> wrote: Hi @AliaNajwaMY! Thank you for using Neuromancer! I've reviewed your issue, and I was able to reproduce what you mentioned regarding the parameters a, b and c not matching the expected values. I wanted to point out a few things so that we can help you solve this problem: - You mention in your question that you are trying to solve this advection-diffusion-reaction equation: u_tt - a* u_xx -b * u_x -c * u =0. However, when constructing the loss function for the PDE, I noticed you used dy_dt - a_var * d2y_d2x - b_var * dy_dx -c_var * y_hat. There is a mismatch in the u_tt and dy_dt that can be causing the issue. - I noticed you are using an analytical solution to generate u, as u(t,x) = sin(pi*x) * exp((-a*(pi^2) + c)*t). I do believe, however, that this is not a solution of the PDE that you are trying to solve, even with b = 0; it seems that a source term is missing out of your PDE, that should probably take the form d2y_dt2 - a_var * d2y_d2x - b_var * dy_dx -c_var - f, where f is the source term that you get from plugging your manufactured solution into the PDE. I did test this on wolfram alpha and by hand, just to make sure, and I do believe this is the case. Please double check to make sure that's the case. - Notice that the issues mentioned in 1) and 2) are not necessarily affecting the solution found by the PINN, as you are using a scaling of 400x in the data loss ell_2 when compared with the PDE loss ell_1. Note that this is not necessarily wrong (we did that in the tutorial you mentioned, and that's a good practice when using data loss as means of imposing initial/boundary conditions), but since the network is learning u(t,x) mostly in a supervised manner, (the data loss doesn't have a clear signal from a, b and c, as it doesn't know the PDE), it might be hard for the model to actually learn a, b and c when u is not a solution of the PDE. Please let me know if any of these points need clarification. I'd be happy to follow up on these. Best wishes, Bruno — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

brunopjacob · 2024-10-24T19:01:41Z

brunopjacob
Oct 24, 2024
Maintainer

Hi @AliaNajwaMY!

I think the function that you are using (unless you changed it) does not satisfy the PDE. Here's what I got from wolfram alpha:

In other words, the function you are using to generate the training data does not satisfy the PDE, unless you add that difference (the -b*exp(c-a*pi^2)... term shown in the last equality of the wolfram alpha result, i.e., unless you modify your PDE to the following:

du/dt - a* d2u/dx2 - b * du/dx -c * u - f

where f is the -b*exp(c-a*pi^2)... term. That way, minimizing the loss ell_f actually makes sense, as ell_f = fpinn = du/dt - a* d2u/dx2 - b * du/dx -c * u - f -> 0. What you are currently doing, on the other hand, is: ell_f = fpinn = du/dt - a* d2u/dx2 - b * du/dx -c * u, which won't work, as du/dt - a* d2u/dx2 - b * du/dx -c * u will, at best, approach f here.

So you can either i) add the source term f to the PDE, by changing your fpinn to have that f term; or ii) change the way you are generating your exact solution u. I don't think this PDE has an analytical solution such that f = 0, so I think you can either go with option i) or you can propose a new u(x,t) and recover a new f from plugging u back into the pde.

Regarding the recovering of parameters, you'll need to first fix the PDE / u / f incompatibility that I described above in order to actually recover a, b and c accurately. In this case, it's likely that you'd be able to recover a, b and c with the scaling as is.

I hope that helps! Please don't hesitate to ask for clarification :)

Best,
Bruno

0 replies

AliaNajwaMY · 2024-10-24T19:44:13Z

AliaNajwaMY
Oct 24, 2024
Author

Hello Bruno, Thank you so much for getting back to me! That's odd that they didn't equate to 0 because if I rewrite the results of the wolfram alpha you included, I can cancel the terms out to be 0. Am I missing something here? On Thursday, 24 October 2024 at 03:02:05 pm GMT-4, Bruno Jacob ***@***.***> wrote: Hi @AliaNajwaMY! I think the function that you are using (unless you changed it) does not satisfy the PDE. Here's what I got from wolfram alpha: Screenshot.2024-10-24.at.11.44.28.AM.png (view on web) In other words, the function you are using to generate the training data does not satisfy the PDE, unless you add that difference (the -b*exp(c-a*pi^2)... term shown in the last equality of the wolfram alpha result, i.e., unless you modify your PDE to the following: du/dt - a* d2u/dx2 - b * du/dx -c * u - f where f is the -b*exp(c-a*pi^2)... term. That way, minimizing the loss ell_f actually makes sense, as ell_f = fpinn = du/dt - a* d2u/dx2 - b * du/dx -c * u - f -> 0. What you are currently doing, on the other hand, is: ell_f = fpinn = du/dt - a* d2u/dx2 - b * du/dx -c * u, which won't work, as du/dt - a* d2u/dx2 - b * du/dx -c * u will, at best, approach f here. So you can either i) add the source term f to the PDE, by changing your fpinn to have that f term; or ii) change the way you are generating your exact solution u. I don't think this PDE has an analytical solution such that f = 0, so I think you can either go with option i) or you can propose a new u(x,t) and recover a new f from plugging u back into the pde. Regarding the recovering of parameters, you'll need to first fix the PDE / u / f incompatibility that I described above in order to actually recover a, b and c accurately. In this case, it's likely that you'd be able to recover a, b and c with the scaling as is. I hope that helps! Please don't hesitate to ask for clarification :) Best, Bruno — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

brunopjacob · 2024-10-24T20:29:53Z

brunopjacob
Oct 24, 2024
Maintainer

Hi @AliaNajwaMY, I'm sorry you are a absolutely right, it does reduce to zero, so the u you proposed is indeed the solution of the PDE! In this case, chances are that the optimization is getting stuck in a local minima; so the a, b and c somewhat get close to what the solution would be, but not the global one. This is an ongoing research problem in PINNs (e.g., see https://arxiv.org/pdf/2203.13648, https://www.sciencedirect.com/science/article/pii/S2590123024001841), and things like adaptive weights for the different loss terms, as well as adaptive collocation point creation, may help.

You'll notice as well that the characteristic length of the domain L, the number of layers, the number of collocation points, the scaling of each loss term, the dimensionless Peclet number (here would be bL/a), and Damköhler Number (cL^2/a) should all affect the presence of these local minima; that's because all these parameters affect the loss landscape, directly or indirectly. Sometimes the minima is somewhat sticky (e.g., in case the PDE has a fixed point that attracts and traps the optimization process); that's the case for equations like Allen-Cahn (see Fig. 3 of https://arxiv.org/pdf/2203.13648), and sometimes this can be resolved by running your optimization for a long number of epochs.

For example, if possible, try using a = 0.1. Because the diffusion will take place in a much slower pace, the signal to c will be a bit better (you'll likely get a c closer to 4.).

I'll keep exploring to see if we currently have anything in Neuromancer that can help in your specific problem, but unfortunately that's a difficult one!

Please let me know if that helps!

Best wishes,
Bruno

0 replies

AliaNajwaMY · 2024-10-25T14:40:19Z

AliaNajwaMY
Oct 25, 2024
Author

Hello Bruno,

Oh thank you for letting me know about this! I wonder if you or anyone in the necromancer team has a specific example of this problem occurring? The fix doesn't have to be using the necromancer package but I would love to see a concrete example of the fix if you have it!

Best,
Alia

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convergence of parameters of heat equation #199

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Convergence of parameters of heat equation #199

AliaNajwaMY Oct 15, 2024

Replies: 6 comments

brunopjacob Oct 22, 2024 Maintainer

AliaNajwaMY Oct 24, 2024 Author

brunopjacob Oct 24, 2024 Maintainer

AliaNajwaMY Oct 24, 2024 Author

brunopjacob Oct 24, 2024 Maintainer

AliaNajwaMY Oct 25, 2024 Author

AliaNajwaMY
Oct 15, 2024

brunopjacob
Oct 22, 2024
Maintainer

AliaNajwaMY
Oct 24, 2024
Author

brunopjacob
Oct 24, 2024
Maintainer

AliaNajwaMY
Oct 24, 2024
Author

brunopjacob
Oct 24, 2024
Maintainer

AliaNajwaMY
Oct 25, 2024
Author