reading from ASE file slow error #16

Asif-Iqbal-Bhatti · 2023-03-15T10:37:59Z

Hello All,

while running the code I got this error:

UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1658220910000/work/torch/csrc/utils/tensor_new.cpp:201.)

I am reading the POSCAR file using ASE and then using the the torch.dftd3 object to find the corrected energy. It says is extremely slow how I can avoid that.

Regards,
Asif

corochann · 2023-03-18T21:33:10Z

Hi Asif,

If you run this example script, is the warning message shown?

https://github.com/pfnet-research/torch-dftd/blob/master/examples/quick_start.py

Did you change only atoms = molecule("CH3CH2OCH3") -> atoms = read("XXX.POSCAR") ?

Asif-Iqbal-Bhatti · 2023-03-19T21:24:57Z

Hello ,

Thank you for your email. Yes I did that and then this warning appear and I checked the timings and observe its performance is lower.

this is my script:
from ase.io.vasp import read_vasp
from torch_dftd.torch_dftd3_calculator import TorchDFTD3Calculator
t0 = time.time()
for g in next(os.walk('POSThermo_data'))[2]:
f = read_vasp(os.path.join(cwd, 'POSThermo_data', g))
calc = TorchDFTD3Calculator(atoms=f, device="cpu", damping="bj", xc="pbe")
d3 = f.get_potential_energy()
print(f"{g:20s}: {d3} eV")
t1 = time.time()
print(t1-t0)

corochann · 2023-03-20T08:09:31Z

Is it correct that if you execute the example code as is, the error message does not appear?

can you check atoms.positions or atoms.cell contains list of numpy array instead of numpy array?

Asif-Iqbal-Bhatti · 2023-03-20T09:13:06Z

If i run the example script then there is no warning. Its type is: Atoms(symbols='OC3H8', pbc=False), <class 'numpy.ndarray'> ,<class 'ase.cell.Cell'>. SO positions and cell has type Class numpy arrays.

Now using the read_vasp() I got:
Atoms(symbols='BaTe', pbc=True, cell=[[-3.544879, -3.544879, 0.0], [-3.544879, 0.0, -3.544879], [0.0, -3.544879, -3.544879]]) <class 'numpy.ndarray'>, <class 'ase.cell.Cell'>

I think when reading the position and cell I have read first then convert them to numpy.array() and then create a atoms object and then feed to dftd3.

Asif-Iqbal-Bhatti · 2023-03-20T20:53:55Z

Hello Coro,

I want to say I get rid of this warning by changing the source code. In the torch_dftd3_calculator.py module
I changed
def _preprocess_atoms(self, atoms: Atoms) -> Dict[str, Optional[Tensor]]:
pos = torch.tensor(np.array(atoms.get_positions()), device=self.device, dtype=self.dtype)
Z = torch.tensor(np.array(atoms.get_atomic_numbers()), device=self.device)
if any(atoms.pbc):
cell: Optional[Tensor] = torch.tensor(
np.array(atoms.get_cell()), device=self.device, dtype=self.dtype
)

with np.array() there there was no warning. What do you think? I analyse dftd3 for materials project data with Grimme code and your code and the dft-d3 values were identical.

corochann · 2023-03-21T21:19:27Z

Yeah I think your solution works.
I expect that atoms.get_positions(), atoms.get_atomic_numbers() and atoms.get_cell() always returns numpy array but it may be not true when you read from vasp file...

can you see if which of posiitons, atomic_numbers or cell was the list of numpy format?

another tentative solution may be

atoms.positions = np.asarray(atoms.positions)
atoms.atomic_numbers = np.asarray(atoms.atomic_numbers)
# atoms.cell = np.asarray(atoms.cell)

Asif-Iqbal-Bhatti · 2023-03-22T09:27:47Z

So I checked and it is atoms.get_cell() that has the list of numpy.ndarrays. For others it is fine.

corochann · 2023-12-08T02:02:53Z

Is it possible to share us the POSCAR file that caused this problem?
We could not reproduce this issue, and hence we can't test/understand what corner-case we need to solve yet.

Sorry for my late response.

lan496 · 2023-12-08T02:07:09Z

For our side, I try the following POSCAR ("Si.POSCAR"):

system Si
5.430
0.5 0.5 0.0
0.0 0.5 0.5
0.5 0.0 0.5
2
cart
0.00 0.00 0.00
0.25 0.25 0.25

Then, read_vasp returns Cell object (ase== 3.22.1).

from ase.io.vasp import read_vasp

atoms = read_vasp("Si.POSCAR")
print(type(atoms.get_cell()))  # -> ase.cell.Cell

Some specific notation in POSCAR may change the behavior of read_vasp, but I am not sure...

PythonFZ mentioned this issue Nov 21, 2023

use get_cell().array #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reading from ASE file slow error #16

reading from ASE file slow error #16

Asif-Iqbal-Bhatti commented Mar 15, 2023

corochann commented Mar 18, 2023

Asif-Iqbal-Bhatti commented Mar 19, 2023

corochann commented Mar 20, 2023

Asif-Iqbal-Bhatti commented Mar 20, 2023 •

edited

Loading

Asif-Iqbal-Bhatti commented Mar 20, 2023

corochann commented Mar 21, 2023

Asif-Iqbal-Bhatti commented Mar 22, 2023

corochann commented Dec 8, 2023

lan496 commented Dec 8, 2023

reading from ASE file slow error #16

reading from ASE file slow error #16

Comments

Asif-Iqbal-Bhatti commented Mar 15, 2023

corochann commented Mar 18, 2023

Asif-Iqbal-Bhatti commented Mar 19, 2023

corochann commented Mar 20, 2023

Asif-Iqbal-Bhatti commented Mar 20, 2023 • edited Loading

Asif-Iqbal-Bhatti commented Mar 20, 2023

corochann commented Mar 21, 2023

Asif-Iqbal-Bhatti commented Mar 22, 2023

corochann commented Dec 8, 2023

lan496 commented Dec 8, 2023

Asif-Iqbal-Bhatti commented Mar 20, 2023 •

edited

Loading