You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to parse a LAMMPS dump file, containing all simulation snapshots, into a list of LammpsDump objects using the parse_lammps_dumps() function. This method works for some dump files, but not all, even though an identical LAMMPS input script was used to generate all the files.
I have linked two example files to recreate this error:
one that parses without error ('success.dump'),
one that results in the error ('fail.dump').
The error can be recreated by running:
d = parse_lammps_dumps('fail.dump')
dumps = [i for i in d]
Please note these are large at around 1.5 Gb each
I have checked the length of the split line where the error is raised (line 62647) and it is the same length as the headers (20, not 22 like it is claiming). I have also checked for special characters, whitespaces, and new lines and nothing seems to be adding to the list of fields in the fail.dump file. Does anyone else have any experience with this?
Here is the full error message:
ParserError Traceback (most recent call last)
[/home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py](https://vscode-remote+ssh-002dremote-002bhpg-002dcompute-005f10.vscode-resource.vscode-cdn.net/home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py) in line 9
<a href='file:///home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py?line=171'>172</a> num_bins = 25
<a href='file:///home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py?line=173'>174</a> d = parse_lammps_dumps(filename)
----> <a href='file:///home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py?line=174'>175</a> dumps = [i for i in d]
[/home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py](https://vscode-remote+ssh-002dremote-002bhpg-002dcompute-005f10.vscode-resource.vscode-cdn.net/home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py) in line 9, in <listcomp>(.0)
<a href='file:///home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py?line=171'>172</a> num_bins = 25
<a href='file:///home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py?line=173'>174</a> d = parse_lammps_dumps(filename)
----> <a href='file:///home/kimia.gh/blue2/B4C_ML_Potential/analysis_scripts/shock/pymatgen_dump_analysis.py?line=174'>175</a> dumps = [i for i in d]
File [~/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py:123](https://vscode-remote+ssh-002dremote-002bhpg-002dcompute-005f10.vscode-resource.vscode-cdn.net/home/kimia.gh/~/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py:123), in parse_lammps_dumps(file_pattern)
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=120'>121</a> if line.startswith("ITEM: TIMESTEP"):
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=121'>122</a> if len(dump_cache) > 0:
--> <a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=122'>123</a> yield LammpsDump.from_str("".join(dump_cache))
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=123'>124</a> dump_cache = [line]
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=124'>125</a> else:
File [~/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py:71](https://vscode-remote+ssh-002dremote-002bhpg-002dcompute-005f10.vscode-resource.vscode-cdn.net/home/kimia.gh/~/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py:71), in LammpsDump.from_str(cls, string)
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=68'>69</a> box = LammpsBox(bounds, tilt)
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=69'>70</a> data_head = lines[8].replace("ITEM: ATOMS", "").split()
---> <a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=70'>71</a> data = pd.read_csv(StringIO("\n".join(lines[9:])), names=data_head, delim_whitespace=True)
<a href='file:///home/kimia.gh/.conda/envs/pmg/lib/python3.9/site-packages/pymatgen/io/lammps/outputs.py?line=71'>72</a> return cls(timestep, n_atoms, box, data)
...
File [~/.conda/envs/pmg/lib/python3.9/site-packages/pandas/_libs/parsers.pyx:2029](https://vscode-remote+ssh-002dremote-002bhpg-002dcompute-005f10.vscode-resource.vscode-cdn.net/home/kimia.gh/~/.conda/envs/pmg/lib/python3.9/site-packages/pandas/_libs/parsers.pyx:2029), in pandas._libs.parsers.raise_parser_error()
ParserError: Error tokenizing data. C error: Expected 20 fields in line 62647, saw 22
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?447a8554-2a73-4a57-a807-865dbd39b69c) or open in a [text editor](command:workbench.action.openLargeOutput?447a8554-2a73-4a57-a807-865dbd39b69c). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am trying to parse a LAMMPS dump file, containing all simulation snapshots, into a list of LammpsDump objects using the
parse_lammps_dumps()
function. This method works for some dump files, but not all, even though an identical LAMMPS input script was used to generate all the files.I have linked two example files to recreate this error:
The error can be recreated by running:
Please note these are large at around 1.5 Gb each
I have checked the length of the split line where the error is raised (line 62647) and it is the same length as the headers (20, not 22 like it is claiming). I have also checked for special characters, whitespaces, and new lines and nothing seems to be adding to the list of fields in the
fail.dump
file. Does anyone else have any experience with this?Here is the full error message:
Package versions:
Beta Was this translation helpful? Give feedback.
All reactions