-
Notifications
You must be signed in to change notification settings - Fork 4
Update pandas to 3.0.0 #619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughBumped pandas from 2.3.3 to 3.0.0 in multiple environment and packaging files; adjusted HTML parsing calls in two referencedata modules to wrap fetched HTML with an in-memory text buffer ( Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.ci_support/environment.yml:
- Line 9: Replace the incompatible pandas pin by changing the deps entry "pandas
=3.0.0" to a pandas 2.x compatible spec (e.g., "pandas =2.3.*" or a range
"<3.0") in the environment.yml so the solver works with the declared
requires-python in pyproject.toml (>=3.9,<3.15); alternatively, if you intend to
require pandas 3.0.0, update pyproject.toml's requires-python to ">=3.11" and
run CI/install to confirm no breakages from pandas 3.0 behavioral changes
(string dtype, copy-on-write, datetime resolution) before committing.
In `@pyproject.toml`:
- Around line 49-66: The pyproject optional dependency groups lammps,
lammps_phonons, and experimental currently pin pandas==3.0.0 which requires
Python 3.11+; update those dependency entries in the lammps, lammps_phonons, and
experimental lists to a pandas 2.x release (e.g., pandas==2.1.3) that is
compatible with Python 3.9–3.10 to restore declared interpreter compatibility.
| - mendeleev =1.1.0 | ||
| - numpy =2.3.5 | ||
| - pandas =2.3.3 | ||
| - pandas =3.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Explore repository structure
find . -name "environment.yml" -type f | head -20Repository: pyiron/atomistics
Length of output: 115
🏁 Script executed:
#!/bin/bash
# Read the CI support environment file
if [ -f ".ci_support/environment.yml" ]; then
cat -n ".ci_support/environment.yml"
fiRepository: pyiron/atomistics
Length of output: 540
🏁 Script executed:
#!/bin/bash
# Check pandas version availability in conda
# Using Python to query PyPI for pandas version history
python3 << 'EOF'
import urllib.request
import json
try:
with urllib.request.urlopen('https://pypi.org/pypi/pandas/json') as response:
data = json.loads(response.read().decode())
# Check if 3.0.0 exists
versions = list(data['releases'].keys())
versions_sorted = sorted(versions, reverse=True)
print("Latest pandas versions:")
for v in versions_sorted[:15]:
print(f" {v}")
if "3.0.0" in versions:
print("\n✓ pandas 3.0.0 exists")
else:
print("\n✗ pandas 3.0.0 DOES NOT EXIST")
# Find 3.x versions
v3_versions = [v for v in versions if v.startswith('3.')]
if v3_versions:
print(f" Available 3.x versions: {sorted(v3_versions, reverse=True)[:5]}")
else:
print(" No 3.x versions available")
except Exception as e:
print(f"Error querying PyPI: {e}")
EOFRepository: pyiron/atomistics
Length of output: 209
🌐 Web query:
pandas 3.0.0 release
💡 Result:
pandas 3.0.0 was released on January 21, 2026. [1]
Key highlights / breaking changes in 3.0.0:
- Dedicated string dtype enabled by default (uses PyArrow-backed strings if PyArrow is installed; otherwise falls back). [1]
- Copy-on-Write behavior for more consistent copy/view semantics. [1]
- New
pd.col()syntax for building expressions (e.g., inDataFrame.assign). [1] - Several backward-incompatible changes, including datetime/timedelta resolution inference changes and
pd.offsets.Daynow always representing a calendar day (DST-safe). [1] - Minimum Python version is now 3.11+; NumPy minimum is 1.26.0. [1]
- Upgrade advice: upgrade to pandas 2.3 first and resolve deprecation warnings before moving to 3.0. [1]
Release candidates were published earlier (e.g., 3.0.0rc0 on Dec 3, 2025) on GitHub and PyPI. [2] [3]
Sources: [1] [2] [3]
🏁 Script executed:
#!/bin/bash
# Try to find information about available pandas versions
# using conda search via web or checking known version info
python3 << 'EOF'
# Check using requests with a different approach
import subprocess
import json
try:
# Try using conda search command via conda-forge
result = subprocess.run(['conda', 'search', '--json', 'pandas'],
capture_output=True, text=True, timeout=30)
if result.returncode == 0:
data = json.loads(result.stdout)
versions = [pkg['version'] for pkg in data.get('pandas', [])]
versions_sorted = sorted(set(versions), reverse=True)
print("Available pandas versions in conda-forge:")
for v in versions_sorted[:20]:
print(f" {v}")
if "3.0.0" in versions_sorted:
print("\n✓ pandas 3.0.0 is available")
else:
print("\n✗ pandas 3.0.0 is NOT available")
# Show highest versions
v3_versions = [v for v in versions_sorted if v.startswith('3.')]
if v3_versions:
print(f" Highest 3.x version available: {v3_versions[0]}")
else:
print(f" Highest version available: {versions_sorted[0]}")
else:
print("Error running conda search")
except Exception as e:
print(f"Error: {e}")
EOFRepository: pyiron/atomistics
Length of output: 112
🏁 Script executed:
#!/bin/bash
# Check for Python version constraints in the project
find . -maxdepth 2 -type f \( -name "pyproject.toml" -o -name "setup.py" -o -name "setup.cfg" \) | head -5 | xargs -I {} cat -n {}Repository: pyiron/atomistics
Length of output: 4463
🏁 Script executed:
#!/bin/bash
# Check CI configuration files
find .github -name "*.yml" -o -name "*.yaml" 2>/dev/null | head -5 | xargs -I {} bash -c "echo '=== {} ===' && head -30 {}"Repository: pyiron/atomistics
Length of output: 3230
Critical: pandas 3.0.0 requires Python 3.11+ but project supports Python 3.9+.
The environment.yml pins pandas 3.0.0, which was released January 21, 2026 and requires Python 3.11 as a minimum. However, pyproject.toml declares requires-python = ">=3.9, <3.15", supporting Python 3.9 and 3.10. The CI environment will fail to solve on Python 3.9/3.10. Additionally, pandas 3.0.0 introduces breaking changes (string dtype enabled by default, Copy-on-Write behavior by default, datetime/timedelta resolution changes). The pandas upgrade guidance recommends upgrading to pandas 2.3 first and resolving deprecation warnings before moving to 3.0.
Either narrow the Python version requirement to 3.11+, or reconsider using pandas 3.0.0 given its recent release and breaking changes.
🤖 Prompt for AI Agents
In @.ci_support/environment.yml at line 9, Replace the incompatible pandas pin
by changing the deps entry "pandas =3.0.0" to a pandas 2.x compatible spec
(e.g., "pandas =2.3.*" or a range "<3.0") in the environment.yml so the solver
works with the declared requires-python in pyproject.toml (>=3.9,<3.15);
alternatively, if you intend to require pandas 3.0.0, update pyproject.toml's
requires-python to ">=3.11" and run CI/install to confirm no breakages from
pandas 3.0 behavioral changes (string dtype, copy-on-write, datetime resolution)
before committing.
| lammps = [ | ||
| "pylammpsmpi==0.3.9", | ||
| "jinja2==3.1.6", | ||
| "pandas==2.3.3", | ||
| "pandas==3.0.0", | ||
| "pyiron_lammps==0.5.3", | ||
| ] | ||
| lammps_phonons = [ | ||
| "pylammpsmpi==0.3.9", | ||
| "jinja2==3.1.6", | ||
| "pandas==2.3.3", | ||
| "pandas==3.0.0", | ||
| "dynaphopy==1.18.0", | ||
| "pyiron_lammps==0.5.3", | ||
| ] | ||
| experimental = [ | ||
| "lxml==6.0.2", | ||
| "mendeleev==1.1.0", | ||
| "pandas==2.3.3", | ||
| "pandas==3.0.0", | ||
| "requests==2.32.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check pandas 3.0.0 Python requirement from PyPI metadata
python3 <<'PY'
import json, urllib.request
try:
data = json.load(urllib.request.urlopen("https://pypi.org/pypi/pandas/3.0.0/json"))
print("Requires-Python:", data["info"]["requires_python"])
print("Version:", data["info"]["version"])
print("Release Date:", data["info"]["release_url"])
except Exception as e:
print(f"Error: {e}")
PYRepository: pyiron/atomistics
Length of output: 195
🏁 Script executed:
# Also check the pyproject.toml to confirm the Python range
cat pyproject.toml | grep -A 5 "python"Repository: pyiron/atomistics
Length of output: 301
🏁 Script executed:
# Check if there are tests for optional dependencies
git ls-files | grep -E "test.*lammps|test.*pandas" | head -20Repository: pyiron/atomistics
Length of output: 933
🌐 Web query:
pandas 3.0.0 python version requirements support
💡 Result:
pandas 3.0.0 supports (and effectively requires) Python 3.11 or newer (i.e., Python 3.10 and earlier are not supported). [1]
Source: pandas 3.0.0 release notes (“Increased minimum version for Python”). [1]
🏁 Script executed:
# Check if there are any existing compatibility or version constraint issues in the codebase
grep -r "pandas" pyproject.toml | head -10Repository: pyiron/atomistics
Length of output: 123
🏁 Script executed:
# Check if there are any existing test markers or skip conditions for optional deps
grep -r "skipif\|mark.skip" tests/ --include="*.py" | grep -i "lammps\|pandas" | head -10Repository: pyiron/atomistics
Length of output: 43
Downgrade pandas to a version compatible with Python 3.9-3.10.
pandas 3.0.0 requires Python 3.11+, which breaks compatibility with Python 3.9 and 3.10 that are within your project's declared support range (>=3.9,<3.15). This affects lines 52, 58, and 65 in the lammps, lammps_phonons, and experimental optional dependency groups. Use pandas 2.x instead.
🤖 Prompt for AI Agents
In `@pyproject.toml` around lines 49 - 66, The pyproject optional dependency
groups lammps, lammps_phonons, and experimental currently pin pandas==3.0.0
which requires Python 3.11+; update those dependency entries in the lammps,
lammps_phonons, and experimental lists to a pandas 2.x release (e.g.,
pandas==2.1.3) that is compatible with Python 3.9–3.10 to restore declared
interpreter compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/atomistics/referencedata/wikipedia.py`:
- Line 36: pandas.read_html is being called with a requests.Response object
(response) which is invalid; change the call to pass a text stream by wrapping
response.text in io.StringIO and then call pandas.read_html on that stream
(update reference to the pandas.read_html invocation that assigns df_lst), and
ensure io is imported (or use io.StringIO where appropriate) so runtime errors
are avoided.
In `@src/atomistics/referencedata/wolframalpha.py`:
- Line 20: Replace the direct use of requests.Response in the pandas.read_html
call: change the call to use requests.get with a sensible timeout (e.g.,
timeout=10) and pass response.text (or response.content.decode(...)) or a
file-like string to pandas.read_html instead of the Response object; ensure
requests is imported if not already. Specifically, update the line using
pandas.read_html and requests.get so it reads the URL with a timeout, assigns
the response to a variable (e.g., resp = requests.get(url, timeout=10)), and
then call pandas.read_html(resp.text) (or resp.content decoded) to satisfy
pandas.read_html's accepted input types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/atomistics/referencedata/wolframalpha.py`:
- Around line 279-357: The data_dict entries (e.g., keys "thermalcondictivity",
"atomicradius", "bulkmodulus", etc.) currently use "http://" periodictable.com
URLs which cause unnecessary redirects; update each "url" value in the data_dict
to use "https://" (for example change
"http://periodictable.com/Properties/A/ThermalConductivity.an.html" to
"https://periodictable.com/Properties/A/ThermalConductivity.an.html") for all
entries in this mapping (items using _select_function_split,
_select_function_scientific, _select_function_lattice, _select_function_mass,
_select_function_poisson, _select_function_density, etc.) so the code in
src/atomistics/referencedata/wolframalpha.py uses HTTPS directly.
♻️ Duplicate comments (1)
src/atomistics/referencedata/wolframalpha.py (1)
19-23: Add an explicit timeout for HTML fetches.
pandas.read_html(url)performs a network fetch with no timeout control, so a slow or stalled server can hang this path. Consider fetching with a standard-library timeout and pass the HTML toread_html.🛠️ Proposed fix (stdlib, keeps requests removed)
+from io import StringIO +from urllib.request import urlopen import os from typing import Callable, Optional, Union @@ - content = pandas.read_html(url) + with urlopen(url, timeout=30) as resp: + html = resp.read().decode("utf-8", errors="replace") + content = pandas.read_html(StringIO(html))
| data_dict = { | ||
| "thermalcondictivity": { | ||
| "url": "https://periodictable.com/Properties/A/ThermalConductivity.an.html", | ||
| "url": "http://periodictable.com/Properties/A/ThermalConductivity.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "atomicradius": { | ||
| "url": "https://periodictable.com/Properties/A/AtomicRadius.an.html", | ||
| "url": "http://periodictable.com/Properties/A/AtomicRadius.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "bulkmodulus": { | ||
| "url": "https://periodictable.com/Properties/A/BulkModulus.an.html", | ||
| "url": "http://periodictable.com/Properties/A/BulkModulus.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "shearmodulus": { | ||
| "url": "https://periodictable.com/Properties/A/ShearModulus.an.html", | ||
| "url": "http://periodictable.com/Properties/A/ShearModulus.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "youngmodulus": { | ||
| "url": "https://periodictable.com/Properties/A/YoungModulus.an.html", | ||
| "url": "http://periodictable.com/Properties/A/YoungModulus.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "poissonratio": { | ||
| "url": "https://periodictable.com/Properties/A/PoissonRatio.an.html", | ||
| "url": "http://periodictable.com/Properties/A/PoissonRatio.an.html", | ||
| "select_function": _select_function_poisson, | ||
| "current_filter": _poisson_filter, | ||
| }, | ||
| "density": { | ||
| "url": "https://periodictable.com/Properties/A/Density.an.html", | ||
| "url": "http://periodictable.com/Properties/A/Density.an.html", | ||
| "select_function": _select_function_density, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "liquiddensity": { | ||
| "url": "https://periodictable.com/Properties/A/LiquidDensity.an.html", | ||
| "url": "http://periodictable.com/Properties/A/LiquidDensity.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "thermalexpansion": { | ||
| "url": "https://periodictable.com/Properties/A/ThermalExpansion.an.html", | ||
| "url": "http://periodictable.com/Properties/A/ThermalExpansion.an.html", | ||
| "select_function": _select_function_scientific, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "meltingpoint": { | ||
| "url": "https://periodictable.com/Properties/A/AbsoluteMeltingPoint.an.html", | ||
| "url": "http://periodictable.com/Properties/A/AbsoluteMeltingPoint.an.html", | ||
| "select_function": _select_function_scientific, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "vaporizationheat": { | ||
| "url": "https://periodictable.com/Properties/A/VaporizationHeat.an.html", | ||
| "url": "http://periodictable.com/Properties/A/VaporizationHeat.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "specificheat": { | ||
| "url": "https://periodictable.com/Properties/A/SpecificHeat.an.html", | ||
| "url": "http://periodictable.com/Properties/A/SpecificHeat.an.html", | ||
| "select_function": _select_function_split, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "latticeconstant": { | ||
| "url": "https://periodictable.com/Properties/A/LatticeConstants.an.html", | ||
| "url": "http://periodictable.com/Properties/A/LatticeConstants.an.html", | ||
| "select_function": _select_function_lattice, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "crystal": { | ||
| "url": "https://periodictable.com/Properties/A/CrystalStructure.an.html", | ||
| "url": "http://periodictable.com/Properties/A/CrystalStructure.an.html", | ||
| "select_function": _select_function_poisson, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "volmolar": { | ||
| "url": "https://periodictable.com/Properties/A/MolarVolume.an.html", | ||
| "url": "http://periodictable.com/Properties/A/MolarVolume.an.html", | ||
| "select_function": _select_function_scientific, | ||
| "current_filter": _default_filter, | ||
| }, | ||
| "mass": { | ||
| "url": "https://periodictable.com/Properties/A/AtomicMass.an.html", | ||
| "url": "http://periodictable.com/Properties/A/AtomicMass.an.html", | ||
| "select_function": _select_function_mass, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd src/atomistics/referencedata && git log --oneline -20 wolframalpha.pyRepository: pyiron/atomistics
Length of output: 121
🏁 Script executed:
# Check if HTTPS versions of periodictable.com are accessible
curl -I -L https://periodictable.com/Properties/A/ThermalConductivity.an.html 2>&1 | head -20Repository: pyiron/atomistics
Length of output: 727
🏁 Script executed:
# Also check the current HTTP version
curl -I -L http://periodictable.com/Properties/A/ThermalConductivity.an.html 2>&1 | head -20Repository: pyiron/atomistics
Length of output: 1031
Use HTTPS URLs directly instead of relying on server redirects.
While periodictable.com automatically redirects HTTP requests to HTTPS with proper HSTS headers, it's better practice to use HTTPS URLs directly (https://periodictable.com/...) to avoid unnecessary redirects and make the intent explicit.
🤖 Prompt for AI Agents
In `@src/atomistics/referencedata/wolframalpha.py` around lines 279 - 357, The
data_dict entries (e.g., keys "thermalcondictivity", "atomicradius",
"bulkmodulus", etc.) currently use "http://" periodictable.com URLs which cause
unnecessary redirects; update each "url" value in the data_dict to use
"https://" (for example change
"http://periodictable.com/Properties/A/ThermalConductivity.an.html" to
"https://periodictable.com/Properties/A/ThermalConductivity.an.html") for all
entries in this mapping (items using _select_function_split,
_select_function_scientific, _select_function_lattice, _select_function_mass,
_select_function_poisson, _select_function_density, etc.) so the code in
src/atomistics/referencedata/wolframalpha.py uses HTTPS directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/atomistics/referencedata/wikipedia.py (1)
34-38: Add timeout and status check to prevent indefinite hangs and silent parsing of error pages.
requests.get()without a timeout can block indefinitely, and missingresponse.raise_for_status()allows error pages to be silently parsed, potentially returning incorrect elastic properties. Thepandas.read_html()call will parse whatever HTML exists in the response without validating that the request succeeded.🔧 Proposed fix
response = requests.get( "https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page)", headers={"User-Agent": "atomistics package"}, + timeout=30, ) + response.raise_for_status() df_lst = pandas.read_html(StringIO(response.text))
♻️ Duplicate comments (1)
src/atomistics/referencedata/wolframalpha.py (1)
21-21: Add timeout + status check to avoid indefinite hangs.
This still callsrequests.getwithout a timeout. Suggest:🔧 Proposed fix
- content = pandas.read_html(StringIO(requests.get(url).text)) + response = requests.get(url, timeout=30) + response.raise_for_status() + content = pandas.read_html(StringIO(response.text))
|
Wait for pyiron/pyiron_lammps#303 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #619 +/- ##
==========================================
+ Coverage 86.81% 86.82% +0.01%
==========================================
Files 43 43
Lines 2419 2421 +2
==========================================
+ Hits 2100 2102 +2
Misses 319 319 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.