Skip to content

Conversation

@jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Jan 25, 2026

Summary by CodeRabbit

  • Chores
    • Updated pandas dependency to 3.0.0 across environment and optional dependency configurations.
  • Improvements
    • Adjusted HTML parsing for external reference data lookups; this may change parsing outcomes for some queries.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 25, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Bumped pandas from 2.3.3 to 3.0.0 in multiple environment and packaging files; adjusted HTML parsing calls in two referencedata modules to wrap fetched HTML with an in-memory text buffer (io.StringIO) before passing to pandas.read_html.

Changes

Cohort / File(s) Summary
Environment configuration
\.ci_support/environment-docs.yml`, `.ci_support/environment.yml`, `binder/environment.yml``
Updated pandas version pin from 2.3.33.0.0.
Project optional dependencies
\pyproject.toml``
Updated pandas pin from 2.3.33.0.0 in optional groups: lammps, lammps_phonons, experimental.
Referencedata HTML parsing
\src/atomistics/referencedata/wikipedia.py`, `src/atomistics/referencedata/wolframalpha.py``
Added io.StringIO import and changed calls to pandas.read_html to pass StringIO(response.text) (wrap response HTML in in-memory text buffer) instead of passing raw text directly.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I hop through lines with gentle cheer,
Pandas leapt a version near.
HTML wrapped snug in string,
Tiny changes make me sing.
A nibble, a hop — code feels clear.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Update pandas to 3.0.0' directly matches the main objective of the pull request, which is to bump the pandas dependency version from 2.3.3 to 3.0.0 across multiple configuration and source files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.ci_support/environment.yml:
- Line 9: Replace the incompatible pandas pin by changing the deps entry "pandas
=3.0.0" to a pandas 2.x compatible spec (e.g., "pandas =2.3.*" or a range
"<3.0") in the environment.yml so the solver works with the declared
requires-python in pyproject.toml (>=3.9,<3.15); alternatively, if you intend to
require pandas 3.0.0, update pyproject.toml's requires-python to ">=3.11" and
run CI/install to confirm no breakages from pandas 3.0 behavioral changes
(string dtype, copy-on-write, datetime resolution) before committing.

In `@pyproject.toml`:
- Around line 49-66: The pyproject optional dependency groups lammps,
lammps_phonons, and experimental currently pin pandas==3.0.0 which requires
Python 3.11+; update those dependency entries in the lammps, lammps_phonons, and
experimental lists to a pandas 2.x release (e.g., pandas==2.1.3) that is
compatible with Python 3.9–3.10 to restore declared interpreter compatibility.

- mendeleev =1.1.0
- numpy =2.3.5
- pandas =2.3.3
- pandas =3.0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Explore repository structure
find . -name "environment.yml" -type f | head -20

Repository: pyiron/atomistics

Length of output: 115


🏁 Script executed:

#!/bin/bash
# Read the CI support environment file
if [ -f ".ci_support/environment.yml" ]; then
  cat -n ".ci_support/environment.yml"
fi

Repository: pyiron/atomistics

Length of output: 540


🏁 Script executed:

#!/bin/bash
# Check pandas version availability in conda
# Using Python to query PyPI for pandas version history
python3 << 'EOF'
import urllib.request
import json

try:
    with urllib.request.urlopen('https://pypi.org/pypi/pandas/json') as response:
        data = json.loads(response.read().decode())
    
    # Check if 3.0.0 exists
    versions = list(data['releases'].keys())
    versions_sorted = sorted(versions, reverse=True)
    
    print("Latest pandas versions:")
    for v in versions_sorted[:15]:
        print(f"  {v}")
    
    if "3.0.0" in versions:
        print("\n✓ pandas 3.0.0 exists")
    else:
        print("\n✗ pandas 3.0.0 DOES NOT EXIST")
        # Find 3.x versions
        v3_versions = [v for v in versions if v.startswith('3.')]
        if v3_versions:
            print(f"  Available 3.x versions: {sorted(v3_versions, reverse=True)[:5]}")
        else:
            print("  No 3.x versions available")
except Exception as e:
    print(f"Error querying PyPI: {e}")
EOF

Repository: pyiron/atomistics

Length of output: 209


🌐 Web query:

pandas 3.0.0 release

💡 Result:

pandas 3.0.0 was released on January 21, 2026. [1]

Key highlights / breaking changes in 3.0.0:

  • Dedicated string dtype enabled by default (uses PyArrow-backed strings if PyArrow is installed; otherwise falls back). [1]
  • Copy-on-Write behavior for more consistent copy/view semantics. [1]
  • New pd.col() syntax for building expressions (e.g., in DataFrame.assign). [1]
  • Several backward-incompatible changes, including datetime/timedelta resolution inference changes and pd.offsets.Day now always representing a calendar day (DST-safe). [1]
  • Minimum Python version is now 3.11+; NumPy minimum is 1.26.0. [1]
  • Upgrade advice: upgrade to pandas 2.3 first and resolve deprecation warnings before moving to 3.0. [1]

Release candidates were published earlier (e.g., 3.0.0rc0 on Dec 3, 2025) on GitHub and PyPI. [2] [3]

Sources: [1] [2] [3]


🏁 Script executed:

#!/bin/bash
# Try to find information about available pandas versions
# using conda search via web or checking known version info
python3 << 'EOF'
# Check using requests with a different approach
import subprocess
import json

try:
    # Try using conda search command via conda-forge
    result = subprocess.run(['conda', 'search', '--json', 'pandas'], 
                          capture_output=True, text=True, timeout=30)
    if result.returncode == 0:
        data = json.loads(result.stdout)
        versions = [pkg['version'] for pkg in data.get('pandas', [])]
        versions_sorted = sorted(set(versions), reverse=True)
        print("Available pandas versions in conda-forge:")
        for v in versions_sorted[:20]:
            print(f"  {v}")
        
        if "3.0.0" in versions_sorted:
            print("\n✓ pandas 3.0.0 is available")
        else:
            print("\n✗ pandas 3.0.0 is NOT available")
            # Show highest versions
            v3_versions = [v for v in versions_sorted if v.startswith('3.')]
            if v3_versions:
                print(f"  Highest 3.x version available: {v3_versions[0]}")
            else:
                print(f"  Highest version available: {versions_sorted[0]}")
    else:
        print("Error running conda search")
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: pyiron/atomistics

Length of output: 112


🏁 Script executed:

#!/bin/bash
# Check for Python version constraints in the project
find . -maxdepth 2 -type f \( -name "pyproject.toml" -o -name "setup.py" -o -name "setup.cfg" \) | head -5 | xargs -I {} cat -n {}

Repository: pyiron/atomistics

Length of output: 4463


🏁 Script executed:

#!/bin/bash
# Check CI configuration files
find .github -name "*.yml" -o -name "*.yaml" 2>/dev/null | head -5 | xargs -I {} bash -c "echo '=== {} ===' && head -30 {}"

Repository: pyiron/atomistics

Length of output: 3230


Critical: pandas 3.0.0 requires Python 3.11+ but project supports Python 3.9+.

The environment.yml pins pandas 3.0.0, which was released January 21, 2026 and requires Python 3.11 as a minimum. However, pyproject.toml declares requires-python = ">=3.9, <3.15", supporting Python 3.9 and 3.10. The CI environment will fail to solve on Python 3.9/3.10. Additionally, pandas 3.0.0 introduces breaking changes (string dtype enabled by default, Copy-on-Write behavior by default, datetime/timedelta resolution changes). The pandas upgrade guidance recommends upgrading to pandas 2.3 first and resolving deprecation warnings before moving to 3.0.

Either narrow the Python version requirement to 3.11+, or reconsider using pandas 3.0.0 given its recent release and breaking changes.

🤖 Prompt for AI Agents
In @.ci_support/environment.yml at line 9, Replace the incompatible pandas pin
by changing the deps entry "pandas =3.0.0" to a pandas 2.x compatible spec
(e.g., "pandas =2.3.*" or a range "<3.0") in the environment.yml so the solver
works with the declared requires-python in pyproject.toml (>=3.9,<3.15);
alternatively, if you intend to require pandas 3.0.0, update pyproject.toml's
requires-python to ">=3.11" and run CI/install to confirm no breakages from
pandas 3.0 behavioral changes (string dtype, copy-on-write, datetime resolution)
before committing.

Comment on lines 49 to 66
lammps = [
"pylammpsmpi==0.3.9",
"jinja2==3.1.6",
"pandas==2.3.3",
"pandas==3.0.0",
"pyiron_lammps==0.5.3",
]
lammps_phonons = [
"pylammpsmpi==0.3.9",
"jinja2==3.1.6",
"pandas==2.3.3",
"pandas==3.0.0",
"dynaphopy==1.18.0",
"pyiron_lammps==0.5.3",
]
experimental = [
"lxml==6.0.2",
"mendeleev==1.1.0",
"pandas==2.3.3",
"pandas==3.0.0",
"requests==2.32.5",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check pandas 3.0.0 Python requirement from PyPI metadata
python3 <<'PY'
import json, urllib.request
try:
    data = json.load(urllib.request.urlopen("https://pypi.org/pypi/pandas/3.0.0/json"))
    print("Requires-Python:", data["info"]["requires_python"])
    print("Version:", data["info"]["version"])
    print("Release Date:", data["info"]["release_url"])
except Exception as e:
    print(f"Error: {e}")
PY

Repository: pyiron/atomistics

Length of output: 195


🏁 Script executed:

# Also check the pyproject.toml to confirm the Python range
cat pyproject.toml | grep -A 5 "python"

Repository: pyiron/atomistics

Length of output: 301


🏁 Script executed:

# Check if there are tests for optional dependencies
git ls-files | grep -E "test.*lammps|test.*pandas" | head -20

Repository: pyiron/atomistics

Length of output: 933


🌐 Web query:

pandas 3.0.0 python version requirements support

💡 Result:

pandas 3.0.0 supports (and effectively requires) Python 3.11 or newer (i.e., Python 3.10 and earlier are not supported). [1]

Source: pandas 3.0.0 release notes (“Increased minimum version for Python”). [1]


🏁 Script executed:

# Check if there are any existing compatibility or version constraint issues in the codebase
grep -r "pandas" pyproject.toml | head -10

Repository: pyiron/atomistics

Length of output: 123


🏁 Script executed:

# Check if there are any existing test markers or skip conditions for optional deps
grep -r "skipif\|mark.skip" tests/ --include="*.py" | grep -i "lammps\|pandas" | head -10

Repository: pyiron/atomistics

Length of output: 43


Downgrade pandas to a version compatible with Python 3.9-3.10.

pandas 3.0.0 requires Python 3.11+, which breaks compatibility with Python 3.9 and 3.10 that are within your project's declared support range (>=3.9,<3.15). This affects lines 52, 58, and 65 in the lammps, lammps_phonons, and experimental optional dependency groups. Use pandas 2.x instead.

🤖 Prompt for AI Agents
In `@pyproject.toml` around lines 49 - 66, The pyproject optional dependency
groups lammps, lammps_phonons, and experimental currently pin pandas==3.0.0
which requires Python 3.11+; update those dependency entries in the lammps,
lammps_phonons, and experimental lists to a pandas 2.x release (e.g.,
pandas==2.1.3) that is compatible with Python 3.9–3.10 to restore declared
interpreter compatibility.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/atomistics/referencedata/wikipedia.py`:
- Line 36: pandas.read_html is being called with a requests.Response object
(response) which is invalid; change the call to pass a text stream by wrapping
response.text in io.StringIO and then call pandas.read_html on that stream
(update reference to the pandas.read_html invocation that assigns df_lst), and
ensure io is imported (or use io.StringIO where appropriate) so runtime errors
are avoided.

In `@src/atomistics/referencedata/wolframalpha.py`:
- Line 20: Replace the direct use of requests.Response in the pandas.read_html
call: change the call to use requests.get with a sensible timeout (e.g.,
timeout=10) and pass response.text (or response.content.decode(...)) or a
file-like string to pandas.read_html instead of the Response object; ensure
requests is imported if not already. Specifically, update the line using
pandas.read_html and requests.get so it reads the URL with a timeout, assigns
the response to a variable (e.g., resp = requests.get(url, timeout=10)), and
then call pandas.read_html(resp.text) (or resp.content decoded) to satisfy
pandas.read_html's accepted input types.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/atomistics/referencedata/wolframalpha.py`:
- Around line 279-357: The data_dict entries (e.g., keys "thermalcondictivity",
"atomicradius", "bulkmodulus", etc.) currently use "http://" periodictable.com
URLs which cause unnecessary redirects; update each "url" value in the data_dict
to use "https://" (for example change
"http://periodictable.com/Properties/A/ThermalConductivity.an.html" to
"https://periodictable.com/Properties/A/ThermalConductivity.an.html") for all
entries in this mapping (items using _select_function_split,
_select_function_scientific, _select_function_lattice, _select_function_mass,
_select_function_poisson, _select_function_density, etc.) so the code in
src/atomistics/referencedata/wolframalpha.py uses HTTPS directly.
♻️ Duplicate comments (1)
src/atomistics/referencedata/wolframalpha.py (1)

19-23: Add an explicit timeout for HTML fetches.
pandas.read_html(url) performs a network fetch with no timeout control, so a slow or stalled server can hang this path. Consider fetching with a standard-library timeout and pass the HTML to read_html.

🛠️ Proposed fix (stdlib, keeps requests removed)
+from io import StringIO
+from urllib.request import urlopen
 import os
 from typing import Callable, Optional, Union
@@
-    content = pandas.read_html(url)
+    with urlopen(url, timeout=30) as resp:
+        html = resp.read().decode("utf-8", errors="replace")
+    content = pandas.read_html(StringIO(html))

Comment on lines 279 to 359
data_dict = {
"thermalcondictivity": {
"url": "https://periodictable.com/Properties/A/ThermalConductivity.an.html",
"url": "http://periodictable.com/Properties/A/ThermalConductivity.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"atomicradius": {
"url": "https://periodictable.com/Properties/A/AtomicRadius.an.html",
"url": "http://periodictable.com/Properties/A/AtomicRadius.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"bulkmodulus": {
"url": "https://periodictable.com/Properties/A/BulkModulus.an.html",
"url": "http://periodictable.com/Properties/A/BulkModulus.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"shearmodulus": {
"url": "https://periodictable.com/Properties/A/ShearModulus.an.html",
"url": "http://periodictable.com/Properties/A/ShearModulus.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"youngmodulus": {
"url": "https://periodictable.com/Properties/A/YoungModulus.an.html",
"url": "http://periodictable.com/Properties/A/YoungModulus.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"poissonratio": {
"url": "https://periodictable.com/Properties/A/PoissonRatio.an.html",
"url": "http://periodictable.com/Properties/A/PoissonRatio.an.html",
"select_function": _select_function_poisson,
"current_filter": _poisson_filter,
},
"density": {
"url": "https://periodictable.com/Properties/A/Density.an.html",
"url": "http://periodictable.com/Properties/A/Density.an.html",
"select_function": _select_function_density,
"current_filter": _default_filter,
},
"liquiddensity": {
"url": "https://periodictable.com/Properties/A/LiquidDensity.an.html",
"url": "http://periodictable.com/Properties/A/LiquidDensity.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"thermalexpansion": {
"url": "https://periodictable.com/Properties/A/ThermalExpansion.an.html",
"url": "http://periodictable.com/Properties/A/ThermalExpansion.an.html",
"select_function": _select_function_scientific,
"current_filter": _default_filter,
},
"meltingpoint": {
"url": "https://periodictable.com/Properties/A/AbsoluteMeltingPoint.an.html",
"url": "http://periodictable.com/Properties/A/AbsoluteMeltingPoint.an.html",
"select_function": _select_function_scientific,
"current_filter": _default_filter,
},
"vaporizationheat": {
"url": "https://periodictable.com/Properties/A/VaporizationHeat.an.html",
"url": "http://periodictable.com/Properties/A/VaporizationHeat.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"specificheat": {
"url": "https://periodictable.com/Properties/A/SpecificHeat.an.html",
"url": "http://periodictable.com/Properties/A/SpecificHeat.an.html",
"select_function": _select_function_split,
"current_filter": _default_filter,
},
"latticeconstant": {
"url": "https://periodictable.com/Properties/A/LatticeConstants.an.html",
"url": "http://periodictable.com/Properties/A/LatticeConstants.an.html",
"select_function": _select_function_lattice,
"current_filter": _default_filter,
},
"crystal": {
"url": "https://periodictable.com/Properties/A/CrystalStructure.an.html",
"url": "http://periodictable.com/Properties/A/CrystalStructure.an.html",
"select_function": _select_function_poisson,
"current_filter": _default_filter,
},
"volmolar": {
"url": "https://periodictable.com/Properties/A/MolarVolume.an.html",
"url": "http://periodictable.com/Properties/A/MolarVolume.an.html",
"select_function": _select_function_scientific,
"current_filter": _default_filter,
},
"mass": {
"url": "https://periodictable.com/Properties/A/AtomicMass.an.html",
"url": "http://periodictable.com/Properties/A/AtomicMass.an.html",
"select_function": _select_function_mass,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cd src/atomistics/referencedata && git log --oneline -20 wolframalpha.py

Repository: pyiron/atomistics

Length of output: 121


🏁 Script executed:

# Check if HTTPS versions of periodictable.com are accessible
curl -I -L https://periodictable.com/Properties/A/ThermalConductivity.an.html 2>&1 | head -20

Repository: pyiron/atomistics

Length of output: 727


🏁 Script executed:

# Also check the current HTTP version
curl -I -L http://periodictable.com/Properties/A/ThermalConductivity.an.html 2>&1 | head -20

Repository: pyiron/atomistics

Length of output: 1031


Use HTTPS URLs directly instead of relying on server redirects.
While periodictable.com automatically redirects HTTP requests to HTTPS with proper HSTS headers, it's better practice to use HTTPS URLs directly (https://periodictable.com/...) to avoid unnecessary redirects and make the intent explicit.

🤖 Prompt for AI Agents
In `@src/atomistics/referencedata/wolframalpha.py` around lines 279 - 357, The
data_dict entries (e.g., keys "thermalcondictivity", "atomicradius",
"bulkmodulus", etc.) currently use "http://" periodictable.com URLs which cause
unnecessary redirects; update each "url" value in the data_dict to use
"https://" (for example change
"http://periodictable.com/Properties/A/ThermalConductivity.an.html" to
"https://periodictable.com/Properties/A/ThermalConductivity.an.html") for all
entries in this mapping (items using _select_function_split,
_select_function_scientific, _select_function_lattice, _select_function_mass,
_select_function_poisson, _select_function_density, etc.) so the code in
src/atomistics/referencedata/wolframalpha.py uses HTTPS directly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/atomistics/referencedata/wikipedia.py (1)

34-38: Add timeout and status check to prevent indefinite hangs and silent parsing of error pages.

requests.get() without a timeout can block indefinitely, and missing response.raise_for_status() allows error pages to be silently parsed, potentially returning incorrect elastic properties. The pandas.read_html() call will parse whatever HTML exists in the response without validating that the request succeeded.

🔧 Proposed fix
     response = requests.get(
         "https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page)",
         headers={"User-Agent": "atomistics package"},
+        timeout=30,
     )
+    response.raise_for_status()
     df_lst = pandas.read_html(StringIO(response.text))
♻️ Duplicate comments (1)
src/atomistics/referencedata/wolframalpha.py (1)

21-21: Add timeout + status check to avoid indefinite hangs.
This still calls requests.get without a timeout. Suggest:

🔧 Proposed fix
-    content = pandas.read_html(StringIO(requests.get(url).text))
+    response = requests.get(url, timeout=30)
+    response.raise_for_status()
+    content = pandas.read_html(StringIO(response.text))

@jan-janssen
Copy link
Member Author

Wait for pyiron/pyiron_lammps#303

@jan-janssen jan-janssen marked this pull request as draft January 25, 2026 18:27
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.82%. Comparing base (059d256) to head (8dc445e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #619      +/-   ##
==========================================
+ Coverage   86.81%   86.82%   +0.01%     
==========================================
  Files          43       43              
  Lines        2419     2421       +2     
==========================================
+ Hits         2100     2102       +2     
  Misses        319      319              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants