You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+107Lines changed: 107 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,113 @@ All notable changes to this project will be documented in this file.
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
+
## [0.1.10] - 2025-07-20
9
+
10
+
### Added
11
+
- Ray distributed processing support for parallel symbol table generation (addresses [#16](https://github.com/codellm-devkit/codeanalyzer-python/issues/16))
12
+
-`--ray/--no-ray` CLI flag to enable/disable Ray-based distributed analysis
13
+
-`--skip-tests/--include-tests` CLI flag to control whether test files are analyzed (improves analysis performance)
14
+
-`--file-name` CLI flag for single file analysis (addresses part of [#16](https://github.com/codellm-devkit/codeanalyzer-python/issues/16))
15
+
- Incremental caching system with SHA256-based file change detection
16
+
- Automatic caching of analysis results to `analysis_cache.json`
17
+
- File-level caching with content hash validation to avoid re-analyzing unchanged files
18
+
- Significant performance improvements for subsequent analysis runs
19
+
- Cache reuse statistics logging
20
+
- Custom exception classes for better error handling in symbol table building:
21
+
-`SymbolTableBuilderException` (base exception)
22
+
-`SymbolTableBuilderFileNotFoundError` (file not found errors)
- Enhanced PyModule schema with metadata fields for caching:
26
+
-`last_modified` timestamp tracking
27
+
-`content_hash` for precise change detection
28
+
- Progress bar support for both serial and parallel processing modes
29
+
- Enhanced test fixtures including xarray project for comprehensive testing
30
+
- Comprehensive `__init__.py` exports for syntactic analysis module
31
+
- Smart dependency installation with conditional logic:
32
+
- Only installs requirements files when they exist (requirements.txt, requirements-dev.txt, dev-requirements.txt, test-requirements.txt)
33
+
- Only performs editable installation when package definition files are present (pyproject.toml, setup.py, setup.cfg)
34
+
- Improved virtual environment setup with better dependency detection and installation logic
35
+
36
+
### Changed
37
+
-**BREAKING CHANGE**: Updated Python version requirement from `>=3.10` to `>=3.9` for broader compatibility (closes [#17](https://github.com/codellm-devkit/codeanalyzer-python/issues/17))
38
+
-**BREAKING CHANGE**: Updated dependency versions with more conservative constraints for better stability:
39
+
-`pydantic` downgraded from `>=2.11.7` to `>=1.8.0,<2.0.0` for stability
40
+
-`pandas` constrained to `>=1.3.0,<2.0.0`
41
+
-`numpy` constrained to `>=1.21.0,<1.24.0`
42
+
-`rich` constrained to `>=12.6.0,<14.0.0`
43
+
-`typer` constrained to `>=0.9.0,<1.0.0`
44
+
- Other dependencies updated with conservative version ranges for better compatibility
45
+
- Major Architecture Enhancement: Complete rewrite of analysis caching system
46
+
-`analyze()` method now implements intelligent caching with PyApplication serialization
47
+
- Symbol table building redesigned to support incremental updates and cache reuse
48
+
- File change detection using SHA256 content hashing for maximum accuracy
49
+
- Enhanced `Codeanalyzer` constructor signature to accept `file_name` parameter for single file analysis
50
+
- Refactored symbol table building from monolithic `build()` method to cache-aware file-level processing
51
+
- Enhanced `Codeanalyzer` constructor signature to accept `skip_tests` and `using_ray` parameters
52
+
- Improved error handling with proper context managers in core analyzer
53
+
- Updated CLI to use Pydantic v1 compatible JSON serialization methods
54
+
- Reorganized syntactic analysis module structure with proper exception handling and exports
55
+
- Enhanced virtual environment detection with better fallback mechanisms
56
+
- Symbol table builder now sets metadata fields (`last_modified`, `content_hash`) for all PyModule objects
57
+
58
+
### Fixed
59
+
- Fixed critical symbol table bug for nested functions (closes [#15](https://github.com/codellm-devkit/codeanalyzer-python/issues/15))
60
+
- Corrected `_callables()` method recursion logic to properly capture both outer and inner functions
61
+
- Previously, only inner/nested functions were being captured in the symbol table
62
+
- Now correctly processes module-level functions, class methods, and all nested function definitions
63
+
- Fixed nested method/function signature generation in symbol table builder
64
+
- Corrected `_callables()` method to properly build fully qualified signatures for nested structures
65
+
- Fixed issue where nested functions and methods were getting incorrect signatures (e.g., `main.__init__` instead of `main.outer_function.NestedClass.__init__`)
66
+
- Added `prefix` parameter to `_callables()` and `_add_class()` methods to maintain proper nesting context
67
+
- Signatures now correctly reflect the full nested hierarchy (e.g., `main.outer_function.NestedClass.nested_class_method.method_nested_function`)
68
+
- Updated class method processing to pass class signature as prefix to nested callable processing
69
+
- Improved path relativization to project directory for cleaner signature generation
70
+
- Fixed Pydantic v2 compatibility issues by reverting to v1 API (`json()` instead of `model_dump_json()`)
71
+
- Fixed missing import statements and type annotations throughout the codebase
72
+
- Fixed symbol table builder to support individual file processing for distributed execution
73
+
- Improved error handling in virtual environment detection and Python interpreter resolution
74
+
- Fixed schema type annotations to use proper string keys for better serialization
75
+
- Enhanced import ordering and removed unnecessary blank lines in CLI module
76
+
- Improved virtual environment setup reliability:
77
+
- Fixed unnecessary pip installs by adding conditional logic to only install when dependencies are available
78
+
- Only attempts to install requirements files if they actually exist in the project
79
+
- Only performs editable installation when package definition files are present
80
+
- Prevents errors and warnings from attempting to install non-existent dependencies
81
+
82
+
### Technical Details
83
+
- Added Ray as a core dependency for distributed computing capabilities (addresses [#16](https://github.com/codellm-devkit/codeanalyzer-python/issues/16))
84
+
- Implemented `@ray.remote` decorator for parallel file processing
85
+
- Comprehensive caching system implementation:
86
+
-`_load_pyapplication_from_cache()` and `_save_analysis_cache()` methods for PyApplication serialization
87
+
-`_file_unchanged()` method with SHA256 content hash validation
88
+
- Cache-aware symbol table building with selective file processing
89
+
- Automatic cache statistics and performance reporting
90
+
- Enhanced progress tracking for both serial and parallel execution modes with Rich progress bars
91
+
- Updated schema to use `Dict[str, PyModule]` instead of `dict[Path, PyModule]` for better serialization
92
+
- Extended PyModule schema with optional `last_modified` and `content_hash` fields for caching metadata
93
+
- Added comprehensive exception hierarchy for better error classification and handling
94
+
- Refactored symbol table building into modular, file-level processing suitable for distribution
95
+
- Enhanced Python interpreter detection with support for multiple version managers (pyenv, conda, asdf)
96
+
- Added `hashlib` integration for file content hashing throughout the codebase
97
+
- Enhanced virtual environment setup logic:
98
+
- Modified `_add_class()` method to accept `prefix` parameter and pass class signature to method processing
99
+
- Updated `_callables()` method signature to include `prefix` parameter for nested context tracking
100
+
- Enhanced signature building logic to use prefix when available, falling back to Jedi resolution for top-level definitions
101
+
- Fixed recursive calls to pass current signature as prefix for proper nesting hierarchy
102
+
- Implemented conditional dependency installation with existence checks for requirements files and package definition files
103
+
104
+
### Notes
105
+
- This release significantly addresses the performance improvements requested in [#16](https://github.com/codellm-devkit/codeanalyzer-python/issues/16):
106
+
- ✅ Ray parallelization implemented
107
+
- ✅ Incremental caching with SHA256-based change detection implemented
108
+
- ✅ `--file-name` option for single-file analysis implemented
109
+
- ❌ `--nproc` options not yet included (still uses all available cores with Ray)
110
+
- ✅ Critical bug fix for nested function detection ([#15](https://github.com/codellm-devkit/codeanalyzer-python/issues/15)) is now included in this version
111
+
- Expected performance improvements: 2-10x faster on subsequent runs depending on code change frequency
112
+
- Enhanced symbol table accuracy ensures all function definitions are properly captured
113
+
- Virtual environment setup is now more robust and only installs dependencies when they are actually available
0 commit comments