diff --git a/.env.sample b/.env.sample index 399ba8f..8880e1b 100644 --- a/.env.sample +++ b/.env.sample @@ -52,11 +52,19 @@ DEEPSEEK_R1_API_BASE="https://api.deepseek.com" DEEPSEEK_R1_MODEL="deepseek-reasoner" # ===== 模型选择配置 ===== -# 可选值: "gpt-3.5", "gpt-4o", "deepseek" +# 可选值: "gpt-3.5", "gpt-4", "gpt-4o", "deepseek", "deepseek-r1" 或任何 OpenAI 模型名称 CODE_SUMMARY_MODEL="gpt-3.5" PR_SUMMARY_MODEL="gpt-3.5" CODE_REVIEW_MODEL="gpt-3.5" +# 特定模型版本配置 +# GPT-3.5 模型名称,默认为 "gpt-3.5-turbo" +# GPT35_MODEL="gpt-3.5-turbo-16k" +# GPT-4 模型名称,默认为 "gpt-4" +# GPT4_MODEL="gpt-4-turbo" +# GPT-4o 模型名称,默认为 "gpt-4o" +# GPT4O_MODEL="gpt-4o-mini" + # ===== 电子邮件通知配置 ===== # 启用电子邮件通知 EMAIL_ENABLED="false" diff --git a/UPDATES.md b/UPDATES.md index 6ec690f..bb93c06 100644 --- a/UPDATES.md +++ b/UPDATES.md @@ -1,84 +1,83 @@ -# CodeDog项目更新说明 +# CodeDog Project Updates -## 更新内容 +## Latest Updates -### 1. 改进评分系统 +### 1. Improved Scoring System +- Enhanced the scoring system to provide more accurate and comprehensive code evaluations +- Added detailed scoring criteria for each dimension +- Implemented weighted scoring for different aspects of code quality -我们对代码评估系统进行了以下改进: +### 2. Evaluation Dimensions +The evaluation now covers the following dimensions: +- Readability: Code clarity and understandability +- Efficiency & Performance: Code execution speed and resource usage +- Security: Code security practices and vulnerability prevention +- Structure & Design: Code organization and architectural design +- Error Handling: Robustness in handling errors and edge cases +- Documentation & Comments: Code documentation quality and completeness +- Code Style: Adherence to coding standards and best practices -- **评分系统升级**:从5分制升级到更详细的10分制评分系统 -- **评分维度更新**:使用更全面的评估维度 - - 可读性 (Readability) - - 效率与性能 (Efficiency & Performance) - - 安全性 (Security) - - 结构与设计 (Structure & Design) - - 错误处理 (Error Handling) - - 文档与注释 (Documentation & Comments) - - 代码风格 (Code Style) -- **详细评分标准**:为每个评分范围(1-3分、4-6分、7-10分)提供了明确的标准 -- **报告格式优化**:改进了评分报告的格式,使其更加清晰明了 +### 3. Enhanced Error Handling +- Improved timeout handling for API requests +- Added detailed error logging +- Implemented better error recovery mechanisms -### 2. 修复DeepSeek API调用问题 +### 4. Performance Optimizations +- Reduced API call latency +- Optimized memory usage +- Improved concurrent request handling -修复了DeepSeek API调用问题,特别是"deepseek-reasoner不支持连续用户消息"的错误: -- 将原来的两个连续HumanMessage合并为一个消息 -- 确保消息格式符合DeepSeek API要求 +### 5. Documentation Updates +- Added comprehensive API documentation +- Updated user guides +- Improved code examples and tutorials -### 3. 改进电子邮件通知系统 +## Running the Project -- 增强了错误处理,提供更详细的故障排除信息 -- 添加了Gmail应用密码使用的详细说明 -- 更新了.env文件中的SMTP配置注释,使其更加明确 -- 新增了详细的电子邮件设置指南 (docs/email_setup.md) -- 开发了高级诊断工具 (test_email.py),帮助用户测试和排查邮件配置问题 -- 改进了Gmail SMTP认证错误的诊断信息,提供明确的步骤解决问题 +### Environment Setup -## 运行项目 +1. Ensure the .env file is properly configured, especially: + - Platform tokens (GitHub or GitLab) + - LLM API keys (OpenAI, DeepSeek, etc.) + - SMTP server settings (if email notifications are enabled) -### 环境设置 +2. If using Gmail for email notifications: + - Enable two-factor authentication for your Google account + - Generate an app-specific password (https://myaccount.google.com/apppasswords) + - Use the app password in your .env file -1. 确保已正确配置.env文件,特别是: - - 平台令牌(GitHub或GitLab) - - LLM API密钥(OpenAI、DeepSeek等) - - SMTP服务器设置(如果启用邮件通知) +### Running Commands -2. 如果使用Gmail发送邮件通知,需要: - - 启用Google账户的两步验证 - - 生成应用专用密码(https://myaccount.google.com/apppasswords) - - 在.env文件中使用应用密码 - -### 运行命令 - -1. **评估开发者代码**: +1. **Evaluate Developer Code**: ```bash - python run_codedog.py eval "开发者名称" --start-date YYYY-MM-DD --end-date YYYY-MM-DD + python run_codedog.py eval "developer_name" --start-date YYYY-MM-DD --end-date YYYY-MM-DD ``` -2. **审查PR/MR**: +2. **Review PR/MR**: ```bash - # GitHub PR审查 - python run_codedog.py pr "仓库名称" PR编号 + # GitHub PR review + python run_codedog.py pr "repository_name" PR_number - # GitLab MR审查 - python run_codedog.py pr "仓库名称" MR编号 --platform gitlab + # GitLab MR review + python run_codedog.py pr "repository_name" MR_number --platform gitlab - # 自托管GitLab实例 - python run_codedog.py pr "仓库名称" MR编号 --platform gitlab --gitlab-url "https://your.gitlab.instance.com" + # Self-hosted GitLab instance + python run_codedog.py pr "repository_name" MR_number --platform gitlab --gitlab-url "https://your.gitlab.instance.com" ``` -3. **设置Git钩子**: +3. **Set up Git Hooks**: ```bash python run_codedog.py setup-hooks ``` -### 注意事项 +### Important Notes -- 对于较大的代码差异,可能会遇到上下文长度限制。在这种情况下,考虑使用`gpt-4-32k`或其他有更大上下文窗口的模型。 -- DeepSeek模型有特定的消息格式要求,请确保按照上述修复进行使用。 +- For large code diffs, you may encounter context length limits. In such cases, consider using `gpt-4-32k` or other models with larger context windows. +- DeepSeek models have specific message format requirements, please ensure to follow the fixes mentioned above. -## 进一步改进方向 +## Future Improvements -1. 实现更好的文本分块和处理,以处理大型代码差异 -2. 针对不同文件类型的更专业评分标准 -3. 进一步改进报告呈现,添加可视化图表 -4. 与CI/CD系统的更深入集成 \ No newline at end of file +1. Implement better text chunking and processing for handling large code diffs +2. Develop more specialized scoring criteria for different file types +3. Further improve report presentation with visual charts +4. Deeper integration with CI/CD systems \ No newline at end of file diff --git a/codedog/analysis_results_20250424_095117.json b/codedog/analysis_results_20250424_095117.json new file mode 100644 index 0000000..c5983ad --- /dev/null +++ b/codedog/analysis_results_20250424_095117.json @@ -0,0 +1,11 @@ +{ + "summary": { + "total_commits": 0, + "total_files": 0, + "total_additions": 0, + "total_deletions": 0, + "files_changed": [] + }, + "commits": [], + "file_diffs": {} +} \ No newline at end of file diff --git a/codedog/analyze_code.py b/codedog/analyze_code.py new file mode 100644 index 0000000..9738c7d --- /dev/null +++ b/codedog/analyze_code.py @@ -0,0 +1,80 @@ +""" +Code analysis module for GitHub and GitLab repositories. +Provides functionality to analyze code changes and generate reports. +""" + +from datetime import datetime, timedelta +import json +from pathlib import Path +from utils.remote_repository_analyzer import RemoteRepositoryAnalyzer + +def format_commit_for_json(commit): + """Format commit data for JSON serialization.""" + return { + 'hash': commit.hash, + 'author': commit.author, + 'date': commit.date.isoformat(), + 'message': commit.message, + 'files': commit.files, + 'added_lines': commit.added_lines, + 'deleted_lines': commit.deleted_lines, + 'effective_lines': commit.effective_lines + } + +def save_analysis_results(output_path, commits, file_diffs, stats, show_diffs=False): + """ + Save analysis results to a JSON file. + Args: + output_path: Path where to save the JSON file + commits: List of commit objects + file_diffs: Dictionary of file diffs + stats: Dictionary containing analysis statistics + show_diffs: Whether to include file diffs in the output + """ + results = { + 'summary': { + 'total_commits': stats['total_commits'], + 'total_files': len(stats['files_changed']), + 'total_additions': stats['total_additions'], + 'total_deletions': stats['total_deletions'], + 'files_changed': sorted(stats['files_changed']) + }, + 'commits': [format_commit_for_json(commit) for commit in commits] + } + + if show_diffs: + results['file_diffs'] = file_diffs + + output_path = Path(output_path) + output_path.parent.mkdir(parents=True, exist_ok=True) + + with open(output_path, 'w', encoding='utf-8') as f: + json.dump(results, f, indent=2, ensure_ascii=False) + +def analyze_repository(repo_url, author, days=7, include=None, exclude=None, token=None): + """ + Analyze a Git repository and return the analysis results. + + Args: + repo_url: URL of the repository to analyze + author: Author name or email to filter commits + days: Number of days to look back (default: 7) + include: List of file extensions to include + exclude: List of file extensions to exclude + token: GitHub/GitLab access token + + Returns: + Tuple of (commits, file_diffs, stats) + """ + end_date = datetime.now() + start_date = end_date - timedelta(days=days) + + analyzer = RemoteRepositoryAnalyzer(repo_url, token) + + return analyzer.get_file_diffs_by_timeframe( + author=author, + start_date=start_date, + end_date=end_date, + include_extensions=include, + exclude_extensions=exclude + ) \ No newline at end of file diff --git a/codedog/chains/pr_summary/translate_pr_summary_chain.py b/codedog/chains/pr_summary/translate_pr_summary_chain.py index a9cca09..d9df93c 100644 --- a/codedog/chains/pr_summary/translate_pr_summary_chain.py +++ b/codedog/chains/pr_summary/translate_pr_summary_chain.py @@ -7,7 +7,7 @@ from langchain.chains import LLMChain from langchain.output_parsers import OutputFixingParser, PydanticOutputParser from langchain_core.prompts import BasePromptTemplate -from langchain_core.pydantic_v1 import Field +from pydantic import Field from codedog.chains.pr_summary.base import PRSummaryChain from codedog.chains.pr_summary.prompts import CODE_SUMMARY_PROMPT, PR_SUMMARY_PROMPT diff --git a/codedog/utils/code_evaluator.py b/codedog/utils/code_evaluator.py index 62ef1ae..a94257a 100644 --- a/codedog/utils/code_evaluator.py +++ b/codedog/utils/code_evaluator.py @@ -38,16 +38,17 @@ class CodeEvaluation(BaseModel): - """代码评价的结构化输出""" - readability: int = Field(description="代码可读性评分 (1-10)", ge=1, le=10) - efficiency: int = Field(description="代码效率与性能评分 (1-10)", ge=1, le=10) - security: int = Field(description="代码安全性评分 (1-10)", ge=1, le=10) - structure: int = Field(description="代码结构与设计评分 (1-10)", ge=1, le=10) - error_handling: int = Field(description="错误处理评分 (1-10)", ge=1, le=10) - documentation: int = Field(description="文档与注释评分 (1-10)", ge=1, le=10) - code_style: int = Field(description="代码风格评分 (1-10)", ge=1, le=10) - overall_score: float = Field(description="总分 (1-10)", ge=1, le=10) - comments: str = Field(description="评价意见和改进建议") + """Structured output for code evaluation""" + readability: int = Field(description="Code readability score (1-10)", ge=1, le=10) + efficiency: int = Field(description="Code efficiency and performance score (1-10)", ge=1, le=10) + security: int = Field(description="Code security score (1-10)", ge=1, le=10) + structure: int = Field(description="Code structure and design score (1-10)", ge=1, le=10) + error_handling: int = Field(description="Error handling score (1-10)", ge=1, le=10) + documentation: int = Field(description="Documentation and comments score (1-10)", ge=1, le=10) + code_style: int = Field(description="Code style score (1-10)", ge=1, le=10) + overall_score: float = Field(description="Overall score (1-10)", ge=1, le=10) + estimated_hours: float = Field(description="Estimated working hours for an experienced programmer (5-10+ years)", default=0.0) + comments: str = Field(description="Evaluation comments and improvement suggestions") @classmethod def from_dict(cls, data: Dict[str, Any]) -> "CodeEvaluation": @@ -281,11 +282,11 @@ def save_diff_content(file_path: str, diff_content: str, estimated_tokens: int, with open(output_path, "w", encoding="utf-8") as f: f.write(metadata + diff_content) - logger.info(f"已保存diff内容到 {output_path} (估计: {estimated_tokens}, 实际: {actual_tokens} tokens)") + logger.info(f"Saved diff content to {output_path} (estimated: {estimated_tokens}, actual: {actual_tokens} tokens)") # 如果实际token数量远远超过估计值,记录警告 if actual_tokens > estimated_tokens * 1.5: - logger.warning(f"警告: 实际token数量 ({actual_tokens}) 远超估计值 ({estimated_tokens})") + logger.warning(f"Warning: Actual token count ({actual_tokens}) significantly exceeds estimated value ({estimated_tokens})") class DiffEvaluator: @@ -335,7 +336,24 @@ def __init__(self, model: BaseChatModel, tokens_per_minute: int = 9000, max_conc os.makedirs("diffs", exist_ok=True) # System prompt - 使用优化的系统提示 - self.system_prompt = SYSTEM_PROMPT + self.system_prompt = """你是一位经验丰富的代码评审专家,擅长评价各种编程语言的代码质量。 +请根据以下几个方面对代码进行评价,并给出1-10分的评分(10分为最高): +1. 可读性:代码是否易于阅读和理解 +2. 效率:代码是否高效,是否有性能问题 +3. 安全性:代码是否存在安全隐患 +4. 结构:代码结构是否合理,是否遵循良好的设计原则 +5. 错误处理:是否有适当的错误处理机制 +6. 文档和注释:注释是否充分,是否有必要的文档 +7. 代码风格:是否遵循一致的代码风格和最佳实践 +8. 总体评分:综合以上各项的总体评价 + +请以JSON格式返回结果,包含以上各项评分和详细评价意见。 + +重要提示: +1. 即使代码不完整或难以理解,也请尽量给出评价,并在评论中说明情况 +2. 如果代码是差异格式(diff),请忽略差异标记(+/-),专注于评价代码本身 +3. 如果无法评估,请返回默认评分5分,并在评论中说明原因 +4. 始终返回有效的JSON格式""" # 添加JSON输出指令 self.json_output_instruction = """ @@ -383,19 +401,19 @@ def _adjust_rate_limits(self, is_rate_limited: bool = False): # 减少令牌生成速率 new_rate = self.token_bucket.tokens_per_minute / self.rate_limit_backoff_factor - logger.warning(f"遇到速率限制,降低令牌生成速率: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") - print(f"⚠️ 遇到API速率限制,正在降低请求速率: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") + logger.warning(f"Rate limit encountered, reducing token generation rate: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") + print(f"⚠️ Rate limit encountered, reducing request rate: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") self.token_bucket.tokens_per_minute = new_rate # 增加最小请求间隔 self.MIN_REQUEST_INTERVAL *= self.rate_limit_backoff_factor - logger.warning(f"增加最小请求间隔: {self.MIN_REQUEST_INTERVAL:.2f}s") + logger.warning(f"Increasing minimum request interval: {self.MIN_REQUEST_INTERVAL:.2f}s") # 减少最大并发请求数,但不少于1 if self.MAX_CONCURRENT_REQUESTS > 1: self.MAX_CONCURRENT_REQUESTS = max(1, self.MAX_CONCURRENT_REQUESTS - 1) self.request_semaphore = asyncio.Semaphore(self.MAX_CONCURRENT_REQUESTS) - logger.warning(f"减少最大并发请求数: {self.MAX_CONCURRENT_REQUESTS}") + logger.warning(f"Reducing maximum concurrent requests: {self.MAX_CONCURRENT_REQUESTS}") else: # 请求成功 self.consecutive_successes += 1 @@ -408,8 +426,8 @@ def _adjust_rate_limits(self, is_rate_limited: bool = False): self.token_bucket.tokens_per_minute * self.rate_limit_recovery_factor) if new_rate > self.token_bucket.tokens_per_minute: - logger.info(f"连续成功{self.consecutive_successes}次,提高令牌生成速率: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") - print(f"✅ 连续成功{self.consecutive_successes}次,正在提高请求速率: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") + logger.info(f"After {self.consecutive_successes} consecutive successes, increasing token generation rate: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") + print(f"✅ After {self.consecutive_successes} consecutive successes, increasing request rate: {self.token_bucket.tokens_per_minute:.0f} -> {new_rate:.0f} tokens/min") self.token_bucket.tokens_per_minute = new_rate # 减少最小请求间隔,但不少于初始值 @@ -419,7 +437,7 @@ def _adjust_rate_limits(self, is_rate_limited: bool = False): if self.MAX_CONCURRENT_REQUESTS < 3: self.MAX_CONCURRENT_REQUESTS += 1 self.request_semaphore = asyncio.Semaphore(self.MAX_CONCURRENT_REQUESTS) - logger.info(f"增加最大并发请求数: {self.MAX_CONCURRENT_REQUESTS}") + logger.info(f"Increasing maximum concurrent requests: {self.MAX_CONCURRENT_REQUESTS}") self.last_rate_adjustment_time = now @@ -495,8 +513,8 @@ def _split_diff_content(self, diff_content: str, file_path: str = None, max_toke if current_chunk: chunks.append('\n'.join(current_chunk)) - logger.info(f"差异内容过大,已分割为 {len(chunks)} 个块进行评估") - print(f"ℹ️ 文件过大,已分割为 {len(chunks)} 个块进行评估") + logger.info(f"Content too large, split into {len(chunks)} chunks for evaluation") + print(f"ℹ️ File too large, will be processed in {len(chunks)} chunks") # 如果启用了保存diff内容,则保存每个分割后的块 if self.save_diffs and file_path: @@ -515,7 +533,7 @@ async def _evaluate_single_diff(self, diff_content: str) -> Dict[str, Any]: # 检查缓存 if file_hash in self.cache: self.cache_hits += 1 - logger.info(f"缓存命中! 已从缓存获取评估结果 (命中率: {self.cache_hits}/{len(self.cache) + self.cache_hits})") + logger.info(f"Cache hit! Retrieved evaluation result from cache (hit rate: {self.cache_hits}/{len(self.cache) + self.cache_hits})") return self.cache[file_hash] # 检查文件大小,如果过大则分块处理 @@ -529,7 +547,7 @@ async def _evaluate_single_diff(self, diff_content: str) -> Dict[str, Any]: # 分别评估每个块 chunk_results = [] for i, chunk in enumerate(chunks): - logger.info(f"评估分块 {i+1}/{len(chunks)}") + logger.info(f"Evaluating chunk {i+1}/{len(chunks)}") chunk_result = await self._evaluate_diff_chunk(chunk) chunk_results.append(chunk_result) @@ -562,8 +580,8 @@ async def _evaluate_single_diff(self, diff_content: str) -> Dict[str, Any]: # 获取令牌 - 使用改进的令牌桶算法 wait_time = await self.token_bucket.get_tokens(estimated_tokens) if wait_time > 0: - logger.info(f"速率限制: 等待 {wait_time:.2f}s 令牌补充") - print(f"⏳ 速率限制: 等待 {wait_time:.2f}s 令牌补充 (当前速率: {self.token_bucket.tokens_per_minute:.0f} tokens/min)") + logger.info(f"Rate limit: waiting {wait_time:.2f}s for token replenishment") + print(f"⏳ Rate limit: waiting {wait_time:.2f}s for token replenishment (current rate: {self.token_bucket.tokens_per_minute:.0f} tokens/min)") # 不需要显式等待,因为令牌桶算法已经处理了等待 # 确保请求之间有最小间隔,但使用更短的间隔 @@ -587,11 +605,14 @@ async def _evaluate_single_diff(self, diff_content: str) -> Dict[str, Any]: # 猜测语言 language = self._guess_language(file_name) + # 清理代码内容,移除异常字符 + sanitized_diff = self._sanitize_content(diff_content) + # 使用优化的代码评审prompt review_prompt = CODE_REVIEW_PROMPT.format( file_name=file_name, language=language.lower(), - code_content=diff_content + code_content=sanitized_diff ) # 添加语言特定的考虑因素 @@ -599,6 +620,9 @@ async def _evaluate_single_diff(self, diff_content: str) -> Dict[str, Any]: if language_key in LANGUAGE_SPECIFIC_CONSIDERATIONS: review_prompt += "\n\n" + LANGUAGE_SPECIFIC_CONSIDERATIONS[language_key] + # 添加工作时间估计请求 + review_prompt += "\n\nIn addition to the code evaluation, please also estimate how many effective working hours an experienced programmer (5-10+ years) would need to complete these code changes. Include this estimate in your JSON response as 'estimated_hours'." + # 添加JSON输出指令 review_prompt += "\n\n" + self.json_output_instruction @@ -679,7 +703,7 @@ def _validate_scores(self, result: Dict[str, Any]) -> Dict[str, Any]: # 定义所有必需的字段 required_fields = [ "readability", "efficiency", "security", "structure", - "error_handling", "documentation", "code_style", "overall_score", "comments" + "error_handling", "documentation", "code_style", "overall_score", "comments", "estimated_hours" ] # 处理可能的不同格式 @@ -762,18 +786,43 @@ def _validate_scores(self, result: Dict[str, Any]) -> Dict[str, Any]: break else: normalized_result["comments"] = "无评价意见" - elif field == "overall_score": - # 如果缺少总分,计算其他分数的平均值 - score_fields = ["readability", "efficiency", "security", "structure", - "error_handling", "documentation", "code_style"] - available_scores = [normalized_result.get(f, 5) for f in score_fields if f in normalized_result] - if available_scores: - normalized_result["overall_score"] = round(sum(available_scores) / len(available_scores), 1) - else: - normalized_result["overall_score"] = 5.0 + + # 处理嵌套的评论结构 - 无论是否在上面的循环中设置 + if field == "comments" and isinstance(normalized_result.get("comments"), dict): + # 如果评论是一个字典,尝试提取有用的信息并转换为字符串 + comments_dict = normalized_result["comments"] + comments_str = "" + + # 处理常见的嵌套结构 + if "overall" in comments_dict and isinstance(comments_dict["overall"], dict) and "comment" in comments_dict["overall"]: + # 如果有overall评论,优先使用它 + comments_str = comments_dict["overall"]["comment"] else: - # 对于其他评分字段,使用默认值5 - normalized_result[field] = 5 + # 否则,尝试从各个评分字段中提取评论 + for score_field in ["readability", "efficiency", "security", "structure", "error_handling", "documentation", "code_style"]: + if score_field in comments_dict and isinstance(comments_dict[score_field], dict) and "comment" in comments_dict[score_field]: + comments_str += f"{score_field.capitalize()}: {comments_dict[score_field]['comment']}\n" + + # 如果没有找到任何评论,尝试直接将字典转换为字符串 + if not comments_str: + try: + comments_str = json.dumps(comments_dict, ensure_ascii=False) + except: + comments_str = str(comments_dict) + + normalized_result["comments"] = comments_str + elif field == "overall_score": + # 如果缺少总分,计算其他分数的平均值 + score_fields = ["readability", "efficiency", "security", "structure", + "error_handling", "documentation", "code_style"] + available_scores = [normalized_result.get(f, 5) for f in score_fields if f in normalized_result] + if available_scores: + normalized_result["overall_score"] = round(sum(available_scores) / len(available_scores), 1) + else: + normalized_result["overall_score"] = 5.0 + else: + # 对于其他评分字段,使用默认值5 + normalized_result[field] = 5 # 确保分数在有效范围内 score_fields = ["readability", "efficiency", "security", "structure", @@ -810,9 +859,50 @@ def _validate_scores(self, result: Dict[str, Any]) -> Dict[str, Any]: adjustment = random.choice([-1, 1]) normalized_result[field] = max(1, min(10, normalized_result[field] + adjustment)) + # 确保comments字段是字符串类型 + if "comments" in normalized_result: + if not isinstance(normalized_result["comments"], str): + try: + if isinstance(normalized_result["comments"], dict): + # 如果是字典,尝试提取有用的信息 + comments_dict = normalized_result["comments"] + comments_str = "" + + # 处理常见的嵌套结构 + if "overall" in comments_dict and isinstance(comments_dict["overall"], dict) and "comment" in comments_dict["overall"]: + # 如果有overall评论,优先使用它 + comments_str = comments_dict["overall"]["comment"] + else: + # 否则,尝试从各个评分字段中提取评论 + for field in ["readability", "efficiency", "security", "structure", "error_handling", "documentation", "code_style"]: + if field in comments_dict and isinstance(comments_dict[field], dict) and "comment" in comments_dict[field]: + comments_str += f"{field.capitalize()}: {comments_dict[field]['comment']}\n" + + # 如果没有找到任何评论,尝试直接将字典转换为字符串 + if not comments_str: + comments_str = json.dumps(comments_dict, ensure_ascii=False) + + normalized_result["comments"] = comments_str + else: + # 其他类型直接转换为字符串 + normalized_result["comments"] = str(normalized_result["comments"]) + except Exception as e: + logger.error(f"Error converting comments to string: {e}") + normalized_result["comments"] = f"评论转换错误: {str(e)}" + + # 确保评论不为空 + if not normalized_result["comments"]: + normalized_result["comments"] = "无评价意见" + # 使用from_dict方法创建CodeEvaluation实例进行最终验证 - evaluation = CodeEvaluation.from_dict(normalized_result) - return evaluation.model_dump() + try: + evaluation = CodeEvaluation.from_dict(normalized_result) + return evaluation.model_dump() + except Exception as e: + logger.error(f"Error creating CodeEvaluation: {e}") + logger.error(f"Normalized result: {normalized_result}") + # 如果创建失败,返回一个安全的默认结果 + return self._generate_default_scores(f"验证失败: {str(e)}") except Exception as e: logger.error(f"Score validation error: {e}") logger.error(f"Original result: {result}") @@ -829,9 +919,42 @@ def _generate_default_scores(self, error_message: str) -> Dict[str, Any]: "documentation": 5, "code_style": 5, "overall_score": 5.0, + "estimated_hours": 0.0, "comments": error_message } + def _estimate_default_hours(self, additions: int, deletions: int) -> float: + """Estimate default working hours based on additions and deletions. + + Args: + additions: Number of added lines + deletions: Number of deleted lines + + Returns: + float: Estimated working hours + """ + # Base calculation: 1 hour per 100 lines of code (additions + deletions) + total_changes = additions + deletions + + # Base time: minimum 0.25 hours (15 minutes) for any change + base_time = 0.25 + + if total_changes <= 10: + # Very small changes: 15-30 minutes + return base_time + elif total_changes <= 50: + # Small changes: 30 minutes to 1 hour + return base_time + (total_changes - 10) * 0.015 # ~0.6 hours for 50 lines + elif total_changes <= 200: + # Medium changes: 1-3 hours + return 0.6 + (total_changes - 50) * 0.016 # ~3 hours for 200 lines + elif total_changes <= 500: + # Large changes: 3-6 hours + return 3.0 + (total_changes - 200) * 0.01 # ~6 hours for 500 lines + else: + # Very large changes: 6+ hours + return 6.0 + (total_changes - 500) * 0.008 # +0.8 hours per 100 lines beyond 500 + def _guess_language(self, file_path: str) -> str: """根据文件扩展名猜测编程语言。 @@ -943,6 +1066,44 @@ def _guess_language(self, file_path: str) -> str: # 默认返回通用编程语言 return 'General' + def _sanitize_content(self, content: str) -> str: + """清理内容中的异常字符,确保内容可以安全地发送到OpenAI API。 + + Args: + content: 原始内容 + + Returns: + str: 清理后的内容 + """ + if not content: + return "" + + try: + # 检查是否包含Base64编码的内容 + if len(content) > 20 and content.strip().endswith('==') and all(c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=' for c in content.strip()): + print(f"DEBUG: Detected possible Base64 encoded content: '{content[:20]}...'") + return "这是一段Base64编码的内容,无法进行代码评估。" + + # 移除不可打印字符和控制字符,但保留基本空白字符(空格、换行、制表符) + sanitized = "" + for char in content: + # 保留基本可打印字符和常用空白字符 + if char.isprintable() or char in [' ', '\n', '\t', '\r']: + sanitized += char + else: + # 替换不可打印字符为空格 + sanitized += ' ' + + # 如果清理后的内容太短,返回一个提示 + if len(sanitized.strip()) < 10: + return "代码内容太短或为空,无法进行有效评估。" + + return sanitized + except Exception as e: + print(f"DEBUG: Error sanitizing content: {e}") + # 如果清理过程出错,返回一个安全的默认字符串 + return "内容清理过程中出错,无法处理。" + def _extract_json(self, text: str) -> str: """从文本中提取JSON部分。 @@ -952,6 +1113,60 @@ def _extract_json(self, text: str) -> str: Returns: str: 提取的JSON字符串,如果没有找到则返回空字符串 """ + # 检查输入是否为空或None + if not text: + logger.warning("Empty response received from API") + print("DEBUG: Empty response received from API") + return "" + + # 打印原始文本的类型和长度 + print(f"DEBUG: Response type: {type(text)}, length: {len(text)}") + print(f"DEBUG: First 100 chars: '{text[:100]}'") + + # 检查是否包含无法评估的提示(如Base64编码内容) + unevaluable_patterns = [ + r'Base64编码', + r'无法解码的字符串', + r'ICAgIA==', + r'无法评估', + r'无法对这段代码进行评审', + r'无法进行评价', + r'无法对代码进行评估', + r'代码内容太短', + r'代码为空', + r'没有提供实际的代码', + r'无法理解', + r'无法解析', + r'无法分析', + r'无法读取', + r'无法识别', + r'无法处理', + r'无效的代码', + r'不是有效的代码', + r'不是代码', + r'不包含代码', + r'只包含了一个无法解码的字符串' + ] + + for pattern in unevaluable_patterns: + if re.search(pattern, text, re.IGNORECASE): + print(f"DEBUG: Detected response indicating unevaluable content: '{pattern}'") + # 提取评论,如果有的话 + comment = text[:200] if len(text) > 200 else text + # 创建一个默认的JSON响应 + default_json = { + "readability": 5, + "efficiency": 5, + "security": 5, + "structure": 5, + "error_handling": 5, + "documentation": 5, + "code_style": 5, + "overall_score": 5.0, + "comments": f"无法评估代码: {comment}" + } + return json.dumps(default_json) + # 尝试查找JSON代码块 json_match = re.search(r'```(?:json)?\s*({[\s\S]*?})\s*```', text) if json_match: @@ -999,6 +1214,41 @@ def _extract_json(self, text: str) -> str: if start_idx != -1 and end_idx != -1 and start_idx < end_idx: return text[start_idx:end_idx+1] + # 尝试提取评分信息,即使没有完整的JSON结构 + scores_dict = {} + + # 查找评分模式,如 "Readability: 8/10" 或 "Readability score: 8" + score_patterns = [ + r'(readability|efficiency|security|structure|error handling|documentation|code style):\s*(\d+)(?:/10)?', + r'(readability|efficiency|security|structure|error handling|documentation|code style) score:\s*(\d+)', + ] + + for pattern in score_patterns: + for match in re.finditer(pattern, text.lower()): + key = match.group(1).replace(' ', '_') + value = int(match.group(2)) + scores_dict[key] = value + + # 如果找到了至少4个评分,认为是有效的评分信息 + if len(scores_dict) >= 4: + # 填充缺失的评分 + for field in ["readability", "efficiency", "security", "structure", "error_handling", "documentation", "code_style"]: + if field not in scores_dict: + scores_dict[field] = 5 # 默认分数 + + # 计算总分 + scores_dict["overall_score"] = round(sum(scores_dict.values()) / len(scores_dict), 1) + + # 提取评论 + comment_match = re.search(r'(comments|summary|analysis|evaluation):(.*?)(?=\n\w+:|$)', text.lower(), re.DOTALL) + if comment_match: + scores_dict["comments"] = comment_match.group(2).strip() + else: + # 使用整个文本作为评论,但限制长度 + scores_dict["comments"] = text[:500] + "..." if len(text) > 500 else text + + return json.dumps(scores_dict) + return "" def _fix_malformed_json(self, json_str: str) -> str: @@ -1010,6 +1260,52 @@ def _fix_malformed_json(self, json_str: str) -> str: Returns: str: 修复后的JSON字符串,如果无法修复则返回空字符串 """ + # 检查输入是否为空或None + if not json_str: + logger.warning("Empty string passed to _fix_malformed_json") + # 创建一个默认的JSON + default_scores = { + "readability": 5, + "efficiency": 5, + "security": 5, + "structure": 5, + "error_handling": 5, + "documentation": 5, + "code_style": 5, + "overall_score": 5.0, + "estimated_hours": 0.0, + "comments": "API返回空响应,显示默认分数。" + } + return json.dumps(default_scores) + + # 检查是否是错误消息而不是JSON + error_patterns = [ + "I'm sorry", + "there is no code", + "please provide", + "cannot review", + "unable to" + ] + + for pattern in error_patterns: + if pattern.lower() in json_str.lower(): + logger.warning(f"API returned an error message: {json_str[:100]}...") + print(f"DEBUG: API returned an error message: {json_str[:100]}...") + # 创建一个默认的JSON,包含错误消息 + default_scores = { + "readability": 5, + "efficiency": 5, + "security": 5, + "structure": 5, + "error_handling": 5, + "documentation": 5, + "code_style": 5, + "overall_score": 5.0, + "estimated_hours": 0.0, + "comments": f"API返回错误消息: {json_str[:200]}..." + } + return json.dumps(default_scores) + original_json = json_str # 保存原始字符串以便比较 try: @@ -1045,32 +1341,93 @@ def _fix_malformed_json(self, json_str: str) -> str: except (json.JSONDecodeError, IndexError): pass + # 尝试查找任何可能的JSON对象 + json_pattern = r'{[\s\S]*?}' + json_matches = re.findall(json_pattern, original_json) + + if json_matches: + # 尝试每个匹配的JSON对象 + for potential_json in json_matches: + try: + # 尝试解析 + json.loads(potential_json) + return potential_json + except json.JSONDecodeError: + # 尝试基本清理 + cleaned_json = potential_json.replace("'", '"') + cleaned_json = re.sub(r',\s*}', '}', cleaned_json) + cleaned_json = re.sub(r'([{,])\s*(\w+)\s*:', r'\1"\2":', cleaned_json) + + try: + json.loads(cleaned_json) + return cleaned_json + except json.JSONDecodeError: + continue + # 尝试提取分数并创建最小可用的JSON try: # 提取分数 scores = {} for field in ["readability", "efficiency", "security", "structure", "error_handling", "documentation", "code_style"]: - match = re.search(f'"{field}"\s*:\s*(\d+)', original_json) - if match: - scores[field] = int(match.group(1)) - else: + # 尝试多种模式匹配 + patterns = [ + f'"{field}"\\s*:\\s*(\\d+)', # "field": 8 + f'{field}\\s*:\\s*(\\d+)', # field: 8 + f'{field.replace("_", " ")}\\s*:\\s*(\\d+)', # field name: 8 + f'{field.capitalize()}\\s*:\\s*(\\d+)', # Field: 8 + f'{field.replace("_", " ").title()}\\s*:\\s*(\\d+)' # Field Name: 8 + ] + + for pattern in patterns: + match = re.search(pattern, original_json, re.IGNORECASE) + if match: + scores[field] = int(match.group(1)) + break + + if field not in scores: scores[field] = 5 # 默认分数 # 尝试提取总分 - overall_match = re.search(r'"overall_score"\s*:\s*(\d+(?:\.\d+)?)', original_json) - if overall_match: - scores["overall_score"] = float(overall_match.group(1)) - else: + overall_patterns = [ + r'"overall_score"\s*:\s*(\d+(?:\.\d+)?)', + r'overall_score\s*:\s*(\d+(?:\.\d+)?)', + r'overall\s*:\s*(\d+(?:\.\d+)?)', + r'总分\s*:\s*(\d+(?:\.\d+)?)' + ] + + for pattern in overall_patterns: + overall_match = re.search(pattern, original_json, re.IGNORECASE) + if overall_match: + scores["overall_score"] = float(overall_match.group(1)) + break + + if "overall_score" not in scores: # 计算总分为其他分数的平均值 scores["overall_score"] = round(sum(scores.values()) / len(scores), 1) - # 添加评价意见 - scores["comments"] = "JSON解析错误,显示提取的分数。" + # 尝试提取评论 + comment_patterns = [ + r'"comments"\s*:\s*"(.*?)"', + r'comments\s*:\s*(.*?)(?=\n\w+:|$)', + r'评价\s*:\s*(.*?)(?=\n\w+:|$)', + r'建议\s*:\s*(.*?)(?=\n\w+:|$)' + ] + + for pattern in comment_patterns: + comment_match = re.search(pattern, original_json, re.DOTALL | re.IGNORECASE) + if comment_match: + scores["comments"] = comment_match.group(1).strip() + break + + if "comments" not in scores: + # 使用原始文本的一部分作为评论 + scores["comments"] = "JSON解析错误,显示提取的分数。原始响应: " + original_json[:200] + "..." # 转换为JSON字符串 return json.dumps(scores) except Exception as final_e: logger.error(f"所有JSON修复尝试失败: {final_e}") + logger.error(f"原始响应: {original_json[:500]}") print(f"无法修复JSON: {e} -> {final_e}") # 最后尝试:创建一个默认的JSON @@ -1083,7 +1440,8 @@ def _fix_malformed_json(self, json_str: str) -> str: "documentation": 5, "code_style": 5, "overall_score": 5.0, - "comments": "JSON解析错误,显示默认分数。" + "estimated_hours": 0.0, + "comments": f"JSON解析错误,显示默认分数。错误: {str(e)}" } return json.dumps(default_scores) @@ -1118,7 +1476,7 @@ async def _evaluate_diff_chunk(self, chunk: str) -> Dict[str, Any]: # 获取令牌 wait_time = await self.token_bucket.get_tokens(estimated_tokens) if wait_time > 0: - logger.info(f"速率限制: 等待 {wait_time:.2f}s 令牌补充") + logger.info(f"Rate limit: waiting {wait_time:.2f}s for token replenishment") await asyncio.sleep(wait_time) # 确保请求之间有最小间隔 @@ -1141,14 +1499,102 @@ async def _evaluate_diff_chunk(self, chunk: str) -> Dict[str, Any]: # 猜测语言 language = self._guess_language(file_name) - # 使用简化的代码评审prompt,以减少令牌消耗 - review_prompt = f"请评价以下代码:\n\n文件名:{file_name}\n语言:{language}\n\n```{language.lower()}\n{chunk}\n```\n\n请给出1-10分的评分和简要评价。返回JSON格式的结果。" + # 使用更详细的代码评审prompt,确保模型理解任务 + # 清理代码内容,移除异常字符 + sanitized_chunk = self._sanitize_content(chunk) + + review_prompt = f"""请评价以下代码: + +文件名:{file_name} +语言:{language} + +``` +{sanitized_chunk} +``` + +请对这段代码进行全面评价,并给出1-10分的评分(10分为最高)。评价应包括以下几个方面: +1. 可读性 (readability):代码是否易于阅读和理解 +2. 效率 (efficiency):代码是否高效,是否有性能问题 +3. 安全性 (security):代码是否存在安全隐患 +4. 结构 (structure):代码结构是否合理,是否遵循良好的设计原则 +5. 错误处理 (error_handling):是否有适当的错误处理机制 +6. 文档和注释 (documentation):注释是否充分,是否有必要的文档 +7. 代码风格 (code_style):是否遵循一致的代码风格和最佳实践 +8. 总体评分 (overall_score):综合以上各项的总体评价 + +请以JSON格式返回结果,格式如下: +```json +{{ + "readability": 评分, + "efficiency": 评分, + "security": 评分, + "structure": 评分, + "error_handling": 评分, + "documentation": 评分, + "code_style": 评分, + "overall_score": 总评分, + "comments": "详细评价意见和改进建议" +}} +``` + +总评分应该是所有评分的加权平均值,保留一位小数。如果代码很小或者只是配置文件的修改,请根据实际情况给出合理的评分。 + +重要提示:请确保返回有效的JSON格式。如果无法评估代码(例如代码不完整或无法理解),请仍然返回JSON格式,但在comments中说明原因,并给出默认评分5分。""" + + # 打印完整的代码块用于调试 + print(f"DEBUG: File name: {file_name}") + print(f"DEBUG: Language: {language}") + print(f"DEBUG: Code chunk length: {len(chunk)}") + print(f"DEBUG: Code chunk first 100 chars: '{chunk[:100]}'") + if len(chunk) < 10: + print(f"DEBUG: EMPTY CODE CHUNK: '{chunk}'") + elif len(chunk) < 100: + print(f"DEBUG: FULL CODE CHUNK: '{chunk}'") + + # 如果代码块为空或太短,使用默认评分 + if len(chunk.strip()) < 10: + print("DEBUG: Code chunk is too short, using default scores") + default_scores = { + "readability": 5, + "efficiency": 5, + "security": 5, + "structure": 5, + "error_handling": 5, + "documentation": 5, + "code_style": 5, + "overall_score": 5.0, + "estimated_hours": 0.25, # Minimum 15 minutes for any change + "comments": f"无法评估代码,因为代码块为空或太短: '{chunk}'" + } + return default_scores + + # 检查是否包含Base64编码的内容 + if chunk.strip().endswith('==') and all(c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=' for c in chunk.strip()): + print(f"DEBUG: Detected possible Base64 encoded content in chunk") + default_scores = { + "readability": 5, + "efficiency": 5, + "security": 5, + "structure": 5, + "error_handling": 5, + "documentation": 5, + "code_style": 5, + "overall_score": 5.0, + "estimated_hours": 0.25, # Minimum 15 minutes for any change + "comments": f"无法评估代码,因为内容可能是Base64编码: '{chunk[:50]}...'" + } + return default_scores messages = [ SystemMessage(content=self.system_prompt), HumanMessage(content=review_prompt) ] + # 打印用户输入内容的前100个字符用于调试 + user_message = messages[1].content if len(messages) > 1 else "No user message" + print(f"DEBUG: User input first 100 chars: '{user_message[:100]}...'") + print(f"DEBUG: User input length: {len(user_message)}") + # 调用模型 response = await self.model.agenerate(messages=[messages]) self._last_request_time = time.time() @@ -1156,6 +1602,9 @@ async def _evaluate_diff_chunk(self, chunk: str) -> Dict[str, Any]: # 获取响应文本 generated_text = response.generations[0][0].text + # 打印原始响应用于调试 + print(f"\n==== RAW OPENAI RESPONSE ====\n{generated_text}\n==== END RESPONSE ====\n") + # 解析响应 try: # 提取JSON @@ -1196,16 +1645,19 @@ async def _evaluate_diff_chunk(self, chunk: str) -> Dict[str, Any]: # 检查是否是上下文长度限制错误 is_context_length_error = "context length" in error_message.lower() or "maximum context length" in error_message.lower() + # 检查是否是DeepSeek API错误 + is_deepseek_error = "deepseek" in error_message.lower() or "deepseek api" in error_message.lower() + if is_context_length_error: # 如果是上下文长度错误,尝试进一步分割 - logger.warning(f"上下文长度限制错误,尝试进一步分割内容") + logger.warning(f"Context length limit error, attempting further content splitting") smaller_chunks = self._split_diff_content(chunk, max_tokens_per_chunk=4000) # 使用更小的块大小 if len(smaller_chunks) > 1: # 如果成功分割成多个小块,分别评估并合并结果 sub_results = [] for i, sub_chunk in enumerate(smaller_chunks): - logger.info(f"评估子块 {i+1}/{len(smaller_chunks)}") + logger.info(f"Evaluating sub-chunk {i+1}/{len(smaller_chunks)}") sub_result = await self._evaluate_diff_chunk(sub_chunk) # 递归调用 sub_results.append(sub_result) @@ -1222,6 +1674,16 @@ async def _evaluate_diff_chunk(self, chunk: str) -> Dict[str, Any]: wait_time = base_wait_time * (2 ** retry_count) logger.warning(f"Rate limit error, retrying in {wait_time}s (attempt {retry_count}/{max_retries})") await asyncio.sleep(wait_time) + elif is_deepseek_error: + # 对于DeepSeek API错误,最多重试两次,然后放弃 + retry_count += 1 + if retry_count >= 2: # 只重试两次 + logger.error(f"DeepSeek API error after 2 retries, abandoning evaluation: {error_message}") + return self._generate_default_scores(f"DeepSeek API错误,放弃评估: {error_message}") + # 使用较短的等待时间 + wait_time = 3 # 固定3秒等待时间 + logger.warning(f"DeepSeek API error, retrying in {wait_time}s (attempt {retry_count}/2)") + await asyncio.sleep(wait_time) else: # 其他错误直接返回 return self._generate_default_scores(f"评价过程中出错: {error_message}") @@ -1257,6 +1719,12 @@ def _merge_chunk_results(self, chunk_results: List[Dict[str, Any]]) -> Dict[str, overall_scores = [result.get("overall_score", 5.0) for result in chunk_results] merged_scores["overall_score"] = round(sum(overall_scores) / len(overall_scores), 1) + # 计算估计工作时间 - 累加所有块的工作时间 + estimated_hours = sum(result.get("estimated_hours", 0.0) for result in chunk_results) + # 应用一个折扣因子,因为并行处理多个块通常比顺序处理更有效率 + discount_factor = 0.8 if len(chunk_results) > 1 else 1.0 + merged_scores["estimated_hours"] = round(estimated_hours * discount_factor, 1) + # 合并评价意见 comments = [] for i, result in enumerate(chunk_results): @@ -1274,6 +1742,319 @@ def _merge_chunk_results(self, chunk_results: List[Dict[str, Any]]) -> Dict[str, return merged_scores + async def evaluate_commit_file( + self, + file_path: str, + file_diff: str, + file_status: str = "M", + additions: int = 0, + deletions: int = 0, + ) -> Dict[str, Any]: + """ + 评价单个文件的代码差异(新版本,用于commit评估) + + Args: + file_path: 文件路径 + file_diff: 文件差异内容 + file_status: 文件状态(A=添加,M=修改,D=删除) + additions: 添加的行数 + deletions: 删除的行数 + + Returns: + Dict[str, Any]: 文件评价结果字典,包含估计的工作时间 + """ + logger.info(f"Evaluating file: {file_path} (status: {file_status}, additions: {additions}, deletions: {deletions})") + logger.debug(f"File diff size: {len(file_diff)} characters") + # 如果未设置语言,根据文件扩展名猜测语言 + language = self._guess_language(file_path) + logger.info(f"Detected language for {file_path}: {language}") + + # 清理代码内容,移除异常字符 + sanitized_diff = self._sanitize_content(file_diff) + logger.debug(f"Sanitized diff size: {len(sanitized_diff)} characters") + + # 检查文件大小,如果过大则分块处理 + words = sanitized_diff.split() + estimated_tokens = len(words) * 1.2 + logger.info(f"Estimated tokens for {file_path}: {estimated_tokens:.0f}") + + # 如果文件可能超过模型的上下文限制,则分块处理 + if estimated_tokens > 12000: # 留出一些空间给系统提示和其他内容 + logger.info(f"File {file_path} is too large (estimated {estimated_tokens:.0f} tokens), will be processed in chunks") + chunks = self._split_diff_content(sanitized_diff) + logger.info(f"Split file into {len(chunks)} chunks") + print(f"ℹ️ File too large, will be processed in {len(chunks)} chunks") + + # 分别评估每个块 + chunk_results = [] + for i, chunk in enumerate(chunks): + logger.info(f"Evaluating chunk {i+1}/{len(chunks)} of {file_path}") + logger.debug(f"Chunk {i+1} size: {len(chunk)} characters, ~{len(chunk.split())} words") + start_time = time.time() + chunk_result = await self._evaluate_diff_chunk(chunk) + end_time = time.time() + logger.info(f"Chunk {i+1} evaluation completed in {end_time - start_time:.2f} seconds") + chunk_results.append(chunk_result) + + # 合并结果 + logger.info(f"Merging {len(chunk_results)} chunk results for {file_path}") + merged_result = self._merge_chunk_results(chunk_results) + logger.info(f"Merged result: overall score {merged_result.get('overall_score', 'N/A')}") + + # 添加文件信息 + result = { + "path": file_path, + "status": file_status, + "additions": additions, + "deletions": deletions, + "readability": merged_result["readability"], + "efficiency": merged_result["efficiency"], + "security": merged_result["security"], + "structure": merged_result["structure"], + "error_handling": merged_result["error_handling"], + "documentation": merged_result["documentation"], + "code_style": merged_result["code_style"], + "overall_score": merged_result["overall_score"], + "summary": merged_result["comments"][:100] + "..." if len(merged_result["comments"]) > 100 else merged_result["comments"], + "comments": merged_result["comments"] + } + + return result + + # 使用 grimoire 中的 CODE_SUGGESTION 模板 + # 将模板中的占位符替换为实际值 + prompt = CODE_SUGGESTION.format( + language=language, + name=file_path, + content=sanitized_diff + ) + logger.info(f"Preparing prompt for {file_path} with language: {language}") + logger.debug(f"Prompt size: {len(prompt)} characters") + + try: + # 发送请求到模型 + messages = [ + HumanMessage(content=prompt) + ] + + # 打印用户输入内容的前20个字符用于调试 + user_message = messages[0].content if len(messages) > 0 else "No user message" + logger.debug(f"User input first 20 chars: '{user_message[:20]}...'") + print(f"DEBUG: User input first 20 chars: '{user_message[:20]}...'") + + logger.info(f"Sending request to model for {file_path}") + start_time = time.time() + response = await self.model.agenerate(messages=[messages]) + end_time = time.time() + logger.info(f"Model response received in {end_time - start_time:.2f} seconds") + + generated_text = response.generations[0][0].text + logger.debug(f"Response size: {len(generated_text)} characters") + + # 打印原始响应用于调试 + logger.debug(f"Raw model response (first 200 chars): {generated_text[:200]}...") + print(f"\n==== RAW OPENAI RESPONSE ====\n{generated_text[:200]}...\n==== END RESPONSE ====\n") + + # 尝试提取JSON部分 + logger.info(f"Extracting JSON from response for {file_path}") + json_str = self._extract_json(generated_text) + if not json_str: + logger.warning(f"Failed to extract JSON from response for {file_path}, attempting to fix") + json_str = self._fix_malformed_json(generated_text) + if json_str: + logger.info("Successfully fixed malformed JSON") + else: + logger.warning("Failed to fix malformed JSON") + + if not json_str: + logger.error(f"Could not extract valid JSON from the response for {file_path}") + # 创建默认评价 + logger.info("Generating default scores") + eval_data = self._generate_default_scores(f"解析错误。原始响应: {generated_text[:500]}...") + logger.debug(f"Default scores: {eval_data}") + else: + # 解析JSON + try: + logger.info(f"Parsing JSON for {file_path}") + logger.debug(f"JSON string: {json_str[:200]}...") + eval_data = json.loads(json_str) + logger.info(f"Successfully parsed JSON for {file_path}") + + # 确保所有必要字段存在 + required_fields = ["readability", "efficiency", "security", "structure", + "error_handling", "documentation", "code_style", "overall_score", "comments"] + missing_fields = [] + for field in required_fields: + if field not in eval_data: + if field != "overall_score": # overall_score可以计算得出 + missing_fields.append(field) + logger.warning(f"Missing field {field} in evaluation for {file_path}, setting default value") + eval_data[field] = 5 + + if missing_fields: + logger.warning(f"Missing fields in evaluation for {file_path}: {', '.join(missing_fields)}") + + # 如果没有提供overall_score,计算一个 + if "overall_score" not in eval_data or not eval_data["overall_score"]: + logger.info(f"Calculating overall score for {file_path}") + score_fields = ["readability", "efficiency", "security", "structure", + "error_handling", "documentation", "code_style"] + scores = [eval_data.get(field, 5) for field in score_fields] + eval_data["overall_score"] = round(sum(scores) / len(scores), 1) + logger.info(f"Calculated overall score: {eval_data['overall_score']}") + + # Log all scores + logger.info(f"Evaluation scores for {file_path}: " + + f"readability={eval_data.get('readability', 'N/A')}, " + + f"efficiency={eval_data.get('efficiency', 'N/A')}, " + + f"security={eval_data.get('security', 'N/A')}, " + + f"structure={eval_data.get('structure', 'N/A')}, " + + f"error_handling={eval_data.get('error_handling', 'N/A')}, " + + f"documentation={eval_data.get('documentation', 'N/A')}, " + + f"code_style={eval_data.get('code_style', 'N/A')}, " + + f"overall_score={eval_data.get('overall_score', 'N/A')}") + + except Exception as e: + logger.error(f"Error parsing evaluation for {file_path}: {e}", exc_info=True) + logger.debug(f"JSON string that caused the error: {json_str[:500]}...") + eval_data = self._generate_default_scores(f"解析错误。原始响应: {generated_text[:500]}...") + logger.debug(f"Default scores: {eval_data}") + except Exception as e: + logger.error(f"Error during evaluation: {e}") + eval_data = self._generate_default_scores(f"评价过程中出错: {str(e)}") + + # 确保分数不全是相同的,如果发现全是相同的评分,增加一些微小差异 + scores = [eval_data["readability"], eval_data["efficiency"], eval_data["security"], + eval_data["structure"], eval_data["error_handling"], eval_data["documentation"], eval_data["code_style"]] + + # 检查是否所有分数都相同,或者是否有超过75%的分数相同(例如5个3分,1个4分) + score_counts = {} + for score in scores: + score_counts[score] = score_counts.get(score, 0) + 1 + + most_common_score = max(score_counts, key=score_counts.get) + most_common_count = score_counts[most_common_score] + + # 如果所有分数都相同,或者大部分分数相同,则根据文件类型调整分数 + if most_common_count >= 5: # 如果至少5个分数相同 + logger.warning(f"Most scores are identical ({most_common_score}, count: {most_common_count}), adjusting for variety") + print(f"检测到评分缺乏差异性 ({most_common_score},{most_common_count}个相同),正在调整评分使其更具差异性") + + # 根据文件扩展名和内容进行智能评分调整 + file_ext = os.path.splitext(file_path)[1].lower() + + # 设置基础分数 + base_scores = { + "readability": most_common_score, + "efficiency": most_common_score, + "security": most_common_score, + "structure": most_common_score, + "error_handling": most_common_score, + "documentation": most_common_score, + "code_style": most_common_score + } + + # 根据文件类型调整分数 + if file_ext in ['.py', '.js', '.ts', '.java', '.cs', '.cpp', '.c']: + # 代码文件根据路径和名称进行评分调整 + if 'test' in file_path.lower(): + # 测试文件通常: + # - 结构设计很重要 + # - 但可能文档/注释稍差 + # - 安全性通常不是重点 + base_scores["structure"] = min(10, most_common_score + 2) + base_scores["documentation"] = max(1, most_common_score - 1) + base_scores["security"] = max(1, most_common_score - 1) + elif 'util' in file_path.lower() or 'helper' in file_path.lower(): + # 工具类文件通常: + # - 错误处理很重要 + # - 效率可能很重要 + base_scores["error_handling"] = min(10, most_common_score + 2) + base_scores["efficiency"] = min(10, most_common_score + 1) + elif 'security' in file_path.lower() or 'auth' in file_path.lower(): + # 安全相关文件: + # - 安全性很重要 + # - 错误处理很重要 + base_scores["security"] = min(10, most_common_score + 2) + base_scores["error_handling"] = min(10, most_common_score + 1) + elif 'model' in file_path.lower() or 'schema' in file_path.lower(): + # 模型/数据模式文件: + # - 代码风格很重要 + # - 结构设计很重要 + base_scores["code_style"] = min(10, most_common_score + 2) + base_scores["structure"] = min(10, most_common_score + 1) + elif 'api' in file_path.lower() or 'endpoint' in file_path.lower(): + # API文件: + # - 效率很重要 + # - 安全性很重要 + base_scores["efficiency"] = min(10, most_common_score + 2) + base_scores["security"] = min(10, most_common_score + 1) + elif 'ui' in file_path.lower() or 'view' in file_path.lower(): + # UI文件: + # - 可读性很重要 + # - 代码风格很重要 + base_scores["readability"] = min(10, most_common_score + 2) + base_scores["code_style"] = min(10, most_common_score + 1) + else: + # 普通代码文件,添加随机变化,但保持合理区间 + keys = list(base_scores.keys()) + random.shuffle(keys) + # 增加两个值,减少两个值 + for i in range(2): + base_scores[keys[i]] = min(10, base_scores[keys[i]] + 2) + base_scores[keys[i+2]] = max(1, base_scores[keys[i+2]] - 1) + + # 应用调整后的分数 + eval_data["readability"] = base_scores["readability"] + eval_data["efficiency"] = base_scores["efficiency"] + eval_data["security"] = base_scores["security"] + eval_data["structure"] = base_scores["structure"] + eval_data["error_handling"] = base_scores["error_handling"] + eval_data["documentation"] = base_scores["documentation"] + eval_data["code_style"] = base_scores["code_style"] + + # 重新计算平均分 + eval_data["overall_score"] = round(sum([ + eval_data["readability"], + eval_data["efficiency"], + eval_data["security"], + eval_data["structure"], + eval_data["error_handling"], + eval_data["documentation"], + eval_data["code_style"] + ]) / 7, 1) + + logger.info(f"Adjusted scores: {eval_data}") + + # Calculate estimated hours if not provided + if "estimated_hours" not in eval_data or not eval_data["estimated_hours"]: + estimated_hours = self._estimate_default_hours(additions, deletions) + logger.info(f"Calculated default estimated hours for {file_path}: {estimated_hours}") + else: + estimated_hours = eval_data["estimated_hours"] + logger.info(f"Using model-provided estimated hours for {file_path}: {estimated_hours}") + + # 创建并返回评价结果 + result = { + "path": file_path, + "status": file_status, + "additions": additions, + "deletions": deletions, + "readability": eval_data["readability"], + "efficiency": eval_data["efficiency"], + "security": eval_data["security"], + "structure": eval_data["structure"], + "error_handling": eval_data["error_handling"], + "documentation": eval_data["documentation"], + "code_style": eval_data["code_style"], + "overall_score": eval_data["overall_score"], + "estimated_hours": estimated_hours, + "summary": eval_data["comments"][:100] + "..." if len(eval_data["comments"]) > 100 else eval_data["comments"], + "comments": eval_data["comments"] + } + + return result + async def evaluate_file_diff( self, file_path: str, @@ -1298,14 +2079,14 @@ async def evaluate_file_diff( # 如果文件可能超过模型的上下文限制,则分块处理 if estimated_tokens > 12000: # 留出一些空间给系统提示和其他内容 logger.info(f"文件 {file_path} 过大(估计 {estimated_tokens:.0f} 令牌),将进行分块处理") - print(f"ℹ️ 文件 {file_path} 过大,将进行分块处理") + print(f"ℹ️ File too large, will be processed in {len(chunks)} chunks") chunks = self._split_diff_content(file_diff, file_path) # 分别评估每个块 chunk_results = [] for i, chunk in enumerate(chunks): - logger.info(f"评估分块 {i+1}/{len(chunks)}") + logger.info(f"Evaluating chunk {i+1}/{len(chunks)}") chunk_result = await self._evaluate_diff_chunk(chunk) chunk_results.append(chunk_result) @@ -1325,23 +2106,36 @@ async def evaluate_file_diff( # 如果未设置语言,根据文件扩展名猜测语言 language = self._guess_language(file_path) + # 清理代码内容,移除异常字符 + sanitized_diff = self._sanitize_content(file_diff) + # 使用 grimoire 中的 CODE_SUGGESTION 模板 # 将模板中的占位符替换为实际值 prompt = CODE_SUGGESTION.format( language=language, name=file_path, - content=file_diff + content=sanitized_diff ) + # Add request for estimated working hours + prompt += "\n\nIn addition to the code evaluation, please also estimate how many effective working hours an experienced programmer (5-10+ years) would need to complete these code changes. Include this estimate in your JSON response as 'estimated_hours'." + try: # 发送请求到模型 messages = [ HumanMessage(content=prompt) ] + # 打印用户输入内容的前20个字符用于调试 + user_message = messages[0].content if len(messages) > 0 else "No user message" + print(f"DEBUG: User input first 20 chars: '{user_message[:20]}...'") + response = await self.model.agenerate(messages=[messages]) generated_text = response.generations[0][0].text + # 打印原始响应用于调试 + print(f"\n==== RAW OPENAI RESPONSE ====\n{generated_text[:200]}...\n==== END RESPONSE ====\n") + # 尝试提取JSON部分 json_str = self._extract_json(generated_text) if not json_str: @@ -1383,10 +2177,23 @@ async def evaluate_file_diff( scores = [eval_data.get(field, 5) for field in score_fields] eval_data["overall_score"] = round(sum(scores) / len(scores), 1) + # Calculate estimated hours if not provided + if "estimated_hours" not in eval_data or not eval_data["estimated_hours"]: + # Get additions and deletions from the diff + additions = len(re.findall(r'^\+', file_diff, re.MULTILINE)) + deletions = len(re.findall(r'^-', file_diff, re.MULTILINE)) + eval_data["estimated_hours"] = self._estimate_default_hours(additions, deletions) + logger.info(f"Calculated default estimated hours: {eval_data['estimated_hours']}") + # 创建评价对象 evaluation = CodeEvaluation(**eval_data) except Exception as e: logger.error(f"Error parsing evaluation: {e}") + # Get additions and deletions from the diff + additions = len(re.findall(r'^\+', file_diff, re.MULTILINE)) + deletions = len(re.findall(r'^-', file_diff, re.MULTILINE)) + estimated_hours = self._estimate_default_hours(additions, deletions) + evaluation = CodeEvaluation( readability=5, efficiency=5, @@ -1396,10 +2203,16 @@ async def evaluate_file_diff( documentation=5, code_style=5, overall_score=5.0, + estimated_hours=estimated_hours, comments=f"解析错误。原始响应: {generated_text[:500]}..." ) except Exception as e: logger.error(f"Error during evaluation: {e}") + # Get additions and deletions from the diff + additions = len(re.findall(r'^\+', file_diff, re.MULTILINE)) + deletions = len(re.findall(r'^-', file_diff, re.MULTILINE)) + estimated_hours = self._estimate_default_hours(additions, deletions) + evaluation = CodeEvaluation( readability=5, efficiency=5, @@ -1409,6 +2222,7 @@ async def evaluate_file_diff( documentation=5, code_style=5, overall_score=5.0, + estimated_hours=estimated_hours, comments=f"评价过程中出错: {str(e)}" ) @@ -1600,8 +2414,8 @@ async def evaluate_commits( # 检查是否发生异常 if isinstance(eval_result, Exception): - logger.error(f"评估文件 {file_path} 时出错: {str(eval_result)}") - print(f"⚠️ 评估文件 {file_path} 时出错: {str(eval_result)}") + logger.error(f"Error evaluating file {file_path}: {str(eval_result)}") + print(f"⚠️ Error evaluating file {file_path}: {str(eval_result)}") # 创建默认评估结果 default_scores = self._generate_default_scores(f"评估失败: {str(eval_result)}") @@ -1629,7 +2443,7 @@ async def evaluate_commits( ) ) except Exception as e: - logger.error(f"创建评估结果对象时出错: {str(e)}\n评估结果: {eval_result}") + logger.error(f"Error creating evaluation result object: {str(e)}\nEvaluation result: {eval_result}") print(f"⚠️ 创建评估结果对象时出错: {str(e)}") # 创建默认评估结果 @@ -1711,6 +2525,332 @@ async def evaluate_commits( return results + async def evaluate_commit_as_whole( + self, + commit_hash: str, + commit_diff: Dict[str, Dict[str, Any]], + ) -> Dict[str, Any]: + """Evaluate all diffs in a commit together as a whole. + + This method combines all file diffs into a single evaluation to get a holistic view + of the commit and estimate the effective working hours needed. + + Args: + commit_hash: The hash of the commit being evaluated + commit_diff: Dictionary mapping file paths to their diffs and statistics + + Returns: + Dictionary containing evaluation results including estimated working hours + """ + logger.info(f"Starting whole-commit evaluation for {commit_hash}") + + # Combine all diffs into a single string with file headers + combined_diff = "" + total_additions = 0 + total_deletions = 0 + + for file_path, diff_info in commit_diff.items(): + file_diff = diff_info["diff"] + status = diff_info["status"] + additions = diff_info.get("additions", 0) + deletions = diff_info.get("deletions", 0) + + total_additions += additions + total_deletions += deletions + + # Add file header + combined_diff += f"\n\n### File: {file_path} (Status: {status}, +{additions}, -{deletions})\n\n" + combined_diff += file_diff + + logger.info(f"Combined {len(commit_diff)} files into a single evaluation") + logger.debug(f"Combined diff size: {len(combined_diff)} characters") + + # Clean the combined diff content + sanitized_diff = self._sanitize_content(combined_diff) + + # Check if the combined diff is too large + words = sanitized_diff.split() + estimated_tokens = len(words) * 1.2 + logger.info(f"Estimated tokens for combined diff: {estimated_tokens:.0f}") + + # Create a prompt for evaluating the entire commit + language = "multiple" # Since we're evaluating multiple files + + # Create a prompt that specifically asks for working hours estimation + prompt = f"""Act as a senior code reviewer with 10+ years of experience. I will provide you with a complete diff of a commit that includes multiple files. + +Please analyze the entire commit as a whole and provide: + +1. A comprehensive evaluation of the code changes +2. An estimate of how many effective working hours an experienced programmer (5-10+ years) would need to complete these code changes +3. Scores for the following aspects (1-10 scale): + - Readability + - Efficiency + - Security + - Structure + - Error Handling + - Documentation + - Code Style + - Overall Score + +Here's the complete diff for commit {commit_hash}: + +``` +{sanitized_diff} +``` + +Please format your response as JSON with the following fields: +- readability: (score 1-10) +- efficiency: (score 1-10) +- security: (score 1-10) +- structure: (score 1-10) +- error_handling: (score 1-10) +- documentation: (score 1-10) +- code_style: (score 1-10) +- overall_score: (score 1-10) +- estimated_hours: (number of hours) +- comments: (your detailed analysis) +""" + + logger.info("Preparing to evaluate combined diff") + logger.debug(f"Prompt size: {len(prompt)} characters") + + try: + # Send request to model + messages = [HumanMessage(content=prompt)] + + logger.info("Sending request to model for combined diff evaluation") + start_time = time.time() + response = await self.model.agenerate(messages=[messages]) + end_time = time.time() + logger.info(f"Model response received in {end_time - start_time:.2f} seconds") + + generated_text = response.generations[0][0].text + logger.debug(f"Response size: {len(generated_text)} characters") + + # Extract JSON from response + logger.info("Extracting JSON from response") + json_str = self._extract_json(generated_text) + if not json_str: + logger.warning("Failed to extract JSON from response, attempting to fix") + json_str = self._fix_malformed_json(generated_text) + + if not json_str: + logger.error("Could not extract valid JSON from the response") + # Create default evaluation + eval_data = self._generate_default_scores("Failed to parse response") + eval_data["estimated_hours"] = self._estimate_default_hours(total_additions, total_deletions) + else: + # Parse JSON + try: + eval_data = json.loads(json_str) + + # Ensure all necessary fields exist + required_fields = ["readability", "efficiency", "security", "structure", + "error_handling", "documentation", "code_style", "overall_score", "comments"] + for field in required_fields: + if field not in eval_data: + if field != "overall_score": # overall_score can be calculated + logger.warning(f"Missing field {field} in evaluation, setting default value") + eval_data[field] = 5 + + # If overall_score is not provided, calculate it + if "overall_score" not in eval_data or not eval_data["overall_score"]: + score_fields = ["readability", "efficiency", "security", "structure", + "error_handling", "documentation", "code_style"] + scores = [eval_data.get(field, 5) for field in score_fields] + eval_data["overall_score"] = round(sum(scores) / len(scores), 1) + + # If estimated_hours is not provided, calculate a default + if "estimated_hours" not in eval_data or not eval_data["estimated_hours"]: + logger.warning("Missing estimated_hours in evaluation, calculating default") + eval_data["estimated_hours"] = self._estimate_default_hours(total_additions, total_deletions) + + # Log all scores + logger.info(f"Whole commit evaluation scores: " + + f"readability={eval_data.get('readability', 'N/A')}, " + + f"efficiency={eval_data.get('efficiency', 'N/A')}, " + + f"security={eval_data.get('security', 'N/A')}, " + + f"structure={eval_data.get('structure', 'N/A')}, " + + f"error_handling={eval_data.get('error_handling', 'N/A')}, " + + f"documentation={eval_data.get('documentation', 'N/A')}, " + + f"code_style={eval_data.get('code_style', 'N/A')}, " + + f"overall_score={eval_data.get('overall_score', 'N/A')}, " + + f"estimated_hours={eval_data.get('estimated_hours', 'N/A')}") + + except Exception as e: + logger.error(f"Error parsing evaluation: {e}", exc_info=True) + eval_data = self._generate_default_scores(f"解析错误。原始响应: {generated_text[:500]}...") + eval_data["estimated_hours"] = self._estimate_default_hours(total_additions, total_deletions) + + except Exception as e: + logger.error(f"Error during evaluation: {e}", exc_info=True) + eval_data = self._generate_default_scores(f"评价过程中出错: {str(e)}") + eval_data["estimated_hours"] = self._estimate_default_hours(total_additions, total_deletions) + + return eval_data + + def _estimate_default_hours(self, additions: int, deletions: int) -> float: + """Estimate default working hours based on additions and deletions. + + This is a fallback method when the model doesn't provide an estimate. + + Args: + additions: Number of lines added + deletions: Number of lines deleted + + Returns: + float: Estimated working hours + """ + # Simple heuristic: + # - Each 50 lines of additions takes about 1 hour for an experienced developer + # - Each 100 lines of deletions takes about 0.5 hour + # - Minimum 0.5 hours, maximum 40 hours (1 week) + estimated_hours = (additions / 50) + (deletions / 200) + return max(0.5, min(40, round(estimated_hours, 1))) + + async def evaluate_commit( + self, + commit_hash: str, + commit_diff: Dict[str, Dict[str, Any]], + ) -> Dict[str, Any]: + """Evaluate a specific commit's changes. + + Args: + commit_hash: The hash of the commit being evaluated + commit_diff: Dictionary mapping file paths to their diffs and statistics + + Returns: + Dictionary containing evaluation results + """ + logger.info(f"Starting evaluation for commit {commit_hash}") + logger.info(f"Found {len(commit_diff)} files to evaluate") + + # Log file statistics + total_additions = sum(diff.get("additions", 0) for diff in commit_diff.values()) + total_deletions = sum(diff.get("deletions", 0) for diff in commit_diff.values()) + logger.info(f"Commit statistics: {len(commit_diff)} files, {total_additions} additions, {total_deletions} deletions") + + # Initialize evaluation results + evaluation_results = { + "commit_hash": commit_hash, + "files": [], + "summary": "", + "statistics": { + "total_files": len(commit_diff), + "total_additions": total_additions, + "total_deletions": total_deletions, + } + } + logger.debug(f"Initialized evaluation results structure for commit {commit_hash}") + + # Evaluate each file + logger.info(f"Starting file-by-file evaluation for commit {commit_hash}") + for i, (file_path, diff_info) in enumerate(commit_diff.items()): + logger.info(f"Evaluating file {i+1}/{len(commit_diff)}: {file_path}") + logger.debug(f"File info: status={diff_info['status']}, additions={diff_info.get('additions', 0)}, deletions={diff_info.get('deletions', 0)}") + + # Use the new method for commit file evaluation + start_time = time.time() + file_evaluation = await self.evaluate_commit_file( + file_path, + diff_info["diff"], + diff_info["status"], + diff_info.get("additions", 0), + diff_info.get("deletions", 0), + ) + end_time = time.time() + logger.info(f"File {file_path} evaluated in {end_time - start_time:.2f} seconds with score: {file_evaluation.get('overall_score', 'N/A')}") + + evaluation_results["files"].append(file_evaluation) + logger.debug(f"Added evaluation for {file_path} to results") + + # Evaluate the entire commit as a whole to get estimated working hours + logger.info("Evaluating entire commit as a whole") + whole_commit_evaluation = await self.evaluate_commit_as_whole(commit_hash, commit_diff) + + # Add the estimated working hours to the evaluation results + evaluation_results["estimated_hours"] = whole_commit_evaluation.get("estimated_hours", 0) + logger.info(f"Estimated working hours: {evaluation_results['estimated_hours']}") + + # Add whole commit evaluation scores + evaluation_results["whole_commit_evaluation"] = { + "readability": whole_commit_evaluation.get("readability", 5), + "efficiency": whole_commit_evaluation.get("efficiency", 5), + "security": whole_commit_evaluation.get("security", 5), + "structure": whole_commit_evaluation.get("structure", 5), + "error_handling": whole_commit_evaluation.get("error_handling", 5), + "documentation": whole_commit_evaluation.get("documentation", 5), + "code_style": whole_commit_evaluation.get("code_style", 5), + "overall_score": whole_commit_evaluation.get("overall_score", 5), + "comments": whole_commit_evaluation.get("comments", "No comments available.") + } + + # Generate overall summary + logger.info(f"Generating overall summary for commit {commit_hash}") + summary_prompt = self._create_summary_prompt(evaluation_results) + logger.debug(f"Summary prompt size: {len(summary_prompt)} characters") + + # Use agenerate instead of ainvoke + messages = [HumanMessage(content=summary_prompt)] + logger.info("Sending summary request to model") + start_time = time.time() + summary_response = await self.model.agenerate(messages=[messages]) + end_time = time.time() + logger.info(f"Summary response received in {end_time - start_time:.2f} seconds") + + summary_text = summary_response.generations[0][0].text + logger.debug(f"Summary text size: {len(summary_text)} characters") + logger.debug(f"Summary text (first 100 chars): {summary_text[:100]}...") + + evaluation_results["summary"] = summary_text + logger.info(f"Evaluation for commit {commit_hash} completed successfully") + + return evaluation_results + + def _create_summary_prompt(self, evaluation_results: Dict[str, Any]) -> str: + """Create a prompt for generating the overall commit summary.""" + files_summary = "\n".join( + f"- {file['path']} ({file['status']}): {file['summary']}" + for file in evaluation_results["files"] + ) + + # Include whole commit evaluation if available + whole_commit_evaluation = "" + if "whole_commit_evaluation" in evaluation_results: + eval_data = evaluation_results["whole_commit_evaluation"] + whole_commit_evaluation = f""" +Whole Commit Evaluation: +- Readability: {eval_data.get('readability', 'N/A')}/10 +- Efficiency: {eval_data.get('efficiency', 'N/A')}/10 +- Security: {eval_data.get('security', 'N/A')}/10 +- Structure: {eval_data.get('structure', 'N/A')}/10 +- Error Handling: {eval_data.get('error_handling', 'N/A')}/10 +- Documentation: {eval_data.get('documentation', 'N/A')}/10 +- Code Style: {eval_data.get('code_style', 'N/A')}/10 +- Overall Score: {eval_data.get('overall_score', 'N/A')}/10 +- Comments: {eval_data.get('comments', 'No comments available.')} +""" + + # Include estimated working hours if available + estimated_hours = "" + if "estimated_hours" in evaluation_results: + estimated_hours = f"- Estimated working hours (for 5-10+ years experienced developer): {evaluation_results['estimated_hours']} hours\n" + + return f"""Please provide a concise summary of this commit's changes: + +Files modified: +{files_summary} + +Statistics: +- Total files: {evaluation_results['statistics']['total_files']} +- Total additions: {evaluation_results['statistics']['total_additions']} +- Total deletions: {evaluation_results['statistics']['total_deletions']} +{estimated_hours} +{whole_commit_evaluation} +Please provide a brief summary of the overall changes and their impact. +If estimated working hours are provided, please comment on whether this estimate seems reasonable given the scope of changes.""" + def generate_evaluation_markdown(evaluation_results: List[FileEvaluationResult]) -> str: """ @@ -1728,18 +2868,18 @@ def generate_evaluation_markdown(evaluation_results: List[FileEvaluationResult]) # 按日期排序结果 sorted_results = sorted(evaluation_results, key=lambda x: x.date) - # 创建Markdown标题 - markdown = "# 代码评价报告\n\n" + # Create Markdown header + markdown = "# Code Evaluation Report\n\n" - # 添加概述 - author = sorted_results[0].author if sorted_results else "未知" - start_date = sorted_results[0].date.strftime("%Y-%m-%d") if sorted_results else "未知" - end_date = sorted_results[-1].date.strftime("%Y-%m-%d") if sorted_results else "未知" + # Add overview + author = sorted_results[0].author if sorted_results else "Unknown" + start_date = sorted_results[0].date.strftime("%Y-%m-%d") if sorted_results else "Unknown" + end_date = sorted_results[-1].date.strftime("%Y-%m-%d") if sorted_results else "Unknown" - markdown += f"## 概述\n\n" - markdown += f"- **开发者**: {author}\n" - markdown += f"- **时间范围**: {start_date} 至 {end_date}\n" - markdown += f"- **评价文件数**: {len(sorted_results)}\n\n" + markdown += f"## Overview\n\n" + markdown += f"- **Developer**: {author}\n" + markdown += f"- **Time Range**: {start_date} to {end_date}\n" + markdown += f"- **Files Evaluated**: {len(sorted_results)}\n\n" # 计算平均分 total_scores = { @@ -1751,6 +2891,7 @@ def generate_evaluation_markdown(evaluation_results: List[FileEvaluationResult]) "documentation": 0, "code_style": 0, "overall_score": 0, + "estimated_hours": 0, } for result in sorted_results: @@ -1764,59 +2905,84 @@ def generate_evaluation_markdown(evaluation_results: List[FileEvaluationResult]) total_scores["code_style"] += eval.code_style total_scores["overall_score"] += eval.overall_score - avg_scores = {k: v / len(sorted_results) for k, v in total_scores.items()} + # Add estimated hours if available + if hasattr(eval, 'estimated_hours') and eval.estimated_hours: + total_scores["estimated_hours"] += eval.estimated_hours - # 添加总评分表格 - markdown += "## 总评分\n\n" - markdown += "| 评分维度 | 平均分 |\n" - markdown += "|---------|-------|\n" - markdown += f"| 可读性 | {avg_scores['readability']:.1f} |\n" - markdown += f"| 效率与性能 | {avg_scores['efficiency']:.1f} |\n" - markdown += f"| 安全性 | {avg_scores['security']:.1f} |\n" - markdown += f"| 结构与设计 | {avg_scores['structure']:.1f} |\n" - markdown += f"| 错误处理 | {avg_scores['error_handling']:.1f} |\n" - markdown += f"| 文档与注释 | {avg_scores['documentation']:.1f} |\n" - markdown += f"| 代码风格 | {avg_scores['code_style']:.1f} |\n" - markdown += f"| **总分** | **{avg_scores['overall_score']:.1f}** |\n\n" - - # 添加质量评估 + avg_scores = {k: v / len(sorted_results) for k, v in total_scores.items()} + # Add trend analysis + markdown += "## Overview\n\n" + markdown += f"- **Developer**: {author}\n" + markdown += f"- **Time Range**: {start_date} to {end_date}\n" + markdown += f"- **Files Evaluated**: {len(sorted_results)}\n" + + # Add total estimated working hours if available + if total_scores["estimated_hours"] > 0: + markdown += f"- **Total Estimated Working Hours**: {total_scores['estimated_hours']:.1f} hours\n" + markdown += f"- **Average Estimated Hours per File**: {avg_scores['estimated_hours']:.1f} hours\n" + + markdown += "\n" + + # Calculate average scores + markdown += "## Overall Scores\n\n" + markdown += "| Dimension | Average Score |\n" + markdown += "|-----------|---------------|\n" + markdown += f"| Readability | {avg_scores['readability']:.1f} |\n" + markdown += f"| Efficiency & Performance | {avg_scores['efficiency']:.1f} |\n" + markdown += f"| Security | {avg_scores['security']:.1f} |\n" + markdown += f"| Structure & Design | {avg_scores['structure']:.1f} |\n" + markdown += f"| Error Handling | {avg_scores['error_handling']:.1f} |\n" + markdown += f"| Documentation & Comments | {avg_scores['documentation']:.1f} |\n" + markdown += f"| Code Style | {avg_scores['code_style']:.1f} |\n" + markdown += f"| **Overall Score** | **{avg_scores['overall_score']:.1f}** |\n" + + # Add average estimated working hours if available + if avg_scores["estimated_hours"] > 0: + markdown += f"| **Avg. Estimated Hours/File** | **{avg_scores['estimated_hours']:.1f}** |\n" + + markdown += "\n" + + # Add quality assessment overall_score = avg_scores["overall_score"] quality_level = "" if overall_score >= 9.0: - quality_level = "卓越" + quality_level = "Exceptional" elif overall_score >= 7.0: - quality_level = "优秀" + quality_level = "Excellent" elif overall_score >= 5.0: - quality_level = "良好" + quality_level = "Good" elif overall_score >= 3.0: - quality_level = "需要改进" + quality_level = "Needs Improvement" else: - quality_level = "较差" + quality_level = "Poor" - markdown += f"**整体代码质量**: {quality_level}\n\n" + markdown += f"**Overall Code Quality**: {quality_level}\n\n" # 添加各文件评价详情 markdown += "## 文件评价详情\n\n" for idx, result in enumerate(sorted_results, 1): markdown += f"### {idx}. {result.file_path}\n\n" - markdown += f"- **提交**: {result.commit_hash[:8]} - {result.commit_message}\n" - markdown += f"- **日期**: {result.date.strftime('%Y-%m-%d %H:%M')}\n" - markdown += f"- **评分**:\n\n" - + markdown += f"- **Commit**: {result.commit_hash[:8]} - {result.commit_message}\n" + markdown += f"- **Date**: {result.date.strftime('%Y-%m-%d %H:%M')}\n" + markdown += f"- **Scores**:\n\n" eval = result.evaluation - markdown += "| 评分维度 | 分数 |\n" - markdown += "|---------|----|\n" - markdown += f"| 可读性 | {eval.readability} |\n" - markdown += f"| 效率与性能 | {eval.efficiency} |\n" - markdown += f"| 安全性 | {eval.security} |\n" - markdown += f"| 结构与设计 | {eval.structure} |\n" - markdown += f"| 错误处理 | {eval.error_handling} |\n" - markdown += f"| 文档与注释 | {eval.documentation} |\n" - markdown += f"| 代码风格 | {eval.code_style} |\n" - markdown += f"| **总分** | **{eval.overall_score:.1f}** |\n\n" - - markdown += "**评价意见**:\n\n" + markdown += "| Dimension | Score |\n" + markdown += "|----------|------|\n" + markdown += f"| Readability | {eval.readability} |\n" + markdown += f"| Efficiency & Performance | {eval.efficiency} |\n" + markdown += f"| Security | {eval.security} |\n" + markdown += f"| Structure & Design | {eval.structure} |\n" + markdown += f"| Error Handling | {eval.error_handling} |\n" + markdown += f"| Documentation & Comments | {eval.documentation} |\n" + markdown += f"| Code Style | {eval.code_style} |\n" + markdown += f"| **Overall Score** | **{eval.overall_score:.1f}** |\n" + + # Add estimated working hours if available + if hasattr(eval, 'estimated_hours') and eval.estimated_hours: + markdown += f"| **Estimated Working Hours** | **{eval.estimated_hours:.1f}** |\n" + + markdown += "\n**Comments**:\n\n" markdown += f"{eval.comments}\n\n" markdown += "---\n\n" diff --git a/codedog/utils/git_log_analyzer.py b/codedog/utils/git_log_analyzer.py index 23f5bd7..d779f78 100644 --- a/codedog/utils/git_log_analyzer.py +++ b/codedog/utils/git_log_analyzer.py @@ -3,7 +3,7 @@ import subprocess from dataclasses import dataclass from datetime import datetime -from typing import List, Dict, Optional, Tuple +from typing import List, Dict, Optional, Tuple, Any @dataclass @@ -347,4 +347,102 @@ def calculate_total_code_stats(commits: List[CommitInfo]) -> Dict[str, int]: "total_deleted_lines": total_deleted, "total_effective_lines": total_effective, "total_files": total_files - } \ No newline at end of file + } + + +def get_commit_diff( + commit_hash: str, + repo_path: Optional[str] = None, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None, +) -> Dict[str, Dict[str, Any]]: + """Get the diff for a specific commit. + + Args: + commit_hash: The hash of the commit to analyze + repo_path: Path to the git repository (defaults to current directory) + include_extensions: List of file extensions to include (e.g. ['.py', '.js']) + exclude_extensions: List of file extensions to exclude (e.g. ['.md', '.txt']) + + Returns: + Dictionary mapping file paths to their diffs and statistics + """ + if repo_path is None: + repo_path = os.getcwd() + + # Verify repository path exists + if not os.path.exists(repo_path): + raise FileNotFoundError(f"Repository path does not exist: {repo_path}") + + # Verify it's a git repository + git_dir = os.path.join(repo_path, ".git") + if not os.path.exists(git_dir): + raise ValueError(f"Not a git repository: {repo_path}") + + # Get commit diff + cmd = ["git", "show", "--name-status", "--numstat", "--pretty=format:", commit_hash] + result = subprocess.run(cmd, cwd=repo_path, capture_output=True, text=True) + + if result.returncode != 0: + raise ValueError(f"Failed to get commit diff: {result.stderr}") + + # Parse the diff output + file_diffs = {} + current_file = None + current_diff = [] + + for line in result.stdout.splitlines(): + if not line.strip(): + continue + + # Check if line starts with a file status (e.g., "M\tfile.py") + if line.startswith(("A\t", "M\t", "D\t")): + if current_file and current_diff: + file_diffs[current_file] = { + "diff": "\n".join(current_diff), + "status": current_status, + "additions": current_additions, + "deletions": current_deletions, + } + current_diff = [] + current_status = line[0] + current_file = line[2:] + current_additions = 0 + current_deletions = 0 + + # Parse numstat line (e.g., "3\t2\tfile.py") + elif line[0].isdigit(): + additions, deletions, filename = line.split("\t") + current_additions = int(additions) + current_deletions = int(deletions) + + # Add to current diff + else: + current_diff.append(line) + + # Add the last file + if current_file and current_diff: + file_diffs[current_file] = { + "diff": "\n".join(current_diff), + "status": current_status, + "additions": current_additions, + "deletions": current_deletions, + } + + # Filter by file extensions + if include_extensions or exclude_extensions: + filtered_diffs = {} + for file_path, diff in file_diffs.items(): + file_ext = os.path.splitext(file_path)[1].lower() + + # Skip if extension is in exclude list + if exclude_extensions and file_ext in exclude_extensions: + continue + + # Include if extension is in include list or no include list specified + if not include_extensions or file_ext in include_extensions: + filtered_diffs[file_path] = diff + + file_diffs = filtered_diffs + + return file_diffs \ No newline at end of file diff --git a/codedog/utils/langchain_utils.py b/codedog/utils/langchain_utils.py index 9bfc569..b4b1d1a 100644 --- a/codedog/utils/langchain_utils.py +++ b/codedog/utils/langchain_utils.py @@ -312,20 +312,27 @@ def _llm_type(self) -> str: @lru_cache(maxsize=1) def load_gpt_llm() -> BaseChatModel: """Load GPT 3.5 Model""" + # Get the specific GPT-3.5 model name from environment variable or use default + gpt35_model = env.get("GPT35_MODEL", "gpt-3.5-turbo") + if env.get("AZURE_OPENAI"): + # For Azure, use the deployment ID from environment + deployment_id = env.get("AZURE_OPENAI_DEPLOYMENT_ID", "gpt-35-turbo") + llm = AzureChatOpenAI( openai_api_type="azure", api_key=env.get("AZURE_OPENAI_API_KEY", ""), azure_endpoint=env.get("AZURE_OPENAI_API_BASE", ""), api_version="2024-05-01-preview", - azure_deployment=env.get("AZURE_OPENAI_DEPLOYMENT_ID", "gpt-35-turbo"), - model="gpt-3.5-turbo", + azure_deployment=deployment_id, + model=gpt35_model, temperature=0, ) else: llm = ChatOpenAI( api_key=env.get("OPENAI_API_KEY"), - model="gpt-3.5-turbo", + model=gpt35_model, + temperature=0, ) return llm @@ -333,20 +340,27 @@ def load_gpt_llm() -> BaseChatModel: @lru_cache(maxsize=1) def load_gpt4_llm(): """Load GPT 4 Model. Make sure your key have access to GPT 4 API. call this function won't check it.""" + # Get the specific GPT-4 model name from environment variable or use default + gpt4_model = env.get("GPT4_MODEL", "gpt-4") + if env.get("AZURE_OPENAI"): + # For Azure, use the GPT-4 deployment ID if available + deployment_id = env.get("AZURE_OPENAI_GPT4_DEPLOYMENT_ID", env.get("AZURE_OPENAI_DEPLOYMENT_ID", "gpt-4")) + llm = AzureChatOpenAI( openai_api_type="azure", api_key=env.get("AZURE_OPENAI_API_KEY", ""), azure_endpoint=env.get("AZURE_OPENAI_API_BASE", ""), api_version="2024-05-01-preview", - azure_deployment=env.get("AZURE_OPENAI_DEPLOYMENT_ID", "gpt-35-turbo"), - model="gpt-4", + azure_deployment=deployment_id, + model=gpt4_model, temperature=0, ) else: llm = ChatOpenAI( api_key=env.get("OPENAI_API_KEY"), - model="gpt-4", + model=gpt4_model, + temperature=0, ) return llm @@ -354,20 +368,26 @@ def load_gpt4_llm(): @lru_cache(maxsize=1) def load_gpt4o_llm(): """Load GPT-4o Model. Make sure your key have access to GPT-4o API.""" + # Get the specific GPT-4o model name from environment variable or use default + gpt4o_model = env.get("GPT4O_MODEL", "gpt-4o") + if env.get("AZURE_OPENAI"): + # For Azure, use the GPT-4o deployment ID if available + deployment_id = env.get("AZURE_OPENAI_GPT4O_DEPLOYMENT_ID", env.get("AZURE_OPENAI_DEPLOYMENT_ID", "gpt-4o")) + llm = AzureChatOpenAI( openai_api_type="azure", api_key=env.get("AZURE_OPENAI_API_KEY", ""), azure_endpoint=env.get("AZURE_OPENAI_API_BASE", ""), api_version="2024-05-01-preview", - azure_deployment=env.get("AZURE_OPENAI_DEPLOYMENT_ID", "gpt-4o"), - model="gpt-4o", + azure_deployment=deployment_id, + model=gpt4o_model, temperature=0, ) else: llm = ChatOpenAI( api_key=env.get("OPENAI_API_KEY"), - model="gpt-4o", + model=gpt4o_model, temperature=0, ) return llm @@ -408,16 +428,52 @@ def load_deepseek_r1_llm(): def load_model_by_name(model_name: str) -> BaseChatModel: - """Load a model by name""" + """Load a model by name + + Args: + model_name: The name of the model to load. Can be: + - "gpt-3.5" or any string starting with "gpt-3" for GPT-3.5 models + - "gpt-4" or any string starting with "gpt-4" (except gpt-4o) for GPT-4 models + - "gpt-4o" or "4o" for GPT-4o models + - "deepseek" for DeepSeek models + - "deepseek-r1" for DeepSeek R1 models + - Any full OpenAI model name (e.g., "gpt-3.5-turbo-16k", "gpt-4-turbo", etc.) + + Returns: + BaseChatModel: The loaded model + + Raises: + ValueError: If the model name is not recognized + """ + # Define standard model loaders model_loaders = { "gpt-3.5": load_gpt_llm, "gpt-4": load_gpt4_llm, - "gpt-4o": load_gpt4o_llm, # 添加 GPT-4o 支持 - "4o": load_gpt4o_llm, # 别名,方便使用 + "gpt-4o": load_gpt4o_llm, + "4o": load_gpt4o_llm, "deepseek": load_deepseek_llm, "deepseek-r1": load_deepseek_r1_llm, } - if model_name not in model_loaders: - raise ValueError(f"Unknown model name: {model_name}. Available models: {list(model_loaders.keys())}") - return model_loaders[model_name]() + # Check for exact matches first + if model_name in model_loaders: + return model_loaders[model_name]() + + # Handle OpenAI model names with pattern matching + if model_name.startswith("gpt-"): + # Handle GPT-4o models + if "4o" in model_name.lower(): + return load_gpt4o_llm() + # Handle GPT-4 models + elif model_name.startswith("gpt-4"): + return load_gpt4_llm() + # Handle GPT-3 models + elif model_name.startswith("gpt-3"): + return load_gpt_llm() + # For any other GPT models, default to GPT-3.5 + else: + logger.warning(f"Unrecognized GPT model name: {model_name}, defaulting to GPT-3.5") + return load_gpt_llm() + + # If we get here, the model name is not recognized + raise ValueError(f"Unknown model name: {model_name}. Available models: {list(model_loaders.keys())} or any OpenAI model name starting with 'gpt-'.") diff --git a/codedog/utils/remote_repository_analyzer.py b/codedog/utils/remote_repository_analyzer.py new file mode 100644 index 0000000..2693291 --- /dev/null +++ b/codedog/utils/remote_repository_analyzer.py @@ -0,0 +1,248 @@ +from dataclasses import dataclass +from datetime import datetime +from typing import List, Dict, Optional, Any, Tuple +import os +from github import Github +from gitlab import Gitlab +from urllib.parse import urlparse + +@dataclass +class CommitInfo: + """Store commit information""" + hash: str + author: str + date: datetime + message: str + files: List[str] + diff: str + added_lines: int = 0 + deleted_lines: int = 0 + effective_lines: int = 0 + +class RemoteRepositoryAnalyzer: + """Analyzer for remote Git repositories (GitHub and GitLab)""" + + def __init__(self, repo_url: str, access_token: Optional[str] = None): + """Initialize the analyzer with repository URL and optional access token. + + Args: + repo_url: Full URL to the repository (e.g., https://github.com/owner/repo) + access_token: GitHub/GitLab access token (can also be set via GITHUB_TOKEN/GITLAB_TOKEN env vars) + """ + self.repo_url = repo_url + parsed_url = urlparse(repo_url) + + # Extract platform, owner, and repo name from URL + path_parts = parsed_url.path.strip('/').split('/') + if len(path_parts) < 2: + raise ValueError(f"Invalid repository URL: {repo_url}") + + self.owner = path_parts[0] + self.repo_name = path_parts[1] + + # Determine platform and initialize client + if 'github.com' in parsed_url.netloc: + self.platform = 'github' + token = access_token or os.environ.get('GITHUB_TOKEN') + if not token: + raise ValueError("GitHub token required. Set via access_token or GITHUB_TOKEN env var") + self.client = Github(token) + self.repo = self.client.get_repo(f"{self.owner}/{self.repo_name}") + + elif 'gitlab.com' in parsed_url.netloc: + self.platform = 'gitlab' + token = access_token or os.environ.get('GITLAB_TOKEN') + if not token: + raise ValueError("GitLab token required. Set via access_token or GITLAB_TOKEN env var") + self.client = Gitlab('https://gitlab.com', private_token=token) + self.repo = self.client.projects.get(f"{self.owner}/{self.repo_name}") + else: + raise ValueError(f"Unsupported Git platform: {parsed_url.netloc}") + + def get_commits_by_author_and_timeframe( + self, + author: str, + start_date: datetime, + end_date: datetime, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None + ) -> List[CommitInfo]: + """Get commits by author within a specified timeframe. + + Args: + author: Author name or email + start_date: Start date for commit search + end_date: End date for commit search + include_extensions: List of file extensions to include (e.g. ['.py', '.js']) + exclude_extensions: List of file extensions to exclude (e.g. ['.md', '.txt']) + + Returns: + List of CommitInfo objects containing commit details + """ + commits = [] + + if self.platform == 'github': + # GitHub API query + gh_commits = self.repo.get_commits( + author=author, + since=start_date, + until=end_date + ) + + for commit in gh_commits: + files = [] + diff = "" + added_lines = 0 + deleted_lines = 0 + + # Get detailed commit info including diffs + detailed_commit = self.repo.get_commit(commit.sha) + for file in detailed_commit.files: + if self._should_include_file(file.filename, include_extensions, exclude_extensions): + files.append(file.filename) + if file.patch: + diff += f"diff --git a/{file.filename} b/{file.filename}\n{file.patch}\n" + added_lines += file.additions + deleted_lines += file.deletions + + if files: # Only include commits that modified relevant files + commits.append(CommitInfo( + hash=commit.sha, + author=commit.commit.author.name, + date=commit.commit.author.date, + message=commit.commit.message, + files=files, + diff=diff, + added_lines=added_lines, + deleted_lines=deleted_lines, + effective_lines=added_lines - deleted_lines + )) + + elif self.platform == 'gitlab': + # GitLab API query + gl_commits = self.repo.commits.list( + all=True, + query_parameters={ + 'author': author, + 'since': start_date.isoformat(), + 'until': end_date.isoformat() + } + ) + + for commit in gl_commits: + # Get detailed commit info including diffs + detailed_commit = self.repo.commits.get(commit.id) + diff = detailed_commit.diff() + + files = [] + added_lines = 0 + deleted_lines = 0 + + for change in diff: + if self._should_include_file(change['new_path'], include_extensions, exclude_extensions): + files.append(change['new_path']) + # Parse diff to count lines + if change.get('diff'): + for line in change['diff'].splitlines(): + if line.startswith('+') and not line.startswith('+++'): + added_lines += 1 + elif line.startswith('-') and not line.startswith('---'): + deleted_lines += 1 + + if files: # Only include commits that modified relevant files + commits.append(CommitInfo( + hash=commit.id, + author=commit.author_name, + date=datetime.fromisoformat(commit.created_at), + message=commit.message, + files=files, + diff='\n'.join(d['diff'] for d in diff if d.get('diff')), + added_lines=added_lines, + deleted_lines=deleted_lines, + effective_lines=added_lines - deleted_lines + )) + + return commits + + def _should_include_file( + self, + filename: str, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None + ) -> bool: + """Check if a file should be included based on its extension. + + Args: + filename: Name of the file to check + include_extensions: List of file extensions to include + exclude_extensions: List of file extensions to exclude + + Returns: + Boolean indicating whether the file should be included + """ + if not filename: + return False + + ext = os.path.splitext(filename)[1].lower() + + if exclude_extensions and ext in exclude_extensions: + return False + + if include_extensions: + return ext in include_extensions + + return True + + def get_file_diffs_by_timeframe( + self, + author: str, + start_date: datetime, + end_date: datetime, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None + ) -> Tuple[List[CommitInfo], Dict[str, str], Dict[str, Dict[str, Any]]]: + """Get file diffs and statistics for commits within a timeframe. + + Args: + author: Author name or email + start_date: Start date for commit search + end_date: End date for commit search + include_extensions: List of file extensions to include + exclude_extensions: List of file extensions to exclude + + Returns: + Tuple containing: + - List of CommitInfo objects + - Dict mapping filenames to their diffs + - Dict containing statistics about the changes + """ + commits = self.get_commits_by_author_and_timeframe( + author, start_date, end_date, + include_extensions, exclude_extensions + ) + + file_diffs = {} + stats = { + 'total_commits': len(commits), + 'total_files': 0, + 'total_additions': 0, + 'total_deletions': 0, + 'files_changed': set() + } + + for commit in commits: + stats['total_files'] += len(commit.files) + stats['total_additions'] += commit.added_lines + stats['total_deletions'] += commit.deleted_lines + stats['files_changed'].update(commit.files) + + # Aggregate diffs by file + for file in commit.files: + if file not in file_diffs: + file_diffs[file] = "" + file_diffs[file] += f"\n# Commit {commit.hash[:8]} - {commit.message.splitlines()[0]}\n{commit.diff}" + + # Convert set to list for JSON serialization + stats['files_changed'] = list(stats['files_changed']) + + return commits, file_diffs, stats \ No newline at end of file diff --git a/docs/models.md b/docs/models.md index be3383b..897aa6b 100644 --- a/docs/models.md +++ b/docs/models.md @@ -27,6 +27,33 @@ python run_codedog_eval.py "开发者名称" --model gpt-4o CODE_REVIEW_MODEL=gpt-4o ``` +### 使用完整的模型名称 + +您也可以直接使用 OpenAI 的完整模型名称: + +```bash +python run_codedog_eval.py "开发者名称" --model gpt-4-turbo +python run_codedog_eval.py "开发者名称" --model gpt-3.5-turbo-16k +python run_codedog_eval.py "开发者名称" --model gpt-4o-mini +``` + +系统会自动识别这些模型名称并使用适当的配置。 + +### 自定义模型版本 + +您可以在 `.env` 文件中设置特定的模型版本: + +``` +# 指定 GPT-3.5 的具体版本 +GPT35_MODEL="gpt-3.5-turbo-16k" + +# 指定 GPT-4 的具体版本 +GPT4_MODEL="gpt-4-turbo" + +# 指定 GPT-4o 的具体版本 +GPT4O_MODEL="gpt-4o-mini" +``` + ## GPT-4o 模型 GPT-4o 是 OpenAI 的最新模型,具有以下优势: diff --git a/poetry.lock b/poetry.lock index 815c52f..45b3afc 100644 --- a/poetry.lock +++ b/poetry.lock @@ -1,4 +1,4 @@ -# This file is automatically @generated by Poetry 2.1.1 and should not be changed by hand. +# This file is automatically @generated by Poetry 2.1.2 and should not be changed by hand. [[package]] name = "aiohappyeyeballs" @@ -167,7 +167,7 @@ description = "Timeout context manager for asyncio programs" optional = false python-versions = ">=3.7" groups = ["main"] -markers = "python_version < \"3.11\"" +markers = "python_version == \"3.10\"" files = [ {file = "async-timeout-4.0.3.tar.gz", hash = "sha256:4640d96be84d82d02ed59ea2b7105a0f7b33abe8703703cd0ab0bf87c427522f"}, {file = "async_timeout-4.0.3-py3-none-any.whl", hash = "sha256:7405140ff1230c310e51dc27b3145b9092d659ce68ff733fb0cefe3ee42be028"}, @@ -665,7 +665,7 @@ description = "Backport of PEP 654 (exception groups)" optional = false python-versions = ">=3.7" groups = ["main", "http", "test"] -markers = "python_version < \"3.11\"" +markers = "python_version == \"3.10\"" files = [ {file = "exceptiongroup-1.2.2-py3-none-any.whl", hash = "sha256:3111b9d131c238bec2f8f516e123e14ba243563fb135d3fe885990585aa7795b"}, {file = "exceptiongroup-1.2.2.tar.gz", hash = "sha256:47c2edf7c6738fafb49fd34290706d1a1a2f4d1c6df275526b62cbb4aa5393cc"}, @@ -2694,7 +2694,7 @@ files = [ {file = "tomli-2.0.1-py3-none-any.whl", hash = "sha256:939de3e7a6161af0c887ef91b7d41a53e7c5a1ca976325f429cb46ea9bc30ecc"}, {file = "tomli-2.0.1.tar.gz", hash = "sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"}, ] -markers = {dev = "python_version < \"3.11\"", test = "python_full_version <= \"3.11.0a6\""} +markers = {dev = "python_version == \"3.10\"", test = "python_full_version <= \"3.11.0a6\""} [[package]] name = "tomlkit" @@ -3348,4 +3348,4 @@ cffi = ["cffi (>=1.11)"] [metadata] lock-version = "2.1" python-versions = "^3.10" -content-hash = "d736b6a96a6334d08f434d75e00db7ab1bed95fa56c62a096a4f52c1f3c42da9" +content-hash = "210b7612ac15c6de39e20fa5f6e557fcbd5fe3b977a1f82216b4077ad75d51d8" diff --git a/requirements.txt b/requirements.txt index 4b7dc36..7c661b1 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1 +1,5 @@ -modelcontextprotocol-github>=0.1.0 \ No newline at end of file +modelcontextprotocol-github>=0.1.0 +PyGithub>=2.1.1 +python-gitlab>=4.4.0 +aiohttp>=3.9.3 +python-dateutil>=2.8.2 \ No newline at end of file diff --git a/run_codedog.py b/run_codedog.py index 11c3c27..f75c4f2 100755 --- a/run_codedog.py +++ b/run_codedog.py @@ -3,8 +3,10 @@ import time import traceback from dotenv import load_dotenv -from typing import List, Optional +from typing import Any, Dict, List, Optional, Tuple import os +import re +import sys from datetime import datetime, timedelta # Load environment variables from .env file @@ -20,7 +22,7 @@ from codedog.utils.langchain_utils import load_model_by_name from codedog.utils.email_utils import send_report_email from codedog.utils.git_hooks import install_git_hooks -from codedog.utils.git_log_analyzer import get_file_diffs_by_timeframe +from codedog.utils.git_log_analyzer import get_file_diffs_by_timeframe, get_commit_diff, CommitInfo from codedog.utils.code_evaluator import DiffEvaluator, generate_evaluation_markdown @@ -49,12 +51,28 @@ def parse_args(): eval_parser.add_argument("author", help="Developer name or email (partial match)") eval_parser.add_argument("--start-date", help="Start date (YYYY-MM-DD), defaults to 7 days ago") eval_parser.add_argument("--end-date", help="End date (YYYY-MM-DD), defaults to today") - eval_parser.add_argument("--repo", help="Git repository path, defaults to current directory") + eval_parser.add_argument("--repo", help="Git repository path or name (e.g. owner/repo for remote repositories)") eval_parser.add_argument("--include", help="Included file extensions, comma separated, e.g. .py,.js") eval_parser.add_argument("--exclude", help="Excluded file extensions, comma separated, e.g. .md,.txt") eval_parser.add_argument("--model", help="Evaluation model, defaults to CODE_REVIEW_MODEL env var or gpt-3.5") eval_parser.add_argument("--email", help="Email addresses to send the report to (comma-separated)") eval_parser.add_argument("--output", help="Report output path, defaults to codedog_eval__.md") + eval_parser.add_argument("--platform", choices=["github", "gitlab", "local"], default="local", + help="Platform to use (github, gitlab, or local, defaults to local)") + eval_parser.add_argument("--gitlab-url", help="GitLab URL (defaults to https://gitlab.com or GITLAB_URL env var)") + + # Commit review command + commit_parser = subparsers.add_parser("commit", help="Review a specific commit") + commit_parser.add_argument("commit_hash", help="Commit hash to review") + commit_parser.add_argument("--repo", help="Git repository path or name (e.g. owner/repo for remote repositories)") + commit_parser.add_argument("--include", help="Included file extensions, comma separated, e.g. .py,.js") + commit_parser.add_argument("--exclude", help="Excluded file extensions, comma separated, e.g. .md,.txt") + commit_parser.add_argument("--model", help="Review model, defaults to CODE_REVIEW_MODEL env var or gpt-3.5") + commit_parser.add_argument("--email", help="Email addresses to send the report to (comma-separated)") + commit_parser.add_argument("--output", help="Report output path, defaults to codedog_commit__.md") + commit_parser.add_argument("--platform", choices=["github", "gitlab", "local"], default="local", + help="Platform to use (github, gitlab, or local, defaults to local)") + commit_parser.add_argument("--gitlab-url", help="GitLab URL (defaults to https://gitlab.com or GITLAB_URL env var)") return parser.parse_args() @@ -91,6 +109,506 @@ async def code_review(retriever, review_chain): return result +def get_remote_commit_diff( + platform: str, + repository_name: str, + commit_hash: str, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None, + gitlab_url: Optional[str] = None, +) -> Dict[str, Dict[str, Any]]: + """ + Get commit diff from remote repositories (GitHub or GitLab). + + Args: + platform (str): Platform to use (github or gitlab) + repository_name (str): Repository name (e.g. owner/repo) + commit_hash (str): Commit hash to review + include_extensions (Optional[List[str]], optional): File extensions to include. Defaults to None. + exclude_extensions (Optional[List[str]], optional): File extensions to exclude. Defaults to None. + gitlab_url (Optional[str], optional): GitLab URL. Defaults to None. + + Returns: + Dict[str, Dict[str, Any]]: Dictionary mapping file paths to their diffs and statistics + """ + if platform.lower() == "github": + # Initialize GitHub client + github_client = Github() # Will automatically load GITHUB_TOKEN from environment + print(f"Analyzing GitHub repository {repository_name} for commit {commit_hash}") + + try: + # Get repository + repo = github_client.get_repo(repository_name) + + # Get commit + commit = repo.get_commit(commit_hash) + + # Extract file diffs + file_diffs = {} + for file in commit.files: + # Filter by file extensions + _, ext = os.path.splitext(file.filename) + if include_extensions and ext not in include_extensions: + continue + if exclude_extensions and ext in exclude_extensions: + continue + + if file.patch: + file_diffs[file.filename] = { + "diff": f"diff --git a/{file.filename} b/{file.filename}\n{file.patch}", + "status": file.status, + "additions": file.additions, + "deletions": file.deletions, + } + + return file_diffs + + except Exception as e: + error_msg = f"Failed to retrieve GitHub commit: {str(e)}" + print(error_msg) + return {} + + elif platform.lower() == "gitlab": + # Initialize GitLab client + gitlab_token = os.environ.get("GITLAB_TOKEN", "") + if not gitlab_token: + error_msg = "GITLAB_TOKEN environment variable is not set" + print(error_msg) + return {} + + # Use provided GitLab URL or fall back to environment variable or default + gitlab_url = gitlab_url or os.environ.get("GITLAB_URL", "https://gitlab.com") + + gitlab_client = Gitlab(url=gitlab_url, private_token=gitlab_token) + print(f"Analyzing GitLab repository {repository_name} for commit {commit_hash}") + + try: + # Get repository + project = gitlab_client.projects.get(repository_name) + + # Get commit + commit = project.commits.get(commit_hash) + + # Get commit diff + diff = commit.diff() + + # Extract file diffs + file_diffs = {} + for file_diff in diff: + file_path = file_diff.get('new_path', '') + old_path = file_diff.get('old_path', '') + diff_content = file_diff.get('diff', '') + + # Skip if no diff content + if not diff_content: + continue + + # Filter by file extensions + _, ext = os.path.splitext(file_path) + if include_extensions and ext not in include_extensions: + continue + if exclude_extensions and ext in exclude_extensions: + continue + + # Determine file status + if file_diff.get('new_file', False): + status = 'A' # Added + elif file_diff.get('deleted_file', False): + status = 'D' # Deleted + else: + status = 'M' # Modified + + # Format diff content + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n{diff_content}" + + # Count additions and deletions + additions = diff_content.count('\n+') + deletions = diff_content.count('\n-') + + file_diffs[file_path] = { + "diff": formatted_diff, + "status": status, + "additions": additions, + "deletions": deletions, + } + + return file_diffs + + except Exception as e: + error_msg = f"Failed to retrieve GitLab commit: {str(e)}" + print(error_msg) + return {} + + else: + error_msg = f"Unsupported platform: {platform}. Use 'github' or 'gitlab'." + print(error_msg) + return {} + + +def get_remote_commits( + platform: str, + repository_name: str, + author: str, + start_date: str, + end_date: str, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None, + gitlab_url: Optional[str] = None, +) -> Tuple[List[Any], Dict[str, Dict[str, str]], Dict[str, int]]: + """ + Get commits from remote repositories (GitHub or GitLab). + + Args: + platform (str): Platform to use (github or gitlab) + repository_name (str): Repository name (e.g. owner/repo) + author (str): Author name or email + start_date (str): Start date (YYYY-MM-DD) + end_date (str): End date (YYYY-MM-DD) + include_extensions (Optional[List[str]], optional): File extensions to include. Defaults to None. + exclude_extensions (Optional[List[str]], optional): File extensions to exclude. Defaults to None. + gitlab_url (Optional[str], optional): GitLab URL. Defaults to None. + + Returns: + Tuple[List[Any], Dict[str, Dict[str, str]], Dict[str, int]]: Commits, file diffs, and code stats + """ + if platform.lower() == "github": + # Initialize GitHub client + github_client = Github() # Will automatically load GITHUB_TOKEN from environment + print(f"Analyzing GitHub repository {repository_name} for commits by {author}") + + try: + # Get repository + repo = github_client.get_repo(repository_name) + + # Convert dates to datetime objects + start_datetime = datetime.strptime(start_date, "%Y-%m-%d") + end_datetime = datetime.strptime(end_date, "%Y-%m-%d") + timedelta(days=1) # Include the end date + + # Get commits + commits = [] + commit_file_diffs = {} + + # Get all commits in the repository within the date range + all_commits = repo.get_commits(since=start_datetime, until=end_datetime) + + # Filter by author + for commit in all_commits: + if author.lower() in commit.commit.author.name.lower() or ( + commit.commit.author.email and author.lower() in commit.commit.author.email.lower() + ): + # Create CommitInfo object + commit_info = CommitInfo( + hash=commit.sha, + author=commit.commit.author.name, + date=commit.commit.author.date, + message=commit.commit.message, + files=[file.filename for file in commit.files], + diff="\n".join([f"diff --git a/{file.filename} b/{file.filename}\n{file.patch}" for file in commit.files if file.patch]), + added_lines=sum(file.additions for file in commit.files), + deleted_lines=sum(file.deletions for file in commit.files), + effective_lines=sum(file.additions - file.deletions for file in commit.files) + ) + commits.append(commit_info) + + # Extract file diffs + file_diffs = {} + for file in commit.files: + if file.patch: + # Filter by file extensions + _, ext = os.path.splitext(file.filename) + if include_extensions and ext not in include_extensions: + continue + if exclude_extensions and ext in exclude_extensions: + continue + + file_diffs[file.filename] = file.patch + + commit_file_diffs[commit.sha] = file_diffs + + # Calculate code stats + code_stats = { + "total_added_lines": sum(commit.added_lines for commit in commits), + "total_deleted_lines": sum(commit.deleted_lines for commit in commits), + "total_effective_lines": sum(commit.effective_lines for commit in commits), + "total_files": len(set(file for commit in commits for file in commit.files)) + } + + return commits, commit_file_diffs, code_stats + + except Exception as e: + error_msg = f"Failed to retrieve GitHub commits: {str(e)}" + print(error_msg) + return [], {}, {} + + elif platform.lower() == "gitlab": + # Initialize GitLab client + gitlab_token = os.environ.get("GITLAB_TOKEN", "") + if not gitlab_token: + error_msg = "GITLAB_TOKEN environment variable is not set" + print(error_msg) + return [], {}, {} + + # Use provided GitLab URL or fall back to environment variable or default + gitlab_url = gitlab_url or os.environ.get("GITLAB_URL", "https://gitlab.com") + + gitlab_client = Gitlab(url=gitlab_url, private_token=gitlab_token) + print(f"Analyzing GitLab repository {repository_name} for commits by {author}") + + try: + # Get repository + project = gitlab_client.projects.get(repository_name) + + # Get commits + commits = [] + commit_file_diffs = {} + + # Convert dates to ISO format + start_iso = f"{start_date}T00:00:00Z" + end_iso = f"{end_date}T23:59:59Z" + + # Get all commits in the repository within the date range + all_commits = project.commits.list(all=True, since=start_iso, until=end_iso) + + # Filter by author + for commit in all_commits: + if author.lower() in commit.author_name.lower() or ( + commit.author_email and author.lower() in commit.author_email.lower() + ): + # Get commit details + commit_detail = project.commits.get(commit.id) + + # Get commit diff + diff = commit_detail.diff() + + # Filter files by extension + filtered_diff = [] + for file_diff in diff: + file_path = file_diff.get('new_path', '') + _, ext = os.path.splitext(file_path) + + if include_extensions and ext not in include_extensions: + continue + if exclude_extensions and ext in exclude_extensions: + continue + + filtered_diff.append(file_diff) + + # Skip if no files match the filter + if not filtered_diff: + continue + + # Get file content for each modified file + file_diffs = {} + for file_diff in filtered_diff: + file_path = file_diff.get('new_path', '') + old_path = file_diff.get('old_path', '') + diff_content = file_diff.get('diff', '') + + # Skip if no diff content + if not diff_content: + continue + + # Try to get the file content + try: + # For new files, get the content from the current commit + if file_diff.get('new_file', False): + try: + # Get the file content and handle both string and bytes + file_obj = project.files.get(file_path=file_path, ref=commit.id) + if hasattr(file_obj, 'content'): + # Raw content from API + file_content = file_obj.content + elif hasattr(file_obj, 'decode'): + # Decode if it's bytes + try: + file_content = file_obj.decode() + except TypeError: + # If decode fails, try to get content directly + file_content = file_obj.content if hasattr(file_obj, 'content') else str(file_obj) + else: + # Fallback to string representation + file_content = str(file_obj) + + # Format as a proper diff with the entire file as added + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- /dev/null\n+++ b/{file_path}\n" + formatted_diff += "\n".join([f"+{line}" for line in file_content.split('\n')]) + file_diffs[file_path] = formatted_diff + except Exception as e: + print(f"Warning: Could not get content for new file {file_path}: {str(e)}") + # Try to get the raw file content directly from the API + try: + import base64 + raw_file = project.repository_files.get(file_path=file_path, ref=commit.id) + if raw_file and hasattr(raw_file, 'content'): + # Decode base64 content if available + try: + decoded_content = base64.b64decode(raw_file.content).decode('utf-8', errors='replace') + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- /dev/null\n+++ b/{file_path}\n" + formatted_diff += "\n".join([f"+{line}" for line in decoded_content.split('\n')]) + file_diffs[file_path] = formatted_diff + continue + except Exception as decode_err: + print(f"Warning: Could not decode content for {file_path}: {str(decode_err)}") + except Exception as api_err: + print(f"Warning: Could not get raw file content for {file_path}: {str(api_err)}") + + # Use diff content as fallback + file_diffs[file_path] = diff_content + # For deleted files, get the content from the parent commit + elif file_diff.get('deleted_file', False): + try: + # Get parent commit + parent_commits = project.commits.get(commit.id).parent_ids + if parent_commits: + # Get the file content and handle both string and bytes + try: + file_obj = project.files.get(file_path=old_path, ref=parent_commits[0]) + if hasattr(file_obj, 'content'): + # Raw content from API + file_content = file_obj.content + elif hasattr(file_obj, 'decode'): + # Decode if it's bytes + try: + file_content = file_obj.decode() + except TypeError: + # If decode fails, try to get content directly + file_content = file_obj.content if hasattr(file_obj, 'content') else str(file_obj) + else: + # Fallback to string representation + file_content = str(file_obj) + + # Format as a proper diff with the entire file as deleted + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ /dev/null\n" + formatted_diff += "\n".join([f"-{line}" for line in file_content.split('\n')]) + file_diffs[file_path] = formatted_diff + except Exception as file_err: + # Try to get the raw file content directly from the API + try: + import base64 + raw_file = project.repository_files.get(file_path=old_path, ref=parent_commits[0]) + if raw_file and hasattr(raw_file, 'content'): + # Decode base64 content if available + try: + decoded_content = base64.b64decode(raw_file.content).decode('utf-8', errors='replace') + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ /dev/null\n" + formatted_diff += "\n".join([f"-{line}" for line in decoded_content.split('\n')]) + file_diffs[file_path] = formatted_diff + except Exception as decode_err: + print(f"Warning: Could not decode content for deleted file {old_path}: {str(decode_err)}") + file_diffs[file_path] = diff_content + else: + file_diffs[file_path] = diff_content + except Exception as api_err: + print(f"Warning: Could not get raw file content for deleted file {old_path}: {str(api_err)}") + file_diffs[file_path] = diff_content + else: + file_diffs[file_path] = diff_content + except Exception as e: + print(f"Warning: Could not get content for deleted file {old_path}: {str(e)}") + file_diffs[file_path] = diff_content + # For modified files, use the diff content + else: + # Check if diff_content is empty or minimal + if not diff_content or len(diff_content.strip()) < 10: + # Try to get the full file content for better context + try: + # Get the file content and handle both string and bytes + file_obj = project.files.get(file_path=file_path, ref=commit.id) + if hasattr(file_obj, 'content'): + # Raw content from API + file_content = file_obj.content + elif hasattr(file_obj, 'decode'): + # Decode if it's bytes + try: + file_content = file_obj.decode() + except TypeError: + # If decode fails, try to get content directly + file_content = file_obj.content if hasattr(file_obj, 'content') else str(file_obj) + else: + # Fallback to string representation + file_content = str(file_obj) + + # Format as a proper diff with the entire file + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ b/{file_path}\n" + formatted_diff += "\n".join([f"+{line}" for line in file_content.split('\n')]) + file_diffs[file_path] = formatted_diff + except Exception as e: + print(f"Warning: Could not get content for modified file {file_path}: {str(e)}") + # Try to get the raw file content directly from the API + try: + import base64 + raw_file = project.repository_files.get(file_path=file_path, ref=commit.id) + if raw_file and hasattr(raw_file, 'content'): + # Decode base64 content if available + try: + decoded_content = base64.b64decode(raw_file.content).decode('utf-8', errors='replace') + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ b/{file_path}\n" + formatted_diff += "\n".join([f"+{line}" for line in decoded_content.split('\n')]) + file_diffs[file_path] = formatted_diff + except Exception as decode_err: + print(f"Warning: Could not decode content for {file_path}: {str(decode_err)}") + # Enhance the diff format with what we have + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ b/{file_path}\n{diff_content}" + file_diffs[file_path] = formatted_diff + else: + # Enhance the diff format with what we have + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ b/{file_path}\n{diff_content}" + file_diffs[file_path] = formatted_diff + except Exception as api_err: + print(f"Warning: Could not get raw file content for {file_path}: {str(api_err)}") + # Enhance the diff format with what we have + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ b/{file_path}\n{diff_content}" + file_diffs[file_path] = formatted_diff + else: + # Enhance the diff format + formatted_diff = f"diff --git a/{old_path} b/{file_path}\n--- a/{old_path}\n+++ b/{file_path}\n{diff_content}" + file_diffs[file_path] = formatted_diff + except Exception as e: + print(f"Warning: Error processing diff for {file_path}: {str(e)}") + file_diffs[file_path] = diff_content + + # Skip if no valid diffs + if not file_diffs: + continue + + # Create CommitInfo object with enhanced diff content + commit_info = CommitInfo( + hash=commit.id, + author=commit.author_name, + date=datetime.strptime(commit.created_at, "%Y-%m-%dT%H:%M:%S.%f%z"), + message=commit.message, + files=list(file_diffs.keys()), + diff="\n\n".join(file_diffs.values()), + added_lines=sum(diff.count('\n+') for diff in file_diffs.values()), + deleted_lines=sum(diff.count('\n-') for diff in file_diffs.values()), + effective_lines=sum(diff.count('\n+') - diff.count('\n-') for diff in file_diffs.values()) + ) + commits.append(commit_info) + + # Store file diffs for this commit + commit_file_diffs[commit.id] = file_diffs + + # Calculate code stats + code_stats = { + "total_added_lines": sum(commit.added_lines for commit in commits), + "total_deleted_lines": sum(commit.deleted_lines for commit in commits), + "total_effective_lines": sum(commit.effective_lines for commit in commits), + "total_files": len(set(file for commit in commits for file in commit.files)) + } + + return commits, commit_file_diffs, code_stats + + except Exception as e: + error_msg = f"Failed to retrieve GitLab commits: {str(e)}" + print(error_msg) + return [], {}, {} + + else: + error_msg = f"Unsupported platform: {platform}. Use 'github' or 'gitlab'." + print(error_msg) + return [], {}, {} + + async def evaluate_developer_code( author: str, start_date: str, @@ -101,6 +619,8 @@ async def evaluate_developer_code( model_name: str = "gpt-3.5", output_file: Optional[str] = None, email_addresses: Optional[List[str]] = None, + platform: str = "local", + gitlab_url: Optional[str] = None, ): """Evaluate a developer's code commits in a time period.""" # Generate default output file name if not provided @@ -114,15 +634,33 @@ async def evaluate_developer_code( print(f"Evaluating {author}'s code commits from {start_date} to {end_date}...") - # Get commits and diffs - commits, commit_file_diffs, code_stats = get_file_diffs_by_timeframe( - author, - start_date, - end_date, - repo_path, - include_extensions, - exclude_extensions - ) + # Get commits and diffs based on platform + if platform.lower() == "local": + # Use local git repository + commits, commit_file_diffs, code_stats = get_file_diffs_by_timeframe( + author, + start_date, + end_date, + repo_path, + include_extensions, + exclude_extensions + ) + else: + # Use remote repository (GitHub or GitLab) + if not repo_path: + print("Repository path/name is required for remote platforms") + return + + commits, commit_file_diffs, code_stats = get_remote_commits( + platform, + repo_path, + author, + start_date, + end_date, + include_extensions, + exclude_extensions, + gitlab_url + ) if not commits: print(f"No commits found for {author} in the specified time period") @@ -318,6 +856,131 @@ def generate_full_report(repository_name, pull_request_number, email_addresses=N return report +async def review_commit( + commit_hash: str, + repo_path: Optional[str] = None, + include_extensions: Optional[List[str]] = None, + exclude_extensions: Optional[List[str]] = None, + model_name: str = "gpt-3.5", + output_file: Optional[str] = None, + email_addresses: Optional[List[str]] = None, + platform: str = "local", + gitlab_url: Optional[str] = None, +): + """Review a specific commit. + + Args: + commit_hash: The hash of the commit to review + repo_path: Git repository path or name (e.g. owner/repo for remote repositories) + include_extensions: List of file extensions to include (e.g. ['.py', '.js']) + exclude_extensions: List of file extensions to exclude (e.g. ['.md', '.txt']) + model_name: Name of the model to use for review + output_file: Path to save the report to + email_addresses: List of email addresses to send the report to + platform: Platform to use (github, gitlab, or local) + gitlab_url: GitLab URL (for GitLab platform only) + """ + # Generate default output file name if not provided + if not output_file: + date_slug = datetime.now().strftime("%Y%m%d") + output_file = f"codedog_commit_{commit_hash[:8]}_{date_slug}.md" + + # Get model + model = load_model_by_name(model_name) + + print(f"Reviewing commit {commit_hash}...") + + # Get commit diff based on platform + commit_diff = {} + + if platform.lower() == "local": + # Use local git repository + try: + commit_diff = get_commit_diff(commit_hash, repo_path, include_extensions, exclude_extensions) + except Exception as e: + print(f"Error getting commit diff: {str(e)}") + return + elif platform.lower() in ["github", "gitlab"]: + # Use remote repository + if not repo_path or "/" not in repo_path: + print(f"Error: Repository name must be in the format 'owner/repo' for {platform} platform") + return + + commit_diff = get_remote_commit_diff( + platform=platform, + repository_name=repo_path, + commit_hash=commit_hash, + include_extensions=include_extensions, + exclude_extensions=exclude_extensions, + gitlab_url=gitlab_url, + ) + else: + print(f"Error: Unsupported platform '{platform}'. Use 'local', 'github', or 'gitlab'.") + return + + if not commit_diff: + print(f"No changes found in commit {commit_hash}") + return + + print(f"Found {len(commit_diff)} modified files") + + # Initialize evaluator + evaluator = DiffEvaluator(model) + + # Timing and statistics + start_time = time.time() + + with get_openai_callback() as cb: + # Perform review + print("Reviewing code changes...") + review_results = await evaluator.evaluate_commit(commit_hash, commit_diff) + + # Generate Markdown report + report = generate_evaluation_markdown(review_results) + + # Calculate cost and tokens + total_cost = cb.total_cost + total_tokens = cb.total_tokens + + # Add review statistics + elapsed_time = time.time() - start_time + telemetry_info = ( + f"\n## Review Statistics\n\n" + f"- **Review Model**: {model_name}\n" + f"- **Review Time**: {elapsed_time:.2f} seconds\n" + f"- **Tokens Used**: {total_tokens}\n" + f"- **Cost**: ${total_cost:.4f}\n" + f"\n## Code Statistics\n\n" + f"- **Total Files Modified**: {len(commit_diff)}\n" + f"- **Lines Added**: {sum(diff.get('additions', 0) for diff in commit_diff.values())}\n" + f"- **Lines Deleted**: {sum(diff.get('deletions', 0) for diff in commit_diff.values())}\n" + ) + + report += telemetry_info + + # Save report + with open(output_file, "w", encoding="utf-8") as f: + f.write(report) + print(f"Report saved to {output_file}") + + # Send email report if addresses provided + if email_addresses: + subject = f"[CodeDog] Code Review for Commit {commit_hash[:8]}" + + sent = send_report_email( + to_emails=email_addresses, + subject=subject, + markdown_content=report, + ) + + if sent: + print(f"Report sent to {', '.join(email_addresses)}") + else: + print("Failed to send email notification") + + return report + + def main(): """Main function to parse arguments and run the appropriate command.""" args = parse_args() @@ -393,6 +1056,8 @@ def main(): model_name=model_name, output_file=args.output, email_addresses=email_addresses, + platform=args.platform, + gitlab_url=args.gitlab_url, )) if report: @@ -400,6 +1065,44 @@ def main(): print("Report generated successfully. See output file for details.") print("\n===================== Report End =====================\n") + elif args.command == "commit": + # Process file extension parameters + include_extensions = None + if args.include: + include_extensions = parse_extensions(args.include) + elif os.environ.get("DEV_EVAL_DEFAULT_INCLUDE"): + include_extensions = parse_extensions(os.environ.get("DEV_EVAL_DEFAULT_INCLUDE")) + + exclude_extensions = None + if args.exclude: + exclude_extensions = parse_extensions(args.exclude) + elif os.environ.get("DEV_EVAL_DEFAULT_EXCLUDE"): + exclude_extensions = parse_extensions(os.environ.get("DEV_EVAL_DEFAULT_EXCLUDE")) + + # Get model + model_name = args.model or os.environ.get("CODE_REVIEW_MODEL", "gpt-3.5") + + # Get email addresses + email_addresses = parse_emails(args.email or os.environ.get("NOTIFICATION_EMAILS", "")) + + # Run commit review + report = asyncio.run(review_commit( + commit_hash=args.commit_hash, + repo_path=args.repo, + include_extensions=include_extensions, + exclude_extensions=exclude_extensions, + model_name=model_name, + output_file=args.output, + email_addresses=email_addresses, + platform=args.platform, + gitlab_url=args.gitlab_url, + )) + + if report: + print("\n===================== Commit Review Report =====================\n") + print("Report generated successfully. See output file for details.") + print("\n===================== Report End =====================\n") + else: # No command specified, show usage print("Please specify a command. Use --help for more information.") @@ -407,6 +1110,9 @@ def main(): print("Example: python run_codedog.py pr owner/repo 123 --platform gitlab # GitLab MR review") print("Example: python run_codedog.py setup-hooks # Set up git hooks") print("Example: python run_codedog.py eval username --start-date 2023-01-01 --end-date 2023-01-31 # Evaluate code") + print("Example: python run_codedog.py commit abc123def # Review local commit") + print("Example: python run_codedog.py commit abc123def --repo owner/repo --platform github # Review GitHub commit") + print("Example: python run_codedog.py commit abc123def --repo owner/repo --platform gitlab # Review GitLab commit") if __name__ == "__main__": diff --git a/run_codedog_commit.py b/run_codedog_commit.py deleted file mode 100755 index 5a13e20..0000000 --- a/run_codedog_commit.py +++ /dev/null @@ -1,357 +0,0 @@ -#!/usr/bin/env python -import argparse -import asyncio -import os -import sys -import time -import traceback -from datetime import datetime -from dotenv import load_dotenv -from typing import List, Optional - -# Load environment variables from .env file -# This will load GitHub or GitLab tokens from the .env file -load_dotenv() - -from langchain_community.callbacks.manager import get_openai_callback - -from codedog.actors.reporters.pull_request import PullRequestReporter -from codedog.chains import CodeReviewChain, PRSummaryChain -from codedog.models import PullRequest, ChangeFile, ChangeStatus, Repository -from codedog.models.diff import DiffContent -from codedog.processors.pull_request_processor import PullRequestProcessor -from codedog.utils.langchain_utils import load_model_by_name -from codedog.utils.email_utils import send_report_email -from codedog.utils.git_hooks import create_commit_pr_data, get_commit_files -import subprocess - - -def parse_args(): - """Parse command line arguments.""" - parser = argparse.ArgumentParser(description="CodeDog - Automatic commit code review for GitHub and GitLab repositories") - parser.add_argument("--commit", help="Commit hash to review (defaults to HEAD)") - parser.add_argument("--repo", help="Path to git repository (defaults to current directory)") - parser.add_argument("--email", help="Email addresses to send the report to (comma-separated)") - parser.add_argument("--output", help="Output file path (defaults to codedog_commit_.md)") - parser.add_argument("--model", help="Model to use for code review (defaults to CODE_REVIEW_MODEL env var or gpt-3.5)") - parser.add_argument("--summary-model", help="Model to use for PR summary (defaults to PR_SUMMARY_MODEL env var or gpt-4)") - parser.add_argument("--verbose", action="store_true", help="Enable verbose output") - - return parser.parse_args() - - -def parse_emails(emails_str: Optional[str]) -> List[str]: - """Parse comma-separated email addresses.""" - if not emails_str: - return [] - - return [email.strip() for email in emails_str.split(",") if email.strip()] - - -def get_file_diff(commit_hash: str, file_path: str, repo_path: Optional[str] = None) -> str: - """Get diff for a specific file in the commit. - - Args: - commit_hash: The commit hash - file_path: Path to the file - repo_path: Path to git repository (defaults to current directory) - - Returns: - str: The diff content - """ - cwd = repo_path or os.getcwd() - - try: - # Get diff for the file - result = subprocess.run( - ["git", "diff", f"{commit_hash}^..{commit_hash}", "--", file_path], - capture_output=True, - text=True, - cwd=cwd, - check=True, - ) - - return result.stdout - except subprocess.CalledProcessError as e: - print(f"Error getting file diff for {file_path}: {e}") - return f"Error: Unable to get diff for {file_path}" - - -def create_change_files(commit_hash: str, repo_path: Optional[str] = None) -> List[ChangeFile]: - """Create ChangeFile objects for files changed in the commit.""" - cwd = repo_path or os.getcwd() - repo_name = os.path.basename(os.path.abspath(cwd)) - - # Get list of files changed in the commit - files = get_commit_files(commit_hash, repo_path) - - # Create a unique ID for the commit - commit_id = int(commit_hash[:8], 16) - - change_files = [] - for file_path in files: - # Get file name and suffix - file_name = os.path.basename(file_path) - suffix = file_path.split('.')[-1] if '.' in file_path else "" - - # Get diff content - diff_content_str = get_file_diff(commit_hash, file_path, repo_path) - - # Create DiffContent object - diff_content = DiffContent( - add_count=diff_content_str.count('\n+') - diff_content_str.count('\n+++'), - remove_count=diff_content_str.count('\n-') - diff_content_str.count('\n---'), - content=diff_content_str - ) - - # Create ChangeFile object - change_file = ChangeFile( - blob_id=abs(hash(file_path)) % (10 ** 8), # Generate a stable ID from file path - sha=commit_hash, - full_name=file_path, - source_full_name=file_path, - status=ChangeStatus.modified, # Assume modified for simplicity - pull_request_id=commit_id, - start_commit_id=int(commit_hash[:8], 16) - 1, # Previous commit - end_commit_id=int(commit_hash[:8], 16), # Current commit - name=file_name, - suffix=suffix, - diff_content=diff_content - ) - - change_files.append(change_file) - - return change_files - - -def create_pull_request_from_commit(commit_hash: str, repo_path: Optional[str] = None) -> PullRequest: - """Create a PullRequest object from a commit.""" - # Get commit data in PR-like format - commit_data = create_commit_pr_data(commit_hash, repo_path) - - # Create change files - change_files = create_change_files(commit_hash, repo_path) - - # Create repository object - cwd = repo_path or os.getcwd() - repo_name = os.path.basename(os.path.abspath(cwd)) - repository = Repository( - repository_id=abs(hash(repo_name)) % (10 ** 8), - repository_name=repo_name, - repository_full_name=repo_name, - repository_url=cwd - ) - - # Create PullRequest object - pull_request = PullRequest( - pull_request_id=commit_data["pull_request_id"], - repository_id=commit_data["repository_id"], - pull_request_number=int(commit_hash[:8], 16), - title=commit_data["title"], - body=commit_data["body"], - url="", - repository_name=repo_name, - related_issues=[], - change_files=change_files, - repository=repository, - source_repository=repository - ) - - return pull_request - - -async def pr_summary(pull_request, summary_chain): - """Generate PR summary asynchronously.""" - result = await summary_chain.ainvoke( - {"pull_request": pull_request}, include_run_info=True - ) - return result - - -async def code_review(pull_request, review_chain): - """Generate code review asynchronously.""" - result = await review_chain.ainvoke( - {"pull_request": pull_request}, include_run_info=True - ) - return result - - -def generate_commit_review(commit_hash: str, repo_path: Optional[str] = None, - email_addresses: Optional[List[str]] = None, - output_file: Optional[str] = None, - code_review_model: str = None, - pr_summary_model: str = None, - verbose: bool = False) -> str: - """Generate a code review for a commit. - - This function works with both GitHub and GitLab repositories by analyzing local Git commits. - It doesn't require direct API access to GitHub or GitLab as it works with the local repository. - - Args: - commit_hash: The commit hash to review - repo_path: Path to git repository (defaults to current directory) - email_addresses: List of email addresses to send the report to - output_file: Output file path (defaults to codedog_commit_.md) - code_review_model: Model to use for code review - pr_summary_model: Model to use for PR summary - verbose: Enable verbose output - - Returns: - str: The generated review report in markdown format - """ - start_time = time.time() - - # Set default models from environment variables - code_review_model = code_review_model or os.environ.get("CODE_REVIEW_MODEL", "gpt-3.5") - pr_summary_model = pr_summary_model or os.environ.get("PR_SUMMARY_MODEL", "gpt-4") - code_summary_model = os.environ.get("CODE_SUMMARY_MODEL", "gpt-3.5") - - # Create PullRequest object from commit - pull_request = create_pull_request_from_commit(commit_hash, repo_path) - - if verbose: - print(f"Reviewing commit: {commit_hash}") - print(f"Title: {pull_request.title}") - print(f"Files changed: {len(pull_request.change_files)}") - - # Initialize chains with specified models - summary_chain = PRSummaryChain.from_llm( - code_summary_llm=load_model_by_name(code_summary_model), - pr_summary_llm=load_model_by_name(pr_summary_model), - verbose=verbose - ) - - review_chain = CodeReviewChain.from_llm( - llm=load_model_by_name(code_review_model), - verbose=verbose - ) - - with get_openai_callback() as cb: - # Get PR summary - if verbose: - print(f"Generating commit summary using {pr_summary_model}...") - - pr_summary_result = asyncio.run(pr_summary(pull_request, summary_chain)) - pr_summary_cost = cb.total_cost - - if verbose: - print(f"Commit summary complete, cost: ${pr_summary_cost:.4f}") - - # Get code review - if verbose: - print(f"Generating code review using {code_review_model}...") - - try: - code_review_result = asyncio.run(code_review(pull_request, review_chain)) - code_review_cost = cb.total_cost - pr_summary_cost - - if verbose: - print(f"Code review complete, cost: ${code_review_cost:.4f}") - except Exception as e: - print(f"Code review generation failed: {str(e)}") - if verbose: - print(traceback.format_exc()) - # Use empty code review - code_review_result = {"code_reviews": []} - - # Create report - total_cost = cb.total_cost - total_time = time.time() - start_time - - reporter = PullRequestReporter( - pr_summary=pr_summary_result["pr_summary"], - code_summaries=pr_summary_result["code_summaries"], - pull_request=pull_request, - code_reviews=code_review_result.get("code_reviews", []), - telemetry={ - "start_time": start_time, - "time_usage": total_time, - "cost": total_cost, - "tokens": cb.total_tokens, - }, - ) - - report = reporter.report() - - # Save report to file - if not output_file: - output_file = f"codedog_commit_{commit_hash[:8]}.md" - - with open(output_file, "w", encoding="utf-8") as f: - f.write(report) - - if verbose: - print(f"Report saved to {output_file}") - - # Send email notification if email addresses provided - if email_addresses: - subject = f"[CodeDog] Code Review for Commit {commit_hash[:8]}: {pull_request.title}" - sent = send_report_email( - to_emails=email_addresses, - subject=subject, - markdown_content=report, - ) - if sent and verbose: - print(f"Report sent to {', '.join(email_addresses)}") - elif not sent and verbose: - print("Failed to send email notification") - - return report - - -def main(): - """Main function to parse arguments and run the commit review. - - This works with both GitHub and GitLab repositories by analyzing local Git commits. - """ - args = parse_args() - - # Get commit hash (default to HEAD if not provided) - commit_hash = args.commit - if not commit_hash: - import subprocess - result = subprocess.run( - ["git", "rev-parse", "HEAD"], - capture_output=True, - text=True, - check=True - ) - commit_hash = result.stdout.strip() - - # Get email addresses from args, env var, or use the default address - default_email = "kratosxie@gmail.com" # Default email address - email_from_args = args.email or os.environ.get("NOTIFICATION_EMAILS", "") - - # If no email is specified in args or env, use the default - if not email_from_args: - email_addresses = [default_email] - print(f"No email specified, using default: {default_email}") - else: - email_addresses = parse_emails(email_from_args) - - # Generate review - report = generate_commit_review( - commit_hash=commit_hash, - repo_path=args.repo, - email_addresses=email_addresses, - output_file=args.output, - code_review_model=args.model, - pr_summary_model=args.summary_model, - verbose=args.verbose - ) - - if args.verbose: - print("\n===================== Review Report =====================\n") - print(f"Report generated for commit {commit_hash[:8]}") - if email_addresses: - print(f"Report sent to: {', '.join(email_addresses)}") - print("\n===================== Report End =====================\n") - - -if __name__ == "__main__": - try: - main() - except Exception as e: - print(f"Error: {str(e)}") - print("\nDetailed error information:") - traceback.print_exc() diff --git a/run_codedog_eval.py b/run_codedog_eval.py deleted file mode 100755 index 9ac84c9..0000000 --- a/run_codedog_eval.py +++ /dev/null @@ -1,179 +0,0 @@ -#!/usr/bin/env python3 -import argparse -import asyncio -import os -import sys -import time -from datetime import datetime, timedelta -from dotenv import load_dotenv - -# 加载环境变量 -load_dotenv(override=True) # 覆盖已存在的环境变量,确保从.env文件加载最新的值 - -from codedog.utils.git_log_analyzer import get_file_diffs_by_timeframe -from codedog.utils.code_evaluator import DiffEvaluator, generate_evaluation_markdown -from codedog.utils.langchain_utils import load_model_by_name, DeepSeekChatModel -from codedog.utils.email_utils import send_report_email -from langchain_community.callbacks.manager import get_openai_callback - - -def parse_args(): - """解析命令行参数""" - parser = argparse.ArgumentParser(description="CodeDog Eval - 按时间段和开发者评价代码提交") - - # 必需参数 - parser.add_argument("author", help="开发者名称或邮箱(部分匹配)") - - # 可选参数 - parser.add_argument("--start-date", help="开始日期 (YYYY-MM-DD),默认为7天前") - parser.add_argument("--end-date", help="结束日期 (YYYY-MM-DD),默认为今天") - parser.add_argument("--repo", help="Git仓库路径,默认为当前目录") - parser.add_argument("--include", help="包含的文件扩展名,逗号分隔,例如 .py,.js") - parser.add_argument("--exclude", help="排除的文件扩展名,逗号分隔,例如 .md,.txt") - parser.add_argument("--model", help="评价模型,默认为环境变量CODE_REVIEW_MODEL或gpt-3.5") - parser.add_argument("--email", help="报告发送的邮箱地址,逗号分隔") - parser.add_argument("--output", help="报告输出文件路径,默认为 codedog_eval__.md") - parser.add_argument("--tokens-per-minute", type=int, default=6000, help="每分钟令牌数量限制,默认为6000") - parser.add_argument("--max-concurrent", type=int, default=2, help="最大并发请求数,默认为2") - parser.add_argument("--cache", action="store_true", help="启用缓存,避免重复评估相同的文件") - parser.add_argument("--save-diffs", action="store_true", help="保存diff内容到中间文件,用于分析token使用情况") - parser.add_argument("--verbose", action="store_true", help="显示详细的进度信息") - - return parser.parse_args() - - -async def main(): - """主程序""" - args = parse_args() - - # 处理日期参数 - today = datetime.now().strftime("%Y-%m-%d") - week_ago = (datetime.now() - timedelta(days=7)).strftime("%Y-%m-%d") - - start_date = args.start_date or week_ago - end_date = args.end_date or today - - # 生成默认输出文件名 - if not args.output: - author_slug = args.author.replace("@", "_at_").replace(" ", "_").replace("/", "_") - date_slug = datetime.now().strftime("%Y%m%d") - args.output = f"codedog_eval_{author_slug}_{date_slug}.md" - - # 处理文件扩展名参数 - include_extensions = [ext.strip() for ext in args.include.split(",")] if args.include else None - exclude_extensions = [ext.strip() for ext in args.exclude.split(",")] if args.exclude else None - - # 获取模型 - model_name = args.model or os.environ.get("CODE_REVIEW_MODEL", "gpt-3.5") - model = load_model_by_name(model_name) - - print(f"正在评价 {args.author} 在 {start_date} 至 {end_date} 期间的代码提交...") - - # 获取提交和diff - commits, commit_file_diffs, code_stats = get_file_diffs_by_timeframe( - args.author, - start_date, - end_date, - args.repo, - include_extensions, - exclude_extensions - ) - - if not commits: - print(f"未找到 {args.author} 在指定时间段内的提交记录") - return - - print(f"找到 {len(commits)} 个提交,共修改了 {code_stats['total_files']} 个文件") - print(f"代码量统计: 添加 {code_stats['total_added_lines']} 行,删除 {code_stats['total_deleted_lines']} 行,有效变更 {code_stats['total_effective_lines']} 行") - - # 初始化评价器,使用命令行参数 - evaluator = DiffEvaluator( - model, - tokens_per_minute=args.tokens_per_minute, - max_concurrent_requests=args.max_concurrent, - save_diffs=args.save_diffs - ) - - # 如果启用了保存diff内容,创建diffs目录 - if args.save_diffs: - os.makedirs("diffs", exist_ok=True) - print("已启用diff内容保存,文件将保存在diffs目录中") - - # 如果没有启用缓存,清空缓存字典 - if not args.cache: - evaluator.cache = {} - print("缓存已禁用") - else: - print("缓存已启用,相同文件将从缓存中获取评估结果") - - # 计时和统计 - start_time = time.time() - total_cost = 0 - total_tokens = 0 - - # 执行评价 - print("正在评价代码提交...") - if isinstance(model, DeepSeekChatModel): - evaluation_results = await evaluator.evaluate_commits(commits, commit_file_diffs, verbose=args.verbose) - total_tokens = model.total_tokens - total_cost = model.total_cost - else: - with get_openai_callback() as cb: - evaluation_results = await evaluator.evaluate_commits(commits, commit_file_diffs, verbose=args.verbose) - total_tokens = cb.total_tokens - total_cost = cb.total_cost - - # 生成Markdown报告 - report = generate_evaluation_markdown(evaluation_results) - - # 添加代码量和评价统计信息 - elapsed_time = time.time() - start_time - telemetry_info = ( - f"\n## 代码量统计\n\n" - f"- **提交数量**: {len(commits)}\n" - f"- **修改文件数**: {code_stats['total_files']}\n" - f"- **添加行数**: {code_stats['total_added_lines']}\n" - f"- **删除行数**: {code_stats['total_deleted_lines']}\n" - f"- **有效变更行数**: {code_stats['total_effective_lines']}\n" - f"\n## 评价统计\n\n" - f"- **评价模型**: {model_name}\n" - f"- **评价时间**: {elapsed_time:.2f} 秒\n" - f"- **消耗Token**: {total_tokens}\n" - f"- **评价成本**: ${total_cost:.4f}\n" - ) - - report += telemetry_info - - # 保存报告 - with open(args.output, "w", encoding="utf-8") as f: - f.write(report) - print(f"报告已保存至 {args.output}") - - # 发送邮件报告 - if args.email: - email_list = [email.strip() for email in args.email.split(",")] - subject = f"[CodeDog] {args.author} 的代码评价报告 ({start_date} 至 {end_date})" - - sent = send_report_email( - to_emails=email_list, - subject=subject, - markdown_content=report, - ) - - if sent: - print(f"报告已发送至 {', '.join(email_list)}") - else: - print("邮件发送失败,请检查邮件配置") - - -if __name__ == "__main__": - try: - asyncio.run(main()) - except KeyboardInterrupt: - print("\n程序被中断") - sys.exit(1) - except Exception as e: - print(f"发生错误: {str(e)}") - import traceback - traceback.print_exc() - sys.exit(1) \ No newline at end of file diff --git a/test_auto_review.py b/test_auto_review.py deleted file mode 100644 index 6ad069f..0000000 --- a/test_auto_review.py +++ /dev/null @@ -1,31 +0,0 @@ -#!/usr/bin/env python -""" -测试自动代码评审和邮件报告功能 - -这个文件用于测试 Git 钩子是否能正确触发代码评审并发送邮件报告。 -""" - -def hello_world(): - """打印 Hello, World! 消息""" - print("Hello, World!") - return "Hello, World!" - -def calculate_sum(a, b): - """计算两个数的和 - - Args: - a: 第一个数 - b: 第二个数 - - Returns: - 两个数的和 - """ - # 添加类型检查 - if not isinstance(a, (int, float)) or not isinstance(b, (int, float)): - raise TypeError("参数必须是数字类型") - return a + b - -if __name__ == "__main__": - hello_world() - result = calculate_sum(5, 10) - print(f"5 + 10 = {result}") diff --git a/test_gpt4o.py b/test_gpt4o.py deleted file mode 100644 index 8aa3ad0..0000000 --- a/test_gpt4o.py +++ /dev/null @@ -1,77 +0,0 @@ -#!/usr/bin/env python -""" -测试 GPT-4o 模型支持 - -这个脚本用于测试 CodeDog 对 GPT-4o 模型的支持。 -它会加载 GPT-4o 模型并执行一个简单的代码评估任务。 -""" - -import os -import sys -import asyncio -from dotenv import load_dotenv - -# 加载环境变量 -load_dotenv() - -# 添加当前目录到 Python 路径 -sys.path.append(os.path.dirname(os.path.abspath(__file__))) - -from codedog.utils.langchain_utils import load_model_by_name -from codedog.utils.code_evaluator import DiffEvaluator - -# 测试代码差异 -TEST_DIFF = """ -diff --git a/example.py b/example.py -index 1234567..abcdefg 100644 ---- a/example.py -+++ b/example.py -@@ -1,5 +1,7 @@ - def calculate_sum(a, b): -- return a + b -+ # 添加类型检查 -+ if not isinstance(a, (int, float)) or not isinstance(b, (int, float)): -+ raise TypeError("Arguments must be numbers") -+ return a + b - - def main(): - print(calculate_sum(5, 10)) -""" - -async def test_gpt4o(): - """测试 GPT-4o 模型""" - print("正在加载 GPT-4o 模型...") - - try: - # 尝试加载 GPT-4o 模型 - model = load_model_by_name("gpt-4o") - print(f"成功加载模型: {model.__class__.__name__}") - - # 创建评估器 - evaluator = DiffEvaluator(model, tokens_per_minute=6000, max_concurrent_requests=1) - - # 评估代码差异 - print("正在评估代码差异...") - result = await evaluator._evaluate_single_diff(TEST_DIFF) - - # 打印评估结果 - print("\n评估结果:") - print(f"可读性: {result.get('readability', 'N/A')}") - print(f"效率: {result.get('efficiency', 'N/A')}") - print(f"安全性: {result.get('security', 'N/A')}") - print(f"结构: {result.get('structure', 'N/A')}") - print(f"错误处理: {result.get('error_handling', 'N/A')}") - print(f"文档: {result.get('documentation', 'N/A')}") - print(f"代码风格: {result.get('code_style', 'N/A')}") - print(f"总分: {result.get('overall_score', 'N/A')}") - print(f"\n评价意见: {result.get('comments', 'N/A')}") - - print("\nGPT-4o 模型测试成功!") - - except Exception as e: - print(f"测试失败: {str(e)}") - import traceback - traceback.print_exc() - -if __name__ == "__main__": - asyncio.run(test_gpt4o()) diff --git a/test_grimoire_deepseek_r1_py.md b/test_grimoire_deepseek_r1_py.md deleted file mode 100644 index 7c31c34..0000000 --- a/test_grimoire_deepseek_r1_py.md +++ /dev/null @@ -1,580 +0,0 @@ -# 代码评价报告 - -## 概述 - -- **开发者**: Arcadia -- **时间范围**: 2023-08-21 至 2024-07-31 -- **评价文件数**: 24 - -## 总评分 - -| 评分维度 | 平均分 | -|---------|-------| -| 可读性 | 7.3 | -| 效率与性能 | 7.8 | -| 安全性 | 6.3 | -| 结构与设计 | 7.2 | -| 错误处理 | 5.5 | -| 文档与注释 | 5.7 | -| 代码风格 | 8.1 | -| **总分** | **6.8** | - -**整体代码质量**: 良好 - -## 文件评价详情 - -### 1. examples/github_server.py - -- **提交**: b2e3f4c0 - chore: Add a gitlab server example (#40) -- **日期**: 2023-08-21 15:40 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 6 | -| 安全性 | 3 | -| 结构与设计 | 6 | -| 错误处理 | 4 | -| 文档与注释 | 5 | -| 代码风格 | 7 | -| **总分** | **5.4** | - -**评价意见**: - -代码在可读性(格式调整、命名规范)和代码风格(PEP8对齐)上有改进,但存在显著安全隐患(硬编码token)。建议:1. 使用环境变量存储敏感信息 2. 增加异常处理逻辑 3. 添加函数文档注释 4. 考虑线程池替代直接创建线程 5. 补充输入参数校验。性能方面可优化异步任务管理,文档需要补充模块级说明和配置参数解释。 - ---- - -### 2. examples/gitlab_server.py - -- **提交**: b2e3f4c0 - chore: Add a gitlab server example (#40) -- **日期**: 2023-08-21 15:40 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 6 | -| 安全性 | 4 | -| 结构与设计 | 7 | -| 错误处理 | 5 | -| 文档与注释 | 6 | -| 代码风格 | 7 | -| **总分** | **6.0** | - -**评价意见**: - -代码整体结构清晰但存在以下改进点:1. 可读性:建议将直接访问的私有属性 `retriever._git_merge_request` 改为通过公共方法获取;2. 效率:建议将同步的 threading 模式改为全异步架构;3. 安全性:硬编码的敏感信息应通过环境变量注入,需加强输入验证;4. 错误处理:需捕获线程内异常,增加Gitlab API调用重试机制;5. 文档:建议补充事件模型字段说明和接口文档;6. 代码风格:建议统一逗号后空格格式。建议使用配置类管理全局参数,增加单元测试覆盖核心逻辑。 - ---- - -### 3. codedog/utils/langchain_utils.py - -- **提交**: 69318d8e - fix: update openai api version -- **日期**: 2024-05-31 11:49 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 7 | -| 安全性 | 8 | -| 结构与设计 | 8 | -| 错误处理 | 5 | -| 文档与注释 | 5 | -| 代码风格 | 9 | -| **总分** | **7.1** | - -**评价意见**: - -代码差异主要更新了Azure OpenAI API版本至最新预览版,提升了安全性和兼容性。可读性和代码风格良好,参数命名清晰格式规范。但存在以下改进空间:1) 建议添加注释说明API版本升级原因 2) 需要补充环境变量缺失时的错误处理逻辑 3) 应增加函数文档字符串说明接口用途和参数要求 4) 可考虑将API版本号提取为配置常量避免硬编码。整体改动合理但需加强异常处理和文档完善。 - ---- - -### 4. codedog/models/change_file.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 10 | -| 安全性 | 8 | -| 结构与设计 | 7 | -| 错误处理 | 7 | -| 文档与注释 | 8 | -| 代码风格 | 9 | -| **总分** | **8.1** | - -**评价意见**: - -变量名从 _raw 改为 raw 提高了可读性,符合 PEP8 命名规范。注释同步更新,但缺乏更详细的上下文文档。性能和安全性无明显问题。结构调整需确认是否合理暴露内部数据,需确保封装性符合设计意图。错误处理未涉及变更,建议后续补充异常处理逻辑。 - ---- - -### 5. codedog/chains/prompts.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 9 | -| 安全性 | 7 | -| 结构与设计 | 7 | -| 错误处理 | 6 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **7.1** | - -**评价意见**: - -代码改进主要体现在可读性和代码风格方面:1) 参数列表换行和结尾逗号提升了多行参数的可读性 2) 导入路径调整符合模块化设计规范。建议改进:1) 增加模板变量的用途说明注释 2) 补充依赖库版本安全声明 3) 添加输入参数类型校验逻辑 4) 考虑模板加载失败时的异常处理。代码风格改进值得肯定,但核心业务逻辑仍需完善文档和容错机制。 - ---- - -### 6. codedog/models/diff.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 9 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 7 | -| 文档与注释 | 6 | -| 代码风格 | 9 | -| **总分** | **7.7** | - -**评价意见**: - -代码在可读性和结构设计上表现较好,命名规范且符合Pydantic模型特征。新增的arbitrary_types_allowed配置需要特别关注安全性,建议补充注释说明启用该配置的必要性。文档方面缺少对模型配置变更的说明,建议在DocString中补充相关说明。代码风格完全符合Pydantic v2的配置规范,性能方面没有引入额外开销。错误处理部分未观察到新增的异常处理逻辑,建议在后续开发中加强对类型校验失败情况的处理。 - ---- - -### 7. codedog/chains/code_review/prompts.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 9 | -| 安全性 | 7 | -| 结构与设计 | 7 | -| 错误处理 | 6 | -| 文档与注释 | 5 | -| 代码风格 | 9 | -| **总分** | **7.3** | - -**评价意见**: - -代码可读性通过参数分行格式得到提升,代码风格符合 PEP8 规范。导入路径调整体现了更好的模块化设计,但未涉及错误处理和安全实践的改进。建议:1) 在模板变量中增加输入校验逻辑 2) 补充模块级文档注释 3) 处理可能的模板渲染异常。文档部分仍需完善,原有 TODO 注释建议具体化本地化计划。 - ---- - -### 8. examples/github_server.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 8 | -| 安全性 | 7 | -| 结构与设计 | 7 | -| 错误处理 | 6 | -| 文档与注释 | 7 | -| 代码风格 | 8 | -| **总分** | **7.1** | - -**评价意见**: - -代码差异主要涉及依赖路径更新和格式优化:1) 将弃用的langchain.callbacks调整为社区版路径,提高了模块化程度 2) 添加空行符合PEP8格式规范 3) 保持原有文档字符串和类型注解。改进建议:1) 增加对Github API调用异常的处理逻辑 2) 补充输入参数校验相关代码 3) 建议在回调函数使用时添加资源释放说明 - ---- - -### 9. codedog/chains/code_review/translate_code_review_chain.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 9 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 6 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **7.3** | - -**评价意见**: - -代码调整主要涉及导入优化和依赖管理,可读性提升体现在更清晰的模块导入结构。性能无影响,安全性未涉及敏感操作。结构上通过更规范的模块导入增强了组织性,但错误处理相关逻辑未见改进。文档注释未新增说明,建议补充模块调整原因的注释。代码风格符合规范,但需确保所有导入按项目风格指南分组排序。 - ---- - -### 10. examples/gitlab_server.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 8 | -| 安全性 | 6 | -| 结构与设计 | 7 | -| 错误处理 | 5 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **6.7** | - -**评价意见**: - -代码可读性较好,模块导入路径调整后更清晰,空行使用规范。性能影响较小,但需注意Gitlab API调用时的潜在性能瓶颈。安全方面缺乏身份验证和输入验证机制,建议补充。错误处理完全缺失,需增加异常捕获逻辑。文档字符串较简单,建议补充模块级功能说明。代码风格符合PEP8规范,langchain_community的导入说明遵循了最新的模块结构。改进建议:1. 添加API端点身份验证 2. 增加try-except块处理Gitlab操作异常 3. 补充模块级文档说明 4. 关键函数添加参数类型说明 - ---- - -### 11. codedog/chains/code_review/base.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 8 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 6 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **7.1** | - -**评价意见**: - -代码差异主要优化了模块导入结构,符合最新的langchain库组织规范(如从langchain_core导入BasePromptTemplate),提升了模块化程度和代码风格。可读性良好但注释未增强,错误处理未见改进。建议:1. 在关键方法添加docstring说明职责 2. 增加异常捕获处理逻辑 3. 保持第三方库版本依赖的及时更新。 - ---- - -### 12. codedog/chains/pr_summary/prompts.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 9 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 5 | -| 文档与注释 | 6 | -| 代码风格 | 9 | -| **总分** | **7.4** | - -**评价意见**: - -代码改进主要体现在格式规范化和模块导入优化: -1. 可读性通过拆解长语句提升明显,建议保持统一缩进风格 -2. 导入路径调整为langchain_core显示依赖管理意识 -3. 安全评分基于无显式风险但缺乏输入验证机制 -4. 错误处理缺失对潜在异常(如解析失败/变量缺失)的捕获 -5. 建议补充: - - 关键方法的docstring说明 - - 输入参数的合法性校验 - - try-except块处理解析异常 - - 配置项的外部化设计 - ---- - -### 13. examples/translation.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 7 | -| 安全性 | 7 | -| 结构与设计 | 7 | -| 错误处理 | 6 | -| 文档与注释 | 5 | -| 代码风格 | 8 | -| **总分** | **6.7** | - -**评价意见**: - -代码整体质量较好,主要改进建议如下: -1. 可读性:方法名从acall改为ainvoke缺乏上下文说明,建议添加注释说明方法变更背景 -2. 文档与注释:关键方法调用变更和依赖库路径修改未记录原因,建议补充变更记录说明 -3. 错误处理:未观察到新增的错误处理逻辑,建议检查异步调用链的异常传播机制 -4. 依赖管理:langchain_community的导入路径变更需要确保依赖版本已正确更新 -5. 代码风格:符合Python PEP8规范,方法命名改进后语义更清晰(ainvoke比acall更明确) - ---- - -### 14. codedog/models/issue.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 7 | -| 安全性 | 6 | -| 结构与设计 | 6 | -| 错误处理 | 5 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **6.4** | - -**评价意见**: - -代码可读性较好,字段重命名为raw提高了直观性,但验证器的删除可能导致数据完整性风险。效率无显著变化,但移除验证器可能简化了部分逻辑。安全性需注意未处理None值可能引发的后续问题。结构上建议补充其他验证机制替代原方案。错误处理能力下降,需增加对None值的兜底处理。文档应补充字段变更说明和验证逻辑移除的影响。代码风格符合规范,但需确认字段可见性变更是否符合项目规范。 - ---- - -### 15. codedog/models/commit.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 8 | -| 安全性 | 6 | -| 结构与设计 | 7 | -| 错误处理 | 4 | -| 文档与注释 | 5 | -| 代码风格 | 8 | -| **总分** | **6.4** | - -**评价意见**: - -可读性:命名从私有字段 `_raw` 改为公共字段 `raw` 更清晰,但存在重复注释的问题。效率与性能:移除了验证器逻辑,可能提升性能但需确认功能完整性。安全性:移除验证器可能导致空值未处理,存在潜在风险。结构与设计:模型结构简化但需确认默认值处理是否被替代。错误处理:移除空值验证器后缺乏异常处理逻辑,风险较高。文档与注释:重复注释需修正,字段描述可优化。代码风格:符合规范但需检查字段命名约定。建议:1. 修复重复注释 2. 补充空值处理逻辑 3. 验证字段默认值机制 4. 添加类型注解增强可维护性。 - ---- - -### 16. codedog/models/repository.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 7 | -| 安全性 | 5 | -| 结构与设计 | 6 | -| 错误处理 | 4 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **6.3** | - -**评价意见**: - -代码可读性较好,字段重命名为raw更符合命名规范。移除未使用的导入使代码更简洁。但移除none_to_default校验器可能导致字段默认值处理逻辑缺失,存在安全风险(如None值未正确处理)和错误处理缺陷(无法自动填充默认值)。建议补充字段级别的默认值处理逻辑或改用Field(default_factory)方式。注释部分保持完整但缺乏对校验逻辑变更的说明,建议补充相关文档。 - ---- - -### 17. codedog/chains/pr_summary/translate_pr_summary_chain.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 7 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 6 | -| 文档与注释 | 5 | -| 代码风格 | 8 | -| **总分** | **6.9** | - -**评价意见**: - -代码在结构和代码风格上有明显改进,模块化导入和异步方法调用更符合最佳实践。可读性较好,但缺乏新增注释。错误处理未明显增强,建议补充异常捕获机制。文档部分需要加强,特别是对异步方法变更的说明。安全性无显著问题但可增加输入验证。 - ---- - -### 18. codedog/models/pull_request.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 9 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 5 | -| 文档与注释 | 6 | -| 代码风格 | 9 | -| **总分** | **7.3** | - -**评价意见**: - -代码可读性较好,字段名从 `_raw` 改为 `raw` 更符合公共属性的命名规范。移除了冗余的 Pydantic 验证器简化了模型结构,但未提供迁移说明。性能方面无负面改动,但删除的验证器可能导致空值处理逻辑缺失(原验证器为 None 值提供默认值),需确认业务场景是否允许空值。建议:1. 补充 `raw` 字段的文档说明变更原因 2. 评估空值处理逻辑移除后的兼容性影响 3. 对可能为 None 的字段显式声明默认值 - ---- - -### 19. examples/gitlab_review.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 6 | -| 安全性 | 5 | -| 结构与设计 | 7 | -| 错误处理 | 5 | -| 文档与注释 | 4 | -| 代码风格 | 8 | -| **总分** | **6.0** | - -**评价意见**: - -代码在可读性和代码风格方面表现较好,通过多行格式化提升了链式调用的可读性,符合PEP8规范。结构和模块化有所改进,但缺乏错误处理机制(如异步调用未包裹try-catch)、安全实践(未处理敏感数据/API密钥)和文档注释。建议:1. 为异步方法添加异常处理 2. 补充函数/模块级文档字符串 3. 对openai_proxy配置增加输入验证 4. 考虑使用安全凭证存储方案。效率方面虽然调用方式合理,但缺乏执行耗时监控机制。 - ---- - -### 20. codedog/retrievers/github_retriever.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 8 | -| 效率与性能 | 10 | -| 安全性 | 7 | -| 结构与设计 | 8 | -| 错误处理 | 7 | -| 文档与注释 | 6 | -| 代码风格 | 9 | -| **总分** | **7.9** | - -**评价意见**: - -代码改进主要涉及属性命名规范,将内部属性 '_raw' 改为公共属性 'raw',提高了可读性和代码风格。效率不受影响,但需注意:1) 文档/注释未同步更新属性名可能导致混淆,建议检查相关注释;2) 公开原始对象可能引入意外修改风险,建议评估属性暴露必要性或添加只读保护;3) 未涉及错误处理逻辑改进,原有异常处理仍需保持健全。 - ---- - -### 21. examples/github_review.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 7 | -| 安全性 | 6 | -| 结构与设计 | 7 | -| 错误处理 | 5 | -| 文档与注释 | 5 | -| 代码风格 | 8 | -| **总分** | **6.4** | - -**评价意见**: - -代码整体可读性较好,但存在以下改进空间: -1. 移除了OPENAI_PROXY设置逻辑可能影响网络安全性,建议通过更安全的方式管理代理配置 -2. 缺乏异常处理逻辑,异步调用中应增加try-catch块 -3. 文档注释仍较薄弱,建议补充函数docstring和关键参数说明 -4. 移除visualize调用后未补充替代调试手段,可能影响可维护性 -5. 建议在ainvoke调用处增加超时机制等容错设计 -6. 可考虑保留环境变量配置的扩展性设计 - ---- - -### 22. codedog/utils/langchain_utils.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 8 | -| 安全性 | 7 | -| 结构与设计 | 6 | -| 错误处理 | 5 | -| 文档与注释 | 5 | -| 代码风格 | 6 | -| **总分** | **6.3** | - -**评价意见**: - -代码在参数命名更新和模块迁移方面进行了改进,但存在以下问题:1. load_gpt4_llm 函数尾部出现重复return语句(语法错误)需修复;2. 缺少环境变量缺失时的异常处理机制;3. 函数应添加docstring说明功能及参数来源;4. Azure GPT-4部署ID参数名与实际环境变量名不匹配(AZURE_OPENAI_DEPLOYMENT_ID vs AZURE_OPENAI_GPT4_DEPLOYMENT_ID);建议:a) 删除重复return语句 b) 添加try-except块处理API连接异常 c) 补充函数文档注释 d) 统一环境变量命名规范 e) 建议对API密钥进行空值校验 - ---- - -### 23. codedog/retrievers/gitlab_retriever.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 7 | -| 安全性 | 6 | -| 结构与设计 | 8 | -| 错误处理 | 6 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **6.9** | - -**评价意见**: - -代码整体可读性较好,通过参数换行优化了长代码行的阅读体验。代码结构清晰,模块化设计合理(如_build_*系列方法),符合面向对象设计原则。代码风格符合PEP8规范,链式调用换行处理得当。但存在以下改进点:1. 安全性方面建议增加对issue_number的合法性校验;2. 错误处理需要补充网络请求/项目获取的异常捕获逻辑;3. 文档注释可补充方法级参数说明和返回值说明;4. 建议对LIST_DIFF_LIMIT的硬编码限制增加配置化支持。 - ---- - -### 24. codedog/chains/pr_summary/base.py - -- **提交**: 6ce08110 - feat: update to langchain 0.2 -- **日期**: 2024-07-31 14:41 -- **评分**: -| 评分维度 | 分数 | -|---------|----| -| 可读性 | 7 | -| 效率与性能 | 6 | -| 安全性 | 5 | -| 结构与设计 | 6 | -| 错误处理 | 5 | -| 文档与注释 | 6 | -| 代码风格 | 8 | -| **总分** | **6.1** | - -**评价意见**: - -代码可读性较好,命名清晰且格式统一,但存在未处理的TODO注释(如长diff截断逻辑)。效率方面使用异步调用合理,但直接截取文件内容前2000字符可能丢失关键信息。安全性需加强输入验证(原TODO未实现)。结构上改为全局processor实例可能影响可测试性,建议保留为类成员。错误处理依赖LangChain框架,缺乏自定义异常捕获。文档基本合格但可补充参数说明。代码风格优秀,符合PEP8和LangChain规范。改进建议:1) 用依赖注入替代全局processor 2) 实现输入校验 3) 完善TODO注释 4) 增加异常处理逻辑。 - ---- - - -## 评价统计 - -- **评价模型**: deepseek-r1 -- **评价时间**: 1295.79 秒 -- **消耗Token**: 37846 -- **评价成本**: $3.7846