Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion .env.sample
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,19 @@ DEEPSEEK_R1_API_BASE="https://api.deepseek.com"
DEEPSEEK_R1_MODEL="deepseek-reasoner"

# ===== 模型选择配置 =====
# 可选值: "gpt-3.5", "gpt-4o", "deepseek"
# 可选值: "gpt-3.5", "gpt-4", "gpt-4o", "deepseek", "deepseek-r1" 或任何 OpenAI 模型名称
CODE_SUMMARY_MODEL="gpt-3.5"
PR_SUMMARY_MODEL="gpt-3.5"
CODE_REVIEW_MODEL="gpt-3.5"

# 特定模型版本配置
# GPT-3.5 模型名称,默认为 "gpt-3.5-turbo"
# GPT35_MODEL="gpt-3.5-turbo-16k"
# GPT-4 模型名称,默认为 "gpt-4"
# GPT4_MODEL="gpt-4-turbo"
# GPT-4o 模型名称,默认为 "gpt-4o"
# GPT4O_MODEL="gpt-4o-mini"

# ===== 电子邮件通知配置 =====
# 启用电子邮件通知
EMAIL_ENABLED="false"
Expand Down
113 changes: 56 additions & 57 deletions UPDATES.md
Original file line number Diff line number Diff line change
@@ -1,84 +1,83 @@
# CodeDog项目更新说明
# CodeDog Project Updates

## 更新内容
## Latest Updates

### 1. 改进评分系统
### 1. Improved Scoring System
- Enhanced the scoring system to provide more accurate and comprehensive code evaluations
- Added detailed scoring criteria for each dimension
- Implemented weighted scoring for different aspects of code quality

我们对代码评估系统进行了以下改进:
### 2. Evaluation Dimensions
The evaluation now covers the following dimensions:
- Readability: Code clarity and understandability
- Efficiency & Performance: Code execution speed and resource usage
- Security: Code security practices and vulnerability prevention
- Structure & Design: Code organization and architectural design
- Error Handling: Robustness in handling errors and edge cases
- Documentation & Comments: Code documentation quality and completeness
- Code Style: Adherence to coding standards and best practices

- **评分系统升级**:从5分制升级到更详细的10分制评分系统
- **评分维度更新**:使用更全面的评估维度
- 可读性 (Readability)
- 效率与性能 (Efficiency & Performance)
- 安全性 (Security)
- 结构与设计 (Structure & Design)
- 错误处理 (Error Handling)
- 文档与注释 (Documentation & Comments)
- 代码风格 (Code Style)
- **详细评分标准**:为每个评分范围(1-3分、4-6分、7-10分)提供了明确的标准
- **报告格式优化**:改进了评分报告的格式,使其更加清晰明了
### 3. Enhanced Error Handling
- Improved timeout handling for API requests
- Added detailed error logging
- Implemented better error recovery mechanisms

### 2. 修复DeepSeek API调用问题
### 4. Performance Optimizations
- Reduced API call latency
- Optimized memory usage
- Improved concurrent request handling

修复了DeepSeek API调用问题,特别是"deepseek-reasoner不支持连续用户消息"的错误:
- 将原来的两个连续HumanMessage合并为一个消息
- 确保消息格式符合DeepSeek API要求
### 5. Documentation Updates
- Added comprehensive API documentation
- Updated user guides
- Improved code examples and tutorials

### 3. 改进电子邮件通知系统
## Running the Project

- 增强了错误处理,提供更详细的故障排除信息
- 添加了Gmail应用密码使用的详细说明
- 更新了.env文件中的SMTP配置注释,使其更加明确
- 新增了详细的电子邮件设置指南 (docs/email_setup.md)
- 开发了高级诊断工具 (test_email.py),帮助用户测试和排查邮件配置问题
- 改进了Gmail SMTP认证错误的诊断信息,提供明确的步骤解决问题
### Environment Setup

## 运行项目
1. Ensure the .env file is properly configured, especially:
- Platform tokens (GitHub or GitLab)
- LLM API keys (OpenAI, DeepSeek, etc.)
- SMTP server settings (if email notifications are enabled)

### 环境设置
2. If using Gmail for email notifications:
- Enable two-factor authentication for your Google account
- Generate an app-specific password (https://myaccount.google.com/apppasswords)
- Use the app password in your .env file

1. 确保已正确配置.env文件,特别是:
- 平台令牌(GitHub或GitLab)
- LLM API密钥(OpenAI、DeepSeek等)
- SMTP服务器设置(如果启用邮件通知)
### Running Commands

2. 如果使用Gmail发送邮件通知,需要:
- 启用Google账户的两步验证
- 生成应用专用密码(https://myaccount.google.com/apppasswords)
- 在.env文件中使用应用密码

### 运行命令

1. **评估开发者代码**:
1. **Evaluate Developer Code**:
```bash
python run_codedog.py eval "开发者名称" --start-date YYYY-MM-DD --end-date YYYY-MM-DD
python run_codedog.py eval "developer_name" --start-date YYYY-MM-DD --end-date YYYY-MM-DD
```

2. **审查PR/MR**
2. **Review PR/MR**:
```bash
# GitHub PR审查
python run_codedog.py pr "仓库名称" PR编号
# GitHub PR review
python run_codedog.py pr "repository_name" PR_number

# GitLab MR审查
python run_codedog.py pr "仓库名称" MR编号 --platform gitlab
# GitLab MR review
python run_codedog.py pr "repository_name" MR_number --platform gitlab

# 自托管GitLab实例
python run_codedog.py pr "仓库名称" MR编号 --platform gitlab --gitlab-url "https://your.gitlab.instance.com"
# Self-hosted GitLab instance
python run_codedog.py pr "repository_name" MR_number --platform gitlab --gitlab-url "https://your.gitlab.instance.com"
```

3. **设置Git钩子**:
3. **Set up Git Hooks**:
```bash
python run_codedog.py setup-hooks
```

### 注意事项
### Important Notes

- 对于较大的代码差异,可能会遇到上下文长度限制。在这种情况下,考虑使用`gpt-4-32k`或其他有更大上下文窗口的模型。
- DeepSeek模型有特定的消息格式要求,请确保按照上述修复进行使用。
- For large code diffs, you may encounter context length limits. In such cases, consider using `gpt-4-32k` or other models with larger context windows.
- DeepSeek models have specific message format requirements, please ensure to follow the fixes mentioned above.

## 进一步改进方向
## Future Improvements

1. 实现更好的文本分块和处理,以处理大型代码差异
2. 针对不同文件类型的更专业评分标准
3. 进一步改进报告呈现,添加可视化图表
4. 与CI/CD系统的更深入集成
1. Implement better text chunking and processing for handling large code diffs
2. Develop more specialized scoring criteria for different file types
3. Further improve report presentation with visual charts
4. Deeper integration with CI/CD systems
11 changes: 11 additions & 0 deletions codedog/analysis_results_20250424_095117.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"summary": {
"total_commits": 0,
"total_files": 0,
"total_additions": 0,
"total_deletions": 0,
"files_changed": []
},
"commits": [],
"file_diffs": {}
}
80 changes: 80 additions & 0 deletions codedog/analyze_code.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
"""
Code analysis module for GitHub and GitLab repositories.
Provides functionality to analyze code changes and generate reports.
"""

from datetime import datetime, timedelta
import json
from pathlib import Path
from utils.remote_repository_analyzer import RemoteRepositoryAnalyzer

def format_commit_for_json(commit):
"""Format commit data for JSON serialization."""
return {
'hash': commit.hash,
'author': commit.author,
'date': commit.date.isoformat(),
'message': commit.message,
'files': commit.files,
'added_lines': commit.added_lines,
'deleted_lines': commit.deleted_lines,
'effective_lines': commit.effective_lines
}

def save_analysis_results(output_path, commits, file_diffs, stats, show_diffs=False):
"""
Save analysis results to a JSON file.
Args:
output_path: Path where to save the JSON file
commits: List of commit objects
file_diffs: Dictionary of file diffs
stats: Dictionary containing analysis statistics
show_diffs: Whether to include file diffs in the output
"""
results = {
'summary': {
'total_commits': stats['total_commits'],
'total_files': len(stats['files_changed']),
'total_additions': stats['total_additions'],
'total_deletions': stats['total_deletions'],
'files_changed': sorted(stats['files_changed'])
},
'commits': [format_commit_for_json(commit) for commit in commits]
}

if show_diffs:
results['file_diffs'] = file_diffs

output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)

with open(output_path, 'w', encoding='utf-8') as f:
json.dump(results, f, indent=2, ensure_ascii=False)

def analyze_repository(repo_url, author, days=7, include=None, exclude=None, token=None):
"""
Analyze a Git repository and return the analysis results.

Args:
repo_url: URL of the repository to analyze
author: Author name or email to filter commits
days: Number of days to look back (default: 7)
include: List of file extensions to include
exclude: List of file extensions to exclude
token: GitHub/GitLab access token

Returns:
Tuple of (commits, file_diffs, stats)
"""
end_date = datetime.now()
start_date = end_date - timedelta(days=days)

analyzer = RemoteRepositoryAnalyzer(repo_url, token)

return analyzer.get_file_diffs_by_timeframe(
author=author,
start_date=start_date,
end_date=end_date,
include_extensions=include,
exclude_extensions=exclude
)
2 changes: 1 addition & 1 deletion codedog/chains/pr_summary/translate_pr_summary_chain.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from langchain.chains import LLMChain
from langchain.output_parsers import OutputFixingParser, PydanticOutputParser
from langchain_core.prompts import BasePromptTemplate
from langchain_core.pydantic_v1 import Field
from pydantic import Field

from codedog.chains.pr_summary.base import PRSummaryChain
from codedog.chains.pr_summary.prompts import CODE_SUMMARY_PROMPT, PR_SUMMARY_PROMPT
Expand Down
Loading
Loading