Codedog is designed as a modular system to retrieve pull request (PR) / merge request (MR) information from Git platforms (GitHub, GitLab), process the changes using Large Language Models (LLMs) via the LangChain framework, and generate structured reports (summaries, code reviews).
The core workflow involves:
- Retrieval: Fetching PR/MR metadata, changed files, diffs, and related issues using platform-specific APIs.
- Processing: Preparing the retrieved data (diff content, metadata) into suitable formats for LLM prompts.
- LLM Interaction (Chains): Sending processed data to LLMs via predefined LangChain chains to generate summaries and reviews.
- Reporting: Formatting the LLM outputs into a user-friendly Markdown report.
The architecture emphasizes separation of concerns, allowing different platforms, LLMs, or reporting formats to be potentially integrated more easily.
Pydantic BaseModels are used extensively to define the structure of data passed between different components. This ensures data consistency and leverages Pydantic's validation capabilities.
Key models include:
Repository: Represents a Git repository (source or target).Commit: Represents a Git commit.Issue: Represents a linked issue.Blob: Represents file content at a specific commit.DiffSegment/DiffContent: Represents parsed diff information usingunidiffobjects internally. Stores added/removed counts and content.ChangeFile: Represents a single file changed within the PR/MR. Includes metadata like name, path, status (ChangeStatusenum: addition, modified, deletion, renaming, etc.), SHAs, URLs, and crucially, theDiffContent.PullRequest: The central model representing the PR/MR. It aggregates information like title, body, URLs, and crucially contains lists ofChangeFileand relatedIssueobjects, along with references to source/targetRepositoryobjects.ChangeSummary: A simple model holding the summary generated by an LLM for a specificChangeFile.PRSummary: Holds the LLM-generated overall summary of the PR, including an overview text, a categorizedPRType(feature, fix, etc.), and a list ofmajor_filesidentified by the LLM.CodeReview: Represents the LLM-generated review/suggestions for a specificChangeFile.
These models provide a platform-agnostic representation of the core Git concepts needed for the review process.
- Purpose: Abstract away the specifics of interacting with different Git hosting platforms (GitHub, GitLab). They fetch raw data and transform it into the project's internal Pydantic
models. - Design:
Retriever(ABC): Defines the common interface (retriever_type,pull_request,repository,source_repository,changed_files,get_blob,get_commit).GithubRetriever: ImplementsRetrieverusing thePyGithublibrary.- Initializes with a
Githubclient, repo name/ID, and PR number. - Maps
github.PullRequest,github.Repository,github.File,github.Issue, etc., tocodedogmodels (_build_repository,_build_pull_request,_build_change_file,_build_issue). - Parses diff content (
_parse_and_build_diff_content) usingunidiffviacodedog.utils.diff_utils. - Extracts related issue numbers from PR title/body (
_parse_issue_numbers).
- Initializes with a
GitlabRetriever: ImplementsRetrieverusing thepython-gitlablibrary.- Initializes with a
Gitlabclient, project name/ID, and MR IID. - Maps
gitlab.v4.objects.ProjectMergeRequest,gitlab.v4.objects.Project, etc., tocodedogmodels. - Handles differences in API responses (e.g., fetching diffs via
mr.diffs.list()and then getting full diffs). - Similar logic for parsing diffs and issues.
- Initializes with a
- Interaction: Instantiated at the start of the workflow with platform credentials and target PR details. Its primary output is the populated
PullRequestmodel object.
- Purpose: To process and prepare data, primarily the
PullRequestobject and its contents, for consumption by the LLM chains and reporters. - Design:
PullRequestProcessor: The main processor.is_code_file/get_diff_code_files: FiltersChangeFileobjects to find relevant code files based on suffix and status (e.g., ignoring deleted files for review). UsesSUPPORT_CODE_FILE_SUFFIXandSUFFIX_LANGUAGE_MAPPING.gen_material_*methods (gen_material_change_files,gen_material_code_summaries,gen_material_pr_metadata): Formats lists ofChangeFiles,ChangeSummarys, and PR metadata into structured text strings suitable for inclusion in LLM prompts, using templates fromcodedog/templates.build_change_summaries: Maps the inputs and outputs of the code summary LLM chain back intoChangeSummarymodel objects.- Uses
Localizationmixin to access language-specific templates.
- Interaction: Takes the
PullRequestobject from the Retriever and lists ofChangeSummaryorCodeReviewobjects from the Chains. Produces formatted strings for LLM inputs and structured data for Reporters.
- Purpose: Encapsulate the logic for interacting with LLMs using LangChain. Defines prompts, LLM calls, and parsing of LLM outputs.
- Design:
- Follows a pattern of subclassing
langchain.chains.base.Chain(though migrating to LCEL is a future possibility). - Uses
LLMChaininternally to combine prompts and LLMs. PRSummaryChain(chains/pr_summary/base.py):- Orchestrates two
LLMChaincalls:code_summary_chain: Summarizes individual code file diffs (usingCODE_SUMMARY_PROMPT). Takes processed diff content as input. Uses.applyfor batch processing.pr_summary_chain: Summarizes the entire PR (usingPR_SUMMARY_PROMPT). Takes processed PR metadata, file lists, and the results of the code summary chain as input.
- Uses
PydanticOutputParser(wrapped inOutputFixingParser) to parse the PR summary LLM output directly into thePRSummaryPydantic model. Relies on format instructions injected into the prompt. _process_*_input: Methods prepare the dictionaries needed forLLMChain.applyorLLMChain.__call__._process_result: Packages the finalPRSummaryobject and the list ofChangeSummaryobjects.
- Orchestrates two
CodeReviewChain(chains/code_review/base.py):- Uses a single
LLMChain(code_review_chain) withCODE_REVIEW_PROMPT. - Takes processed diff content for each relevant file as input. Uses
.applyfor batch processing. _process_result: Maps LLM text outputs back toCodeReviewobjects, associating them with the originalChangeFile.
- Uses a single
Translate*ChainVariants (chains/code_review/translate_*.py,chains/pr_summary/translate_*.py):- Inherit from the base chains (
CodeReviewChain,PRSummaryChain). - Add an additional
translate_chain(LLMChainwithTRANSLATE_PROMPT). - Override
_process_result(and_aprocess_result) to call the base method first and then pass the generated summaries/reviews through thetranslate_chainusing.applyor.aapply.
- Inherit from the base chains (
- Prompts (
chains/.../prompts.py): DefinePromptTemplateobjects, often importing base templates fromcodedog/templates/grimoire_en.pyand sometimes injecting parser format instructions.
- Follows a pattern of subclassing
- Interaction: Takes processed data from the
PullRequestProcessor. Invokes LLMs vialangchain-openai(or potentially others). Outputs structured data (PRSummary,list[ChangeSummary],list[CodeReview]).
- Purpose: Centralize all user-facing text (report formats) and LLM prompt instructions. Support multiple languages.
- Design:
grimoire_*.py: Contain the core LLM prompt templates (e.g.,PR_SUMMARY,CODE_SUMMARY,CODE_SUGGESTION,TRANSLATE_PR_REVIEW). These define the instructions given to the LLM.template_*.py: Contain f-string templates for formatting the final Markdown report (e.g.,REPORT_PR_REVIEW,REPORT_PR_SUMMARY,REPORT_CODE_REVIEW_SEGMENT). Also includes mappings likeREPORT_PR_TYPE_DESC_MAPPINGandMATERIAL_STATUS_HEADER_MAPPING.LocalizationClass: A simple class used as a mixin. It holds dictionaries mapping language codes ("en", "cn") to the corresponding template and grimoire modules. Provides.templateand.grimoireproperties to access the correct language resources based on the instance'slanguage.
- Interaction:
- Grimoires are used by
chains/.../prompts.pyto createPromptTemplates. - Templates are used by
PullRequestProcessor(forgen_material_*) andactors/reporters(for final report generation). - The
Localizationmixin is used by Processors and Reporters to get language-specific text.
- Grimoires are used by
- Purpose: Take the final processed data (LLM outputs packaged in models) and format it into the desired output format (currently Markdown).
- Design:
Reporter(ABC): Defines thereport()method interface.CodeReviewMarkdownReporter: Takes a list ofCodeReviewobjects. Iterates through them, formatting each usingtemplate.REPORT_CODE_REVIEW_SEGMENT. Wraps the result intemplate.REPORT_CODE_REVIEW.PRSummaryMarkdownReporter: TakesPRSummary,list[ChangeSummary], andPullRequest. Uses helper methods (_generate_pr_overview,_generate_change_overivew,_generate_file_changes) and templates (template.REPORT_PR_SUMMARY,template.REPORT_PR_SUMMARY_OVERVIEW, etc.) to build the summary part of the report. LeveragesPullRequestProcessorfor some formatting.PullRequestReporter: The main reporter. It orchestrates the other two reporters.- Takes all final data:
PRSummary,list[ChangeSummary],PullRequest,list[CodeReview], and optional telemetry data. - Instantiates
PRSummaryMarkdownReporterandCodeReviewMarkdownReporterinternally. - Calls their respective
report()methods. - Combines their outputs into the final overall report using
template.REPORT_PR_REVIEW, adding headers, footers, and telemetry information.
- Takes all final data:
- Interaction: Consumes the output models from the Chains (
PRSummary,CodeReview, etc.) and the originalPullRequestdata. Usestemplatesfor formatting. Produces the final string output.
- Purpose: Provide common helper functions used across different modules.
- Design:
langchain_utils.py:load_gpt_llm(),load_gpt4_llm(): Centralized functions to instantiate LangChain LLM objects (ChatOpenAIorAzureChatOpenAI). They read configuration from environment variables (OPENAI_API_KEY,AZURE_OPENAI, etc.). Use@lru_cacheto avoid re-initializing models unnecessarily.
diff_utils.py:parse_diff(),parse_patch_file(): Wrapper functions around theunidifflibrary to parse raw diff/patch strings intounidiff.PatchSetobjects, simplifying usage in the retrievers.
- Interaction: Used by Retrievers (diff parsing) and the main application logic/Quickstart (LLM loading).
A typical run (based on the Quickstart) follows these steps:
- Initialization:
- Load environment variables (API keys, etc.).
- Instantiate a platform client (e.g.,
github.Github). - Instantiate the appropriate
Retriever(e.g.,GithubRetriever) with the client, repo, and PR number. The Retriever fetches initial data during init.
- LLM & Chain Setup:
- Load required LLMs using
codedog.utils.langchain_utils(e.g.,load_gpt_llm,load_gpt4_llm). - Instantiate the required
Chainobjects (e.g.,PRSummaryChain.from_llm(...),CodeReviewChain.from_llm(...)), passing in the loaded LLMs.
- Load required LLMs using
- Execute Chains:
- Call the summary chain (e.g.,
summary_chain({"pull_request": retriever.pull_request}, ...)). This triggers the internal processing, LLM calls for code summaries, the main PR summary, and parsing. The result includespr_summary(aPRSummaryobject) andcode_summaries(alist[ChangeSummary]). - Call the review chain (e.g.,
review_chain({"pull_request": retriever.pull_request}, ...)). This triggers LLM calls for each code file diff. The result includescode_reviews(alist[CodeReview]).
- Call the summary chain (e.g.,
- Generate Report:
- Instantiate the main
PullRequestReporterwith the results from the chains (pr_summary,code_summaries,code_reviews) and the originalretriever.pull_requestobject. Optionally pass telemetry data. Specify language if not default. - Call
reporter.report()to get the final formatted Markdown string.
- Instantiate the main
- Output: Print or save the generated report string.
sequenceDiagram
participant User/Script
participant Retriever
participant LLM Utils
participant Chains
participant Processor
participant Reporter
participant Templates
User/Script->>Retriever: Instantiate (client, repo, pr_num)
Retriever-->>User/Script: retriever (with PullRequest model)
User/Script->>LLM Utils: load_gpt_llm(), load_gpt4_llm()
LLM Utils-->>User/Script: llm35, llm4
User/Script->>Chains: Instantiate PRSummaryChain(llms)
User/Script->>Chains: Instantiate CodeReviewChain(llm)
Chains-->>User/Script: summary_chain, review_chain
User/Script->>Chains: summary_chain(pull_request)
Chains->>Processor: get_diff_code_files(pr)
Processor-->>Chains: code_files
Chains->>Processor: gen_material_*(...) for code summary inputs
Processor->>Templates: Get formatting
Templates-->>Processor: Formatting
Processor-->>Chains: Formatted inputs
Chains->>LLM Utils: Run code_summary_chain.apply(inputs)
LLM Utils-->>Chains: Code summary outputs (text)
Chains->>Processor: build_change_summaries(inputs, outputs)
Processor-->>Chains: code_summaries (List[ChangeSummary])
Chains->>Processor: gen_material_*(...) for PR summary inputs
Processor->>Templates: Get formatting
Templates-->>Processor: Formatting
Processor-->>Chains: Formatted inputs
Chains->>Templates: Get PR_SUMMARY prompt + format instructions
Templates-->>Chains: Prompt
Chains->>LLM Utils: Run pr_summary_chain(inputs)
LLM Utils-->>Chains: PR summary output (text)
Chains->>Chains: Parse output into PRSummary model
Chains-->>User/Script: {'pr_summary': PRSummary, 'code_summaries': List[ChangeSummary]}
User/Script->>Chains: review_chain(pull_request)
Chains->>Processor: get_diff_code_files(pr)
Processor-->>Chains: code_files
Chains->>Processor: gen_material_*(...) for code review inputs
Processor->>Templates: Get formatting
Templates-->>Processor: Formatting
Processor-->>Chains: Formatted inputs
Chains->>Templates: Get CODE_SUGGESTION prompt
Templates-->>Chains: Prompt
Chains->>LLM Utils: Run chain.apply(inputs)
LLM Utils-->>Chains: Code review outputs (text)
Chains->>Chains: Map outputs to CodeReview models
Chains-->>User/Script: {'code_reviews': List[CodeReview]}
User/Script->>Reporter: Instantiate PullRequestReporter(results, pr)
Reporter->>Reporter: Instantiate internal reporters
Reporter->>Templates: Get report templates
Templates-->>Reporter: Templates
Reporter->>Processor: Use processor for some formatting
Processor-->>Reporter: Formatted parts
Reporter-->>User/Script: Final Markdown Report (string)
- Configuration is primarily handled via environment variables, loaded directly using
os.environ(mainly incodedog/utils/langchain_utils.pyfor LLM keys/endpoints). - Platform tokens (GitHub/GitLab) are expected to be passed during client initialization, typically sourced from the environment by the calling script.
- Modularity: Separating retrieval, processing, LLM interaction, and reporting allows for easier extension or modification (e.g., adding Bitbucket support would primarily involve creating a new Retriever).
- Platform Abstraction: The Pydantic models provide a common language internally, isolating most of the code from platform-specific details handled by the Retrievers.
- LangChain: Leverages LangChain for abstracting LLM interactions, prompt management, output parsing, and chain composition. Using
LLMChainprovides a structured way to handle prompts and models. - Pydantic: Used for data validation, structure, and also leveraged by LangChain's
PydanticOutputParserfor reliable structured output from LLMs. - Localization: Built-in support for different languages via separate template files and the
Localizationmixin. - Error Handling: Currently somewhat basic; relies mainly on exceptions raised by underlying libraries (PyGithub, python-gitlab, LangChain). More robust handling could be added.
- Dependency Management: Uses Poetry for clear dependency specification and environment management.
- LCEL Migration: Update chains to use LangChain Expression Language (LCEL) instead of explicit
Chainsubclassing. - Long Diff Handling: Implement strategies (chunking, map-reduce) to handle very large file diffs that exceed LLM context limits.
- Enhanced Error Handling: Add specific
try...exceptblocks in retrievers and chains for better diagnostics. - Configuration Flexibility: Potentially add support for configuration files in addition to environment variables. Make Azure API version configurable.
- Extensibility: Refine interfaces (e.g.,
Retriever,Reporter) to make adding new platforms or output formats even smoother. - Testing: Expand test coverage, potentially adding more integration tests.
- Resolve Pydantic v1 Shim Warning: Investigate the lingering
LangChainDeprecationWarningrelated to the pydantic_v1 shim import path.