Skip to content

refactor: gha #33

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 15, 2025
Merged

refactor: gha #33

merged 14 commits into from
Jul 15, 2025

Conversation

a-klos
Copy link
Member

@a-klos a-klos commented Jul 3, 2025

This pull request introduces a major restructuring of the repository to improve organization and streamline workflows. Key changes include migrating core libraries and services into separate libs and services directories, updating references across configuration files, and enhancing CI/CD workflows to support the new structure.

Repository Restructuring:

  • Removed submodules for rag-infrastructure and rag-core-library from .gitmodules, consolidating their contents into libs and services directories.
  • Updated paths in .vscode/launch.json to reflect the new directory structure for libs and services. [1] [2] [3]
  • Adjusted Python analysis paths in .vscode/settings.json to point to the new libs directory.

Workflow Enhancements:

  • Replaced the update-submodules job with build-and-lint-services and build-and-lint-libs, introducing matrix strategies for linting and testing individual services and libraries.

Documentation Updates:

  • Updated README.md to reflect the new directory structure, replacing references to submodules with local paths and updating setup instructions. [1] [2] [3] [4] [5] [6]

Build and Makefile Updates:

  • Updated Makefile to use the new libs and services directories for linting, testing, and Docker builds.

a-klos added 10 commits July 3, 2025 11:46
- Remove unnecessary submodule update job (no longer needed in monorepo)
- Update service matrix to include all services (admin-backend, document-extractor, rag-backend, mcp-server)
- Add separate job for library testing (rag-core-lib, rag-core-api, admin-api-lib, extractor-api-lib)
- Fix Docker build paths to use services/ prefix for service Dockerfiles
- Update library builds to use libs/Dockerfile with proper DIRECTORY and TEST args
- Replace deprecated ::set-output with modern GITHUB_OUTPUT
- Remove submodule-related checkout options
- Ensure CI properly tests both services and libraries in the monorepo
- Fix library build context to use 'libs' instead of root directory
- Use TEST=0 for library linting (matching Tiltfile pattern)
- Use separate images for library lint vs test (TEST=0 vs TEST=1 with DIRECTORY)
- Remove unnecessary --entrypoint overrides (Dockerfile already sets correct entrypoint)
- Ensure CI builds match exactly how Tilt builds work locally
This pull request introduces a new `.env.template` file to streamline
environment variable setup and updates the `README.md` to reflect these
changes. The updates clarify required and optional configurations for
running the application and improve documentation consistency.

### Environment Variable Setup:
*
[`.env.template`](diffhunk://#diff-749e06f64632f62a0c0dfbf4c4f3850e27e94ac109aa121fabd5c29469ae88deR1-R63):
Added a comprehensive template for environment variables, including
sections for S3 storage, authentication, Langfuse observability, LLM
provider API keys, and optional Confluence integration. This file
provides detailed descriptions and placeholders for required and
optional values.
*
[`README.md`](diffhunk://#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L176-R200):
Updated instructions to guide users on copying and editing the
`.env.template` file, highlighting key variables and their importance
for application functionality.

### Documentation Improvements:
*
[`README.md`](diffhunk://#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L347-R345):
Renumbered and reorganized sections for clarity, including updates to
the environment variable setup and access via ingress.
…es (#47)

This pull request updates the `services/frontend` project to adopt ES
modules and TypeScript conventions more consistently. Key changes
include switching from CommonJS to ES modules, updating file extensions
in module paths, and removing unnecessary `.ts` extensions in
import/export statements.

### Migration to ES modules:

*
[`services/frontend/libs/shared/utils/package.json`](diffhunk://#diff-d64f34af5787aca8eb7a66b583933473526c15edc5cf6338d2c09f38ff7b2b7aL7-R8):
Changed the `type` field from `commonjs` to `module` and updated the
`main` field to point to `index.ts` instead of `index.js`.
*
[`services/frontend/libs/shared/utils/tsconfig.json`](diffhunk://#diff-b4f2b54a18d6e55f43202d452b71ea8895bbff0cd789330f84ce5ff4a6eb6237L4-R4):
Updated the `module` compiler option from `commonjs` to `esnext`.

### TypeScript file extension updates:

*
[`services/frontend/libs/shared/utils/src/index.ts`](diffhunk://#diff-c292470fa6a226ffe1c183d3e1d9909a9e8b950746a66eb65027153975ca7a49L1-R6):
Removed `.ts` extensions from import/export paths for better
compatibility with ES modules.
*
[`services/frontend/tsconfig.base.json`](diffhunk://#diff-ecb84aeedd2fa486fc966df39b040d031e6576ef6343bd30d2f4f31bcb76173aL26-R31):
Updated paths to include `.ts` extensions for module resolution
consistency.
a-klos added 4 commits July 15, 2025 07:24
This pull request includes changes to improve the organization of
backend services, enhance debugging capabilities, and refactor
asynchronous handling in the `admin-api-lib`. The most significant
updates involve restructuring file paths for services, adding debugging
arguments, and transitioning from threads to asyncio tasks for better
concurrency management.

### Backend Service Path Updates:
*
[`.vscode/launch.json`](diffhunk://#diff-bd5430ee7c51dc892a67b3f2829d1f5b6d223f0fd48b82322cfd45baf9f5e945L33-R34):
Updated `localRoot` and `remoteRoot` paths to reflect the new
organization under the `services` directory for `rag-backend`,
`document-extractor`, and `admin-backend`.
[[1]](diffhunk://#diff-bd5430ee7c51dc892a67b3f2829d1f5b6d223f0fd48b82322cfd45baf9f5e945L33-R34)
[[2]](diffhunk://#diff-bd5430ee7c51dc892a67b3f2829d1f5b6d223f0fd48b82322cfd45baf9f5e945L64-R65)
[[3]](diffhunk://#diff-bd5430ee7c51dc892a67b3f2829d1f5b6d223f0fd48b82322cfd45baf9f5e945L95-R100)
*
[`Tiltfile`](diffhunk://#diff-c2ee8653e1d6b85f0aadf87cd438a9250806c052877248442be4d434cbc52425L161-R161):
Adjusted `sync` paths in `live_update` for backend services to align
with the new directory structure.
[[1]](diffhunk://#diff-c2ee8653e1d6b85f0aadf87cd438a9250806c052877248442be4d434cbc52425L161-R161)
[[2]](diffhunk://#diff-c2ee8653e1d6b85f0aadf87cd438a9250806c052877248442be4d434cbc52425L202-R202)
[[3]](diffhunk://#diff-c2ee8653e1d6b85f0aadf87cd438a9250806c052877248442be4d434cbc52425L234-R234)
[[4]](diffhunk://#diff-c2ee8653e1d6b85f0aadf87cd438a9250806c052877248442be4d434cbc52425L278-R278)
*
[`infrastructure/rag/values.yaml`](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3L72-R73):
Changed `--reload-dir` paths for backend services to match the updated
directory structure.
[[1]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3L72-R73)
[[2]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3L242-R244)
[[3]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3L320-R323)

### Debugging Enhancements:
*
[`infrastructure/rag/values.yaml`](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3R58):
Added `-Xfrozen_modules=off` to `debugArgs` for `backend`,
`adminBackend`, and `extractor` to improve Python debugging
capabilities.
[[1]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3R58)
[[2]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3R229)
[[3]](diffhunk://#diff-673dd2d3d4e66a8fd4e45f9c1c9900711313f946bf8b6a89e96c954988fc14f3R308)

### Async Refactoring:
*
[`libs/admin-api-lib/src/admin_api_lib/impl/api_endpoints/default_file_uploader.py`](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL73-R72):
Refactored from threads to asyncio tasks for background processing,
including changes to `_prune_background_tasks` and usage of
`asyncio.to_thread` for blocking calls.
[[1]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL73-R72)
[[2]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL104-R107)
[[3]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL147-R154)
[[4]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL159-R167)
[[5]](diffhunk://#diff-2f472a7af7eccd952a8fc1daa5e849c9cd0649fae490380ff2c421bb57e16feeL172-R181)
*
[`libs/admin-api-lib/src/admin_api_lib/impl/api_endpoints/default_source_uploader.py`](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L165-R167):
Applied similar refactoring with `asyncio.to_thread` for blocking calls
in `_handle_source_upload`.
[[1]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L165-R167)
[[2]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L179-R182)
[[3]](diffhunk://#diff-3c2003f1db64f5b9610d691e451bb2ea5f65a92f1a81b5b210588931a4ecb455L193-R197)

### Test Updates:
*
[`libs/admin-api-lib/tests/default_file_uploader_test.py`](diffhunk://#diff-0228f4f920d182dd8dae216a018d88c345d3aa732337348ad1c0030060450f55L110-R110):
Updated tests to verify the creation and handling of asyncio background
tasks instead of threads.
[[1]](diffhunk://#diff-0228f4f920d182dd8dae216a018d88c345d3aa732337348ad1c0030060450f55L110-R110)
[[2]](diffhunk://#diff-0228f4f920d182dd8dae216a018d88c345d3aa732337348ad1c0030060450f55R130-R138)
This pull request removes outdated and unused files from the repository,
including documentation, licensing, configuration, and scripts. It also
updates semantic-release configurations for better compatibility and
consistency. Below is a breakdown of the most important changes.

### Removal of outdated files:

*
[`libs/CODE_OF_CONDUCT.md`](diffhunk://#diff-941504b808088f0db84affbb072240264079b3365d5679c15a2cbe69e557805fL1-L133):
Removed the Contributor Covenant Code of Conduct file, indicating that
the project no longer maintains this documentation.
*
[`libs/CONTRIBUTING.md`](diffhunk://#diff-53ac97f0a1162dcb92d47c6e1647c58a5c1949e6762a3128ddaa8adcdda88a95L1-L21):
Deleted the contributing guidelines, signaling a shift in how
contributions are managed or documented.
*
[`libs/LICENSE`](diffhunk://#diff-4b3a7ac52ca41be7125ddbd451d9e4310a7438c779d38184e80a68a24314b8ebL1-L201):
Removed the Apache License file, suggesting the project may have changed
its licensing approach or no longer requires this file.

### Removal of unused scripts:

*
[`libs/api-generator.sh`](diffhunk://#diff-b3b27103dd9315710da31113a6f791074735b27d3bdff3e99195579c89555250L1-L49):
Deleted an unused script for generating API clients, indicating that the
workflow has been deprecated or replaced.

### Updates to semantic-release configurations:

*
[`libs/package.json`](diffhunk://#diff-17bbb976b777dae36c47f83a12ceb90ee85cb16f71ed831084b659d3340646d3L1-L8):
Removed semantic-release-related dependencies (`@semantic-release/git`,
`@semantic-release/github`, `semantic-release`) and added
`conventional-changelog-conventionalcommits`. This aligns the
configuration with conventional commit standards.
[[1]](diffhunk://#diff-17bbb976b777dae36c47f83a12ceb90ee85cb16f71ed831084b659d3340646d3L1-L8)
[[2]](diffhunk://#diff-7ae45ad102eab3b6d7e7896acd08c427a9b25b346470d7bc6507b6481575d519R5)
*
[`libs/release.config.js`](diffhunk://#diff-ab495aae7c2cc8ed72e4ae0cc673c298a0cd42b939439ecfe59dc1198c16e30dL1-L15):
Updated plugin configurations to use conventional commit presets for
commit analysis and release notes generation.
[[1]](diffhunk://#diff-ab495aae7c2cc8ed72e4ae0cc673c298a0cd42b939439ecfe59dc1198c16e30dL1-L15)
[[2]](diffhunk://#diff-ac88fd8e5da48b4c325de01dc00bc9609325e9bdb489c5a1abc781d538a1579eL4-R9)

### Adjustments to API generation scripts:

*
[`services/mcp-server/api-generator.sh`](diffhunk://#diff-7a7cf95b1a229f31982e8d5980913dd92e3c482e23e354d1ab8d5460727ca860L4-R7):
Updated paths in the script to reflect changes in directory structure,
ensuring compatibility with the new project layout.
This pull request refactors the handling of language model providers and
enhances the robustness of AI-generated outputs across multiple files.
Key changes include replacing the `llm_provider` function with the new
`chat_model_provider`, improving error handling for AI responses, and
adding support for output parsers in specific chains.

### Refactoring and Provider Updates:
*
[`libs/admin-api-lib/src/admin_api_lib/dependency_container.py`](diffhunk://#diff-8b7c1816cb3e0a40b7965721c550eefdc184c5d914ec023e36527255613381e7L64-R63):
Replaced `llm_provider` with `chat_model_provider` for `ollama` and
`stackit` models, and updated the Singleton initialization to use
provider-specific strings ("ollama" and "openai").
[[1]](diffhunk://#diff-8b7c1816cb3e0a40b7965721c550eefdc184c5d914ec023e36527255613381e7L64-R63)
[[2]](diffhunk://#diff-8b7c1816cb3e0a40b7965721c550eefdc184c5d914ec023e36527255613381e7L111-R111)
*
[`libs/rag-core-api/src/rag_core_api/dependency_container.py`](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bL66-R64):
Similar updates to replace `llm_provider` with `chat_model_provider` and
update Singleton initialization for `ollama`, `stackit`, and `fake`
models. Also fixed a key mapping issue by changing `api_key` to
`openai_api_key`.
[[1]](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bL66-R64)
[[2]](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bL186-R186)
[[3]](diffhunk://#diff-483b37f4ebbc24c973c3b170542171d90c65f3c6b68f1a6d598ce8964a94be7bL255-R253)

### AI Output Handling:
*
[`libs/admin-api-lib/src/admin_api_lib/impl/summarizer/langchain_summarizer.py`](diffhunk://#diff-9793b1081628436dd7d5a0e37abc9d79ee5e25af3f5e784f99379249809ed8dbL80-R90):
Enhanced error handling to ensure AIMessage content is extracted
properly and converted to a string if necessary.
*
[`libs/rag-core-api/src/rag_core_api/impl/graph/chat_graph.py`](diffhunk://#diff-eeb8dc30c9a5ef343841996872be6cce0c890210aa765d0e32c248b4e120ede1R209-R221):
Added logic to validate and convert AI-generated responses
(`rephrased_question` and `answer_text`) into strings.

### Output Parser Integration:
*
[`libs/rag-core-api/src/rag_core_api/impl/answer_generation_chains/answer_generation_chain.py`](diffhunk://#diff-383d1b454a8f86bcd6e83561f9404bf94b3fe9877d1a045bcb3586105b0fa5cdR7):
Integrated `StrOutputParser` into the chain creation process for better
output parsing.
[[1]](diffhunk://#diff-383d1b454a8f86bcd6e83561f9404bf94b3fe9877d1a045bcb3586105b0fa5cdR7)
[[2]](diffhunk://#diff-383d1b454a8f86bcd6e83561f9404bf94b3fe9877d1a045bcb3586105b0fa5cdR66)
*
[`libs/rag-core-api/src/rag_core_api/impl/answer_generation_chains/rephrasing_chain.py`](diffhunk://#diff-8133279f7c0d324fdfc28838bb487748c091179dc11fc23c79c4f3f489bc3827R5):
Added `StrOutputParser` to the chain creation for consistent output
handling.
[[1]](diffhunk://#diff-8133279f7c0d324fdfc28838bb487748c091179dc11fc23c79c4f3f489bc3827R5)
[[2]](diffhunk://#diff-8133279f7c0d324fdfc28838bb487748c091179dc11fc23c79c4f3f489bc3827L59-R63)

### Dependency and Test Updates:
*
[`libs/rag-core-lib/pyproject.toml`](diffhunk://#diff-b19ab043535569caf9345971969d115d6515ae951a21b00a278145a28230fba1L24-R25):
Updated `langchain-core` to version `0.3.68` and added
`langchain-openai` as a dependency. Adjusted per-file ignores for
linting rules.
[[1]](diffhunk://#diff-b19ab043535569caf9345971969d115d6515ae951a21b00a278145a28230fba1L24-R25)
[[2]](diffhunk://#diff-b19ab043535569caf9345971969d115d6515ae951a21b00a278145a28230fba1L70-R75)
*
[`libs/rag-core-lib/src/rag_core_lib/impl/llms/llm_factory.py`](diffhunk://#diff-eb6012372d14a0c91bfa090c1e785605696f2f620e615f9e8602a00708a7ebe5L1-R88):
Replaced `llm_provider` with `chat_model_provider`, introduced
`_PROVIDER_KEY_MAP` for unified key mapping, and refactored the logic to
initialize chat models using `init_chat_model`.
*
[`libs/rag-core-lib/tests/chat_model_provider_test.py`](diffhunk://#diff-8633e65b39b350a66e34674e3f82877e240fa9a3e3877039cf2de7113d54c785R1-R59):
Added a new test script to validate the functionality of
`chat_model_provider` with the updated approach.
*
[`libs/rag-core-lib/tests/dummy6_test.py`](diffhunk://#diff-56619d1732b9dfdb782f77495f216dde9b297e7b9244a045f9b0d93c113368c5L1-L7):
Removed the dummy test file as part of cleanup.
@a-klos a-klos merged commit 4d29fb0 into fix/migration-issues Jul 15, 2025
2 checks passed
@a-klos a-klos deleted the refactor/gha branch July 15, 2025 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants