Skip to content

Commit

Permalink
SDKTECHNO-259 Rewrite the README files for app technologies + the pri…
Browse files Browse the repository at this point in the history
…ncipal README
  • Loading branch information
zannelo committed Mar 25, 2024
1 parent 28c2267 commit 66623db
Show file tree
Hide file tree
Showing 66 changed files with 1,033 additions and 949 deletions.
81 changes: 46 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# Saagie Technologies


[![GitHub release](https://img.shields.io/github/release/saagie/technologies?style=for-the-badge)][releases]
[![GitHub release date](https://img.shields.io/github/release-date/saagie/technologies?style=for-the-badge&color=blue)][releases]

Expand All @@ -19,68 +18,80 @@
[build_promote]: https://github.com/saagie/technologies/actions?query=workflow%3APROMOTE
[build_modified]: https://github.com/saagie/technologies/actions?query=workflow%3A%22BUILD+ONLY+MODIFIED%22

This repository contains all certified technologies used in Saagie.
It also contains some experimental technologies before being certified.
This repository contains all the certified technologies used in Saagie.
It also includes some technologies that are still in the experimental phase.

For information about using these technologies with your Saagie platform, refer to [Saagie's official documentation](https://docs.saagie.io/product/latest/sdk/index.html).
For more information on how to use these technologies with your Saagie platform, see the <a href="https://docs.saagie.io/user/latest/developer/sdk/" target="_blank">official Saagie documentation</a>.

## CONTRIBUTING
## How to contribute to this repository?

All contributions are made via Github Pull Requests.
All contributions are made with Github pull requests.

### Build
## About the build

The build is using [Gradle](https://gradle.org/)
The build uses <a href="https://gradle.org/" target="_blank">Gradle</a>.

Launch the Gradle task `localBuildModifiedJobs` or `localBuildModifiedApps` to build locally the technology modified locally:
Launch the Gradle task `localBuildModifiedJobs` or `localBuildModifiedApps` to build the modified technology locally:

./gradlew localBuildModifiedJobs
./gradlew localBuildModifiedApps

### CI
### Continuous Integration (CI)

The build is running using a Github Action workflow (build only modified). It builds only technologies modified and generate a pre release containing assets. The name of the pre release = current version + name of the branch.
The build is executed using a Github Action workflow called <a href="https://github.com/saagie/technologies/blob/a32ff207775fb3c53bb3874e384b7b90323e78e5/.github/workflows/buildOnlyModified.yml" target="_blank">`BUILD ONLY MODIFIED`</a>. It builds only the modified technologies and generates a pre-release containing the assets. The name of the pre-release is `the current version + the branch name`.

### Structure

The build is using the [Gradle plugin for Saagie technologies repository](https://github.com/saagie/technologies-plugin). It expects a specific directory structure:
- `technologies`: the required base directory. It contains sub directories for each type of technology metadata:
-- `app`: contains folders of each app technology
-- `job`: contains folders of each job technology
-- `connectiontype`: contains folders of each connection type
-- `scripts` is a sub directory dedicated to external technologies which share the same javascript files
The build is based on our <a href="https://github.com/saagie/technologies-plugin" target="_blank">Gradle plugin for the Saagie technology repository</a>. It expects to see a specific directory structure, which includes:

- The `technologies` directory: This is the base directory required. It contains subdirectories for each type of technology metadata.
- The `app` subdirectory: It contains folders for each app technology.
- The `job` subdirectory: It contains folders for each job technology.
- The `connectiontype` subdirectory: It contains folders for each connection type.
- The `scripts` subdirectory: It is dedicated to external technologies that share the same JavaScript files.
- A `metadata.yaml` file for each metadata folder. For `job` and `app` technology metadata, the build automatically generates them from the `technology.yaml` file and from each `context.yaml` found in the subfolders.

Each metadata folder must then contain a `metadata.yaml` file. For `job` and `app` technology metadata, the build is automatically generating them from the `technology.yaml` file and each `context.yaml` found in the sub folders.
> [!NOTE]
> Generated files, such as `metadata.yaml`, `.js` files, and others, must be checked in Git, as the build tries to avoid rebuilding technologies that have not been modified.
NOTE: Generated files (metadata.yaml, js files, etc..) need to be checked in git, since the build is trying to avoid rebuilding technologies which didn't change.
#### Docker-based technologies

#### Docker based technologies
> [!NOTE]
> These technologies are not external. They are based on a Docker image and embedded in Saagie.
For each context of a job (non external, based on an Docker image) or of an app technology, the folder dedicated to the context must then contain:
- `build.gradle.kts` which apply the `SaagieTechnologiesGradlePlugin` and the `DockerRemoteApiPlugin`.
- `Dockerfile` which declares how to build the Docker image
- `image_test.yml`: if present, the generated Docker image will be tested with the [GoogleContainerTools/container-structure-test](https://github.com/GoogleContainerTools/container-structure-test)
For each job or app technology context, there must be a context-specific folder that includes:
- A `build.gradle.kts` file, which is mandatory. It applies the `SaagieTechnologiesGradlePlugin` and the `DockerRemoteApiPlugin` Gradle tasks.
- A `Dockerfile` file, which is mandatory. It declares how to build the Docker image.
- A `image_test.yml` file, which is optional. If present, the generated Docker image will be tested with the <a href="https://github.com/GoogleContainerTools/container-structure-test" target="_blank">GoogleContainerTools/container-structure-test</a> project.

#### External technologies

For contexts of an external job technology, the `context.yaml` file will reference javascript files, relatively to its location. The referenced javascript files should be generated by a dedicated javascript build.
> [!NOTE]
> These technologies are external. They are not part of Saagie.
The build of such javascript files can be setup anywhere in the directory structure, as long as the relative path is proprely filled in the `context.yaml` file.
Each context of an external job technology has its dedicated `context.yaml` file, in which JavaScript files are referenced. The JavaScript files can be placed anywhere in the directory structure, as long as the relative path is proprely filled in the `context.yaml` file. These referenced `.js` files must be generated by a dedicated JavaScript build.

The Saagie Gradle plugin can be used to trigger the build of the javascript file in the CI. It needs:
- a `build.gradle.kts` which apply the `SaagieTechnologiesGradlePlugin` and the `NodePlugin`.
- a `package.json` with a `build` and `test` scripts.
Saagie’s Gradle plugin can be used to trigger the build of these JavaScript files in the CI. For that, your external technology context folder must include:
- A `build.gradle.kts` file, which is mandatory. It applies the `SaagieTechnologiesGradlePlugin` and the `NodePlugin` Gradle tasks.
- A `package.json` file, which is mandatory. This file includes a `build` script and a `test` script.

The Saagie Gradle plugin will launch [Yarn](https://yarnpkg.com/) to install depednencies and run the scripts to build and test the javascript files.
Saagie’s Gradle plugin will launch <a href="https://yarnpkg.com/" target="_blank">Yarn</a> to install dependencies and run the scripts to build and test JavaScript files.

### Developping an External Job Technology
#### Build an external job technology

It is highly recommended to use the [Saagie SDK](https://github.com/saagie/sdk) when developping an external job technology.
> [!IMPORTANT]
> We strongly recommend that you use the <a href="https://github.com/saagie/sdk" target="_blank">Saagie SDK</a> repository to develop your external job technologies. The Saagie SDK will help you to create your own technologies and integrate them into your Saagie platform. For more information, see our documentation on <a href="https://docs.saagie.io/user/latest/developer/sdk/" target="_blank"> Saagie SDK</a>.
In technologies which doesn't share scripts with other technologies, it should be already setup. For instance, in the technology folder of `dataiku`, just run `yarn dev` and the build of the javascript files will be running with the watch mode, and the SDK webapp will start.
For technologies that do not share the same scripts with other technologies, it should already be setup.
<br>For example, in the <a href="https://github.com/saagie/technologies/tree/master/technologies/job/dataiku" target="_blank">`dataiku` folder</a>, run `yarn dev` to build the JavaScript files with the watch mode and start the SDK webapp.
> [!NOTE]
> What is _watch mode_?
> <br>After the task is completed, the tool enters a loop where it watches the file system for changes to your source files. Whenever a change is detected, the task runs again to update its output.
For technologies which share javascript files, like `aws`, `gcp` or `azure`, the setup is split in two parts. In the folder dedicated to the shared scripts, run `yarn dev` to run the build with the watch mode. Then in the folder dedicated to the technology to develop, like `aws-lambda`, run `npx @saagie/sdk start` to start the SDK webapp.
For technologies that share `.js` scripts with other technologies, such as `aws`, `gcp`, or `azure`, the setup is split into two parts.
1. In the folder dedicated to the shared scripts, run `yarn dev` to execute the build with the watch mode.
2. In the folder dedicated to the technology to be developed, such as `aws-lambda`, run `npx @saagie/sdk start` to start the SDK webapp.

### Promotion

When the pull-request is merged in master, another Github action (running a gradle task) starts. It will retag docker images with branch name into a "production" name and generate a real release (and delete the pre release)
When the pull-request is merged into `master`, another GitHub action starts. This action will run a Gradle task to re-tag the Docker images with the branch name to a production name. It will then generate a real release, deleting the pre-release.
62 changes: 36 additions & 26 deletions technologies/app/airbyte/README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,45 @@
# Airbyte

![Docker Image Size (tag)](https://img.shields.io/docker/image-size/saagie/airbyte/1.0)

## Description
> [!NOTE]
> A project can create its own Airbyte connections without being seen by other projects, whereas there is only one user on the Airbyte Open-Source (OSS) version. For more details on the limitation of Airbyte OSS, see the <a href="https://docs.airbyte.com/" target="_blank">Airbyte Open Source documentation</a>.
## How to launch Airbyte?

To make Airbyte work on your platform, you must meet the following requirements.

1. Ask Saagie a VM containing Airbyte. To do so, open a ticket at the <a href="https://support.saagie.com/hc/en-us" target="_blank">Saagie Help Center</a>.
2. Once your VM has been created, you will receive the information required to configure Airbyte in Saagie, that is, your credentials and URL of the VM you have requested. Remember them for the next step.
3. On your Saagie platform, create the following <a href="https://docs.saagie.io/user/latest/data-team/projects-module/projects/managing-environment-variables#creating-environment-variables" target="_blank">environment variables</a>:

This folder contains the image of the redirection to the VM containing the OSS version of Airbyte allowing the creation of data flows.
| Name | Value |
|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `AIRBYTE_URL` | This is the URL of the VM containing Airbyte. |
| `AIRBYTE_LOGIN` | This is the login of the VM containing Airbyte. |
| `AIRBYTE_PASSWORD` | This is the password of the VM containing Airbyte. |
| `SAAGIE_URL` | This is the URL of your Saagie platform. <br/>For example, `https://saagie-workspace.prod.saagie.io`. |
| `SAAGIE_PLATFORM_ID` | This is the ID of your Saagie platform. The default value is `1`. |
| `SAAGIE_LOGIN` | This is the Saagie platform user login.<br/> **<u>IMPORTANT</u>**: Make sure this user have editor rights to the corresponding project, that is, the project you will be referencing for `SAAGIE_PROJECT_NAME`. |
| `SAAGIE_PASSWORD` | This is the Saagie platform user password.<br/> **<u>NOTE</u>**: The user credentials referenced here and in the `SAAGIE_PASSWORD` environment variable are required for project management. Any other user who has access to the project will be able to use the Airbyte app as well. |
| `SAAGIE_PROJECT_NAME` | This is the name of your Saagie project. |
| `AIRBYTE_WORKSPACE_NAME` | This is the Airbyte workspace name. If not configured, it will retrieve the value of the `SAAGIE_PROJECT_NAME` environment variable. |
4. You can now access Airbyte in the following ways:
- <a href="https://docs.saagie.io/user/latest/how-to/airbyte/airbyte-use-as-app-in-saagie" target="_blank">as a Saagie app</a>.
- <a href="https://docs.saagie.io/user/latest/how-to/airbyte/airbyte-use-as-api" target="_blank">as an API</a>.

A project can create his own Airbyte connections without being seen by other projects while there is only one single user on the OSS version of Airbyte.
For more details about the limitation of OSS version of Airbyte, click [here](https://airbyte.com/airbyte-open-source)
***
> _For more information on Airbyte, see the <a href="https://docs.airbyte.com/?_gl=1*12vfeg5*_gcl_au*MTMzNzE3NDY2Mi4xNzExMTE4ODQ4" target="_blank">official documentation</a>._
## How to build in local

Inside the `airbyte` folder corresponding to your version, run :
```
docker build -t saagie/airbyte:<version> .
docker push saagie/airbyte:<version>
```
<!-- ## How to build the image in local?
## How to launch it
### Using Docker Commands
To deploy Airbyte on your platform, first, you have to request a VM containing Airbyte to Saagie,
[click here to create your request](https://saagie.zendesk.com/hc/en-us).
Then you need to create a user with editor rights on the project that you want to install
airbyte, and then set the following environment variables in Saagie :
To build the image in local with Docker commands, follow the steps below.
- AIRBYTE_URL : URL of the VM containing Airbyte
- AIRBYTE_LOGIN : Login of the VM containing Airbyte
- AIRBYTE_PASSWORD : Password of the VM containing Airbyte
- SAAGIE_LOGIN: Login of Saagie platform user (please make sure that this user have editor rights on `SAAGIE_PROJECT_NAME`)
- SAAGIE_PASSWORD: Password of Saagie platform user
- SAAGIE_URL: URL of the Saagie platform (i.e. : `https://saagie-workspace.prod.saagie.io`)
- SAAGIE_PLATFORM_ID : ID of your plateform (Default value : `1`)
- SAAGIE_PROJECT_NAME: Project name of Saagie
- AIRBYTE_WORKSPACE_NAME (optional): Workspace's name of Airbyte, if not set, it will be the same as `SAAGIE_PROJECT_NAME`
1. Navigate to the `airbyte-x.y` folder corresponding to your version, `technologies/app/airbyte/<version>`. Use the `cd` command.
2. Run the following command lines:
```bash
docker build -t saagie/airbyte:<version> .
docker push saagie/airbyte:<version>
```
Where `<version>` must be replaced with the version number. -->
4 changes: 2 additions & 2 deletions technologies/app/airbyte/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ version: v1
type: APP
id: airbyte
label: Airbyte
baseline: "Airbyte is a cloud-native, open-source data integration platform"
description: " Airbyte is a powerful data integration tool that can help organizations streamline their data workflows and ensure the accuracy and reliability of their data."
baseline: "Airbyte is a cloud-native and open-source data integration platform."
description: "Airbyte's features enable you to easily connect date sources and destinations, ensuring that data is moved efficiently and accurately between systems."
available: true
icon: airbyte
defaultResources:
Expand Down
4 changes: 2 additions & 2 deletions technologies/app/airbyte/technology.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ version: v1
type: APP
id: airbyte
label: Airbyte
baseline: "Airbyte is a cloud-native, open-source data integration platform"
description: " Airbyte is a powerful data integration tool that can help organizations streamline their data workflows and ensure the accuracy and reliability of their data."
baseline: "Airbyte is a cloud-native and open-source data integration platform."
description: "Airbyte's features enable you to easily connect date sources and destinations, ensuring that data is moved efficiently and accurately between systems."
available: true
icon: airbyte
defaultResources:
Expand Down
Loading

0 comments on commit 66623db

Please sign in to comment.