Skip to content

Commit

Permalink
Merge pull request #27 from rog-golang-buddies/feature/parse_yml
Browse files Browse the repository at this point in the history
Parsing of the open API to API Spec document.
  • Loading branch information
ldmi3i committed Aug 23, 2022
2 parents 61627f5 + d842773 commit 36a91a5
Show file tree
Hide file tree
Showing 33 changed files with 2,152 additions and 185 deletions.
84 changes: 74 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,86 @@
# Data Scraping Service

[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/rog-golang-buddies/api-hub_data-scraping-service/main.svg)](https://results.pre-commit.ci/latest/github/rog-golang-buddies/api-hub_data-scraping-service/main)

## Description
Service asynchronously process user request to add new Open API.
In other words this service processes content of Open API file, transforms it to the ASD (API Specification Document) model and sends next to the storage and update service.

### Main functions (To Do)
1. Listen to queue events (links to open API yaml/json files)
2. Check link availability
3. Retrieve file content
4. Validate content
5. Parse content into an ASD model
6. Put ASD model with metadata to the storage and update service queue
Service asynchronously process user request to add new Open API.
In other words, this service processes the content of the Open API file and transforms it into the ASD (API
Specification Document) model and sends it next to the storage and update service.

### Starting service

The easiest way to start an application is to do it with docker.
If you have docker you just need to run a command from the project root
`docker-compose -f ./docker/docker-compose-dev.yml up -d --build`.
And `docker-compose -f ./docker/docker-compose-dev.yml down` to stop.
You can observe queues, and send and retrieve messages from queues using the web interface available by address http://localhost:15672 .
You can observe queues, and send and retrieve messages from queues via the web interface available by
the address http://localhost:15672.
login/password = guest/guest.

### MVP version

1. Listen for the events with the static links to the open API specification files.
2. Download & parse openapi specification into a common API specification document(ASD) (view for the UI part).
3. Send notification to the API gateway if required (depends on the flag; look 'How it works' section)
4. Post ASD to the result queue.

#### Communication model

Consume requests with the file urls and notification flag
Default listen queue name: data-scraping-asd
Request:

```json5
{
"file_url": "https://developer.atlassian.com/cloud/trello/swagger.v3.json",
"is_notify_user": true
}
```

If "is_notify_user" is true then this service must post notifications to the separate queue. A notification contains one
field with an error model. If an error happens it will contain an error otherwise nil.
Default notification queue name: gateway-scrape-notifications
Example:

```json5
{
"error": {
"cause": "file exceed the limit: 5242880",
"message": "error while processing url"
}
}
```

If the parsing process has been completed correctly then the result will be posted to the result queue and delivered to
the 'storage and update service'
Default result queue name: storage-update-asd
The model is too big, so I don't give its description here - see the code for details.

#### How to check functionality manually using the RabbitMQ management page

1. Start service as mentioned in the 'Start service' section
2. Go to http://localhost:15672 and login as guest/guest
3. Go to the Queue tab.
4. Check that data-scraping-asd queue has been already presented here
5. Expand 'Add a new queue' section under the 'Overview' and add 2 queues: 'gateway-scrape-notifications' and
'storage-update-asd'
6. Go into the data-scraping-asd queue and expand the 'Publish message' section under the charts
7. Add request body and publish a message
8. You can check service logs with `docker logs dss`, return to the Queues tab and check result messages in the queues
using the "Get messages" section

### Known current limitations (TO DO)

1. Supported only swagger 3.0 version.
2. Ignore field constraints (max length and etc.)

### Main functions

1. Listen to queue events (links to open API yaml/json files)
2. Check link availability
3. Retrieve file content (there is a limit of file size - by default it's 5 Mb)
4. Validate content
5. Parse content into an ASD model
6. Put ASD model with metadata to the storage and update service queue
9 changes: 9 additions & 0 deletions docker/docker-compose-dev.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
version: '3.9'

volumes:
rabbit-data:
driver: local

services:
rabbit:
image: rabbitmq:3-management #you may open management UI via http://localhost:15672/#/ login&password == guest
container_name: rabbit
#hostname required here to work with the volume on persistent queues.
#Rabbit saves data by folders whose names are generated from the host. To have data restored on container restart we need to commit the host.
hostname: rabbit
volumes:
- rabbit-data:/var/lib/rabbitmq
ports:
- "5672:5672"
- "15672:15672"
Expand Down
6 changes: 6 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ module github.com/rog-golang-buddies/api-hub_data-scraping-service
go 1.18

require (
github.com/getkin/kin-openapi v0.98.0
github.com/golang/mock v1.6.0
github.com/kelseyhightower/envconfig v1.4.0
github.com/rabbitmq/amqp091-go v1.4.0
Expand All @@ -13,8 +14,13 @@ require (

require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/go-openapi/jsonpointer v0.19.5 // indirect
github.com/go-openapi/swag v0.19.5 // indirect
github.com/invopop/yaml v0.1.0 // indirect
github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
go.uber.org/atomic v1.7.0 // indirect
go.uber.org/multierr v1.6.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
17 changes: 17 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,27 @@ github.com/benbjohnson/clock v1.1.0 h1:Q92kusRqC1XV2MjkWETPvjJVqKetz1OzxZB7mHJLj
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/getkin/kin-openapi v0.98.0 h1:lIACvCG9cxmFsEywz+LCoVhcZHFLUy+Nv5QSkb43eAE=
github.com/getkin/kin-openapi v0.98.0/go.mod h1:w4lRPHiyOdwGbOkLIyk+P0qCwlu7TXPCHD/64nSXzgE=
github.com/go-openapi/jsonpointer v0.19.5 h1:gZr+CIYByUqjcgeLXnQu2gHYQC9o73G2XUeOFYEICuY=
github.com/go-openapi/jsonpointer v0.19.5/go.mod h1:Pl9vOtqEWErmShwVjC8pYs9cog34VGT37dQOVbmoatg=
github.com/go-openapi/swag v0.19.5 h1:lTz6Ys4CmqqCQmZPBlbQENR1/GucA2bzYTE12Pw4tFY=
github.com/go-openapi/swag v0.19.5/go.mod h1:POnQmlKehdgb5mhVOsnJFsivZCEZ/vjK9gh66Z9tfKk=
github.com/golang/mock v1.6.0 h1:ErTB+efbowRARo13NNdxyJji2egdxLGQhRaY+DUumQc=
github.com/golang/mock v1.6.0/go.mod h1:p6yTPP+5HYm5mzsMV8JkE6ZKdX+/wYM6Hr+LicevLPs=
github.com/gorilla/mux v1.8.0/go.mod h1:DVbg23sWSpFRCP0SfiEN6jmj59UnW/n46BH5rLB71So=
github.com/invopop/yaml v0.1.0 h1:YW3WGUoJEXYfzWBjn00zIlrw7brGVD0fUKRYDPAPhrc=
github.com/invopop/yaml v0.1.0/go.mod h1:2XuRLgs/ouIrW3XNzuNj7J3Nvu/Dig5MXvbCEdiBN3Q=
github.com/kelseyhightower/envconfig v1.4.0 h1:Im6hONhd3pLkfDFsbRgu68RDNkGF1r3dvMUtDTo2cv8=
github.com/kelseyhightower/envconfig v1.4.0/go.mod h1:cccZRl6mQpaq41TPp5QxidR+Sa3axMbJDNb//FQX6Gg=
github.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI=
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e h1:hB2xlXdHp/pmPZq0y3QnmWAArdw9PqbmotexnWx/FU8=
github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
Expand All @@ -19,6 +31,7 @@ github.com/rabbitmq/amqp091-go v1.4.0/go.mod h1:JsV0ofX5f1nwOGafb8L5rBItt9GyhfQf
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.8.0 h1:pSgiaMZlXftHpm5L7V1+rVB+AZJydKsMxsQBIJw4PKk=
Expand Down Expand Up @@ -63,6 +76,10 @@ golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8T
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.0/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
13 changes: 7 additions & 6 deletions internal/app.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/load"
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/logger"
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/parse"
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/parse/openapi"
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/process"
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/queue"
"github.com/rog-golang-buddies/api-hub_data-scraping-service/internal/queue/handler"
Expand All @@ -29,7 +30,7 @@ func Start() int {
return 1
}

proc, err := createDefaultProcessor()
proc, err := createDefaultProcessor(log, conf)
if err != nil {
log.Error("error while creating processor: ", err)
return 1
Expand Down Expand Up @@ -65,11 +66,11 @@ func Start() int {
return 0
}

func createDefaultProcessor() (process.UrlProcessor, error) {
recognizer := recognize.NewRecognizer()
parsers := []parse.Parser{parse.NewJsonOpenApiParser(), parse.NewYamlOpenApiParser()}
converter := parse.NewConverter(parsers)
loader := load.NewContentLoader()
func createDefaultProcessor(log logger.Logger, config *config.ApplicationConfig) (process.UrlProcessor, error) {
recognizer := recognize.NewRecognizer(log)
parsers := []parse.Parser{openapi.NewOpenApi(log)}
converter := parse.NewConverter(log, parsers)
loader := load.NewContentLoader(log, &config.Web)

return process.NewProcessor(recognizer, converter, loader)
}
9 changes: 5 additions & 4 deletions internal/config/application.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@ import (
)

type ApplicationConfig struct {
Env Environment `default:"dev"`
Logger LoggerConfig
Queue QueueConfig
Env Environment `default:"dev"`
Logger LoggerConfig
Queue QueueConfig
Web Web
}

//ReadConfig reads configuration from the environment and populates the structure with it
// ReadConfig reads configuration from the environment and populates the structure with it
func ReadConfig() (*ApplicationConfig, error) {
var conf ApplicationConfig
if err := envconfig.Process("", &conf); err != nil {
Expand Down
7 changes: 7 additions & 0 deletions internal/config/web.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package config

// Web is a web-related properties configuration
type Web struct {
//RespLimBytes represents the maximum file size (in bytes) to download.
RespLimBytes int64 `default:"5242880"`
}
Loading

0 comments on commit 36a91a5

Please sign in to comment.