Skip to content

Commit

Permalink
Cleanup and improving bad mx host check (#17)
Browse files Browse the repository at this point in the history
Reporting on incorrect MX hosts (introducing `misconfigured_mx` in the suggest response). Also: documentation and CI improvements
  • Loading branch information
Dynom committed Aug 29, 2022
1 parent a24e708 commit 1e32009
Show file tree
Hide file tree
Showing 25 changed files with 255 additions and 157 deletions.
18 changes: 11 additions & 7 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,25 @@ jobs:
working_directory: /home/circleci/eri
docker:
# specify the version
- image: circleci/golang:1.17
- image: cimg/go:1.19

environment:
BINARY_NAME: "eri-linux-amd64"
TEST_RESULTS: "/tmp/test-results"
GOFLAGS: "-buildvcs=false -trimpath"

steps:
- checkout
- run: mkdir -p ${TEST_RESULTS}
- run: go install github.com/jstemmer/go-junit-report@latest
- run: curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.43.0
- run: go install github.com/mattn/goveralls@latest
- run: curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.49.0

- run:
name: Build
command: |
TAG=${CIRCLE_TAG:-dev}
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o "${BINARY_NAME}" -a -ldflags="-w -s -X main.Version=${TAG}" ./cmd/web
GOFLAGS="-buildvcs=false -trimpath" CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o "${BINARY_NAME}" -a -ldflags="-w -s -X main.Version=${TAG}" ./cmd/web
- run:
# Check if we have updates to minor/patch level packages we're explicitly referencing
Expand All @@ -40,8 +42,7 @@ jobs:
name: Test
command: |
go test -v ./... | go-junit-report > ${TEST_RESULTS}/report.xml
go test -cover -coverprofile=${TEST_RESULTS}/coverage.txt -covermode=atomic ./...
go test -race ./...
go test -cover -race -covermode=atomic -coverprofile=${TEST_RESULTS}/coverage.txt ./...
go tool cover -html=${TEST_RESULTS}/coverage.txt -o ${TEST_RESULTS}/coverage.html
- store_test_results:
Expand All @@ -51,16 +52,19 @@ jobs:
path: "/tmp/test-results"

- run:
name: Codecov upload
name: Coveralls upload
command: |
bash <(curl -s https://codecov.io/bash) -f ${TEST_RESULTS}/coverage.txt
goveralls -coverprofile=${TEST_RESULTS}/coverage.txt -service=circle-ci -repotoken=${COVERALLS_TOKEN}
workflows:
version: 2
build-test:
jobs:
- lint-and-test:
context:
- org-global
- "Public repos"
filters:
branches:
only: /.*/
15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


# ERI
Email Recipient Inspector is a project for preventing email typos. It's a self-learning service, a library or a command line utility. The services can help your uses to prevent mistakes when entering their email address. The library allows you to incorporate the features in your own business layer and the cli can be used as a convenient way to test domains or e-mail addresses.
Email Recipient Inspector is a project for preventing email typos. It's a self-learning service, a library or a command line utility. The services can help your uses to prevent mistakes when entering their email address. The library allows you to incorporate the features in your own business layer and the cli can be used as a convenient way to test domains or email addresses.

# ERI as command line utility
## Installation
Expand Down Expand Up @@ -93,9 +93,16 @@ The local part (left of the `@`) remains completely untouched. It's simply echoe
"alternatives": [
"[email protected]"
],
"malformed_syntax": false
"malformed_syntax": false,
"misconfigured_mx": false
}
```
##### The advisory fields
Please take note: These fields are advisory. Email delivery is still possible (even though unlikely) when these advisory fields are false. For example the recipient "root" on a local system is considered invalid. For web-use, however, It'll be mostly correct.

- `malformed_syntax` (bool) is an indication of the syntax. The check is fairly liberal. If `true`, chances are pretty good the email will never work.` _Note: this is permanent_.
- `misconfigured_mx` (bool) is an indication of a misconfigured MX. If `true`, it's unlikely that the host can accept email. _Note: this can be temporary!_.


### /autocomplete
The autocomplete endpoint returns a list of domains matching the prefix. To prevent leaking sensitive information, ERI is configured with a threshold to limit exposure of rarely used domains.
Expand Down Expand Up @@ -171,7 +178,7 @@ For more help, see the package: https://github.com/Dynom/ERI-js

# Integration
## Data scrubbing
When integrating ERI in your application, the initial results might be poor. When you change the validation mechanism (to include ERI) your data might still be too "dirty" to work with. After feeding your existing e-mail addresses into ERI you might want to cleanup the data first. The autocomplete endpoint might give odd results (e.g.: hotmail.com.com). Scrubbing this data from ERI's hitlist table and with the new mechanisms in place should prevent those addresses to end up into your backend in the future, but without the scrubbing you'll stay in a less-than-ideal situation.
When integrating ERI in your application, the initial results might be poor. When you change the validation mechanism (to include ERI) your data might still be too "dirty" to work with. After feeding your existing email addresses into ERI you might want to cleanup the data first. The autocomplete endpoint might give odd results (e.g.: hotmail.com.com). Scrubbing this data from ERI's hitlist table and with the new mechanisms in place should prevent those addresses to end up into your backend in the future, but without the scrubbing you'll stay in a less-than-ideal situation.

## To proxy or to expose directly
While ERI is designed to be exposed publicly, you might have different ideas about how to protect your backend services. Adding a proxy is a good alternative, and it allows you to fine-tune the rate-limiter to that specific use-case.
Expand Down Expand Up @@ -201,7 +208,7 @@ Mailcheck works completely in JavaScript, with the option to use only known TLDs


# Email delivery nuances
Ever since the first e-mail got sent in 1971 a lot has happened with electronic mail. In modern days email is seen as "the" way to identify and communicate with people online. Because of this, many people will easily give away their email addresses and people receive many, many emails. It's hard to read it all, not even counting the spam. Looking specifically at my own behaviour, I don't even open email unless I think it's important, just by scanning the sender and the subject of the email.
Ever since the first email got sent in 1971 a lot has happened with electronic mail. In modern days email is seen as "the" way to identify and communicate with people online. Because of this, many people will easily give away their email addresses and people receive many, many emails. It's hard to read it all, not even counting the spam. Looking specifically at my own behaviour, I don't even open email unless I think it's important, just by scanning the sender and the subject of the email.

With this in mind, even with a perfect validator, and a brilliantly composed and relevant email, it's still possible your email won't be read. ERI is designed to help out the user willing to trust you with their email address. ERI is not designed as a marketing tool to help optimise email delivery.

Expand Down
5 changes: 5 additions & 0 deletions cmd/eri-cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,3 +94,8 @@ bzcat emails.bz2 | \
jq .email | \
xargs ./updateStatus.sh
```

Using Shell process substitution
```bash
eri-cli check --input-is-email < <( echo "[email protected]" ) | jq .valid
```
4 changes: 2 additions & 2 deletions cmd/web/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import (
"encoding"
"errors"
"fmt"
"io/ioutil"
"os"
"strings"
"time"

Expand All @@ -26,7 +26,7 @@ func NewConfig(fileName string) (Config, error) {
// Not reading a config file on startup, might not show any feedback
c.Log.Level = logrus.TraceLevel.String()

b, err := ioutil.ReadFile(fileName)
b, err := os.ReadFile(fileName)
if err != nil {
return c, fmt.Errorf("unable to open %q, reason: %w", fileName, err)
}
Expand Down
10 changes: 5 additions & 5 deletions cmd/web/erihttp/handlers/compression_test.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package handlers

import (
"io/ioutil"
"io"
"net/http"
"net/http/httptest"
"strings"
Expand Down Expand Up @@ -44,9 +44,9 @@ func TestWithGzipHandler(t *testing.T) {

mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
b, err := ioutil.ReadAll(r.Body)
b, err := io.ReadAll(r.Body)
if err != nil {
t.Errorf("ioutil.ReadAll(r.Body) Setting up the test failed %s", err)
t.Errorf("io.ReadAll(r.Body) Setting up the test failed %s", err)
t.FailNow()
}

Expand Down Expand Up @@ -88,9 +88,9 @@ func TestWithGzipHandler(t *testing.T) {
}

defer res.Body.Close()
b, err := ioutil.ReadAll(res.Body)
b, err := io.ReadAll(res.Body)
if err != nil {
t.Errorf("ioutil.ReadAll(res.Body) Setting up the test failed %s", err)
t.Errorf("io.ReadAll(res.Body) Setting up the test failed %s", err)
t.FailNow()
}

Expand Down
1 change: 1 addition & 0 deletions cmd/web/erihttp/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ func (r *AutoCompleteResponse) PrepareResponse() {
type SuggestResponse struct {
Alternatives []string `json:"alternatives"`
MalformedSyntax bool `json:"malformed_syntax"`
MisconfiguredMX bool `json:"misconfigured_mx"`
Error string `json:"error,omitempty"`
}

Expand Down
3 changes: 1 addition & 2 deletions cmd/web/erihttp/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package erihttp
import (
"fmt"
"io"
"io/ioutil"
"net/http"
)

Expand Down Expand Up @@ -37,7 +36,7 @@ func GetBodyFromHTTPRequest(r *http.Request, maxBodySize int64) ([]byte, error)
return empty, fmt.Errorf("%w %q", ErrUnsupportedContentType, ct)
}

b, err := ioutil.ReadAll(io.LimitReader(r.Body, maxBodySize+1))
b, err := io.ReadAll(io.LimitReader(r.Body, maxBodySize+1))
if err != nil {
return empty, ErrInvalidRequest
}
Expand Down
3 changes: 2 additions & 1 deletion cmd/web/graphql.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@ func NewGraphQLSchema(conf config.Config, suggestSvc *services.SuggestSvc, autoc

return erihttp.SuggestResponse{
Alternatives: result.Alternatives,
MalformedSyntax: sugErr == validator.ErrEmailAddressSyntax,
MalformedSyntax: errors.Is(sugErr, validator.ErrEmailAddressSyntax),
MisconfiguredMX: !result.HasValidMX,
}, err
},
Description: "Get suggestions",
Expand Down
30 changes: 18 additions & 12 deletions cmd/web/handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,12 @@ const (
failedResponseError = "Generating response failed."
)

func NewAutoCompleteHandler(logger logrus.FieldLogger, svc *services.AutocompleteSvc, maxSuggestions uint64, maxBodySize uint64) http.HandlerFunc {
type marshalFn func(v interface{}) ([]byte, error)

func NewAutoCompleteHandler(logger logrus.FieldLogger, svc *services.AutocompleteSvc, maxSuggestions uint64, maxBodySize uint64, jsonMarshaller marshalFn) http.HandlerFunc {
if jsonMarshaller == nil {
jsonMarshaller = json.Marshal
}

logger = logger.WithField("handler", "auto complete")
return func(w http.ResponseWriter, r *http.Request) {
Expand Down Expand Up @@ -81,7 +86,7 @@ func NewAutoCompleteHandler(logger logrus.FieldLogger, svc *services.Autocomplet
return
}

response, err := json.Marshal(erihttp.AutoCompleteResponse{
response, err := jsonMarshaller(erihttp.AutoCompleteResponse{
Suggestions: result.Suggestions,
})

Expand All @@ -107,8 +112,12 @@ func NewAutoCompleteHandler(logger logrus.FieldLogger, svc *services.Autocomplet
}
}

// NewSuggestHandler constructs a HTTP handler that deals with suggestion requests
func NewSuggestHandler(logger logrus.FieldLogger, svc *services.SuggestSvc, maxBodySize uint64) http.HandlerFunc {
// NewSuggestHandler constructs an HTTP handler that deals with suggestion requests
func NewSuggestHandler(logger logrus.FieldLogger, svc *services.SuggestSvc, maxBodySize uint64, jsonMarshaller marshalFn) http.HandlerFunc {
if jsonMarshaller == nil {
jsonMarshaller = json.Marshal
}

log := logger.WithField("handler", "suggest")
return func(w http.ResponseWriter, r *http.Request) {
var err error
Expand Down Expand Up @@ -136,18 +145,15 @@ func NewSuggestHandler(logger logrus.FieldLogger, svc *services.SuggestSvc, maxB
}

var alts = []string{req.Email}
var sugErr error
{
var result services.SuggestResult
result, sugErr = svc.Suggest(r.Context(), req.Email)
if len(result.Alternatives) > 0 {
alts = append(alts[0:0], result.Alternatives...)
}
result, sugErr := svc.Suggest(r.Context(), req.Email)
if len(result.Alternatives) > 0 {
alts = append(alts[0:0], result.Alternatives...)
}

sr := erihttp.SuggestResponse{
Alternatives: alts,
MalformedSyntax: errors.Is(sugErr, validator.ErrEmailAddressSyntax),
MisconfiguredMX: !result.HasValidMX,
}

if sugErr != nil {
Expand All @@ -159,7 +165,7 @@ func NewSuggestHandler(logger logrus.FieldLogger, svc *services.SuggestSvc, maxB
sr.Error = sugErr.Error()
}

response, err := json.Marshal(sr)
response, err := jsonMarshaller(sr)

if err != nil {
log.WithFields(logrus.Fields{
Expand Down
Loading

0 comments on commit 1e32009

Please sign in to comment.