This application scrapes specified sections in Muusikoiden.net tori, generates a report of new listings and sends it to you via email. Ideally it is invoked recurringly with e.g. a cron job. Now there's no reason to procrastinate with manually going through the sections now and again!
Hacked together with Haskell (+ various libraries) and AWS.
First clone the repository
git clone https://github.com/atarv/mnet-paivystaja
cd mnet-paivystaja
Then install Stack and run
stack build
...or use Docker, but only after you have written the configuration file:
docker build -t mnet-aggregator .
Building might take a while the first time.
To get a zip-archive suitable for running on AWS Lambda run the Makefile at repository root.
make
To create the Lambda function see AWS documentation. You can easily update the function by using AWS CLI:
aws lambda update-function-code --function-name example-function-name --zip-file fileb://build/output/function.zip
Configuring is done by creating config.dhall
file in repository root. For
more specific documentation take a look at Configs.hs
Configuration example:
-- config.dhall
{ mailConfig =
{ smtpHostname = "smtp.mailprovider.com"
, senderEmail = "[email protected]"
, senderName = "Listings aggregator"
, smtpPassword = "smtppassword"
, smtpPort = 25
, smtpUsername = "smtpuser"
}
, serverPort = 8080
, dynamoDBTableName = "example-table"
}
For setting up a DynamoDB table see AWS documentation.
See lambda/Main.hs
for which environment variables must be set.
Use PK
as partition key and SK
as sort key (both String type).
If you built the app with stack:
stack run
Docker:
docker run mnet-aggregator
Remember to set AWS credentials and AWS_REGION environment variables when running locally.
The aggregator is run, when a HTTP POST request is made to the
/generatereport
path. Example with curl
:
curl --request POST \
--url http://localhost:8080/generatereport \
--header 'content-type: application/json' \
--data '{
"recipientEmail": "[email protected]",
"recipientName": "Your name",
"sections": [
{
"sectionTitle": "Guitars",
"sectionUrl": "https://muusikoiden.net/tori/?category=8"
},
{
"sectionTitle": "Amps",
"sectionUrl": "https://muusikoiden.net/tori/?category=42"
}
]
}'
You'll probably want to run this with a cron job. You can use URLs from Haku, if you are looking for specific gear.