Question - Twitter support #12

FuchsiaSoft · 2016-05-10T19:28:53Z

I've just stumbled across this and love it... but I'd like to add support for tweeting out broken links automatically. Similar to the slack option currently just another platform for it I guess.

I've got experience working with Twitter's API and associated .Net libs for it so can't see it being particularly tricky, just wanted to see if it would be a welcome addition from your point of view before I go off and fork etc.

If the addition would be welcome let me know and I'll provide an outline of how I'd plan on doing it. 😃

hmol · 2016-05-10T20:12:57Z

Yeah, I guess if you see it as a useful feature then go right ahead and fork me
One thing: if the crawler finds 1000 broken links, it will tweet 1000 tweets at the same time. Do you think this could be a problem?

FuchsiaSoft · 2016-05-10T20:23:56Z

Yes it definitely would be a problem... twitter's rate limits are 15 updates per window, which conveniently is 15 minutes long, and also API key gets revoked for duplicate messages in quick repetition. So there would need to be a message queue or similar.

The options I see are:

Message queue that posts them in line with twitter rate limits. This is doable as the twitter API gives a header in its response saying how many requests are "left" for the current key. I'd also need to have some sort of persistence for relatively recent tweets to make sure that duplicates aren't put up in sequence. Maybe a local DB to make it resilient for restarts etc.
Aggregate results into a report and have the tweet just reference a link to that. This could then be linked to a twitter account which acts as a bot that people can tweet to and request a crawl of a website, with the bot answering them back on twitter directly when complete.

I'll have a detailed look at the setup of your current code base and see how I can go about fitting it in with minimal disruption.

Also, the twitter thing I think would be mainly fun to do, but I'm also considering it for a real-world intranet and have it send the results with SMTP from an internal server. So I'll probably put that in too

Thanks for being so welcoming by the way!... social coding for the win! 👍

hmol · 2016-05-10T20:34:47Z

Alternative 1 sounds difficult. Alternative 2 is a great idea. But where will you host the generated report?

Btw: SMTP as a output sounds wery useful. You could gather all the broken links as a report and just send one mail (Not one mail per broken link).

FuchsiaSoft · 2016-05-10T20:38:24Z

agreed re: alternatives 1 and 2... actually as I was typing it I realised the same.

for hosting the report I'd probably go with pastebin which has a handy API for just such use cases... or maybe GitHub gists (never even checked if they're publicly accessible without a GitHub account but I assume they are)

And SMTP yes definitely just email the aggregate report, sorry if I didn't explain that point.

coolio... I'll crack on ASAP :)

p.s. I have a separate issue around proxy support, but I can raise that in a separate issue for tracking/clarity

hmol · 2016-05-10T20:46:17Z

I was not aware that you could use pastebin for this, really great to avoid having to host the reports for our self .
If I understand this correctly there needs to be an instance of the LinkCrawler running someplace on a server and receiving events from twitter, finding broken links on requested website, and then tweeting a response. Do you think I should continue to use Azure Webjob for this, or did you have something else on mind? Should there be a limit on the number of crawled links?

PS: If you want to implement SMTP support, maybe you could create a separate issue for that as well?

FuchsiaSoft · 2016-05-10T20:51:19Z

Yes it would need to run on an endless loop essentially polling twitter for mentions. There is a convenient endpoint for that exact purpose so yeah it's pretty much as simple as you describe.

For twitter, the webjob wouldn't be the way to go, but since we're going with the aggregated report option I'd suggest putting the twitter side of things into its own application and moving the logic of link crawler into a class library. So the console program would still function as normal but we can use link crawler in other places.

And I'd plan on running the twitter side on a raspberry pi using Mono... but I'd need to make sure that the code ports to Mono OK for that first. Would be a cool thing to have running though 😄

hmol · 2016-05-10T20:55:13Z

Cool! 👍 Let me know if you need anything, looking forward to your pull request 😄

hmol · 2016-05-12T08:21:24Z

Maybe it would be relevant to take a look at User streams https://dev.twitter.com/streaming/userstreams ?

FuchsiaSoft mentioned this issue May 10, 2016

Proxy support #13

Open

FuchsiaSoft mentioned this issue May 10, 2016

SMTP support #14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question - Twitter support #12

Question - Twitter support #12

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016 •

edited

Loading

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016

hmol commented May 12, 2016

Question - Twitter support #12

Question - Twitter support #12

Comments

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016 • edited Loading

FuchsiaSoft commented May 10, 2016

hmol commented May 10, 2016

hmol commented May 12, 2016

hmol commented May 10, 2016 •

edited

Loading