Skip to content

Commit

Permalink
Compensate to bug in Phishing.Database
Browse files Browse the repository at this point in the history
Do to a bug in Phishing.Database we are not able to do full search in the active files. For that reason we are now importing the `ALL-phishing-links.txt` and strips it down to domain only list in `data/phishing_database/`

Related issues:
- https://github.com/mitchellkrogza/Phishing.Database/issues/840
- https://github.com/mitchellkrogza/Phishing.Database/issues/881
- Phishing-Database/phishing#381 (comment)
- Phishing-Database/phishing#396
- Phishing-Database/phishing#407
- https://github.com/mitchellkrogza/phishing/issues/395
- mypdns/matrix#624
- blocklistproject/Lists#1252
- https://github.com/mitchellkrogza/Phishing.Database/issues/840
- Phishing-Database/Phishing.Database#722

Trying to use @main for the php installer and using php version 8.4

Added `libdomain-publicsuffix-perl` to the dependencies.sh script as it is required by perl in import.sh. It turns out Perl just anoyingly does it again... 😏
  • Loading branch information
spirillen committed Jul 2, 2024
1 parent f7fefe0 commit 4f6c877
Show file tree
Hide file tree
Showing 6 changed files with 28 additions and 10 deletions.
7 changes: 4 additions & 3 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,10 @@ jobs:
token: '${{ secrets.GITHUB_TOKEN }}'

- name: Setup PHP
uses: shivammathur/setup-php@v2
uses: shivammathur/setup-php@main
with:
php-version: '8.1'
php-version: '8.4'
extensions: mysql, imagick

- name: Install requirements
run: |
Expand All @@ -54,7 +55,7 @@ jobs:
- name: Download and convert dumb hosts to plain data
run: php -f "${{ github.workspace }}/scripts/converter.php"

- name: Update active list
- name: Update active source list
run: bash "${{ github.workspace }}/scripts/update_active_lists.sh"

- name: Commit changes
Expand Down
7 changes: 5 additions & 2 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
21 June 2020
02. July 2024
- Added Phishing.Database/ALL-phishing-links.txt
- Added and started using perl

21. June 2020
- Updated the HpHost location
- Adding .dtq zones
- Updated the code for blocklist.site to use array

15. January 2020
- Replaces MobileAdTracker with adaway.github.io

6 changes: 4 additions & 2 deletions scripts/converter.php
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
<?php
//script source https://raw.githubusercontent.com/r-a-y/mobile-hosts/master/converter.php
// script source https://raw.githubusercontent.com/r-a-y/mobile-hosts/master/converter.php
// https://github.com/r-a-y/mobile-hosts/blob/master/
// License: GPL-3.0 https://github.com/r-a-y/mobile-hosts/blob/master/LICENSE
// Add our lists.
$lists = array(
'adAway' => 'https://raw.githubusercontent.com/AdAway/adaway.github.io/master/hosts.txt',
Expand Down Expand Up @@ -46,7 +48,7 @@
'phishingArmy' => 'https://phishing.army/download/phishing_army_blocklist_extended.txt',
'Phishing.Database' => 'https://raw.githubusercontent.com/mitchellkrogza/Phishing.Database/master/phishing-domains-ACTIVE.txt',
'Phishing.DatabaseAll' => 'https://raw.githubusercontent.com/mitchellkrogza/Phishing.Database/master/ALL-phishing-domains.txt',
'Phishing.DatabaseAllLinks' => 'https://raw.githubusercontent.com/mitchellkrogza/Phishing.Database/master/ALL-phishing-links.txt',
// 'Phishing.DatabaseAllLinks' => 'https://raw.githubusercontent.com/mitchellkrogza/Phishing.Database/master/ALL-phishing-links.txt',
'QuidsupMixed' => 'https://quidsup.net/notrack/blocklist.php?download=trackersdomains',
'ShadowWhispererAds' => 'https://raw.githubusercontent.com/ShadowWhisperer/BlockLists/master/Lists/Ads',
'ShadowWhispererApple' => 'https://raw.githubusercontent.com/ShadowWhisperer/BlockLists/master/Lists/Apple',
Expand Down
5 changes: 3 additions & 2 deletions scripts/dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# The perpose of this script is to import various eternal hosts files into lists
# that contail only domain.tld for easier working with the lists to our RPZ files

# Exit on any erros
# Exit on any errors

set -e

Expand All @@ -27,4 +27,5 @@ bash -c "$(curl -sL https://raw.githubusercontent.com/ilikenwf/apt-fast/master/q

apt-fast update -yqq
#apt-fastdist-upgrade -yqq
apt-fast install -yqq openssh-client curl wget dos2unix ldnsutils
apt-fast install -yqq openssh-client curl wget dos2unix ldnsutils \
libdomain-publicsuffix-perl
10 changes: 10 additions & 0 deletions scripts/import.sh
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,15 @@ echo "Imported openfish.com"
# START @mitchellkrogza's many lists
# echo "START importing @mitchellkrogza's many lists"

# Perlscript as by https://unix.stackexchange.com/a/745455

mkdir -p "${git_dir}/data/phishing_database/"
c "https://raw.githubusercontent.com/mitchellkrogza/Phishing.Database/master/ALL-phishing-links.txt" >/tmp/ALL-phishing-links.txt
perl -MDomain::PublicSuffix -lne '
BEGIN{$s = Domain::PublicSuffix->new}
print if $_ eq $s->get_root_domain($_)' </tmp/ALL-phishing-links.txt |
sed -r 's/^(https?|ftp)\:\/\///g;s/\/.*//g;s/.*@//g;s/\.$//g;/((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])$/d;/\^.$/d' | uniq | python3.11 ~/Projects/github/mypdns/matrix/tools/domain-sort.py >"data/phishing_database/ALL-phishing-links.txt"

# mkdir -p "${git_dir}/data/mitchellkrogza/badd_boyz_hosts/"
# echo ""
# echo "Badd-Boyz-Hosts"
Expand Down Expand Up @@ -270,6 +279,7 @@ echo "Imported openfish.com"
echo ""
echo ""
echo "The script ${0}"
# shellcheck disable=SC2320
echo -e "Exited with error code ${?}\n\n"

# git add .
Expand Down
3 changes: 2 additions & 1 deletion scripts/update_active_lists.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ truncate -s 0 "${git_dir}/sources.list"

# shellcheck disable=SC2044
for lists in $(find data/ -type f -name domain.list); do
printf "$github.workspace/-/raw/master/$lists\n" | sort -u -f >>"${git_dir}/sources.list"
printf "$github.workspace/-/raw/master/$lists\n" |
sort -u -f >>"${git_dir}/sources.list"
done

echo -e "\n\nThe script ${0}\nExited with error code ${?}\n\n"

0 comments on commit 4f6c877

Please sign in to comment.