-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List spring cleaning #91
Comments
good idea! even if it's not strictly necessary, it's always good to clean up things every now and then
|
for #92, I got (most of) the domains out of the rules file with: cat void-gr-filters.txt| grep -Po '.*?(//|\|\||@@\|\||@@|\~)\K.*?(?=/|#)' | sort | uniq > domain-list.txt
cat void-gr-filters.txt| grep -Po '^[0-9a-zA-Z].*?(?=/|#)' | sort | uniq >> domain-list.txt
sort domain-list.txt | uniq > domain-list-final.txt and then ran this: #!/bin/bash
DOMAIN_LIST="domain-list-final.txt"
#DOMAIN_LIST="testme.txt"
RESOLVER="1.1.1.1"
BAD_DOMAINS="bad_domains"
SUB_NO_RECORD="no_record"
WWW_EXISTS="www_exists"
rm -f "${BAD_DOMAINS}" "${SUB_NO_RECORD}" "${WWW_EXISTS}"
while read -r line; do
# cleanups
myline=$(echo "${line}" | awk -F':' '{ print $1 }')
line=$(echo "${myline}" | grep -Ev '/|\|' | grep -Ev '^[0-9]')
if [ "x${line}" = "x" ]; then
continue
fi
echo "Working on: ${line}"
# Check if the subdomain exists
if [ "$(dig "${line}" @${RESOLVER} +short)" = "" ]; then
# Check if the subdomain with www prepended exists
if [ "$(dig "www.${line}" @${RESOLVER} +short)" = "" ]; then
domain=$(echo "${line}" | awk -F. '{ print $(NF-1) "." $NF }')
# if the domain doesn't have NS records, the domain does not exist any more
if [ "$(dig NS "${domain}" @${RESOLVER} +short)" = "" ]; then
echo "${domain}" | tee -a "${BAD_DOMAINS}"
# if the entry is a subdomain we already know it doesn't have A record
elif [ "$(echo "${line}" | grep -o '\.' | wc -l)" -gt "1" ]; then
echo "${line}" | tee -a "${SUB_NO_RECORD}"
fi
else
echo "${line}" | tee -a "${WWW_EXISTS}"
fi
fi
done < "${DOMAIN_LIST}"
double checked all "bad_domains" manually |
@kargig Good stuff! |
Cosmetic filter have network start characters: 044bc9f (made in 2016) greek-adblockplus-filter/void-gr-filters.txt Lines 432 to 433 in 72bccd0
AdGuard disabled use in 2018: AdguardTeam/FiltersRegistry@a452d4d#diff-6472c0fcd53f81660278097de5b81a5a1cd70c38b8a5068d02039207a61d5726R93-R95 |
Hello,
I was taking a brief look through the filter list. Considering that this is a project ongoing for more than 10 years, it might be worth performing some sort of spring cleaning, to identify and remove:
I can't think of a good methodology for this, other than manually checking.
Any thoughts and ideas? Is this even necessary?
I was thinking that reducing the rules might help with performance of adblockers? IDK.
The text was updated successfully, but these errors were encountered: