identifying/tracking minecraft servers on mojang's blocklist
Go to file
2022-09-03 21:57:46 -04:00
.github/workflows Update 2022-08-26 11:58:22 -04:00
data I bruteforced 2.23e14 sha1s and all I got was this hash 2022-09-03 21:57:46 -04:00
.gitignore Starting off with 80/2317 identified hostnames 2022-08-23 14:40:19 -04:00
LICENSE Initial commit 2022-08-23 14:08:35 -04:00
package-lock.json rename to blocklist to be consistent with the api endpoint 2022-08-31 12:24:09 -04:00
package.json rename to blocklist to be consistent with the api endpoint 2022-08-31 12:24:09 -04:00
README.md Add a few more 2022-09-02 18:16:32 -04:00
try_bruteforce.sh Cachebuster on blockedservers list due to caching causing flip flopping 2022-08-30 11:32:49 -04:00
try_url.js Starting off with 80/2317 identified hostnames 2022-08-23 14:40:19 -04:00
update_blocklist.js rename to blocklist to be consistent with the api endpoint 2022-08-31 12:24:09 -04:00
update_merged.js Update 2022-08-26 11:58:22 -04:00

sudofox/mojang-blocklist

I figured I'd try to get a more comprehensive list of the domains blocked by Mojang, so this is my stab at it.

useful bash snippets

Get a list of TLDs (idk if this is super up to date)

curl -s https://raw.githubusercontent.com/umpirsky/tld-list/master/data/en/tld.txt|grep -Po "\(\K.+?(?=\))" > tld.txt

Get the middle segment (part before the TLD) of all entries, excluding ddns.net, spit it out as *.string

awk -F= '{print $2}' data/identified.txt|grep -v ddns|awk -F. '{print $(NF-1)}'|sort -u > middle_segments.txt

For all TLDs in tld.txt, try *.string.tld (try also: no subdomain, play., mc., etc)

for tld in $(cat tld.txt); do cat middle_segments.txt|awk '{print $1".'$tld'"}';  done|pv -l |xargs -P3 node try_url.js

Get a list of hashes which have not yet been identified

comm -23 <(sort -u data/current.txt) <(awk -F= '{print $1}' data/identified.txt |sort -u) > todo.txt

for big lists of minecraft server urls:

remove first subdomain. replace with *.. this also strips port numbers and normalizes casing

cat minecraftservers_org_scrape.txt| grep -Po ".+?(?=:)" | grep -Po ".+?(?=\.)\K.*" | tr '[[:upper:]]' '[[:lower:]]'|awk '{print "*"$1}'|xargs node try_url.js

Do srv lookups for a list of domains

cat domains.txt| grep -Po ".+?(?=:)" | tr '[[:upper:]]' '[[:lower:]]'|grep [[:alpha:]]| xargs -I{} -P10 timeout 5 dig srv _minecraft._tcp.{} +short | tee -a domains_srv_resolved.txt 

Given a list of raw dig output for many srv lookups, filter for domains only and strip the trailing dot:

tr ' ' '\n'|egrep [[:alpha:]]|sort -u|grep -Po ".+?(?=\.$)"

try *.mc or *.play subdomains for existing

awk -F= '{print $NF}' data/identified.txt |grep [[:alpha:]]|grep -Po "\*\.\K.*"|awk '{print "*.mc."$1}'|xargs node try_url.js

Finding bypassers via SRV...

awk -F= '{print $2}' data/identified.txt |sed 's/*.//'|awk '{print "_minecraft._tcp."$1}'|xargs -L1 -P10 dig +short srv |tee srv_re_resolve.txt
cat srv_re_resolve.txt |awk '{print $NF}'|sed 's/\.$//'|xargs node try_url.js
cat srv_re_resolve.txt |awk '{print $NF}'|sed 's/\.$//'|awk '{print "*."$1}'|xargs node try_url.js
cat srv_re_resolve.txt |awk '{print $NF}'|sed 's/\.$//'|awk '{print "*.mc."$1}'|xargs node try_url.js
cat srv_re_resolve.txt |awk '{print $NF}'|sed 's/\.$//'|awk '{print "*.play."$1}'|xargs node try_url.js