“Exploring the Tor dataset with Metabase”: My version
About a week ago, Gus had a simple question: “is there an easy way to find the percentage of offline bridges with distribution mechanism = none?”. He asked if I'd maybe be interested in working on it, which of course I would.
He already knew how to do it, and the actual work was a bit more than just that (see this issue if you are interested). To do it, what I needed was something called Metabase (and database and a script to glue stuff together, but that doesn't matter much, the real stuff happens inside metabase). I love it, it's as simple as that, it allows me to get so much information about the network, with nice visuals, filtered however I want it and sometimes summarized by whatever thing I want it summarized by (e.g. the sum of exit probability)
Gus showed me Dustri's blog post on his introduction to Metabase: https://dustri.org/b/exploring-the-tor-dataset-with-metabase.html, that's where you should start if you are interested.
I just expanded the script dustri made:
#!/bin/bash
#
# torplanet - v0.2
#
#
# I don't have mongod enabled
# you can remove these two lines if you do
sudo systemctl start mongod
sleep 5
# Dump "world-old", then copy the current bridges and relays "world" into "world-old", then dump "world"
mongosh < mongo-old.script
mongodump --db=bridges --archive | mongorestore --nsFrom=bridges.world --nsTo=bridges.world-old --archive
mongodump --db=relays --archive | mongorestore --nsFrom=relays.world --nsTo=relays.world-old --archive
mongosh < mongo.script
# Create dir
mkdir -p world
cd world
wget -O world.json https://onionoo.torproject.org/details
echo "Formatting json"
for f in `cat world.json | jq -r 'keys[]'` ; do cat world.json | jq ".$f" > $f.json; done
# Import to mongodb
mongoimport --jsonArray --db relays --collection world --file relays.json
mongoimport --jsonArray --db bridges --collection world --file bridges.json
echo "Imported all relays and bridges in databases, collection world"
firefox &
java -jar /home/user/metabase/metabase.jar
mongo-old.script is just this:
use bridges;
db.world-old.drop();
use relays;
db.world-old.drop();
and mongo.script:
use bridges;
db.world.drop();
use relays;
db.world.drop();