A local educational institution requested assistance with their TOR web crawler project. The objective was to create a database of malicious files found through visiting websites on The Onion Router network. The software used for this project was Apache Nutch, Elasticsearch, and Kibana. The overall problem was that Nutch wouldn’t send crawled data to its specified location, and the team would have to find a method for indexing this information properly. The team would also have to modify configurations to index specific file extensions. The end result would be a dedicated server consistently scanning TOR pages to form a large search engine.

Project Students: John Bovey

Author