DescripTor is a Java API that fetches Tor descriptors from a variety of sources like cached descriptors and directory authorities/mirrors. The DescripTor API is useful to support statistical analysis of the Tor network data and for building services and applications.
Learn more DownloadDescripTor is a Java API that fetches Tor descriptors from a variety of sources like cached descriptors and directory authorities/mirrors. The DescripTor API is useful to support statistical analysis of the Tor network data and for building services and applications.
The descriptor types supported by DescripTor include relay and bridge descriptors which are part of Tor's directory protocol as well as Torperf data files and TorDNSEL's exit lists. Access to these descriptors is unified to facilitate access to publicly available data about the Tor network.
This API is designed for Java programs that process Tor descriptors in batches. A Java program using this API first sets up a descriptor source by defining where to find descriptors and which descriptors it considers relevant. The descriptor source then makes the descriptors available in a descriptor store. The program can then query the descriptor store for the contained descriptors. Changes to the descriptor sources after descriptors are made available in the descriptor store will not be noticed. This simple programming model was designed for periodically running, batch-processing applications and not for continuously running applications that rely on learning about changes to an underlying descriptor source.
The executable jar, source jar, and javadoc jar can be found here [LINK]. Before using them please verify the release (see below for instructions).
Releases can be cryptographically verified to get some more confidence that they were put together by a Tor developer. The following steps explain the verification process by example.
Download the release tarball and the separate signature file:
wget https://dist.torproject.org/descriptor/1.0.0/descriptor-1.0.0.tar.gz
wget https://dist.torproject.org/descriptor/1.0.0/descriptor-1.0.0.tar.gz.asc
Attempt to verify the signature on the tarball:
gpg --verify descriptor-1.0.0.tar.gz.asc
If the signature cannot be verified due to the public key of the signer not being locally available, download that public key from one of the key servers and retry:
gpg --keyserver pgp.mit.edu --recv-key 0x4EFD4FDC3F46D41E
gpg --verify descriptor-1.0.0.tar.gz.asc
If the signature still cannot be verified, something is wrong!
But note that even if it can be verified, you now only know that the signature was made by the person claiming to own this key, which could be anyone. You'll need a trust path to the owner of this key in order to trust this signature, but that's clearly out of scope here. In short, your best chance is to meet a Tor developer in real life and enter the web of trust.
If you want to go one step further in the verification game, you can verify the signature on the .jar files.
Print and then import the provided X.509 certificate:
keytool -printcert -file CERT
keytool -importcert -alias karsten -file CERT
Verify the signatures on the contained .jar files using Java's jarsigner tool:
jarsigner -verify descriptor-1.0.0.jar
jarsigner -verify descriptor-1.0.0-sources.jar
import org.torproject.descriptor.*;
import java.io.File;
public class DownloadConsensuses {
public static void main(String[] args) {
// Download consensuses published in the last 72 hours, which will
take up to five minutes and require several hundred MB on the local disk.
DescriptorCollector descriptorCollector =
DescriptorSourceFactory.createDescriptorCollector();
descriptorCollector.collectDescriptors(
// Download from Tor's main CollecTor instance,
"https://collector.torproject.org",
// include only network status consensuses
new String[] { "/recent/relay-descriptors/consensuses/" },
// regardless of last-modified time,
0L,
// write to the local directory called descriptors/,
new File("descriptors"),
// and don't delete extraneous files that do not exist remotely anymore.
// and here's a very long line because why not let's write more stuff *type* *type* *type* am I done yet? oh i'll just copy this line... and here's a very long line because why not let's write more stuff *type* *type* *type* am I done yet? oh i'll just copy this line... done!
false);
}
}
<style type="text/css">
pre.highlight {
background: #F0F0F0;
}
pre.highlight code {
display: block;
overflow-x: auto;
padding: 0.5em;
background: #F0F0F0;
white-space: pre;
}
.hljs {
display: inline-block;
overflow-x: scroll;
padding: 0.5em;
padding-right: 100%;
background: #002b36;
color: #839496;
-webkit-text-size-adjust: none;
}
</style>
# Changes in version 1.6.0 - 2017-02-17 * Major changes - Deprecate DescriptorDownloader in favor of the much more widely used DescriptorCollector. * Medium changes - Add two methods for loading and saving a parse history file in the descriptor reader to avoid situations where applications fail after all descriptors are read but before they are all processed. - Unify the build process by adding git-submodule metrics-base in src/build and removing all centralized parts of the build process. - Avoid deleting extraneous local descriptor files when collecting descriptors from CollecTor. - Turn the descriptor reader thread into a daemon thread, so that the application can decide at any time to stop consuming descriptors without having to worry about the reader thread not being done. - Parse "proto" lines in server descriptors, "pr" lines in status entries, and "(recommended|required)-(client|relay)-protocols" lines in consensuses and votes. - Parse "shared-rand-.*" lines in consensuses and votes. - Deprecate DescriptorCollectorImpl now that DescriptorIndexCollector is the default. # Changes in version 1.5.0 - 2016-10-19 * Major changes - Make the DescriptorCollector implementation that uses CollecTor's index.json file to determine which descriptor files to fetch the new default. Applications must provide gson-2.2.4.jar or higher as dependency. * Minor changes - Avoid running into an IOException and logging a warning for it. # Changes in version 1.4.0 - 2016-08-31 * Major changes - Add the Simple Logging Facade for Java (slf4j) for logging support rather than printing warnings to stderr. Applications must provide slf4j-api-1.7.7.jar or higher as dependency and can optionally provide a compatible logging framework of their choice (java.util.logging, logback, log4j). * Medium changes - Add an alpha version of a DescriptorCollector implementation that is not enabled by default and that uses CollecTor's index.json file to determine which descriptor files to fetch. Applications can enable this implementation by providing gson-2.2.4.jar or higher as dependency and setting property descriptor.collector to org.torproject.descriptor.index.DescriptorIndexCollector. * Minor changes - Include resource files in src/*/resources/ in the release tarball. - Move executable, source, and javadoc jar to generated/dist/. # Changes in version 1.3.1 - 2016-08-01 * Medium changes - Adapt to CollecTor's new date format to make DescriptorCollector work again. # Changes in version 1.3.0 - 2016-07-06 * Medium changes - Parse "package" lines in consensuses and votes. - Support more than one "directory-signature" line in a vote, which may become relevant when authorities start signing votes using more than one algorithm. - Provide directory signatures in consensuses and votes in a list rather than a map to support multiple signatures made using the same identity key digest but different algorithms. - Be more lenient about digest lengths in directory signatures which may be longer or shorter than 20 bytes. - Parse "tunnelled-dir-server" lines in server descriptors. * Minor changes - Stop reporting "-----END .*-----" lines in v2 network statuses as unrecognized. # Changes in version 1.2.0 - 2016-05-31 * Medium changes - Include the hostname in directory source entries of consensuses and votes. - Also accept \r\n as newline in Torperf results files. - Make unrecognized keys of Torperf results available together with the corresponding values, rather than just the whole line. - In Torperf results, recognize all percentiles of expected bytes read for 0 <= x <= 100 rather than just x = { 10, 20, ..., 90 }. - Rename properties for overriding default descriptor source implementation classes. - Actually return the signing key digest in network status votes. - Parse crypto parts in network status votes. - Document all public parts in org.torproject.descriptor and add an Ant target to generate Javadocs. * Minor changes - Include a Torperf results line with more than one unrecognized key only once in the unrecognized lines. - Make "consensus-methods" line optional in network statuses votes, which would mean that only method 1 is supported. - Stop reporting "-----END .*-----" lines in directory key certificates as unrecognized. - Add code used for benchmarking. # Changes in version 1.1.0 - 2015-12-28 * Medium changes - Parse flag thresholds in bridge network statuses, and parse the "ignoring-advertised-bws" flag threshold in relay network status votes. - Support parsing of .xz-compressed tarballs using Apache Commons Compress and XZ for Java. Applications only need to add XZ for Java as dependency if they want to parse .xz-compressed tarballs. - Introduce a new ExitList.Entry type for exit list entries instead of the ExitListEntry type which is now deprecated. The main difference between the two is that ExitList.Entry can hold more than one exit address and scan time which were previously parsed as multiple ExitListEntry instances. - Introduce four new types to distinguish between relay and bridge descriptors: RelayServerDescriptor, RelayExtraInfoDescriptor, BridgeServerDescriptor, and BridgeExtraInfoDescriptor. The existing types, ServerDescriptor and ExtraInfoDescriptor, are still usable and will not be deprecated, because applications may not care whether a relay or a bridge published a descriptor. - Support Ed25519 certificates, Ed25519 master keys, SHA-256 digests, and Ed25519 signatures thereof in server descriptors and extra-info descriptors, and support Ed25519 master keys in votes. - Include RSA-1024 signatures of SHA-1 digests of extra-info descriptors, which were parsed and discarded before. - Support hidden-service statistics in extra-info descriptors. - Support onion-key and ntor-onion-key cross certificates in server descriptors. * Minor changes - Start using Java 7 features like the diamond operator and switch on String, and use StringBuilder correctly in many places. # Changes in version 1.0.0 - 2015-12-05 * Major changes - This is the initial release after four years of development. Happy 4th birthday!
Dear contributor to metrics-lib, this text is an attempt to tell you how we do development for this fine library. We highly encourage you to read it when making contributions to metrics-lib to make it easier for us to accept them. But we also invite you to question these guidelines and make suggestions if you see room for improvement.
Before we go into the details of writing code, let's briefly talk about the purpose of metrics-lib. Back in 2011, the reason for creating this library was to avoid rewriting the same code over and over that would handle data gathered in the Tor network. metrics-lib is now being used in the major Java-based tools in the Tor metrics space, and it's being used by researchers to do one-off analyses of Tor network data.
metrics-lib is not that big, so it shouldn't be difficult to go through the interfaces and classes to see what they are doing. But to give you a general overview, here are some highlights:
We tried to keep the number of dependencies as small as possible, and we tried to avoid adding any dependencies that wouldn't be available in common operating system distributions like Debian stable. That doesn't mean that we're opposed to add any further dependencies, but we need to keep in mind that any user of our library will have to add those dependencies, too.
metrics-lib currently has the following dependencies to compile:
We're using a code style that is not really formally defined but that roughly follows these rules:
There's probably more to say about code style, but please take a look at the existing code and try to write new code as similar as possible.
metrics-lib is still rather light on unit tests, but that shouldn't prevent us from writing tests for new code. Test classes go into a separate source directory and use the same package structure as the class they're supposed to test.
We have to assume that applications don't update their metrics-lib version very often. This is related to the lack of a release process until recently. If we want to remove a feature we'll have to deprecate it and basically keep it working for at least another year.
We're keeping a change log since we started putting out releases. Here we're going to describe what deserves a change log entry and whether those changes are major, medium, or minor:
As a rule of thumb, we should put out a new release of metrics-lib soon after making a major change as listed under "Change log" above. If we're planning to make more changes soon after, let's wait for them and make a release with everything. But we shouldn't let a major change sit in an unreleased metrics-lib for more than, say, two weeks. In contrast to that, medium changes can stay unreleased for longer, though they don't have to if we want to use them in an application sooner. Minor changes can be collected and usually will be released with changes on higher stages, but when necessary they can be released earlier.
Regarding version numbers, we started with 1.0.0 and bumped to 1.1.0, 1.2.0, etc. which were all backwards-compatible changes. Whenever we'll remove a previously deprecated feature, making a backwards-incompatible change, we'll bump to 2.0.0. For minor changes, we'd bump to x.x.1.
Releases are cryptographically signed on multiple levels: the Git tag created for the release is signed using GnuPG, the produced .jar files are signed using Java's jarsigner tool, and the produced tarball is again signed using GnuPG. We'll assume that you're familiar with GnuPG and how to manage keys for it, but we're including some sample commands for managing keys for jarsigner using the less commonly known Java keytool below.
First, generate a new key pair and create a certificate that expires after 90 days using reasonable (yet compatible) cryptographic algorithms and parameters:
keytool -genkeypair -alias karsten -keyalg RSA -keysize 2048 \
-sigalg SHA256withRSA -validity 90 \
-dname "CN=Karsten Loesing, O=The Tor Project\, Inc, L=Seattle, ST=WA, C=US"
Extend the certificate for the existing key pair when it expires:
keytool -selfcert -alias karsten
Export the certificate to a local file called CERT:
keytool -exportcert -alias karsten -rfc -file CERT
Also, in order to sign releases using Ant, you'll have to create a file `build.properties` with content similar to the content below (note that without such a file, the Ant targets starting at `signjar` won't work):
jarsigner.alias=karsten
jarsigner.storepass=password
Putting out a new release requires a series of steps:
Edit `build.xml` and raise the `release.version` property to the desired new release.
Edit `CHANGELOG.md` and make sure it contains the correct date for the release.
Commit these changes.
Clean up the src/ and lib/ directories from any files that may have been used locally for developing and that shouldn't be included in the tarball.
Use Ant to first clean up and then create a tarball containing all sources and signed .jar files:
ant clean compile test jar signjar tar
Sign the produced tarball using GnuPG:
gpg --detach-sign --armor --local-user 0x4EFD4FDC3F46D41E \
descriptor-1.0.0.tar.gz
Verify the signed tarball, ideally on a different system, as described in `README.md`.
Create a signed Git tag for the new release:
git tag -s descriptor-1.0.0 -m "DescripTor 1.0.0"
Push the branch. Ideally, verify the tag signature by cloning it on another system and running the following command:
git verify-tag descriptor-1.0.0
Upload the tarball and signature file and announce the new version.
Edit `build.xml` again and raise `release.version` to the current release plus `-dev`, e.g., `1.0.0-dev`.
If you want to start working on metrics-lib, you can clone the repo:
git clone --recursive https://git.torproject.org/metrics-lib.git
In case you forgot to add '--recursive' just run the bootstrap script for the submodule:
./src/main/resources/bootstrap-development.sh
or the contained git command:
git submodule update --init --remote
There are no metrics-lib packages yet, but we should aim for providing packages for at least Debian stable, either official or unofficial.
Dear contributor, now that you made it to the end of this guide, please be reminded that these are just guidelines that shall make it easier for us to work on metrics-lib. But we're making these rules ourselves, and that "we" includes you. Please suggest any changes to this guide and help us make it better. Thanks!
Content
Content
Content
Content
Content
Content
Link to GIT with description, description and description
link to CollecTor
link to releases or release table
link to sources
link to bug tracker, open bugs and direct link for adding new bug
link to contributor docs on team wiki page
© 2017 The Tor Project
Data on this site is freely available under a CC0 no copyright declaration: To the extent possible under law, the Tor Project has waived all copyright and related or neighboring rights in the data. "Tor" and the "Onion Logo" are registered trademarks of The Tor Project, Inc.