-
Notifications
You must be signed in to change notification settings - Fork 48
Analyse Namecoin data using Apache Spark
This is a Spark application demonstrating some of the capabilities of the hadoopcryptoledger library. It takes as input a set of files on HDFS containing Namecoin Blockchain data. As output it returns the total number of transactions found in the blockchain data. It has successfully been tested with the Cloudera Quickstart VM 5.5 and HDP Sandbox 2.5, but other Hadoop distributions should work equally well. Spark 1.5 was used for testing.
Namecoin describes itself as a distributed blockchain based domain name and identity system. Namecoin data has the same data structures as Bitcoin data, but has 1) special output scripts for name operations and 2) by using merged mining/AuxPOW as an incentive for Bitcoin miners to mine as well Namecoins. Both introduces additional data structures. The first one is addressed by additional methods (cf. Useful Utility functions) and the second one by supporting reading of AuxPOW information, which needs to be activated that you can properly process Namecoin blockchain data (see here). Finally, you need to configure the Namecoin network magic instead of the Bitcoin one (cf. Support for Altcoins based on Bitcoin)
See here how to fetch Namecoin blockchain data.
After it has been copied you are ready to use the example.
Execute
git clone https://github.com/ZuInnoTe/hadoopcryptoledger.git hadoopcryptoledger
You can build the application by changing to the directory hadoopcryptoledger/examples/spark-scala-namecoinblock and using the following command:
sbt clean assembly test it:test
This will also execute the integration tests
You will find the jar "example-hcl-spark-scala-namecoinblock.jar" in ./target/scala-2.10
Make sure that the output directory is clean:
hadoop fs -rm -R /user/namecoin/output
Execute the following command (to execute it using a local master)
spark-submit --class org.zuinnote.spark.namecoin.example.SparkScalaNamecoinBlockCounter --master local[8] ./target/scala-2.10/example-hcl-spark-scala-namecoinblock.jar /user/namecoin/input /user/namecoin/output
After the Spark job has completed, you find the result in /user/namecoin/output. You can display it using the following command:
hadoop fs -cat /user/namecoin/output/part-00000
Blog about Namecoin analytics: https://snippetessay.wordpress.com/2017/10/10/big-data-analytics-on-bitcoins-first-altcoin-namecoin/
Understanding the structure of Bitcoin data (Litecoin is very similar):
Blocks: https://en.bitcoin.it/wiki/Block
Transactions: https://en.bitcoin.it/wiki/Transactions
Generic information about Namecoin: https://en.wikipedia.org/wiki/Namecoin
Namecoin Webpage: https://namecoin.org