-
Notifications
You must be signed in to change notification settings - Fork 5
Log Generator Driver
Ashrith Mekala edited this page May 9, 2015
·
3 revisions
Log Generator Driver command line interface
bin/generator log --help
Log Generator
Usage: generator [options]
-e <value> | --eventsPerSec <value>
number of log events to generate per sec, use this to throttle the generator
-o <value> | --outputFormat <value>
format of the string to write to the file defaults to: 'tsv'
where,
text - string formatted by tabs in between columns
avro - string formatted using avro serialization
-d <value> | --destination <value>
destination where the generator writes data to, defaults to: 'file'
where,
file - output's directly to flat files
kafka - output to specified kafka topic
kinesis - output to specified kinesis
-r <value> | --fileRollSize <value>
size of the file to roll in bytes, defaults to: Int.MaxValue (don't roll files)
-p <value> | --filePath <value>
path of the file where the data should be generated, defaults to: '/tmp'
-t <value> | --totalEvents <value>
total number of events to generate, default: 1000
-b <value> | --flushBatch <value>
number of events to flush to file at a single time, defaults to: 10000
--kafkaBrokerList <value>
list of kafka brokers to write to, defaults to: 'localhost:9092'
--kafkaTopicName <value>
name of the kafka topic to write data to, defaults to: 'logs'
--kinesisStreamName <value>
name of the kinesis stream to write data to, defaults to: 'logevents'
--kinesisShardCount <value>
number of kinesis shard to create, defaults to: '1'
--ipSessionCount <value>
number of times a ip can appear in a session, defaults to: '25'
--ipSessionLength <value>
size of the session, defaults to: '50'
--threadsCount <value>
number of threads to use for write and read operations, defaults to: 1
--threadPoolSize <value>
size of the thread pool, defaults to: 10
--awsAccessKey <value>
AWS access key (required for kinesis)
--awsSecretKey <value>
AWS secret key (required for kinesis)
--awsEndPoint <value>
AWS service end point to connect to (required for kinesis)
--loggingLevel <value>
Logging level to set, defaults to: INFO
--help
prints this usage text
Examples:
-
To generate
100000
events to/tmp
bin/generator log --totalEvents 100000 --filePath /tmp
-
To generate
100M
events to/tmp
and roll file every 64 MBbin/generator log --totalEvents 100000000 --filePath /tmp --fileRollSize 67108864
-
To generate
100M
events to/tmp/
concurrently using 5 threadsbin/generator log --totalEvents 100000000 --filePath /tmp --fileRollSize 67108864 --threadCount 5
-
To generate
100M
events to/tmp/
concurrently using 5 threads in 'avro' formatbin/generator log --totalEvents 100000000 --filePath /tmp --fileRollSize 67108864 --threadCount 5 --fileFormat avro
-
Writing directly to kafka
bin/generator log --totalEvents 1000 --destination kafka --kafkaBrokerList "localhost:9092" --kafkaTopicName logs --threadsCount 5
-
Writing to Kinesis
bin/generator log --totalEvents 1000 --eventsPerSec 100 --flushBatch 500 --destination kinesis --kinesisStreamName logevents --kinesisShardCount 2 --awsAccessKey [ACCESS_KEY] --awsSecretKey [SECRET_KEY] --awsEndPoint [ENDPOINT_URL_KINESIS]
-
Checking kinesis stream using aws command line tool
-
Describe the stream and get the shard-id
aws kinesis describe-stream --stream-name generator
-
Get the shard-iterator
aws kinesis get-shard-iterator --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --stream-name generator
-
Get the records
SHARD_ITERATOR=$(aws kinesis get-shard-iterator --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --stream-name generator --query 'ShardIterator') aws kinesis get-records --shard-iterator $SHARD_ITERATOR --debug
-