Skip to content
Ashrith Mekala edited this page Feb 11, 2014 · 1 revision

Mocks log events generated by apache web server, the log events generated will be of the format:

95.22.50.11 - - [09/Sep/2013:16:36:44 -0700] "GET /test.php HTTP/1.1" 200 1832 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1"

Schema:

column_header type logger_mapping description
originatingIp string %h source of the request (client ip)
clientIdentity string %l (-) RFC 1413 identity of the client determined by identd on the clients machine
userId string %u (-) Userid of the person requesting the document as determined by http authentication
timeStamp string %t time at which the request was received, format: [day/month/year:hour:minute:second zone]
requestType string "%r" indicates info about http method used by the client is GET
requestPage string "%r" client requested resource /test.php
httpProtocolVersion string "%r" client used protocol HTTP/1.1
responseCode int %>s status code that the server sends back to the client
responseSize int %b size of the object returned to the client, not including the response headers
referrer string "%{Referer}i" ("-") identifies the site that the client reports having been referred from
userAgent string "%{User-agent}i" user-agent HTTP request header, identifies information that the client browser reports about itself

If data format is in avro, the avro schemas are available in src/main/resources/avro_schemas

Clone this wiki locally