-
Notifications
You must be signed in to change notification settings - Fork 0
jnlin/mod_uid
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <html> <head> <title>mod_uid.c, version 1.0</title> </head> <body bgcolor="#ffffff"> <h1 align=center >mod_uid.c version 1.0</h1> <center>a module issuing the "correct" cookies for counting the site visitors<p></center> <h3>Contents</h3> <ol> <li><a href="#1">Copyright</a> <li><a href="#2">Purpose</a> <li><a href="#3">Installation (Apache 1.x)</a> <li><a href="#3a">Installation (Apache 2.0.x)</a> <li><a href="#4">Configuration</a> <li><a href="#format">Cookie format</a> <li><a href="#5">What can be written to the log</a> <li><a href="#6">Why not mod_usertrack</a> <li><a href="#7">TODO</a> </ol> <a name="1"></a> <h3>Copyright</h3> Copyright (C) 2000-2002 Alex Tutubalin, [email protected] <p> May be distributed and used in derived products under the conditions analogous to the <a href="http://www.apache.org/LICENSE.txt">Apache License</a>: the author's copyright and the reference to <a href="http://www.lexa.ru/lexa/">http://www.lexa.ru/lexa</a> must be preserved, and the derived product should not be called <b>mod_uid</b>.<p> A prototype of this module was written by the author when he was working at <a href="http://www.rambler.ru">Rambler Co.</a>; the present version has been significantly modified.<p> The author is grateful to <b>Dmitry Khrustalev</b> for valuable advice.<p> <a name="2"></a> <h3>Description</h3> The standard distribution of Apache does not provide adequate means for user tracking (for problems associated with <b>mod_usertrack</b>, see <a href="#6">below</a>), and this module provides them. <p> What it actually does: <ul> <li> if the user has provided the cookie header with the correct cookie-name, the module writes this cookie in notes with the name <tt>uid_got</tt> (accordingly, then it may be written to the log); <li> if the user has arrived without the required cookie, the module issues the SetCookie header for him/her and writes the cookie thus issued in notes with the name <tt>uid_set</tt> (and this may also be written to the log); <li> if built-in P3P support is included, the P3P header is also issued as the Set-Cookie header is issued. </ul> <p> Advantages: <ul> <li>the cookie contains the date it is issued and the "service number" (that is, the number specified during configuring); thus, it helps one understand when the user first arrived at our site and where exactly he/she arrived; <li>multiserver work is supported: under accurate configuring (or its total absence ;), it is guaranteed that the cookie issued to the user will be unique; <li>the cookie issued to the user and the one received from him/her are not mingled in the log file; <li>the cookies are 128 bit long, and one may work with them in the log analyzer (quick search etc.) using ready source code intended for working with IPv6 (for example, <TT>libpatricia</TT>); <li>support of P3P (minimal) is provided. </ul> <a name="3"></a> <h3>Installation (Apache 1.x)</h3> While configuring Apache, add the following to ./configure parameters: --add-module=/path/to/mod_uid.c: <pre> tar xzvf apache_1.3xxx tar xzvf mod_uid-1.0.xx.tar.gz cd apache_1.3xx ./configure --prefix=/usr/local/apache \ ... --add-module=../mod_uid_1.0.xx/mod_uid.c other-params make make install </pre> <a name="3a"></a> <h3>Installation (Apache 2.0)</h3> You should use mod_uid2.c with Apache 2.0.x<br> Use the <b>apxs</b> program for installation: <pre> tar xzvf mod_uid-1.xx.tar.gz cd mod_uid-1.xx /usr/local/apach/bin/apxs -i -c -a mod_uid2.c </pre> This command will compile (-c), install (-i) and activate (-a) mod_uid2 module. <a name="4"></a> <h3>Configuration Directives</h3> All the configuration directives may be specified wherever desired: Server/VirtualServer/Location/... To specify them in <tt>.htaccess</tt>, one should allow AllowOverride FileInfo (or All).<p> <dl> <dt><b>UIDActive</b> On/Off</dt> <dd>Cookie issue turned on/off.<br> If set to "off", the cookies received from the client are decoded all the same and may be written to the log.<br> <i>Default:</i> On </dd> <dt><p></dt> <dt><b>UIDCookieName</b> string</dt> <dd>Cookie name (<i>default</i> - uid).<br> The name of the cookie issued to the client. Should not match any other name(s) used at the site. </dd> <dt><p></dt> <dt><b>UIDService</b> number</dt> <dd>The "service number" is a strictly positive (nonzero) unique number identifying the given server in the cluster or the given document or document set.<br> This number is used for two purposes: <ol> <li>If several servers are used within one domain (with the same cookie parameter <tt>domain=</tt>) or with one hostname, then the use of different <b>UIDService</b> numbers guarantees that the cookies issued by different servers will be unique. <li>The use of different <b>UIDService</b> numbers for different parts of the server makes it possible to reveal (by log analysis) which of the parts was first visited by the client. </ol> <i>Default:</i> server IP address. </dd> <dt><p></dt> <dt><b>UIDDomain</b> .domain.name</dt> <dd> Name of the domain for which the cookie is issued<br> In multiserver configurations, this directive makes it possible to have a common cookie namespace for all the servers (for example, mail.rambler.ru, www.rambler.ru, and info.rambler.ru use the .rambler.ru domain)<br> If <tt>domain=</tt> has to be set to "off" for a certain document set but stay "on" for the server as a whole, one should use <code>UIDDomain none</code> in the corresponding config section (Location/Directory/...).<br> <i>Default:</i> no domain; that is, the user's browser will return the cookie only to the originating server. </dd> <dt><p></dt> <dt><b>UIDPath</b> string</dt> <dd>The path for which the cookie is issued (parameter <tt>path=</tt> in Set-Cookie:)<br> <i>Default:</i> / </dd> <dt><p></dt> <dt><b>UIDExpires</b> number</dt> <dd> Sets the expiration date for the cookie.<br> <code>UIDExpires number</code> - <tt>number</tt> of seconds to be added to the current time.<br> <code>UIDExpires plus 3 year 4 month 2 day 1 hour 15 minutes</code> - the same expressed in normal human language. <br> <i>Default:</i> current date plus 10 years. </dd> <dt><p></dt> <dt><b>UIDP3P</b> On/Off/Always</dt> <dd>Controls if the P3P header is issued together with the cookie.<br> Variants: <ul> <li> Off - P3P header is not issued; <li> On - issued only if the <tt>domain</tt> parameter is issued for the cookie; <li> Always - always issued (i.e. even without <tt>domain</tt>). </ul> <i>Default:</i> Off.<br> This directive is required for satisfying MS IE6+ in the multiserver configuration and, for example, for including the "counter" code from another server in the page. In case the cookie is issued without <tt>domain=</tt> or <tt>domain</tt> includes the current server name for the main document, MS IE6+ with default settings will be satisfied all the same; however, the cookies may be suppressed for compound documents collected from different servers.<br> <b>mod_uid</b> issues only the P3P header (by default, only with compact policy); support of /w3c/p3p.xml and the like is up to the owner of the server.<br> The P3P header is issued only if <b>mod_uid</b> issues the Set-Cookie header; that is, if you have to issue other cookies as well and also need P3P for them, the problem of P3P issuing should be solved separately and independently. </dd> <dt><p></dt> <dt><b>UIDP3PString</b> string</dt> <dd>Text of the P3P header sent to the client.<br> <i>Default:</i> CP="NOI PSA OUR BUS UNI" </dd> </dl> <a name="format"></a> <h3>Cookie Format</h3> The cookie format in the binary form is <tt>unsigned int cookie[4]</tt>, where<br> <ul> cookie[0] is the "service number" (specified via <b>UIDService</b>);<br> cookie[1] is the issue time (unix time);<br> cookie[2] is the <b>pid</b> of the process that issued the cookie;<br> cookie[3] contains a unique sequencer within the limits of the process (upper 24 bits, starting value 0x030303) and<br> the cookie version number (lower 8 bits, now equal to 2).<br> </ul> These 128 bits are converted with respect for the network byte order, encoded (base64) and sent to the client. (In ver. 1, everything was sent in the host order, and support of server clusters with different architectures was thus complicated.)<br> <h4>Uniqueness</h4> Evidently, only insurance can fully guarantee anything. ;) And if more than 2^128 cookies are issued within a single domain, some of them will be duplicate. However, the cookie format was developed in such a way that the cookies must be unique if their number is reasonable. <ol> <li> If the "service number" is unique (each server has its own) within the given domain, different servers will surely issue different cookies. <li> Inclusion of the issue time and <b>pid</b> in the cookie implies that <b>pid</b>s of different processes are not duplicated during one second. This is true for all UNIX systems I know: <b>pid</b>s monotonically increase up to a certain maximum (2^16 or higher). That is, cookie[1]/cookie[2] may be duplicated within one server if more than 2^16 <b>fork()</b> is done per second, which is hardly possible in the present state of matters. <li> The sequencer (the upper 24 bits in cookie[3]) enables one to verify the uniqueness of the cookie within one process during one second. The capacity of the sequencer makes it possible to issue up to 1.0E+07 cookies per second by one process. </ol> <a name="5"></a> <h3>What can be written to the log</h3> <b>mod_uid</b> writes one of the following two values to "notes": <ol> <li>if a cookie was received from the client, it is placed in note <tt>uid_got</tt>; <li>if a cookie was sent to the client, it is placed in note <tt>uid_set</tt>. </ol> Cookies are logged as four 32-bit hexadecimal numbers in the host order (in ver. 2, a network-host conversion is performed; in ver. 1, everything is saved "as is" under the assumption that the server architecture did not change since the cookie had been issued). In <b>LogFormat</b>, these notes may be used in the form of \"%{uid_got}n\" and \"%{uid_set}n\", respectively.<br> Using LogFormat of the type <pre> LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{uid_got}n\" \"%{uid_set}n combined_cookie </pre> we'll have approximately this kind of log entries: <pre> Cookie sent to the client: 62.104.212.93 - - [05/Jan/2002:00:02:06 +0300] "GET / HTTP/1.0" 200 13487 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)" "-" "ruid=000000013C36184E00009A2100002901" Cookie received from the client: 216.136.145.172 - - [05/Jan/2002:00:14:59 +0300] "GET /buttons/but-support-e.gif HTTP/1.0" 200 252 "http://apache.lexa.ru/english/meta-http-eng.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" "ruid=000000013C361B5000009A0100009501" "-" </pre> Such a format is easily understood by widespread log analyzers, including <b>Webtrends</b>, which nicely counts visitors according to such a log. <a name="6"></a> <h3> Why not <b>mod_usertrack</b> from the Apache distribution? </h3> Because it has several drawbacks: <ul> <li> it does not strictly guarantee that the same cookie will not be issued to two users, although, of course, the probability of such an event is minimized due to consideration of <tt>getpid()</tt>, <tt>remote_ip</tt>, and time up to milliseconds; <li> it does not support multiserver work, and the probability of issuing identical cookies increases in this case; <li> one might wish to see the cookie sent to the user also in the log, and see it separately, whereas <b>mod_usertrack</b> mingles them; <li> one might wish to see the "service number" (see above) in order to understand which of our services was visited by the user during his first visit. </ul> <a name="7"></a> <h3>TODO</h3> <ol> <li> Support of various formats (Netscape/Cookie/Cookie2, as in <b>mod_usertrack</b>), but only if it becomes really necessary - and so far I haven't noticed any such necessity. <li> There is a vague suspicion that the sequencer increment should be surrounded with mutexes at multithread-apache and multiprocessor computers. </ol> <!-- $Id: README.html,v 1.6 2002/04/20 16:56:19 lexa Exp $ --> </body> </html>
About
Apache module issuing the "correct" cookies for counting the site visitors
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published