logs are a very broad concept in computer systems, and any program has the potential to output logs: operating system kernels, various application servers, and so forth. The content, size, and use of logs vary, and it’s hard to generalize.
the log in the log processing method discussed in this article refers to the Web log only. In fact, there is no precise definition that may include, but is not limited to, a variety of front-end Web servers — Apache, lighttpd, Tomcat, and other user access logs, and the logs produced by various Web applications themselves.
in Web logs, each log typically represents a user’s access behavior. For example, here is a typical Apache log:
, 18.104.22.168 – [18/Mar/2005:12:21:42, +0800], GET / HTTP/1.1, 200899, http://s.baidu.com/, Mozilla/4.0 (compatible; MSIE 6; Windows, NT 5.1; Maxthon)
from above this log, we can get a lot of useful information, such as the use of the visitor’s IP, access time, access to the target page, the source address and the visitor’s UserAgent client information etc.. If you need more information, you must use other means to obtain: for example, to get the user’s screen resolution, the general need to use js code to send a separate request; and if you want to get access to the news headlines such as user specific information, you may need to Web the output of the application in your own source code.
why do you want to analyze log
there is no doubt that the Web log contains a large number of people – mainly product analysts will be interested in the information, the most simple, we can access the website each kind of page from PV values (PageView, page visits), independent of IP number (number of IP after that to weight); some slightly more complicated the user can calculate the retrieval keywords list, user retention time maximum pages; more complex, construction model, advertising click through analysis of user behavior characteristics and so on.
now that these data are so useful, so of course there have been countless ready-made tools that can help us to analyze them, such as awstats, Webalizer, are used exclusively for the free program statistical analysis Web server log.
is another type of product, they do not directly log analysis, but by allowing users to embed JS code in the page the way to direct statistical data, or that we can think it is directly to the log output to their server. Typical representative product — big >