dan lasota's masters in education portfolio for online innovation and design


What Does an Apache Web Log Look Like?

3 July 2012 ed431 onid

A typical Apache Web Server log looks line many many lines of text; Each line contains information about a request for information and the result of the web server trying to fulfill that request. I took this from my web server and asked for the last 50 requests. The first entry is the IP number of the machine/device/person requesting it. 137.229.whatever are University of Alaska numbers. Then there is the time/date stamp. Followed by the actual http request, usually a GET, but sometimes a POST. In most cases there are path names to images, articles, etc that are on the web server. The last two numbers are the result code (200 is ok, 404 is not there, etc), the last number is the size of the resource in bytes.

I posted this because of the discussion on digital footprints in ED 431. This is what my server, which runs Apache, tracks. The majority of web servers in the world track similar information. Most web servers, mine included, do not keep these logs forever. Mine is set to delete logs older than 30 days. Some servers keep them around for longer.

This is by no means the only part of one’s digital foot print. The reason I am calling attention to it, however, is that its an example of an ephemeral piece of data. Two things to mention about it: One, there is no personal information associated with it. My server only knows the IP number, the Internet address, of the machine asking for the page information. Two, this information is only kept on my server for about 30 days. By today’s standards this is relatively low tech. Most websites are able to capture a lot more information which generally serves the purpose of letting the people running the web site know if they are serving visitors well. Web site authors and administrators can determine the kind of browser used, the size of the visitor’s display window in pixels, the user’s operating system and a few more tidbits. This is all done with javascript, if the user allows it. A lot of web admin people use Google Analytics which helps determine who is visiting a web site and what they do once they are there.

Some web sites do track a lot more. But not every web site tracks or is even interested in who you are. Only the most basic information is kept, and only for a short while. This would fall into Allison’s description of foot prints in the sand rather than those cast in concrete. But Allison does make a good point about the growing lack of privacy and the blurred lines between what some companies think is user experience and what becomes profit motive. We must pay attention to who tracks what, by understanding it we can chose to opt-in or out as the case may be, or abandon or refuse to participate in some social networks.

2 thoughts on “What Does an Apache Web Log Look Like?

  1. Skip Via says:

    Very interesting. I haven’t looked at an Apache log in years.

    As you mentioned, the extent of what is gathered and how long it’s kept is a function of the intention of the web managers behind a site. Unless you leave information such as your name or phone number, it obviously can’t be retained. However, through data minng techniques, it’s sometimes possible to associate those data with a log that didn’t record it in the first place. You’re right, though–even Google doesn’t attach names with it’s data. They’re only interested in broad patterns and such.

    If you are using a Google domain (private or not) you can use Google Analytics to track users on your site. The blog that we maintain for Larry Mitchell’s fifth graders uses analytics and it’s a source of great pride for the students when they get hits from Switzerland or Australia.

    • dan.lasota says:

      Google Analytics is a very powerful tool, indeed. I used it once to determine the optimum width for web pages. After looking at the human visitors to the site (not search engines), I determined that 98% of the people had a 1024 pixel wide screen. I was able to tailor the width of the screen to the majority of the visitors. I think most web admins and designers are after information like this.

      Another big reason to use Google Analytics is to see how visitors arrive at your site. You might see that one or two web sites act as pointers to your site, or that many people arrive at your doorstep searching for something in particular. It’s good to have a page that exactly addresses the content they are looking for.

      If folks are interested I can set up some visitor accounts into one of my analytics websites so you can see what it’s like.

Leave a Comment

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>