Getting RAW Log files and deciphering

I will be pulling log files from a remote server and I need some help in deciphering them. Basically I will use PHP to parse the file and extract bandwidth used by a client.

The log files come off of Limelight and I can tell who my clients are based on the folder structure.

So let's say I want to determine the bandwidth used for a particular file for a particular client. I know I could isolate the client and the file based on the GET portion of the log, but how do I know how much bandwidth has been used? Can I get this information from the log files?

Here is a small sample of a couple of items.

Code:
202.108.250.253 - - [12/Apr/2006:04:00:11 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 413 "http://202.108.23.172/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
202.108.250.253 - - [12/Apr/2006:04:14:42 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 284 "http://202.108.23.172/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
202.108.250.253 - - [12/Apr/2006:04:28:20 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 8469 "http://202.108.23.172/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
220.181.18.7 - - [12/Apr/2006:12:32:48 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 8469 "http://220.181.27.54/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
201.129.85.211 - - [12/Apr/2006:07:18:44 -0700] "GET http://podcastpub.dl.llnw.net/8/teen_options_1.mp3 HTTP/1.0" 200 565904 "-" "iTunes/6.0.2 (Macintosh; N; PPC)"

I think I would find the answer here: "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 413 but I am not sure.

I will continue doing research, but any pointers would be gratefull.

Since I have to pull them from their FTP server will a CRON job be the best use of this? Can a CRON job do this and can it call web pages. NEVER used CRON jobs as you can tell.

Also the log files are in compressed .gz format so I will have to get them uncompressed to even read them.

Thanks again!
 
What about using something like awstats customer install to do the work for you,instead of re-creating something in PHP?
 
Stephen said:
What about using something like awstats customer install to do the work for you,instead of re-creating something in PHP?

These stats would be based on the logs downloaded from limelight... and these logs have all customers intermingled in the logs. I will have to go through the logs and pull out each client and each audio file. I believe it will need to be custom... plus I must make it fit within the control panel I am building for the clients to examine their statistics and such in.

Thanks
 
Back
Top