Categories
Uncategorized

Reading Large Files

Imagine you are examining a server log to find the number of times a particular page is accessed. You could do this by counting the number of times the page URL appears in the log:

$data = file_get_contents(WEB_SERVER_LOG);

$num_of_page_loads = substr_count($data, 'https://www.mysite.com/mypage');

If your PHP application is running on a host with restricted resources, $data = file_get_contents(FILE_PATH); can cause your process to run out of memory and crash. A better way is to process your file line by line.

fopen And fgets

fopen and fgets are wrappers over functions in the C language that underlies PHP; they are simple but lightweight and easy to use.

fopen opens a file and returns a file pointer resource, a “handle” that other functions can use to refer to the file. The first parameter is the file path; the second is the type of access you need. 'r' is read only, starting from the beginning of the file.

fgets reads the next line from a “handle” and returns it, or false if there are no more lines.

fclose tidies up by closing the “handle”.

$file_handle = fopen(WEB_SERVER_LOG, 'r');

$num_of_page_loads = 0;
while ($line = fgets($file_handle)) {
    if (strpos($line, PAGE_URL) !== false) {
        $num_of_page_loads ++;
    }
}

fclose($file_handle);

This code will only read one line of the file at a time memory. Using these low level functions is an excellent way to ensure you code handles large inputs without using too much memory.

Your email will not be given to anyone else, ever.