[ACCEPTED]-Reading large files from end-file-io

Accepted answer
Score: 18

You can use fopen and fseek to navigate 1 in file backwards from end. For example

$fp = @fopen($file, "r");
$pos = -2;
while (fgetc($fp) != "\n") {
    fseek($fp, $pos, SEEK_END);
    $pos = $pos - 1;
}
$lastline = fgets($fp);
Score: 7

It depends how you interpret "can".

If you 13 wonder whether you can do this directly 12 (with PHP function) without reading the 11 all the preceding lines, then the answer 10 is: No, you cannot.

A line ending is an interpretation 9 of the data and you can only know where 8 they are, if you actually read the data.

If 7 it is a really big file, I'd not do that 6 though. It would be better if you were to 5 scan the file starting from the end, and 4 gradually read blocks from the end to the 3 file.

Update

Here's a PHP-only way to read the last n lines 2 of a file without reading through all of 1 it:

function last_lines($path, $line_count, $block_size = 512){
    $lines = array();

    // we will always have a fragment of a non-complete line
    // keep this in here till we have our next entire line.
    $leftover = "";

    $fh = fopen($path, 'r');
    // go to the end of the file
    fseek($fh, 0, SEEK_END);
    do{
        // need to know whether we can actually go back
        // $block_size bytes
        $can_read = $block_size;
        if(ftell($fh) < $block_size){
            $can_read = ftell($fh);
        }

        // go back as many bytes as we can
        // read them to $data and then move the file pointer
        // back to where we were.
        fseek($fh, -$can_read, SEEK_CUR);
        $data = fread($fh, $can_read);
        $data .= $leftover;
        fseek($fh, -$can_read, SEEK_CUR);

        // split lines by \n. Then reverse them,
        // now the last line is most likely not a complete
        // line which is why we do not directly add it, but
        // append it to the data read the next time.
        $split_data = array_reverse(explode("\n", $data));
        $new_lines = array_slice($split_data, 0, -1);
        $lines = array_merge($lines, $new_lines);
        $leftover = $split_data[count($split_data) - 1];
    }
    while(count($lines) < $line_count && ftell($fh) != 0);
    if(ftell($fh) == 0){
        $lines[] = $leftover;
    }
    fclose($fh);
    // Usually, we will read too many lines, correct that here.
    return array_slice($lines, 0, $line_count);
}
Score: 7

It's not pure PHP, but the common solution 4 is to use the tac command which is the revert 3 of cat and loads the file in reverse. Use exec() or 2 passthru() to run it on the server and then 1 read the results. Example usage:

<?php
$myfile = 'myfile.txt';
$command = "tac $myfile > /tmp/myfilereversed.txt";
exec($command);
$currentRow = 0;
$numRows = 20;  // stops after this number of rows
$handle = fopen("/tmp/myfilereversed.txt", "r");
while (!feof($handle) && $currentRow <= $numRows) {
   $currentRow++;
   $buffer = fgets($handle, 4096);
   echo $buffer."<br>";
}
fclose($handle);
?>
Score: 6

Following snippet worked for me.

$file = popen("tac 1 $filename",'r');

while ($line = fgets($file)) {

   echo $line;

}

Reference: http://laughingmeme.org/2008/02/28/reading-a-file-backwards-in-php/

Score: 3

If your code is not working and reporting 18 an error you should include the error in 17 your posts!

The reason you are getting an 16 error is because you are trying to store 15 the entire contents of the file in PHP's 14 memory space.

The most effiicent way to solve 13 the problem would be as Greenisha suggests 12 and seek to the end of the file then go 11 back a bit. But Greenisha's mecanism for 10 going back a bit is not very efficient.

Consider 9 instead the method for getting the last 8 few lines from a stream (i.e. where you 7 can't seek):

while (($buffer = fgets($handle, 4096)) !== false) {
    $i1++;
    $content[$i1]=$buffer;
    unset($content[$i1-$lines_to_keep]);
}

So if you know that your max 6 line length is 4096, then you would:

if (4096*lines_to_keep<filesize($input_file)) {
   fseek($fp, -4096*$lines_to_keep, SEEK_END);
}

Then 5 apply the loop I described previously.

Since 4 C has some more efficient methods for dealing 3 with byte streams, the fastest solution 2 (on a POSIX/Unix/Linux/BSD) system would 1 be simply:

$last_lines=system("last -" . $lines_to_keep . " filename");
Score: 3

For Linux you can do

$linesToRead = 10;
exec("tail -n{$linesToRead} {$myFileName}" , $content); 

You will get an array of lines in $content 1 variable

Pure PHP solution

$f = fopen($myFileName, 'r');

    $maxLineLength = 1000;  // Real maximum length of your records
    $linesToRead = 10;
    fseek($f, -$maxLineLength*$linesToRead, SEEK_END);  // Moves cursor back from the end of file
    $res = array();
    while (($buffer = fgets($f, $maxLineLength)) !== false) {
        $res[] = $buffer;
    }

    $content = array_slice($res, -$linesToRead);
Score: 3

If you know about how long the lines are, you 11 can avoid a lot of the black magic and just 10 grab a chunk of the end of the file.

I needed 9 the last 15 lines from a very large log 8 file, and altogether they were about 3000 7 characters. So I just grab the last 8000 6 bytes to be safe, then read the file as 5 normal and take what I need from the end.

    $fh = fopen($file, "r");
    fseek($fh, -8192, SEEK_END);
    $lines = array();
    while($lines[] = fgets($fh)) {}

This 4 is possibly even more efficient than the 3 highest rated answer, which reads the file 2 character by character, compares each character, and 1 splits based on newline characters.

Score: 2

Here is another solution. It doesn't have 2 line length control in fgets(), you can 1 add it.

/* Read file from end line by line */
$fp = fopen( dirname(__FILE__) . '\\some_file.txt', 'r');
$lines_read = 0;
$lines_to_read = 1000;
fseek($fp, 0, SEEK_END); //goto EOF
$eol_size = 2; // for windows is 2, rest is 1
$eol_char = "\r\n"; // mac=\r, unix=\n
while ($lines_read < $lines_to_read) {
    if (ftell($fp)==0) break; //break on BOF (beginning...)
    do {
            fseek($fp, -1, SEEK_CUR); //seek 1 by 1 char from EOF
        $eol = fgetc($fp) . fgetc($fp); //search for EOL (remove 1 fgetc if needed)
        fseek($fp, -$eol_size, SEEK_CUR); //go back for EOL
    } while ($eol != $eol_char && ftell($fp)>0 ); //check EOL and BOF

    $position = ftell($fp); //save current position
    if ($position != 0) fseek($fp, $eol_size, SEEK_CUR); //move for EOL
    echo fgets($fp); //read LINE or do whatever is needed
    fseek($fp, $position, SEEK_SET); //set current position
    $lines_read++;
}
fclose($fp);
Score: 1

Well while searching for the same thing, I 4 can across the following and thought it 3 might be useful to others as well so sharing 2 it here:

/* Read file from end line by line 1 */

function tail_custom($filepath, $lines = 1, $adaptive = true) {
        // Open file
        $f = @fopen($filepath, "rb");
        if ($f === false) return false;

        // Sets buffer size, according to the number of lines to retrieve.
        // This gives a performance boost when reading a few lines from the file.
        if (!$adaptive) $buffer = 4096;
        else $buffer = ($lines < 2 ? 64 : ($lines < 10 ? 512 : 4096));

        // Jump to last character
        fseek($f, -1, SEEK_END);

        // Read it and adjust line number if necessary
        // (Otherwise the result would be wrong if file doesn't end with a blank line)
        if (fread($f, 1) != "\n") $lines -= 1;

        // Start reading
        $output = '';
        $chunk = '';

        // While we would like more
        while (ftell($f) > 0 && $lines >= 0) {

            // Figure out how far back we should jump
            $seek = min(ftell($f), $buffer);

            // Do the jump (backwards, relative to where we are)
            fseek($f, -$seek, SEEK_CUR);

            // Read a chunk and prepend it to our output
            $output = ($chunk = fread($f, $seek)) . $output;

            // Jump back to where we started reading
            fseek($f, -mb_strlen($chunk, '8bit'), SEEK_CUR);

            // Decrease our line counter
            $lines -= substr_count($chunk, "\n");

        }

        // While we have too many lines
        // (Because of buffer size we might have read too many)
        while ($lines++ < 0) {
            // Find first newline and remove all text before that
            $output = substr($output, strpos($output, "\n") + 1);
        }

        // Close file and return
        fclose($f);     
        return trim($output);

    }
Score: 0

As Einstein said every thing should be made 3 as simple as possible but no simpler. At 2 this point you are in need of a data structure, a 1 LIFO data structure or simply put a stack.

Score: 0

A more complete example of the "tail" suggestion 16 above is provided here. This seems to be 15 a simple and efficient method -- thank-you. Very 14 large files should not be an issue and a 13 temporary file is not required.

$out = array();
$ret = null;

// capture the last 30 files of the log file into a buffer
exec('tail -30 ' . $weatherLog, $buf, $ret);

if ( $ret == 0 ) {

  // process the captured lines one at a time
  foreach ($buf as $line) {
    $n = sscanf($line, "%s temperature %f", $dt, $t);
    if ( $n > 0 ) $temperature = $t;
    $n = sscanf($line, "%s humidity %f", $dt, $h);
    if ( $n > 0 ) $humidity = $h;
  }
  printf("<tr><th>Temperature</th><td>%0.1f</td></tr>\n", 
          $temperature);
  printf("<tr><th>Humidity</th><td>%0.1f</td></tr>\n", $humidity);
}
else { # something bad happened }

In the above 12 example, the code reads 30 lines of text 11 output and displays the last temperature 10 and humidity readings in the file (that's 9 why the printf's are outside of the loop, in 8 case you were wondering). The file is filled 7 by an ESP32 which adds to the file every 6 few minutes even when the sensor reports 5 only nan. So thirty lines gets plenty of 4 readings so it should never fail. Each 3 reading includes the date and time so in 2 the final version the output will include 1 the time the reading was taken.

More Related questions