php - Almost the same code, but different output, why?

Solution:
You are reading the file byte at a time.
For example the characterб
encodes as bytes0xD0 0xB1
in UTF-8. The tab character is0x09
.
So without the tab character, you first write0xD0
, then0xB1
, resulting in0xD0 0xB1
which is valid UTF-8.
With the tab character, you write0x09
between every byte - making it:0xD0 0x09 0xB1
.0xD0
followed by0x09
is not
valid UTF-8, so the browser renders the replacement character to deal with it.
You need to be more sophisticated about it; this should work:
$file = fopen("t1.txt","r+");
while (! feof ($file))
{
$c = fgetc($file);
$val = ord($c);
//UTF-8 Lead Byte
if( $val & 0x80 ) {
$continuationByteCount = 0;
if( ($val & 0xF8) == 0xF0) $continuationByteCount = 3;
else if( ($val & 0xF0) == 0xE0) $continuationByteCount = 2;
else if( ($val & 0xE0) == 0xC0) $continuationByteCount = 1;
echo $c;
while( $continuationByteCount-- ) {
echo fgetc($file);
}
}
else { //Single-byte UTF-8 unit... I.E. ASCII
echo $c;
}
echo "\t";
}
fclose($file);
Read it all at once and split to array where each item is 1 character (1-4 bytes):
$chars = preg_split( '//u', file_get_contents("t1.txt"), -1, PREG_SPLIT_NO_EMPTY );
foreach( $chars as $char ) {
echo $char;
echo "\t";
}
Answer
Solution:
I think this might be a problem with the encoding recognition from the browser. You can try
<?php
header('Content-type: text/html; charset=utf-8');
?>
Or set the meta tag
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Share solution ↓
Additional Information:
Link To Answer People are also looking for solutions of the problem: sqlstate[hy000] [1698] access denied for user 'root'@'localhost'
Didn't find the answer?
Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.
Similar questions
Find the answer in similar questions on our website.
Write quick answer
Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.