Using PHP mySQL with CSV data containing BOM
Get the solution ↓↓↓I have a database that holds stock levels for certain items that are supplied by different suppliers. Each supplier sends me a daily CSV file with their current stock levels. I am trying to update the stock levels into my database.
The problem I am having is that when I extract the data from the CSV and send it through queries, it is not being working properly.
I have echoed the queries prior to sending them and the output is fine. Using phpMyAdmin, if I just paste the code as it is echoed, it works fine. This has led me to believe that it is an encoding problem.
Viewing the CSV file in cPanel File Manager I see there is an odd character at the beginning of the file. (I believe this is caleld a BOM). If I delete this characted and save the CSV file then my code works perfectly and the databse updates as expected.
Editing the file in cPanel File Manager, the Encoding opens as ansi_x3.110-1983. While manually deleting the character will fix the issue, it is not an option as I want this to be a fully automated daily process.
My code to open the file and extract the data from CSV:
// Open File
$csvData = fopen($file, "r");
if($csvData !== FALSE)
{
while(!feof($csvData))
{
$csvRow[] = fgetcsv($csvData, 100);
}
}
// Close file
fclose($csvData);
My code to build a simple search query
foreach($csvRow as $row)
{
$searchQuery = "SELECT * FROM supplier WHERE supplierItemCode = '".$row[0]."'";
$result = $conn->query($searchQuery);
echo "<br>".$searchQuery;
if($result->num_rows > 0)
{
// CODE NEVER REACHES HERE
}
As mentioned, if I simply paste the echo of$searchQuery
into phpMyAdmin and run the query it works fine.
I have tried usingfseek($csvData, 2)
which successfully removes the BOM characters from the first row of data, but that is having no effect.
As suggested, I have tried using
$csvData = fopen($file, "r");
$BOM = null;
if($csvData !== FALSE)
{
$BOM = fread($csvData, 3);
if($BOM !== FALSE)
{
if($BOM != "\xef\xbb\xbf")
{
echo "<h5>BOM: ".$BOM; // This code is executed every time
fseek($csvData, 0);
}
}
//fseek($csvData, 2); // This was my earlier attempts without the above BOM filter
while(!feof($csvData))
{
$csvRow[] = fgetcsv($csvData, 100);
}
}
Using the BOM filter method produces this output.
As a further note, you'll notice that in my Update query output, there is a blank space in the SET quantity column. This space is not visible in the csv file.
This query is built with
$updateQuery = "UPDATE supplier SET ".$supplier." = '".$row[2]."' WHERE supplierItemCode = '".$row[0]."'";
Any suggestions on what exactly is causing this issue and how I can get around it.
Thanks in advance.
Answer
Solution:
Try the following modification to the code that opens and reads the CSV file. It checks for the presence of the BOM and bypasses it if present:
$cvsRow = [];
// Open File
$csvData = fopen($file, "r");
if($csvData !== FALSE)
{
$BOM = fread($csvData, 4); // read potential BOM sequences to see if one is present or not
if ($BOM !== FALSE)
{
if (strlen($BOM) >= 3 && substr_compare($BOM, "\xef\xbb\xbf", 0, 3) == 0)
{
fseek($csvData, 3); // found UTF-8 encoded BOM
}
elseif (strlen($BOM) >= 2 && (substr_compare($BOM, "\xfe\xff", 0, 2) == 0 || substr_compare($BOM, "\xff\xfe", 0, 2) == 0))
{
fseek($csvData, 2); // found UTF-16 encoded BOM
}
elseif ($BOM != "\00\00\xfe\xff" && $BOM != "\xff\xfe\00\00")
{
fseek($csvData, 0); // did not find UTF-32 encoded BOM
}
while(!feof($csvData))
{
$csvRow[] = fgetcsv($csvData, 100);
}
}
// Close file (only if it has been successfully opened)
fclose($csvData);
}
Answer
Solution:
I finally got a solution to work. After doing a lot of investigating, I believed it was encoded in UTF-16, despite what the BOM characters may have been saying.
I just wrote a simple function to convert each CSV value I was passing to the SQL.
function Convert($str)
{
return mb_convert_encoding($str, "UTF-8", "UTF-16BE");
}
........
$updateQuery = "UPDATE supplier SET ".$supplier." = '".Convert($row[2])."' WHERE supplierItemCode = '".Convert($row[0])."'";
I'm not sure why the BOM was causing such issue and why removing it entirely was not working. Thanks for everyone's help that lead me to discover the encoding problem.
Share solution ↓
Additional Information:
Link To Answer People are also looking for solutions of the problem: closed without sending a request; it was probably just an unused speculative preconnection
Didn't find the answer?
Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.
Similar questions
Find the answer in similar questions on our website.
Write quick answer
Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.