Problem with character encoding

General discussions about WebYep
Post Reply
Totoleheros
Posts: 7
Joined: Sat Apr 04, 2009 4:33 pm

Problem with character encoding

Post by Totoleheros » Tue Apr 13, 2010 8:08 pm

Hi,

I have already reported an issue that I have http://forums.obdev.at/viewtopic.php?f=6&t=2614&sid=67852807ae197c4cc6f97e75e292e809: I want to display in my sidebar the titles of seminars that are filled in a wepyep page (loop). The 'global trick' does not work as expected (because of the loop?). Now, to solve this problem, I have written a php script to parse the data directory and to get the content of the files and finally the 'Titles' of the seminars of the webyep page. Here is the code:

Code: Select all

<?php
// directory path can be either absolute or relative
$dirPath = '../../webyep-system2/data/';

// open the specified directory and check if it's opened successfully
if ($handle = opendir($dirPath))
{

   // keep reading the directory entries 'til the end
   while (false !== ($file = readdir($handle)))
   {

      // just skip the reference to current and parent directory
      if ($file != "." && $file != "..")
        {
     // explode the file content and get the title
         $array_explode=(explode('-', $file, 10));
         if (in_array("TitleSeminar", $array_explode))
         {
            $filetoopen=$dirPath.$file;
            $fh = fopen($filetoopen, 'r');            
            $theData = fread($fh, filesize($filetoopen));
            $cleanedstr1 = substr_replace($theData, "", 0, 43);
            $cleanedstr2 = str_replace("\"", "", $cleanedstr1);
            $cleanedstr3 = str_replace("}", "", $cleanedstr2);
            $cleanedstr4 = str_replace(";", "", $cleanedstr3);
            //mb_convert_encoding($cleanedstr4, "utf-8", "auto") does not work
            fclose($fh);
            echo "$cleanedstr4<br>";         
         }
         }
   }
   // ALWAYS remember to close what you opened
   closedir($handle);
}
?>


My problem is that this solution yields this:

Thérapie génique : protocoles en cours, résultats cliniques et développements futurs
Leptine et absorption des sucres. Conséquences physiologiques et physiopathologiques
Hypertension artérielle pulmonaire : physiopathologie, génétique et innovations thérapeutiques


instead of:

Thérapie génique : protocoles en cours, résultats cliniques et développements futurs
Leptine et absorption des sucres. Conséquences physiologiques et physiopathologiques
Hypertension artérielle pulmonaire : physiopathologie, génétique et innovations thérapeutiques


As you can see, I have a character encoding problem despite the fact that everything has been set in 'UTF-8' (including the webyep config file). Could someone help? I have the feeling that I am close to the solution...

johannes
Objective Development
Objective Development
Posts: 815
Joined: Fri Nov 10, 2006 4:39 pm
Contact:

Re: Problem with character encoding

Post by johannes » Wed Apr 21, 2010 6:36 pm

The text (currently) is stored in the encoding used when submitting the text (entering it into the WebYep text editor window) - so if the WebYep config was set to UTF-8 when entering the text, it should be stored in UTF-8 in the datafile.

But I would need to see the page in action - I need to see the page's charset meta tag and the text in source code.
Can you post the URL?

PS: While this is an innovative solution to an otherwise not solvable problem, please note that accessing the WebYep data files directly is very unstable - a small change to the data structure and this PHP code will fail...

Totoleheros
Posts: 7
Joined: Sat Apr 04, 2009 4:33 pm

Re: Problem with character encoding

Post by Totoleheros » Thu Apr 22, 2010 10:16 am

Johannes,

Thank you for your reply but I have finally found where my mistake was. The sidebar of my site is indeed a frame. And I forgot to set the sidebar page to the UTF-8 coding. Everything is working as I want now. I know that this technic is unstable but it does work. Here is the final PHP code:

Code: Select all

<?php
// directory path can be either absolute or relative
$dirPath = '../../webyep-system2/data/';

//Set a variable used later to get the last seminar date
$previousdateseminar=1000000000000000;

// open the specified directory and check if it's opened successfully
if ($handle = opendir($dirPath))
{
   // keep reading the directory entries 'til the end
   while (false !== ($file = readdir($handle)))
   {
      // just skip the reference to current and parent directory
      if ($file != "." && $file != "..")
        {
         $array_explode=(explode('-', $file, 10));
         // Explode filename of the files generated by WebYep and containing the required information to collect: here 'TitleSeminar'
         if (in_array("TitleSeminar", $array_explode))
         {
            $filetoopen=$dirPath.$file;
            $fh = fopen($filetoopen, 'r');            
            $theData = fread($fh, filesize($filetoopen));
            // clean the content required: here you should open the file generated by WebYep to guess where is the string you want to collect and remove the unwanted text
            $cleanedTitle1 = substr_replace($theData, "", 0, 43);
            $cleanedTitle2 = str_replace("\"", "", $cleanedTitle1);
            $cleanedTitle3 = str_replace("}", "", $cleanedTitle2);
            $cleanedTitle4 = str_replace(";", "", $cleanedTitle3);
            //Reconstitute the filename to get an associated webyep file: here, 'DateSeminar'
            $filenamedate=implode("-", array_slice($array_explode, 0, 3))."-DateSeminar";
            $filetoopen=$dirPath.$filenamedate;
            $fh = fopen($filetoopen, 'r');            
            $theData = fread($fh, filesize($filetoopen));
            //Cleaning of the Date
            $cleanedDate1 = substr_replace($theData, "", 0, 43);
            $cleanedDate2 = str_replace("\"", "", $cleanedDate1);
            $cleanedDate3 = str_replace("}", "", $cleanedDate2);
            $cleanedDate4 = str_replace(";", "", $cleanedDate3);
            //Another file to get: 'Conferencier'
            $filenameconferencier=implode("-", array_slice($array_explode, 0, 3))."-Conferencier";
            $filetoopen=$dirPath.$filenameconferencier;
            $fh = fopen($filetoopen, 'r');            
            $theData = fread($fh, filesize($filetoopen));
            //Cleaning again
            $cleanedConferencier1 = substr_replace($theData, "", 0, 43);
            $cleanedConferencier2 = str_replace("\"", "", $cleanedConferencier1);
            $cleanedConferencier3 = str_replace("}", "", $cleanedConferencier2);
            $cleanedConferencier4 = str_replace(";", "", $cleanedConferencier3);
                     
            fclose($fh);
            
            //Display only the next coming seminar
                                    
            If (strtotime($cleanedDate4)<$previousdateseminar)
            {
            $Conferencier=$cleanedConferencier4;
            $Date=$cleanedDate4;
            $Title=$cleanedTitle4;
            $previousdateseminar=strtotime($cleanedDate4);
            }
         }
         }
   }
   if ($Date !=="")
   {
         echo "<strong>$Date :</strong> $Conferencier \"";
      echo "$Title\"<br>";
   }
   
   // close what you opened    
   closedir($handle);
}
?>


This may help someone one day...

Best regards,

AN

johannes
Objective Development
Objective Development
Posts: 815
Joined: Fri Nov 10, 2006 4:39 pm
Contact:

Re: Problem with character encoding

Post by johannes » Mon Apr 26, 2010 10:07 pm

Thanks for sharing!

Post Reply