Mombu the Php Forum sponsored links

Go Back   Mombu the Php Forum > Php > #36415 : transformToXML will NOT output UTF-8
User Name
Password
REGISTER NOW! Mark Forums Read

sponsored links


Reply
 
1 20th May 12:05
php-bugs
External User
 
Posts: 1
Default #36415 : transformToXML will NOT output UTF-8


From: memoimyself at yahoo dot com dot br
Operating system: Windows XP
PHP version: 5.1.2
PHP Bug Type: XSLT related
Bug description: transformToXML will NOT output UTF-8

Description:
------------
An instance of the XSLTProcessor class, via its transformToXML method, is
used to transform XML do***ents using an XSL stylesheet.

The XSL do***ent is in a file that is encoded in UTF-8.

The PHP script is in a file also encoded in UTF-8.

The XML do***ents are created at run time from XML strings stored in a
MySQL 5 database whose character set is
UTF-8 and whose tables all have UTF-8 as their character set as well.

All the XML strings stored in the database are duly encoded in UTF-8.

Prior to data retrieval, a 'SET NAMES "utf8"' query is run to ensure that
all i/o operations use the UTF-8 character set.

Upon transformation, the results are output to the client preceded by
"header('Content-Type: text/html; charset=UTF-8')" to ensure that the
browser uses the correct character set.

The XSL file has the following top-level (child node of the do***ent
element, as it should be) element:

<xslutput encoding="utf-8" method="html"/>

When this code is run on a Windows server (Win XP, Apache 2.0.55,PHP
5.1.2), the transformation NEVER outputs UTF-8 text (seems to output
iso-8859-1), even if the 'method' attribute in the above element is
changed to 'xml', and even if a 'media-type' attribute is also used.

When run on a Linux server (also running PHP 5.1.2), the transformation
runs as expected and outputs proper UTF-8 text to the browser.

Reproduce code:
---------------
PHP code:

$dbo = new PDO(BD_DSN, BD_USERNAME, BD_PWD);
$dbo->query('SET NAMES "utf8"');
$sql = 'SELECT Report FROM reports WHERE Id =
'.$dbo->quote(strip_gpc_slashes($_GET['rid'])).'
AND Author = '.$dbo->quote($_SESSION['user']);
$result = $dbo->query($sql);
$row = $result->fetch(PDO::FETCH_OBJ);
$xml = new DOMDo***ent('1.0', 'UTF-8');
$xml->loadXML($row->Report);
$xsl = new DOMDo***ent('1.0', 'UTF-8');
$xsl->load('/path/to/xsl/file.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$output = $proc->transformToXML($xml);
header('Content-Type: text/html; charset=utf-8');
print $output;

Start of XSL do***ent:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/11/schema-for-xslt20.xsd">
<xslutput encoding="utf-8" method="html"/>
<xsl:template match="/">...

Expected result:
----------------
All text output to the browser should be proper UTF-8. If the browser's
character encoding is set to UTF-8 (which it should, with the
"content-type" header above), all accented character should be adequately
displayed.

Actual result:
--------------
When the code is run on a Windows XP server, the text output to the
browser is NOT proper UTF-8 and all accented characters are replaced by
weird symbols.

When the code is run on a Linux server (also equipped with PHP 5.1.2),
everything works as expected and the output is proper UTF-8.

--
Edit bug report at http://bugs.php.net/?id=36415&edit=1
--
Try a CVS snapshot (PHP 4.4): http://bugs.php.net/fix.php?id=36415&r=trysnapshot44
Try a CVS snapshot (PHP 5.1): http://bugs.php.net/fix.php?id=36415&r=trysnapshot51
Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=36415&r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=36415&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=36415&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=36415&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=36415&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=36415&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=36415&r=support
Expected behavior: http://bugs.php.net/fix.php?id=36415&r=notwrong
Not enough info: http://bugs.php.net/fix.php?id=36415&r=notenoughinfo
Submitted twice: http://bugs.php.net/fix.php?id=36415&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=36415&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=36415&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=36415&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=36415&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=36415&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=36415&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=36415&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=36415&r=mysqlcfg
  Reply With Quote


  sponsored links


2 20th May 12:05
External User
 
Posts: 1
Default #36415 : transformToXML will NOT output UTF-8


ID: 36415
Updated by: chregu@php.net
Reported By: memoimyself at yahoo dot com dot br
-Status: Open
+Status: Feedback
Bug Type: XSLT related
Operating System: Windows XP
PHP Version: 5.1.2
New Comment:

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves.

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external
resources such as databases, etc.

If possible, make the script source available online and provide
an URL to it here. Try to avoid embedding huge scripts into the report.

This is not a reproducable script....
We need something, we can copy&paste and run.

And please do not open 2 reports for the same problem


Previous Comments:
------------------------------------------------------------------------

[2006-02-16 12:59:42] memoimyself at yahoo dot com dot br

Description:
------------
An instance of the XSLTProcessor class, via its transformToXML method,
is used to transform XML do***ents using an XSL stylesheet.

The XSL do***ent is in a file that is encoded in UTF-8.

The PHP script is in a file also encoded in UTF-8.

The XML do***ents are created at run time from XML strings stored in a
MySQL 5 database whose character set is
UTF-8 and whose tables all have UTF-8 as their character set as well.

All the XML strings stored in the database are duly encoded in UTF-8.

Prior to data retrieval, a 'SET NAMES "utf8"' query is run to ensure
that all i/o operations use the UTF-8 character set.

Upon transformation, the results are output to the client preceded by
"header('Content-Type: text/html; charset=UTF-8')" to ensure that the
browser uses the correct character set.

The XSL file has the following top-level (child node of the do***ent
element, as it should be) element:

<xslutput encoding="utf-8" method="html"/>

When this code is run on a Windows server (Win XP, Apache 2.0.55,PHP
5.1.2), the transformation NEVER outputs UTF-8 text (seems to output
iso-8859-1), even if the 'method' attribute in the above element is
changed to 'xml', and even if a 'media-type' attribute is also used.

When run on a Linux server (also running PHP 5.1.2), the transformation
runs as expected and outputs proper UTF-8 text to the browser.

Reproduce code:
---------------
PHP code:

$dbo = new PDO(BD_DSN, BD_USERNAME, BD_PWD);
$dbo->query('SET NAMES "utf8"');
$sql = 'SELECT Report FROM reports WHERE Id =
'.$dbo->quote(strip_gpc_slashes($_GET['rid'])).'
AND Author = '.$dbo->quote($_SESSION['user']);
$result = $dbo->query($sql);
$row = $result->fetch(PDO::FETCH_OBJ);
$xml = new DOMDo***ent('1.0', 'UTF-8');
$xml->loadXML($row->Report);
$xsl = new DOMDo***ent('1.0', 'UTF-8');
$xsl->load('/path/to/xsl/file.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$output = $proc->transformToXML($xml);
header('Content-Type: text/html; charset=utf-8');
print $output;

Start of XSL do***ent:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/11/schema-for-xslt20.xsd">
<xslutput encoding="utf-8" method="html"/>
<xsl:template match="/">...

Expected result:
----------------
All text output to the browser should be proper UTF-8. If the browser's
character encoding is set to UTF-8 (which it should, with the
"content-type" header above), all accented character should be
adequately displayed.

Actual result:
--------------
When the code is run on a Windows XP server, the text output to the
browser is NOT proper UTF-8 and all accented characters are replaced by
weird symbols.

When the code is run on a Linux server (also equipped with PHP 5.1.2),
everything works as expected and the output is proper UTF-8.


------------------------------------------------------------------------


--
Edit this bug report at http://bugs.php.net/?id=36415&edit=1
  Reply With Quote
3 26th May 10:47
php-bugs
External User
 
Posts: 1
Default #36415 : transformToXML will NOT output UTF-8


ID: 36415
Comment by: vodka_carambar_lovely_spam at yahoo dot com
Reported By: memoimyself at yahoo dot com dot br
Status: No Feedback
Bug Type: XSLT related
Operating System: Windows XP
PHP Version: 5.1.2
New Comment:

I got the same problem with a similar setup. The only difference was
the OS (Ubuntu). Same PHP version and Apache version. Double check that
the content of what is going into transformToXML really is valid UTF-8.
The slighest encoding slip-up anywhere in the code will screw
everything up.

For myself I had to open a file that was supposed to be UTF-8, copy
paste it into another do***ent and "save as" with UTF-8 encoding.


Previous Comments:
------------------------------------------------------------------------

[2006-02-24 01:00:03] php-bugs at lists dot php dot net

No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

------------------------------------------------------------------------

[2006-02-16 13:17:21] chregu@php.net

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves.

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external
resources such as databases, etc.

If possible, make the script source available online and provide
an URL to it here. Try to avoid embedding huge scripts into the report.

This is not a reproducable script....
We need something, we can copy&paste and run.

And please do not open 2 reports for the same problem

------------------------------------------------------------------------

[2006-02-16 12:59:42] memoimyself at yahoo dot com dot br

Description:
------------
An instance of the XSLTProcessor class, via its transformToXML method,
is used to transform XML do***ents using an XSL stylesheet.

The XSL do***ent is in a file that is encoded in UTF-8.

The PHP script is in a file also encoded in UTF-8.

The XML do***ents are created at run time from XML strings stored in a
MySQL 5 database whose character set is
UTF-8 and whose tables all have UTF-8 as their character set as well.

All the XML strings stored in the database are duly encoded in UTF-8.

Prior to data retrieval, a 'SET NAMES "utf8"' query is run to ensure
that all i/o operations use the UTF-8 character set.

Upon transformation, the results are output to the client preceded by
"header('Content-Type: text/html; charset=UTF-8')" to ensure that the
browser uses the correct character set.

The XSL file has the following top-level (child node of the do***ent
element, as it should be) element:

<xslutput encoding="utf-8" method="html"/>

When this code is run on a Windows server (Win XP, Apache 2.0.55,PHP
5.1.2), the transformation NEVER outputs UTF-8 text (seems to output
iso-8859-1), even if the 'method' attribute in the above element is
changed to 'xml', and even if a 'media-type' attribute is also used.

When run on a Linux server (also running PHP 5.1.2), the transformation
runs as expected and outputs proper UTF-8 text to the browser.

Reproduce code:
---------------
PHP code:

$dbo = new PDO(BD_DSN, BD_USERNAME, BD_PWD);
$dbo->query('SET NAMES "utf8"');
$sql = 'SELECT Report FROM reports WHERE Id =
'.$dbo->quote(strip_gpc_slashes($_GET['rid'])).'
AND Author = '.$dbo->quote($_SESSION['user']);
$result = $dbo->query($sql);
$row = $result->fetch(PDO::FETCH_OBJ);
$xml = new DOMDo***ent('1.0', 'UTF-8');
$xml->loadXML($row->Report);
$xsl = new DOMDo***ent('1.0', 'UTF-8');
$xsl->load('/path/to/xsl/file.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$output = $proc->transformToXML($xml);
header('Content-Type: text/html; charset=utf-8');
print $output;

Start of XSL do***ent:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/11/schema-for-xslt20.xsd">
<xslutput encoding="utf-8" method="html"/>
<xsl:template match="/">...

Expected result:
----------------
All text output to the browser should be proper UTF-8. If the browser's
character encoding is set to UTF-8 (which it should, with the
"content-type" header above), all accented character should be
adequately displayed.

Actual result:
--------------
When the code is run on a Windows XP server, the text output to the
browser is NOT proper UTF-8 and all accented characters are replaced by
weird symbols.

When the code is run on a Linux server (also equipped with PHP 5.1.2),
everything works as expected and the output is proper UTF-8.


------------------------------------------------------------------------


--
Edit this bug report at http://bugs.php.net/?id=36415&edit=1
  Reply With Quote


  sponsored links


Reply


Thread Tools
Display Modes




Copyright 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666