Ticket Attachment Encoding

Moderator: crythias

Post Reply
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Ticket Attachment Encoding

Post by Xypher »

Hello,
I've just recently upgraded our company's OTRS to 3.1.7 and we're experiencing an issue with binary attachments sent through tickets. It appears the files' encoding is converted because the attachment will have many ?'s in it when viewed from a hex editor.

Previously we've had this same issue but I was able to fix it in the SysConfig: by changing default charset from utf8 to iso~ (3.0.10).

I've scoured the internet trying to find a solution to no avail. It does not appear to be a common issue for others?

Please someone help :(
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

What database software are you using?

OTRS versions 3.0 onwards are only meant to work with character encoding UTF8. This applies to all database tables, default database charset, webserver charset, etc. If you're using an ISO variant this could work but I can't imagine it being good, as you may end up with mixed charsets inside the database (and that can be unpleasant to have to deal with).

When you say you are viewing binary data in a hex editor, do you mean the binary form of the articles? Or the article attachments?
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

CSL,

We're using MySQL 5.1.49-3 on Debian 6.0. Just this morning I converted our db from latin1 to utf8 (hoping that would fix this). Although the 'file dbdump.sql' reported the dump was utf8 anyhow... (the database was not created as utf8 though)

With the hex editor I was viewing the downloaded attachment to compare to the original.
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

You will need to force MySQL to use UTF8 as it's default charset for everything to be sure that it is the only encoding used. Enter the following into the MySQL command line to see what charsets MySQL is using for its defaults:

Code: Select all

show variables like 'char%';
It should look like this:

Code: Select all

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
If it doesn't, and there are 'latin1' or 'iso' charsets in there, then add the following to your my.cnf file:

Code: Select all

character-set-server=utf8
collation-server=utf8_general_ci
skip-character-set-client-handshake
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET collation_database = utf8_general_ci'
init_connect='SET NAMES utf8'
Then restart MySQL. That will make sure that MySQL only deals with UTF8. Also make sure OTRS is set to use UTF8 in sysconfig.

Use this command to find out which of your tables are set to use something other than UTF8:

Code: Select all

show table status where collation != 'utf8_general_ci' \G
If you have any listed here then you'll need to dump and convert them in a specific way. I found this site very helpful when I had the same issue on an older 2.4 install:
http://www.bluebox.net/news/2009/07/mysql_encoding/
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

Ah son of a...

Yeah "show variables like 'char%';" was tainted by latin1 and every table status showed latin1

I missed one command in between the mysqldump and import. Thought since 'file backup.sql' came back as UTF8 there was no need.

Currently in process of re-dumping the db and will do the perl regex substitute to switch the table encoding.


Thank you, I'll update how it goes! :)
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

Ok so yesterday was horrible, our ticketing system was down for a large portion of the day due to dumping & recreating the db.

This morning I was able to convert all the tables to utf8 (yesterday I tried using a perl regex substitute to convert them but ended up getting key max size 1000 errors on import) but now attachments are getting cut short. It appears as though a file will be cut right at the point where the first char is found to have greater than 220~+ value.

Now what's going on? :(

All the character_set's are utf8 (minus the filesystem; which is binary), I added the lines to my.cnf and all the tables show as utf8_general_ci collation
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

You backed up your OTRS database before this, or have a scheduled backup running on it, yes? It would be very preferable / advisable to copy your OTRS install (application folder and database) to another machine and attempt all of this there before touching production.

It sounds like you could be hitting a max_allowed_packet size limit when dumping the database:

https://dev.mysql.com/doc/refman/5.1/en ... wed_packet

mysqldump doesn't respect your MySQL server's max_allowed_packet value, and you will need to specify it manually in the command when dumping the database, like this:

mysqldump -u root -p --max_allowed_packet=128M dbname > filename.mysql

Those key 1000 errors sound like the indexes in some of the tables aren't being re-created properly due to the increased size of UTF8 characters (3 bytes instead of 1 byte). You should be able to fix this with mysqlcheck. First of all, run mysqlcheck in its basic form (mysqlcheck -u root -p otrs) to check the tables. If it reports that any need fixing, backup your database again (as fixing can result in data loss) and run mysqlcheck -r -u root -p otrs to attempt a repair.
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

When making a dump of the database I was using max_allowed_packet=512M

mysqlcheck came back OK on all tables.
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

That's good, so the only problem remaining now is the attachments. These are all stored in the article_attachment table as longblob. As I'm typing this I'm actually remembering something about blobs being excluded from conversion operations, because they are not stored in any particular charset (they are raw binary, you can view them using a specific charset, but they are not stored conforming to any charset): http://dev.mysql.com/doc/refman/5.0/en/blob.html

So if you have ran a charset conversation operation on them such as iconv, then you may have rendered them unreadable. I think the solution was to do the reverse conversion on them to restore them to their original format, so in this case from UTF8 to iso~ or latin1.

You could try to run an ALTER TABLE statement to see if this does the trick, or failing that, dump the article_attachment table on its own, convert it back, and then re-import it. There is also a longblob column in the article_plain table that you may have to do this with as well.

If the data itself has been truncated by the conversion process, where a too-large value couldn't be inserted, then you may have to restore the article_attachment table from a backup of the database instead.
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

I saw the iconv command but based on the status of the conversion it was not needed. All attachments that exist and that are not corrupt are still the same today.

The truncation happens at submission of the attachment. It will even show the file as short in size when viewed from the ticket (AgentTicketZoom). So no, the conversion process did not truncate anything. It is post-conversion where it is occurring.

I'm just baffled more people aren't complaining about this. Or unless I missed something in the upgrade process.
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

So I find out our company has a basic subscription with OTRS and we have a ticket open with them on this issue now. I'll update with the resolution once one is found.
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

Ok - it sounded from your post like the issue was with attachments being truncated in the DB after the conversion. If that's not happened then that's excellent and it looks like you've successfully converted it.

This instead sounds like a webserver issue. What charset is your webserver set to use? You can check in Firefox by going to the 'Web Developer' menu - 'Character encoding' and then seeing which one is checked when your browse OTRS. Just a last thought before you carry on with the official support channel.
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

CSL wrote:.. What charset is your webserver set to use? You can check in Firefox by going to the 'Web Developer' menu - 'Character encoding' ..
I don't see the option of Character Encoding but when I view the Page Info after visiting our OTRS site it shows "Content-Type = text/html;charset=utf-8".

(I did however see under the View menu a Character Encoding submenu which had UTF-8 selected... not sure if selected by the site content or by default though)
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

Ok that sounds right - the web developer menu is on newer versions of firefox that have the firefox button on the top left, so not important.

It may be worth switching your attachment method from DB to filesystem to see if this resolves the issue, and would let you point the finger at the database if it does. Beyond that I'm not sure what else to suggest, so carrying on with official support is probably the way to go.
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

Ok will do. Thank you for all your help CSL.
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

OK so OTRS wasn't much help. Their response was:
It seems to me that part of the schema must have gotten lost during the utf-8 conversion. The help I can offer for database conversion is limited since this is an issue on the MYSQL end. Generally speaking I would recommend checking the datatype for 'articleattachement' in order to see what it is set to.
Any thoughts?
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

So if one of you could, please provide an outline or link to an official site showing step by step what to do to convert a MySQL db to utf8 (with OTRS in mind.)
crythias
Moderator
Posts: 10169
Joined: 04 May 2010, 18:38
Znuny Version: 5.0.x
Location: SouthWest Florida, USA
Contact:

Re: Ticket Attachment Encoding

Post by crythias »

note that I haven't tried this, but this person seems to have done something: http://blogs.law.harvard.edu/djcp/2010/ ... right-way/
YMMV.
OTRS 6.0.x (private/testing/public) on Linux with MySQL database.
Please edit your signature to include your OTRS version, Operating System, and database type.
Click Subscribe Topic below to get notifications. Consider amending your topic title to include [SOLVED] if it is so.
Need help? Before you ask
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

crythias wrote:note that I haven't tried this, but this person seems to have done something: http://blogs.law.harvard.edu/djcp/2010/ ... right-way/
YMMV.
Seems a tad bit easy... even a couple of their comment'ers had a thing or two to say about the method provided. I'll have to get our test server up and try it there.

Thank you crythias
Xypher
Znuny newbie
Posts: 48
Joined: 31 Jul 2012, 15:13
Znuny Version: 3.1.7
Real Name: Jonathan
Company: Systems Integration, Inc.

Re: Ticket Attachment Encoding

Post by Xypher »

Should any of the OTRS tables be collation of 'utf8_bin' (Say perhaps the article_attachment table) or does this not matter?
CSL
Znuny expert
Posts: 159
Joined: 11 Nov 2011, 19:27
Znuny Version: 3.0.11

Re: Ticket Attachment Encoding

Post by CSL »

It doesn't matter, they should all be utf8_general_ci from what I understand (that's how my own system is set up).
Backend: OTRS 3.0.11 RedHat Enterprise Linux 6.2, Apache, MySQL with replication
Frontend: OTRS 3.0.11 RedHat Enterprise Linux 6.2 with SELinux, Apache SSL
mishelpdesk
Znuny newbie
Posts: 4
Joined: 29 Dec 2012, 00:27
Znuny Version: 3.2.9
Real Name: Anne McGee
Company: Insitu

Re: Ticket Attachment Encoding

Post by mishelpdesk »

Here's what worked for me:

Check that character set was latin1 instead of utf8 as suggested:
SHOW VARIABLES LIKE 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Check my tables and found 8 that were wonky
show table status where collation != 'utf8_general_ci' \G

Fixed the otrs database:
ALTER DATABASE otrs DEFAULT CHARACTER SET utf8 COLLATE=utf8_general_ci;

Then fixed the tables. Luckily for me there were only 8 so I did them like this:

ALTER TABLE `sessions` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_activity` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_activity_dialog` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_entity` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_entity_sync` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_process` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_transition` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `pm_transition_action` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

It fixed everything except character_set_server was still latin1 when I did SHOW VARIABLES command again.
So I adding this line in my.cnf:

[mysqld]
character-set-server = utf8

Everything seems to work ok now ... cross your fingers!
OTRS Version 3.1.12
ITSM 2.0 Add-on
Linux / Ubuntu 12.04.1
Post Reply