Home > Open Source, Silverlight > A Custom Text Encoding Generator For Silverlight

A Custom Text Encoding Generator For Silverlight

March 30th, 2010

Unlike the .NET platform, Silverlight only provides two text encodings out of the box: UTF-8 (UTF8Encoding class) and UTF-16 (UnicodeEncoding class).

Accordingly, if you find yourself in a situation where you need to encode or decode data with another encoding (e.g. iso-8859-1), you’ll have to write your own Encoding class (or delegate the work to a server-side service).

I found myself in this exact situation yesterday, and came up with a little tool which automates the process. The Encoding Generator is a WPF application which takes the name or code page of a well known encoding, and generates source code for a custom Encoding class which compiles under Silverlight.

 

Get Source Code

 

Get Compiled Executable

Current version: 1.0.0, 2010.03.31, requires .NET 3.5 SP1 or higher

(You can subscribe to the RSS feed or follow me on Twitter in order to get notified about updates and bug fixes)

 

image

 

 

How Does It Work?

 

Specifying the Encoding

In order to specify the encoding you want to use, you can either enter the name or numeric code page of a well-known encoding. As soon as you enter a valid value, some information for the encoding is being displayed in the right hand border you can see on the screenshot.

As a sample for valid encoding names or code pages, here’s some values you can enter in order to tell the tool to generate an iso-8859-1 encoder (see screenshot):

  • iso-8859-1 (name)
  • latin1 (name)
  • 28591 (code page)
    A list of encodings can be found here.

Fallback Character

The tool gives you the option to specify a fallback character value, which is used as a default in case a character or byte value is being processed during encoding/decoding. In case you don’t specify the character, the encoding class will crash at runtime should it receive data that cannot be properly encoded or decoded.

Single-Byte Encoding Limitation

The generated class only works if a single byte can be translated into a single character and vice versa. Accordingly, if you try to generate code for an encoding that uses several bytes per (e.g. utf-8) character, the generator shows an error message.

Byte Range

You need to specify the byte range of the encoding. For example, ASCII supports only 128 characters, and therefore has a byte range of 128 bytes. Most other encodings support a byte range of 256 bytes, though. 256 is the maximum value that can be specified, as a single byte cannot deliver more values (the byte data type covers a numeric range from 0 – 255).

Testing

The generator also creates an NUnit test class that compares the results of the generated class against the original encoding. Accordingly, this test class is supposed to run in a regular .NET environment, not in Silverlight (if the original encoding that is used in the test was available in SL, you wouldn’t have to generate a custom encoding class in the first place…).

Internals

At runtime, the following is happening: Basically, the generator maintains mapping tables to do the encoding and decoding from characters to bytes and vice versa. Fore every request, it just looks up the translation tables for every supported character/byte value of the encoding.

The generator creates these translation tables on the fly in the form of a static array and dictionary.

Performance

The library doesn’t contain any performance tweaks and performs much slower than the built-in encodings that rely on all sorts of black magic. However, as long as you don’t have to encode or decode huge amounts of data, this shouldn’t be noticeable.

Here’s the results from my machine for 10000 iterations:

  • Encoding the whole character table to a byte array (256 characters)
    • 17 milliseconds with the built-in encoding
    • 94 milliseconds with the generated encoding
  • Decoding the bytes back into a string
    • 2 milliseconds with the built-in encoding
    • 46 milliseconds with the custom encoding

  1. Daniel
    March 31st, 2010 at 18:11 | #1

    “Silverlight only provides two text encodings out of the box: UTF-8 and Unicode.”

    UTF-8 *is* Unicode. Do you mean UTF-16 ?

  2. March 31st, 2010 at 18:14 | #2

    Daniel,

    I am aware of that. The encoding classes that come out of the box in Silverlight are called UTF8Encoding and UnicodeEncoding so I decided to stick to the terminology. But you’re right, I could have been more clear – I updated the posting accordingly. Thanks for the feedback.

  3. April 7th, 2010 at 20:58 | #3

    You save me! Thanks!

  4. February 17th, 2011 at 06:51 | #4

    I am really impressed. This is a great piece of work. Thanks for sharing it!

  5. July 25th, 2011 at 07:02 | #5

    Hi,

    Your encoding generator has been a blessing for me. I have been writing a keyboard app for WP7 and this has been vital for me.

    I am looking into supporting Nordic languages but i can’t get encoding generator to take the latin6 / iso-8859-10 codes.

    I’ll look into the code eventually. Thought i’d ask you first as i have my hands full building decent frequency word list

  6. July 26th, 2011 at 12:44 | #6

    dont worry about it. I think i am have figured out how to use utf-8 finally :) will know for sure tomorrow

  7. Alex Burtsev
    August 2nd, 2011 at 20:04 | #7

    Saved me 2 hours of work. Great idea to autogenerate it.

  8. Josias Fontoura
    August 31st, 2011 at 15:40 | #8

    Muito boa solução! Resolveu definitivamente o meu problema. Obrigado por compartilhar!

  9. Rui Marinho
    September 9th, 2011 at 00:47 | #9

    Nice job, i m trying to add ASCII support , but when i call my generated class i get the following error:

    “An item with the same key has already been added.”

    any ideas?

    thanks and nice work.

  10. nery
    January 19th, 2012 at 21:34 | #10

    Hi,am trying to load an windows 1252 xml, and am getting an error in the silverlight. i saw the class “A custom encoding class that provides encoding capabilities for the
    Central European (Windows)’ encoding under Silverlight”

    But i dont know how to use it in silverlight, i mean how to give the xml to it .
    Please any help is appreciated.

  11. Laguni
    March 4th, 2012 at 08:10 | #11

    I’m impressed and grateful

  12. Manfred
    March 4th, 2012 at 20:38 | #12

    Hi,

    thanks a lot, saved me a lot of time.

    Great work ;-)

  13. Ambious
    March 14th, 2012 at 08:42 | #13

    This is a brilliant idea and brilliant execution, thank you!

  14. Pedro
    March 20th, 2012 at 02:21 | #14

    Thank you man, you’re a crack!!

  15. March 25th, 2012 at 10:18 | #15

    Thanks for sharing its very useful tool, not to mention the auto-generated unit tests!

  16. April 13th, 2012 at 21:02 | #16

    Thanks a lot! Saved me a lot of time to decode a web page from ISO-8859-15. Works perfectly!

  17. Onni Hakala
    April 23rd, 2012 at 11:59 | #17

    Thank you! This helped me alot!

  18. Todd
    April 24th, 2012 at 16:42 | #18

    Hi Philipp,

    Can you please make one that can generate encoding for BIG5 (two bytes)? We are desperately need that. Thank you!

    Todd

  19. Martin
    July 11th, 2012 at 23:17 | #19

    Thank you for publishing this, wonderful work, you saved me too ;)

  20. Sashka
    October 29th, 2012 at 17:50 | #20

    You made my day, thank you a lot! Very useful and nice application!

  21. Marta
    April 28th, 2013 at 16:47 | #21

    You are my hero! It works perfectly and save me a lot of time, thank you for sharing your work :)

  22. Robert
    June 7th, 2013 at 19:39 | #22

    Thanks! I spent like 2-3h trying to get 1252 to utf8 to work.
    This helped me a bit: http://msdn.microsoft.com/en-us/library/kdcak6ye.aspx

    But in the end the thing that solved my issue was when I was reading the textfiles with at StreamReader, all I had to do was specifiy the generated class. Like this:

    Stream s = Application.GetResourceStream(new Uri(“TestData/file.xml”, UriKind.Relative)).Stream;
    var reader = new StreamReader(s, new Windows1252Encoding());

    Where the Windows1252Encoding class is what the Silverlight-application generated for me.

    BIG THANKS to the author!

  23. Ciro
    October 31st, 2013 at 00:03 | #23

    Hi Philipp,I need an example about how to implemente a generated class from you tool.
    Is is possible for you post a small example?

    Thanks in advice.

  24. jony
    December 30th, 2013 at 14:56 | #24

    i am speechless i was about to throw 3 days worth of work because of this problem, and then i fount this.

    i really cant thank you enough!!

    best tool ever!!!
    thank you , thank you, thank you!

  25. Algem Mojedo
    May 9th, 2014 at 09:50 | #25

    Thank you for publishing this, Great idea to auto generate it, you saved me from Headache and lot of time…

  26. Chris
    August 23rd, 2014 at 09:30 | #26

    This utility is godsend, really. Pretty good use in Lightswitch. I cannot be grateful enough for your sharing. Thanks again!

  1. December 27th, 2010 at 11:06 | #1
  2. March 14th, 2012 at 09:10 | #2
  3. April 24th, 2012 at 11:12 | #3
  4. May 20th, 2012 at 13:47 | #4
  5. November 7th, 2013 at 09:59 | #5
  6. May 26th, 2014 at 02:16 | #6