Web Connection User Discussions
Arabic and English in XML and WWWC
Gravatar is a globally recognized avatar based on your email address. Arabic and English in XML and WWWC
  Michael B
  All
  Apr 18, 2018 @ 09:37am

Rick,

You have answered similar questions for me in the past, but I wondered if you have any suggestions for something we are having trouble with. Our WWWC is essentially a developer friendly CMS. We have a feature that allows end users to change a labels.xml file. The templates in our WWWC app render the text of the labels at run time. This works fine latin based character sets. We have our first Saudi Arabian client and they have translated the labels.xml file. It looks fine when i open it in various editors (except vfp [big shock-not!]).

I guess I already know part of my answer as I write this, but was hoping for a suggestion as to how I might handle this without having to re-write the entire templating engine to Angular (I reviewed this - https://weblog.west-wind.com/posts/2015/May/23/Right-To-Left-RTL-Text-Display-in-Angular-and-ASPNET). I suspect since the final output of the html goes through VFP my solution has to happen inside of VFP? Maybe there is a third party FLL that does this sort of thing?

Do we need to setup a wwwc server on a Windows box and set the default character set to Arabic? Will that help?

Thanks in advance.

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Rick Strahl
  Michael B
  Apr 18, 2018 @ 02:58pm

This setup will cause you unending pain in FoxPro... 😃

Mixing mutliple code pages in FoxPro at the same time doesn't work worth a shit in FoxPro and the only way this can even remotely work is by using UTF-8 to capture non-supported codepage text even in memory.

If you're outputting to HTML or XML you may be able to capture everything in UTF-8 format and then dump the raw UTF-8 directly into XML or HTML and that does work, but causes other problems in that you now have to manually encode everything you write to the document explicitly (you can't do a UTF-8 conversion on the whole document anymore because some things are already encoded).

If you have a choice - use some other technology that supports Unicode to handle reading and writing out of the mixed language data and then return the final result back to FoxPro over COM using SYS(3101) which is used to set the CodePage used for COM interaction. One of the codepages is UTF-8.

I do this in the Markdown Parser here on the message board site for example:

IF (llUtf8)
   lnOldCodePage = SYS(3101)
   SYS(3101,65001)
   lcMarkdown = STRCONV(lcMarkdown,9)
ENDIF

lcHtml = this.oBridge.InvokeStaticMethod("Markdig.Markdown","ToHtml",lcMarkdown,loPipeline)

IF llUtf8
  SYS(3101,lnOldCodePage)  
ENDIF

I pull the data from the Markdown Parser from .NET (via COM) and tell the COM system to return the data as UTF-8. This embeds the raw UTF-8 response into the output then.

But because I do this I can't encode the whole page, so all other output that can have extended characters etc has to be explicitly embedded like this:

<%: STRCONV(poMSg.Subject,9) :%>

for which the STRCONV() normally isn't necessary. Ugly but it works.

+++ Rick ---

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Tuvia Vinitsky
  Michael B
  Apr 19, 2018 @ 10:31am

You have found what IMO is the biggest VFP issue - code pages and no Unicode.

It is possible to set a texbox to a designated codepage and display the VFP data in that charset (it will look like garbage in the table), but you have to know the code page etc. You cannot just add some Arabic in Unicode and display it via VFP. It simply cannot be done; VFP will not recognize Unicode.

The only approach I see it to customize the arabic version of the app.

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Michael B
  Michael B
  Apr 21, 2018 @ 07:35pm

Rick - I just an article you wrote about this from a million years ago - https://www.west-wind.com/presentations/foxunicode/foxunicode.asp - I decided to blow my saturday night and play around with this. I changed the default region to Saudi Arabia and rebooted. I fired up VFP, changed the region to 'system' and then restarted vfp. I recompiled my app, not sure if i needed to, but did.

Then I hit a page that uses my xml based translation function. Checkout the screenshot. I guess you were right about the pain and suffering that would ensue.

I thought I would mention that when I was debugging, the XML that renders with filetostr() that has both english and arabic in it, actually looks 'ok' (not perfect but viewable)... I dont know what caused vfp to bomb with the BS out of memory message, but...

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Michael B
  Michael B
  Apr 21, 2018 @ 08:13pm

Rick - check it out!

I opened my WWWC template that renders the page in notepad++ and saved it as 'utf-8 without BOM' and then copy and pasted some Arabic into it. That appears to be the solution. The external xml solution that we use for Latin character sets wont cut it for the cool stuff!

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Rick Strahl
  Michael B
  Apr 23, 2018 @ 12:44am

Sure static text isn't a problem because the template can be saved as UTF-8. FoxPro can just pass through the UTF-8 text and that'll work if you don't encode the page as Web Connection does by default. FoxPro just passes the already encoded binary data and it just renders. That works fine for static text...

But that doesn't address the fact that you may need to get data expressions from your database into that page. Since that data has to come from FoxPro and be stored in FoxPro variables to get embedded you can't display this.

There are ways around this if the data comes from:

  • A COM object using a SYS(3101,65001) function that can return raw data as UTF-8
  • OleDb Driver using SYS(3101,65001) function that can return raw data as UTF-8
  • ODBC SQL Server driver with CodePage set to UTF-8

However this means the data has to come straight from COM in order to be able to access it this way which may make it very difficult to query or manipulate the data after it comes into FoxPro because the value you're working on may be encoded.

In addition you now have a page in UTF-8 and that code can no longer be UTF-8 encoded as a whole page - if you do it double encodes and your nice Arabic text will be gobbledy gook gunk. Alternately you now have to UTF-8 encode all content you embed into the template explicitly.

There are no easy solutions to this with FoxPro that I know of.

+++ Rick ---

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Michael B
  Rick Strahl
  Apr 23, 2018 @ 11:24am

Rick - if our CMS was db centric instead of file centric I'd be hiring you to help me deal with this. I wonder if its reasonable for you to build in multi-lingual into the framework. Your grasp of how to achieve would save lots of time for folks like me. While you have done an excellent job at describing how YOU would solve this issue, I can't put all the pieces together in my world and in my app. We rely on you to lead the way. I had to create my own labels.xml solution because there was no overt solution in WWWC.

Food for thought.

Gravatar is a globally recognized avatar based on your email address. re: Arabic and English in XML and WWWC
  Rick Strahl
  Michael B
  Apr 23, 2018 @ 01:50pm

I don't think there's anything wrong with your understanding 😃 You can't pull it all together because there is no way to pull it all together!

If you need to create an app that displays multiple code pages at the same time, it's not really possible to do this with FoxPro unless you jump through some excruciating hoops (as mentioned in the last message). And even that leaves holes in what you can do.

There's no simple generic solution to this and it's a failure of FoxPro non-Unicode support. FoxPro simply can't deal with non-codepage language text - as soon as FoxPro touches it it destroys the data by converting.

If you are building a multi-codepage-language CMS then FoxPro is not the right solution for that. Anything that supports Unicode can easily do this but FoxPro - not so much.

+++ Rick ---

© 1996-2024