Supporting each other

Community forums

Welcome, Guest

Forum

Main Forum

Bugs and Issues

Character encoding problem while using Word Search and Crossword pages in LO

Report any technical problems you discover and discuss solutions.

Page:
1

TOPIC:

Character encoding problem while using Word Search and Crossword pages in LO 3 years 8 months ago #8375

chandra
Topic Author
Offline
New Member
Posts: 2
Thank you received: 0

Hi, One of my responsibilities in the non-profit I work in is to create language learning material for my education team. The material is in an Indian language / script (Language: Kannada, script: Kannada). Things have been going wonderfully well so far, and I hope to be able to share a showcase of our work on this forum when we have our first official release of LOs.
Yesterday I hit an issue while using the Word Search and the Crossword pages. Kannada characters are multi-byte representations unlike the basic Latin set ISO8859-1 (example: A to Z) which are single byte. It appears both the Word Search and the Crossword activities seem to assume single-byte representations for the words used in the activities and as a result, cut out the individual Kannada characters into multiple byte-wise units that cannot be read.

To illustrate what I mean, I have attached a screenshot of a simple 2-word crossword. Here, the two Kannada words are actually only 2 and 3 characters long (but 3 and 5 bytes long), but Xerte is splitting them up as bytes. This obviously results in an incorrect visual representation (although technically valid). I am wondering if anyone has come across this issue and if there is a workaround. Or, am I doing something wrong?

I would love for children to be able to use the Crossword and Word Search activities! So, any input is much appreciated. Thank you.

Word Search poses an additional problem if the above is resolved. The activity fills letters in the grid using the A-Z characters by default. This obviously does not go well with the non Latin characters we would like the grid to be filled by. Is there a way to change this default behavior? Thank you!

Attachments:

Please Log in or Create an account to join the conversation.

Character encoding problem while using Word Search and Crossword pages in LO 3 years 7 months ago #8380

simonbarne
Offline
Junior Member
Posts: 36
Thank you received: 5

Just guessing, but is Xerte treating the vowel diacritics as separate characters, because that's how they are in Unicode? (en.wikipedia.org/wiki/Kannada_(Unicode_block)) In your example, the word written downwards seems to be ಇಲಿ, ending with the vowel diacritic ಿ which is normally attached to the preceding consonant (ಲ) but is here shown as a separate character. But how to get round this, I have no idea!

Please Log in or Create an account to join the conversation.

Character encoding problem while using Word Search and Crossword pages in LO 3 years 7 months ago #8381

chandra Topic Author Offline New Member Posts: 2 Thank you received: 0	I agree with your analysis. I thought so too. I see the 'main' character + the diacritic/modifier are stored as separate, hex-encoded characters in the Xerte backend. At the moment, I don't know how to get around this. Have to experiment.
	Please Log in or Create an account to join the conversation.