Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using //TRANSLIT does not work (might be iconv problem) #7

Open
GoogleCodeExporter opened this issue Jan 28, 2016 · 2 comments
Open

Comments

@GoogleCodeExporter
Copy link
Collaborator

Summary:
When using iconv to Tranlate UTF-8 to ASCII//TRANSLIT the 'é' is not 
translated to 'e' (like expected). Instead, a questionmark '?' is returned.
I have tried to debug this and I can see that after C.iconv is called, cScrach 
contains the '?' already. When not using //TRANSLIT, the 'é' is being 
translated to 'xx', where 'x' is the invalid char I used, and I guess it's 
placed twice because 'é' is a multibyte (2 bytes) UTF-8 character.

Used code (note that the invalid char is 'x', NOT '?'):
package main

import (
    "code.google.com/p/go-charset/charset/iconv"
    "encoding/hex"
    "log"
    "fmt"
)

func main() {
    input := "AéA"
    t, err := iconv.Translator("ASCII//TRANSLIT", "UTF-8", 'x')
    if err != nil {
        log.Fatalf("Coult not get charset translator from UTF-8 to ASCII. Got error: %s\n", err)
        return
    }
    fmt.Print(hex.Dump([]byte(input)))
    n, cdata, err := t.Translate([]byte(input), true)
    if err != nil {
        log.Fatalf("Could not translate string '%s' to ASCII. Got error: %s\n", input, err)
    }
    fmt.Print(hex.Dump(cdata))
    output := string(cdata)
    log.Printf("Translated %d characters from UTF-8 ('%s') to ASCII ('%s')\n", n, input, output)
}


Result when running:
geertjohan@VirtKubuntu:~$ iconvtesting 
00000000  41 c3 a9 41                                       |A..A|
00000000  41 3f 41                                          |A?A|
2012/06/20 09:47:29 Translated 4 characters from UTF-8 ('AéA') to ASCII ('A?A')

Original issue reported on code.google.com by [email protected] on 20 Jun 2012 at 7:52

@GoogleCodeExporter
Copy link
Collaborator Author

Original comment by [email protected] on 20 Jun 2012 at 9:16

  • Changed state: Accepted

@Karry
Copy link

Karry commented Jul 23, 2020

In C++ code, it is necessary to initialize locale before using iconv with "ASCII//TRANSLIT" :

  try {
    std::locale::global(std::locale(""));
    std::cout << "Current locale activated" << std::endl;
  } catch (const std::exception& e) {
    std::cerr << "ERROR: Cannot set locale: " << e.what() << std::endl;
  }

In C it should be equivalent to setlocale(LC_CTYPE,"");. Not sure if this is the same issue...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants