Is there a way to tell what encoding is used for the name and content of a file?

Question

Is there a way to tell what encoding is used for the name and content of a file? Both GUI and terminal solutions (preferred) are fine. Thanks and regards!

enzotib · Accepted Answer · 2011-06-10T22:35:32.363

4

You could try

chardet <<<filename

The chardet program can try to guess the encoding of the stream on stdin, and <<< is the mean by which bash use a string as stdin, the same as

echo filename | chardet

For a whole directory content you can use

ls dir | chardet

EDIT

I forgot about the content, but is almost the same:

chardet <filename

or

cat filename | chardet

or for all the files in dir

cat dir/* | chardet

edited Jun 10 '11 at 22:35

answered Jun 10 '11 at 22:11

enzotib

93,831

Thanks! Nice to know chardet. I am trying to figure out the encodings of the names of the compressed files in a zip archive, but the output by chardet seems not correct. Please see my post here http://askubuntu.com/questions/48158/chinese-encoding-in-names-of-compressed-files-in-zip . Thanks! – Tim Jun 10 '11 at 22:57

score 2 · Answer 2 · answered Jun 10 '11 at 21:12

2

If you mean mime-encoding you could try file --mime-encoding filename for the content of the file.

answered Jun 10 '11 at 21:12

Marcel

612

Thanks! How about the file name? – Tim Jun 10 '11 at 21:22
I never put special chars in filenames. I remember there was some command which tries to detect the encoding from a string, then you can just pipe the filename through this... – Marcel Jun 10 '11 at 21:26
Thanks! I was wondering what differences are between "mime-encoding" and "character encoding"? – Tim Jul 09 '11 at 15:33

Is there a way to tell what encoding is used for the name and content of a file?

2 Answers2

Linked