FORMAT

        cstocs [options] src_encoding dst_encoding [files ...]

SYNOPSIS

cstocs il2 ascii < file | less cstocs -i utf8 il2 file1 file2 file3 cstocs --help

DESCRIPTION

Cstocs is a simple conversion utility to change charset encoding of a text. It reads either specified files or (if none specified) the standard input, assumes that the input is encoded in \*(C`src_encoding\*(C' and ties to reencode it into \*(C`dst_encoding\*(C'. The result is written to the standard output.

Run \*(C`cstocs\*(C' without parameters to get short help and list of available encodings.

Characters that are not defined in \*(C`src_encoding\*(C' are passed to the output unchanged.

If source text contains character, that is defined in \*(C`src_encoding\*(C' but not in \*(C`dst_encoding\*(C', it can be handled several ways. For example, character \*(L"e with caron\*(R" (symbol ecaron), and \*(L"d with caron\*(R" (symbol dcaron) are included in the iso-8859-2 encoding, but not in the iso-8859-1. If you will do reencoding of 8859-2 text to 8859-1, you may want to do one of the following actions:

1.

Keep it the same, option \*(C`--nofillstring\*(C'.

2.

Do not produce any output instead of \*(L"ecaron\*(R" symbol, option \*(C`--null\*(C'.

3.

Substitute some string (possibly a space) instead of both ecaron and dcaron, options \*(C`--fillstring\*(C'.

4.

Substitute a letter \*(L"d\*(R" instead of dcaron, and \*(L"e\*(R" instead of ecaron. It is even possible to substitute string instead of symbol, so you can replace the \*(L"\s-1AE\s0\*(R" Latin character with string \*(L"\s-1AE\s0\*(R" (letter \*(L"A\*(R", and letter \*(L"E\*(R"). Or you can replace a \*(L"plusminus sign\*(R" with a string \*(L"+/-\*(R". These substitutions are described in the accent file.

OPTIONS

-i, -i.ext, --inplace.ext

Files specified will be converted in-place, using Perl \*(C`-i\*(C' facility. Optionaly, an extension for backup copies may be specified after dot. This parameter has to be the first one, if specified.

--dir directory

Encoding files are taken from directory instead of the default, which is Cz/Cstocs/enc in the Perl lib tree. The location of encoding files can also be changed using the \s-1CSTOCSDIR\s0 environment variable, but the --dir option has the highest priority.

--fillstring string

If source text contains character, that is defined in the \*(C`src_encoding\*(C' but not in the \*(C`dst_encoding\*(C' nor in the accent file (or accent file is not used), it is replaced by \*(C`string\*(C'. The default is single space.

--nofillstring

Disable changes of characters that would otherwise have fillstring applied. This is different from \*(C`--null\*(C' because that cancels that character out.

--null

Completely equivalent to --fillstring "".

--nochange or --noaccent

Do not use the accent file at all.

--onebyone

Use only those rules from the accent file, which rewrite one character to one character. If this option is specified, character \*(L"ecaron\*(R" will be rewritten to \*(L"e\*(R", but \*(L"\s-1AE\s0\*(R" character will not be rewritten to \*(L"\s-1AE\s0\*(R" string.

--onebymore

Use all rules from accent file. This is the default option.

RELATED TO cstocs…

Cz::Cstocs\|(3).

AUTHOR

Jan \*(L"Yenya\*(R" Kasprzak has done the original Un*x implementation.

Jan Pazdziora, [email protected], created the Perl module version.