kmail
EncodingDetector Class Reference
Provides encoding detection capabilities. More...
#include <encodingdetector.h>
Public Types | |
enum | EncodingChoiceSource { DefaultEncoding, AutoDetectedEncoding, BOM, EncodingFromXMLHeader, EncodingFromMetaTag, EncodingFromHTTPHeader, UserChosenEncoding } |
enum | AutoDetectScript { None, SemiautomaticDetection, Arabic, Baltic, CentralEuropean, ChineseSimplified, ChineseTraditional, Cyrillic, Greek, Hebrew, Japanese, Korean, NorthernSaami, SouthEasternEurope, Thai, Turkish, Unicode, WesternEuropean } |
Public Member Functions | |
EncodingDetector () | |
EncodingDetector (TQTextCodec *codec, EncodingChoiceSource source, AutoDetectScript script=None) | |
bool | setEncoding (const char *encoding, EncodingChoiceSource type) |
const char * | encoding () const |
bool | visuallyOrdered () const |
void | setAutoDetectLanguage (AutoDetectScript) |
AutoDetectScript | autoDetectLanguage () const |
EncodingChoiceSource | encodingChoiceSource () const |
bool | analyze (const char *data, int len) |
bool | analyze (const TQByteArray &data) |
Static Public Member Functions | |
static AutoDetectScript | scriptForName (const TQString &lang) |
static TQString | nameForScript (AutoDetectScript) |
static AutoDetectScript | scriptForLanguageCode (const TQString &lang) |
static bool | hasAutoDetectionForScript (AutoDetectScript) |
Protected Member Functions | |
bool | errorsIfUtf8 (const char *data, int length) |
TQTextDecoder * | decoder () |
Detailed Description
Provides encoding detection capabilities.
Searches for encoding declaration inside raw data -- meta and xml tags. In the case it can't find it, uses heuristics for specified language.
If it finds unicode BOM marks, it changes encoding regardless of what the user has told
Intended lifetime of the object: one instance per document.
Typical use:
TQByteArray data; ... EncodingDetector detector; detector.setAutoDetectLanguage(EncodingDetector::Cyrillic); TQString out=detector.decode(data);
Do not mix decode() with decodeWithBuffering()
Guess encoding of char array
Definition at line 57 of file encodingdetector.h.
Constructor & Destructor Documentation
EncodingDetector::EncodingDetector | ( | ) |
Default codec is latin1 (as html spec says), EncodingChoiceSource is default, AutoDetectScript=Semiautomatic.
Definition at line 796 of file encodingdetector.cpp.
EncodingDetector::EncodingDetector | ( | TQTextCodec * | codec, | |
EncodingChoiceSource | source, | |||
AutoDetectScript | script = None | |||
) |
Allows to set Default codec, EncodingChoiceSource, AutoDetectScript.
Definition at line 800 of file encodingdetector.cpp.
Member Function Documentation
bool EncodingDetector::analyze | ( | const char * | data, | |
int | len | |||
) |
Analyze text data.
- Returns:
- true if there was enough data for accurate detection
Definition at line 906 of file encodingdetector.cpp.
bool EncodingDetector::analyze | ( | const TQByteArray & | data | ) |
Analyze text data.
- Returns:
- true if there was enough data for accurate detection
Definition at line 901 of file encodingdetector.cpp.
TQTextDecoder * EncodingDetector::decoder | ( | ) | [protected] |
- Returns:
- TQTextDecoder for detected encoding
Definition at line 841 of file encodingdetector.cpp.
const char * EncodingDetector::encoding | ( | ) | const |
Convenience method.
- Returns:
- mime name of detected encoding
Definition at line 824 of file encodingdetector.cpp.
bool EncodingDetector::errorsIfUtf8 | ( | const char * | data, | |
int | length | |||
) | [protected] |
Check if we are really utf8.
Taken from kate
- Returns:
- true if current encoding is utf8 and the text cannot be in this encoding
Please somebody read http://de.wikipedia.org/wiki/UTF-8 and check this code...
Definition at line 732 of file encodingdetector.cpp.
EncodingDetector::AutoDetectScript EncodingDetector::scriptForName | ( | const TQString & | lang | ) | [static] |
Takes lang name _after_ it were i18n()'ed.
Definition at line 1166 of file encodingdetector.cpp.
bool EncodingDetector::setEncoding | ( | const char * | encoding, | |
EncodingChoiceSource | type | |||
) |
- Returns:
- true if specified encoding was recognized
Definition at line 846 of file encodingdetector.cpp.
The documentation for this class was generated from the following files: