E-MailRelay
Public Types | Static Public Member Functions | Static Public Attributes | List of all members
G::Convert Class Reference

A static class which provides string encoding conversion functions between UTF-8 and wchar_t. More...

#include <gconvert.h>

Public Types

using unicode_type = std::uint_least32_t
 
using ParseFn = std::function< bool(unicode_type, std::size_t, std::size_t)>
 

Static Public Member Functions

static std::wstring widen (std::string_view)
 Widens from UTF-8 to UTF-16/UCS-4 wstring. More...
 
static bool valid (std::string_view) noexcept
 Returns true if the string is valid UTF-8. More...
 
static std::string narrow (const std::wstring &)
 Narrows from UTF-16/UCS-4 wstring to UTF-8. More...
 
static std::string narrow (const wchar_t *)
 Pointer overload. More...
 
static std::string narrow (const wchar_t *, std::size_t n)
 String-view overload. More...
 
static bool invalid (const std::wstring &)
 Returns true if the string contains L'\xFFFD'. More...
 
static bool invalid (const std::string &)
 Returns true if the string contains u8"\uFFFD". More...
 
static std::size_t u8out (unicode_type, char *&) noexcept
 Puts a Unicode character value into a character buffer with UTF-8 encoding. More...
 
static std::pair< unicode_type, std::size_t > u8in (const unsigned char *, std::size_t n) noexcept
 Reads a Unicode character from a UTF-8 buffer together with the number of bytes consumed. More...
 
static void u8parse (std::string_view, ParseFn)
 Calls a function for each Unicode value in the given UTF-8 string. More...
 
static bool utf16 (bool)
 Forces UTF-16 even if wchar_t is 4 bytes. Used in testing. More...
 

Static Public Attributes

static constexpr unicode_type unicode_error = ~(unicode_type)0
 

Detailed Description

A static class which provides string encoding conversion functions between UTF-8 and wchar_t.

On Unix wchar_t strings are unencoded UCS-4; on Windows wchar_t strings are UTF-16.

Definition at line 43 of file gconvert.h.

Member Typedef Documentation

◆ ParseFn

using G::Convert::ParseFn = std::function<bool(unicode_type,std::size_t,std::size_t)>

Definition at line 51 of file gconvert.h.

◆ unicode_type

using G::Convert::unicode_type = std::uint_least32_t

Definition at line 48 of file gconvert.h.

Member Function Documentation

◆ invalid() [1/2]

bool G::Convert::invalid ( const std::string &  s)
static

Returns true if the string contains u8"\uFFFD".

Definition at line 81 of file gconvert.cpp.

◆ invalid() [2/2]

bool G::Convert::invalid ( const std::wstring &  s)
static

Returns true if the string contains L'\xFFFD'.

Definition at line 74 of file gconvert.cpp.

◆ narrow() [1/3]

std::string G::Convert::narrow ( const std::wstring &  s)
static

Narrows from UTF-16/UCS-4 wstring to UTF-8.

Invalid input characters are substituted with u8"\uFFFD", ie. "\xEF\xBF\xBD".

Definition at line 53 of file gconvert.cpp.

◆ narrow() [2/3]

std::string G::Convert::narrow ( const wchar_t *  p)
static

Pointer overload.

Definition at line 60 of file gconvert.cpp.

◆ narrow() [3/3]

std::string G::Convert::narrow ( const wchar_t *  p,
std::size_t  n 
)
static

String-view overload.

Definition at line 67 of file gconvert.cpp.

◆ u8in()

std::pair< G::Convert::unicode_type, std::size_t > G::Convert::u8in ( const unsigned char *  p,
std::size_t  n 
)
staticnoexcept

Reads a Unicode character from a UTF-8 buffer together with the number of bytes consumed.

Returns [unicode_error,1] on error.

Definition at line 153 of file gconvert.cpp.

◆ u8out()

std::size_t G::Convert::u8out ( unicode_type  u,
char *&  p_out 
)
staticnoexcept

Puts a Unicode character value into a character buffer with UTF-8 encoding.

Advances the pointer by reference and returns the number of bytes (1..4). Returns zero on error, without advancing the pointer.

Definition at line 284 of file gconvert.cpp.

◆ u8parse()

void G::Convert::u8parse ( std::string_view  s,
ParseFn  fn 
)
static

Calls a function for each Unicode value in the given UTF-8 string.

Stops if the callback returns false. The callback parameters are: Unicode value (0xFFFD on error), UTF-8 bytes consumed, and UTF-8 byte offset.

Definition at line 216 of file gconvert.cpp.

◆ utf16()

bool G::Convert::utf16 ( bool  b)
static

Forces UTF-16 even if wchar_t is 4 bytes. Used in testing.

Definition at line 30 of file gconvert.cpp.

◆ valid()

bool G::Convert::valid ( std::string_view  sv)
staticnoexcept

Returns true if the string is valid UTF-8.

Definition at line 44 of file gconvert.cpp.

◆ widen()

std::wstring G::Convert::widen ( std::string_view  sv)
static

Widens from UTF-8 to UTF-16/UCS-4 wstring.

Invalid input characters are substituted with L'\xFFFD'.

Definition at line 38 of file gconvert.cpp.

Member Data Documentation

◆ unicode_error

constexpr unicode_type G::Convert::unicode_error = ~(unicode_type)0
staticconstexpr

Definition at line 50 of file gconvert.h.


The documentation for this class was generated from the following files: