pub struct Wtf8Buf { /* private fields */ }
Expand description
An owned, growable string of well-formed WTF-8 data.
Similar to String
, but can additionally contain surrogate code points
if they’re not in a surrogate pair.
Implementations§
Source§impl Wtf8Buf
impl Wtf8Buf
Sourcepub fn with_capacity(n: usize) -> Wtf8Buf
pub fn with_capacity(n: usize) -> Wtf8Buf
Create an new, empty WTF-8 string with pre-allocated capacity for n
bytes.
Sourcepub fn from_string(string: String) -> Wtf8Buf
pub fn from_string(string: String) -> Wtf8Buf
Create a WTF-8 string from an UTF-8 String
.
This takes ownership of the String
and does not copy.
Since WTF-8 is a superset of UTF-8, this always succeeds.
Sourcepub fn from_str(s: &str) -> Wtf8Buf
pub fn from_str(s: &str) -> Wtf8Buf
Create a WTF-8 string from an UTF-8 &str
slice.
This copies the content of the slice.
Since WTF-8 is a superset of UTF-8, this always succeeds.
Sourcepub fn from_ill_formed_utf16(v: &[u16]) -> Wtf8Buf
pub fn from_ill_formed_utf16(v: &[u16]) -> Wtf8Buf
Create a WTF-8 string from a potentially ill-formed UTF-16 slice of 16-bit code units.
This is lossless: calling .to_ill_formed_utf16()
on the resulting
string will always return the original code units.
Sourcepub fn reserve(&mut self, additional: usize)
pub fn reserve(&mut self, additional: usize)
Reserves capacity for at least additional
more bytes to be inserted
in the given Wtf8Buf
.
The collection may reserve more space to avoid frequent reallocations.
§Panics
Panics if the new capacity overflows usize
.
Sourcepub fn capacity(&self) -> usize
pub fn capacity(&self) -> usize
Returns the number of bytes that this string buffer can hold without reallocating.
Sourcepub fn push_wtf8(&mut self, other: &Wtf8)
pub fn push_wtf8(&mut self, other: &Wtf8)
Append a WTF-8 slice at the end of the string.
This replaces newly paired surrogates at the boundary with a supplementary code point, like concatenating ill-formed UTF-16 strings effectively would.
Sourcepub fn push(&mut self, code_point: CodePoint)
pub fn push(&mut self, code_point: CodePoint)
Append a code point at the end of the string.
This replaces newly paired surrogates at the boundary with a supplementary code point, like concatenating ill-formed UTF-16 strings effectively would.
Sourcepub fn truncate(&mut self, new_len: usize)
pub fn truncate(&mut self, new_len: usize)
Shortens a string to the specified length.
§Failure
Fails if new_len
> current length,
or if new_len
is not a code point boundary.
Sourcepub fn into_string(self) -> Result<String, Wtf8Buf>
pub fn into_string(self) -> Result<String, Wtf8Buf>
Consume the WTF-8 string and try to convert it to UTF-8.
This does not copy the data.
If the contents are not well-formed UTF-8 (that is, if the string contains surrogates), the original WTF-8 string is returned instead.
Sourcepub fn into_string_lossy(self) -> String
pub fn into_string_lossy(self) -> String
Consume the WTF-8 string and convert it lossily to UTF-8.
This does not copy the data (but may overwrite parts of it in place).
Surrogates are replaced with "\u{FFFD}"
(the replacement character
“�”)
Methods from Deref<Target = Wtf8>§
Sourcepub fn slice(&self, begin: usize, end: usize) -> &Wtf8
pub fn slice(&self, begin: usize, end: usize) -> &Wtf8
Return a slice of the given string for the byte range [begin
..end
).
§Failure
Fails when begin
and end
do not point to code point boundaries,
or point beyond the end of the string.
Sourcepub fn slice_from(&self, begin: usize) -> &Wtf8
pub fn slice_from(&self, begin: usize) -> &Wtf8
Return a slice of the given string from byte begin
to its end.
§Failure
Fails when begin
is not at a code point boundary,
or is beyond the end of the string.
Sourcepub fn slice_to(&self, end: usize) -> &Wtf8
pub fn slice_to(&self, end: usize) -> &Wtf8
Return a slice of the given string from its beginning to byte end
.
§Failure
Fails when end
is not at a code point boundary,
or is beyond the end of the string.
Sourcepub fn ascii_byte_at(&self, position: usize) -> u8
pub fn ascii_byte_at(&self, position: usize) -> u8
Return the code point at position
if it is in the ASCII range,
or `b’\xFF’ otherwise.
§Failure
Fails if position
is beyond the end of the string.
Sourcepub fn code_points(&self) -> Wtf8CodePoints<'_> ⓘ
pub fn code_points(&self) -> Wtf8CodePoints<'_> ⓘ
Return an iterator for the string’s code points.
Sourcepub fn as_str(&self) -> Option<&str>
pub fn as_str(&self) -> Option<&str>
Try to convert the string to UTF-8 and return a &str
slice.
Return None
if the string contains surrogates.
This does not copy the data.
Sourcepub fn to_string_lossy(&self) -> Cow<'_, str>
pub fn to_string_lossy(&self) -> Cow<'_, str>
Lossily convert the string to UTF-8.
Return an UTF-8 &str
slice if the contents are well-formed in UTF-8.
Surrogates are replaced with "\u{FFFD}"
(the replacement character
“�”).
This only copies the data if necessary (if it contains any surrogate).
Sourcepub fn to_ill_formed_utf16(&self) -> IllFormedUtf16CodeUnits<'_> ⓘ
pub fn to_ill_formed_utf16(&self) -> IllFormedUtf16CodeUnits<'_> ⓘ
Convert the WTF-8 string to potentially ill-formed UTF-16 and return an iterator of 16-bit code units.
This is lossless:
calling Wtf8Buf::from_ill_formed_utf16
on the resulting code units
would always return the original WTF-8 string.
Trait Implementations§
Source§impl Debug for Wtf8Buf
Format the string with double quotes,
and surrogates as \u
followed by four hexadecimal digits.
Example: "a\u{D800}"
for a string with code points [U+0061, U+D800]
impl Debug for Wtf8Buf
Format the string with double quotes,
and surrogates as \u
followed by four hexadecimal digits.
Example: "a\u{D800}"
for a string with code points [U+0061, U+D800]
Source§impl Extend<CodePoint> for Wtf8Buf
Append code points from an iterator to the string.
impl Extend<CodePoint> for Wtf8Buf
Append code points from an iterator to the string.
This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.
Source§fn extend<T: IntoIterator<Item = CodePoint>>(&mut self, iterable: T)
fn extend<T: IntoIterator<Item = CodePoint>>(&mut self, iterable: T)
Source§fn extend_one(&mut self, item: A)
fn extend_one(&mut self, item: A)
extend_one
)Source§fn extend_reserve(&mut self, additional: usize)
fn extend_reserve(&mut self, additional: usize)
extend_one
)Source§impl FromIterator<CodePoint> for Wtf8Buf
Create a new WTF-8 string from an iterator of code points.
impl FromIterator<CodePoint> for Wtf8Buf
Create a new WTF-8 string from an iterator of code points.
This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.