pub enum WordSeparator {
AsciiSpace,
UnicodeBreakProperties,
Custom(fn(line: &str) -> Box<dyn Iterator<Item = Word<'_>> + '_>),
}
Expand description
Describes where words occur in a line of text.
The simplest approach is say that words are separated by one or
more ASCII spaces (' '
). This works for Western languages
without emojis. A more complex approach is to use the Unicode line
breaking algorithm, which finds break points in non-ASCII text.
The line breaks occur between words, please see
WordSplitter
for options of how to handle
hyphenation of individual words.
§Examples
use textwrap::core::Word;
use textwrap::WordSeparator::AsciiSpace;
let words = AsciiSpace.find_words("Hello World!").collect::<Vec<_>>();
assert_eq!(words, vec![Word::from("Hello "), Word::from("World!")]);
Variants§
AsciiSpace
Find words by splitting on runs of ' '
characters.
§Examples
use textwrap::core::Word;
use textwrap::WordSeparator::AsciiSpace;
let words = AsciiSpace.find_words("Hello World!").collect::<Vec<_>>();
assert_eq!(words, vec![Word::from("Hello "),
Word::from("World!")]);
UnicodeBreakProperties
Split line
into words using Unicode break properties.
This word separator uses the Unicode line breaking algorithm
described in Unicode Standard Annex
#14 to find legal places
to break lines. There is a small difference in that the U+002D
(Hyphen-Minus) and U+00AD (Soft Hyphen) don’t create a line break:
to allow a line break at a hyphen, use
WordSplitter::HyphenSplitter
.
Soft hyphens are not currently supported.
§Examples
Unlike WordSeparator::AsciiSpace
, the Unicode line
breaking algorithm will find line break opportunities between
some characters with no intervening whitespace:
#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;
assert_eq!(UnicodeBreakProperties.find_words("Emojis: 😂😍").collect::<Vec<_>>(),
vec![Word::from("Emojis: "),
Word::from("😂"),
Word::from("😍")]);
assert_eq!(UnicodeBreakProperties.find_words("CJK: 你好").collect::<Vec<_>>(),
vec![Word::from("CJK: "),
Word::from("你"),
Word::from("好")]);
}
A U+2060 (Word Joiner) character can be inserted if you want to manually override the defaults and keep the characters together:
#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;
assert_eq!(UnicodeBreakProperties.find_words("Emojis: 😂\u{2060}😍").collect::<Vec<_>>(),
vec![Word::from("Emojis: "),
Word::from("😂\u{2060}😍")]);
}
The Unicode line breaking algorithm will also automatically suppress break breaks around certain punctuation characters::
#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;
assert_eq!(UnicodeBreakProperties.find_words("[ foo ] bar !").collect::<Vec<_>>(),
vec![Word::from("[ foo ] "),
Word::from("bar !")]);
}
Custom(fn(line: &str) -> Box<dyn Iterator<Item = Word<'_>> + '_>)
Find words using a custom word separator
Implementations§
source§impl WordSeparator
impl WordSeparator
Trait Implementations§
source§impl Clone for WordSeparator
impl Clone for WordSeparator
source§fn clone(&self) -> WordSeparator
fn clone(&self) -> WordSeparator
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for WordSeparator
impl Debug for WordSeparator
impl Copy for WordSeparator
Auto Trait Implementations§
impl Freeze for WordSeparator
impl RefUnwindSafe for WordSeparator
impl Send for WordSeparator
impl Sync for WordSeparator
impl Unpin for WordSeparator
impl UnwindSafe for WordSeparator
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)