Boost.Locale provides a collator class, derived from std::collate
, that adds support for primary, secondary, tertiary, quaternary and identical comparison levels. They can be approximately defined as:
- Primary – ignore accents and character case, comparing base letters only. For example "facade" and "Façade" are the same.
- Secondary – ignore character case but consider accents. "facade" and "façade" are different but "Façade" and "façade" are the same.
- Tertiary – consider both case and accents: "Façade" and "façade" are different. Ignore punctuation.
- Quaternary – consider all case, accents, and punctuation. The words must be identical in terms of Unicode representation.
- Identical – as quaternary, but compare code points as well.
There are two ways of using the collator facet: directly, by calling its member functions compare, transform and hash, or indirectly by using the comparator template class in STL algorithms.
For example:
wstring a=L"Façade", b=L"facade";
bool eq = 0 == use_facet<collator<wchar_t> >(loc).compare(collator_base::secondary,a,b);
wcout << a <<L" and "<<b<<L" are " << (eq ? L"identical" : L"different")<<endl;
std::locale
is designed to be useful as a comparison class in STL collections and algorithms. To get similar functionality with comparison levels, you must use the comparator class.
std::map<std::string,std::string,comparator<char,collator_base::secondary> > strings;
You can also set a specific locale or level when creating and using the comparator class:
comparator<char> comp(some_locale,some_level);
std::map<std::string,std::string,comparator<char> > strings(comp);