本章很长,但是算法绝对是C++极富魅力的功能!
10 GENERIC ALGORITHMS
10.1 OVERVIEWgeneric algorithms: “algorithms” because they implement common classical algorithms such as sorting and searching, and “generic” because they operate on elements of differing type and across multiple container types—not only library types such as vector or list, but also the built-in array type—and, as we shall see, over other kinds of sequences as well.
In general, the algorithms do not work directly on a container. Instead, they operate by traversing a range of elements bounded by two iterators.Typically, as the algorithm traverses the range, it does something with each element。The generic algorithms do not themselves execute container operations. They operate solely in terms of iterators and iterator operations.The fact that the algorithms operate in terms of iterators and not container operations has a perhaps surprising but essential implication: Algorithms never change the size of the underlying container.There is a special class of iterator, the inserters. An inserter is an iterator adaptor that takes a container and yields an iterator that adds elements to the specified container. When we assign a value through an insert iterator, the iterator calls a container operation to add an element at a specified position in the given container.When an algorithm operates on one of these iterators, the iterator may have the effect of adding elements to the container.The algorithm itself, however, never does so.
int val = 42; // value we‘ll look for
// result will denote the element we want if it‘s in vec, or vec.cend() if not
auto result = find(vec.cbegin(), vec.cend(), val);
string val = "a value"; // value we‘ll look for
// this call to find looks through string elements in a list
auto result = find(1st.cbegin(), 1st.cend(), val);
//because pointers act like iterators on built-in arrays, we can use find to look in an array:
int ia[] = {27, 210, 12, 47, 109, 83};
int val = 83;
int* result = find(begin(ia), end(ia), val);
10.2 A FIRST LOOK AT THE ALGORITHMS
10.2.1 READ-ONLY ALGORITHMSFind(),count(),accumulate(),equal()
Algorithms that take a single iterator denoting a second sequence assume that the second sequence is at least as large at the first.
equal(roster1.cbegin(), roster1.cend(), roster2.cbegin());
equal makes one critically important assumption: It assumes that the second sequence is at least as big as the first. This algorithm potentially looks at every element in the first sequence. It assumes that there is a corresponding element for each of those
elements in the second sequence.
10.2.2 ALGORITHMS THAT WRITE CONTAINER ELEMENTS
Algorithms that write to a destination iterator assume the destination is large enough to hold the number of elements being written.
vector<int>vec; // empty vector// disaster: attempts to write to ten (nonexistent) elements in vec
fill_n(vec.begin(), 10, 0);
This call to fill_n is a disaster.
One way to ensure that an algorithm has enough elements to hold the output is to use an insert iterator.
back_inserter takes a reference to a container and returns an insert iterator bound to that container. When we assign through that iterator, the assignment calls push_back to add an element with the given value to the container:
vector<int> vec; // empty vector
auto it = back_inserter(vec); // assigning through it adds elements to vec
*it = 42; // vec now has one element with value 42
vector<int> vec; // empty vector
// ok: back_inserter creates an insert iterator that adds elements to vec
fill_n(back_inserter(vec), 10, 0); // appends ten elements to vec
10.2.3 ALGORITHMS THAT REORDER CONTAINER ELEMENTS
Some algorithms rearrange the order of elements within a container. An obvious example of such an algorithm is sort.
10.3.1 PREDICATE
A predicate is an expression that can be called and that returns a value that can be used as a condition.The predicates used by library algorithms are either unary predicates(meaning they have a single parameter) or binary predicates(meaning they have two parameters). The algorithms that take predicates call the given predicate on the elements in the input range. As a result, it must be possible to convert the element type to the parameter type of the predicate.
// comparison function to be used to sort by word length
bool isShorter(const string &s1, const string &s2)
{
return s1.size() < s2.size();
}
// sort on word length, shortest to longest
sort(words.begin(), words.end(), isShorter);
To keep the words of the same length in alphabetical order we can use the stable_sort algorithm. A stable sort maintains the original order among equal elements.
elimDups(words);// put words in alphabetical order and remove duplicates
// resort by length, maintaining alphabetical order among words of the same length
stable_sort(words.begin(), words.end(), isShorter);
for (const auto &s : words) // no need to copy the strings
cout << s << " "; // print each element separated by a space
cout << endl;
The predicates we pass to an algorithm must have exactly one or two parameters.Sometimes we want to do processing that requires more arguments than the algorithm’s predicate allows.
We can pass any kind of callable object to an algorithm. An object or expression is callable if we can apply the call operator to it. That is, if e is a callable expression, we can write e(args) where args is a comma-separated list of zero or more arguments.
There are four kinds of callables: functions, function pointers, classes that overload the function-call operator and lambda expressions.
A lambda expression represents a callable unit of code. It can be thought of as an unnamed, inline function. Like any function, a lambda has a return type, a parameter list, and a function body. Unlike a function, lambdas may be defined inside a function. A lamba expression has the form:
[capture list] (parameter list) -> return type { function body}
where capture list is an (often empty) list of local variables defined in the enclosing function; return type, parameter list, and function body are the same as in any ordinary function. However, unlike ordinary functions, a lambda must use a trailing return to specify its return type.
We can omit either or both of the parameter list and return type but must always include the capture list and function body:
auto f = [] { return 42; };
cout<< f() << endl; // prints 42
stable_sort(words.begin(), words.end(),[](const string &a, const string &b)
{ return a.size() < b.size();});
[sz](const string &a)
{ return a.size() >= sz; };
// get an iterator to the first element whose size() is >= sz
auto wc = find_if(words.begin(), words.end(),
[sz](const string &a){ return a.size() >= sz; });
for_each algorithm takes a callable object and calls that object on each element in the input range:
// print words of the given size or longer, each one followed by a spacefor_each(wc, words.end(),[](const string &s){cout << s << " ";});
cout << endl;
The capture list is used for local nonstaticvariables only; lambdas can use local statics and variables declared outside the function directly.
When we define a lambda, the compiler generates a new (unnamed) class type that corresponds to that lambda.
For now, what’s useful to understand is that when we pass a lambda to a function, we are defining both a new type and an object of that type: The argument is an unnamed object of this compiler-generated class type.
By default, the class generated from a lambda contains a data member corresponding to the variables captured by the lambda.
Rather than explicitly listing the variables we want to use from the enclosing function, we can let the compiler infer which variables we use from the code in the lambda’s body.
The & tells the compiler to capture by reference, and the =says the values are captured by value.
// sz implicitly captured by value
wc = find_if(words.begin(), words.end(),[=](const string &s){ return s.size() >= sz; });
If we want to capture some variables by value and others by reference, we can mix implicit and explicit captures:
// os implicitly captured by reference; c explicitly captured by value
for_each(words.begin(), words.end(),[&, c](const string &s) { os << s << c; });
By default, a lambda may not change the value of a variable that it copies by value.If we want to be able to change the value of a captured variable, we must follow the parameter list with the keyword mutable. Lambdas that are mutable may not omit the parameter list:
void fcn3()
{
size_t v1 = 42; // local variable
// f can change the value of the variables it captures
auto f = [v1] () mutable { return ++v1; };
v1 = 0;
auto j = f(); // j is 43
}
void fcn4()
{
size_t v1 = 42; // local variable
// v1 is a reference to a non const variable
// we can change that variable through the reference inside f2
auto f2 = [&v1] { return ++v1; };
v1 = 0;
auto j = f2(); // j is 1
}
Specifying the Lambda Return Type
transform(vi.begin(),vi.end(), vi.begin(),[](int i) { return i < 0 ? -i : i; });
The transform function takes three iterators and a callable. The first two iterators denote an input sequence and the third is a destination. The algorithm calls the given callable on each element in the input sequence and writes the result to the destination.Under C++ 11,when we need to define a return type for a lambda, we must use a trailing return type:
transform(vi.begin(),vi.end(), vi.begin(),[](int i) -> int{ if (i < 0) return -i; else return i; });
10.3.4 BINDING ARGUMENTS
It is not so easy to write a function to replace a lambda that captures local variables. The key reason is that the lambda can use its capture list to store local variables, thus it can only have one or two arguments to fit into algrithm’s callable funtions requirement.In place of that lambda, we have to figure out how to pass an argument. We can solve the problem of passing a size argument to check_size by using a new library function named bind, which is defined in the functional header.The bind function can be thought of as a general-purpose function adaptor.It takes a callable object and generates a new callable that “adapts” the parameter list of the original object.
The general form of a call to bind is:
auto newCallable= bind(callable, arg_list);
bool check_size(const string &s, string::size_type sz)
{
return s.size() >= sz;
}
// check6 is a callable object that takes one argument of type string
// and calls check_size on its given string and the value 6
auto check6 = bind(check_size, _1, 6);
This call to bind has only one placeholder, which means that check6 takes a single argument. The placeholder appears first in arg_list, which means that the parameter in check6 corresponds to the first parameter of check_size. That parameter is a const string&, which means that the parameter in check6 is also a const string&. Thus, a call to check6 must pass an argument of type string, which check6 will pass as the first argument to check_size.
The second argument in arg_list(i.e., the third argument to bind) is the value 6.That value is bound to the second parameter of check_size. Whenever we call check6, it will pass 6 as the second argument to check_size.
auto wc = find_if(words.begin(), words.end(),[sz](const string &a)
with a version that uses check_size:
auto wc = find_if(words.begin(), words.end(),bind(check_size, _1, sz));
This call to bind generates a callable object that binds the second argument of check_size to the value of sz.
The _n names are defined in a namespace named placeholders. That namespace is itself defined inside the std namespace:using namespace std::placeholders; makes all the names defined by placeholders usable.
Like the bind function, the placeholders namespace is defined in the functional header.
Arguments to bind
// g is a callable object that takes two arguments
auto g = bind(f, a, b, _2, c, _1);
generates a new callable that takes two arguments, represented by the placeholders _2 and _1.
In effect, this call to bind maps g(_1,_2) to f(a,b, _2, c, _1)
For example, calling g(X, Y) calls f(a,b, Y, c, X)
(Woo,酷!)We can use bind to invert the meaning of isShorter by writing:
// sort on word length, shortest to longest
sort(words.begin(), words.end(), isShorter);
// sort on word length, longest to shortest
sort(words.begin(), words.end(), bind(isShorter, _2, _1));
Because bind copies its arguments and we cannot copy an ostream. If we want to pass an object to bind without copying it, we must use the libraryref function:The ref function returns an object that contains the given reference and that is itself copyable.
There is also a cref function that generates a class that holds a reference to const.
In addition to the iterators that are defined for each of the containers, the library defines several additional kinds of iterators in the iterator header. These iterators include:
? Insert iterators: These iterators are bound to a container and can be used to insert elements into the container.
? Stream iterators: These iterators are bound to input or output streams and can be used to iterate through the associated IO stream.
? Reverse iterators: These iterators move backward, rather than forward. The library containers, other than forward_list, have reverse iterators.
? Move iterators: These special-purpose iterators move rather than copy their elements. We’ll cover move iterators in § 13.6.2(p. 543).
An inserter is an iterator adaptor that takes a container and yields an iterator that adds elements to the specified container. When we assign a value through an insert iterator, the iterator calls a container operation to add an element at a specified position in the given container.
? back_inserter: creates an iterator that uses push_back.
? front_inserter: creates an iterator that uses push_front.
? inserter: creates an iterator that uses insert. This function takes a second argument, which must be an iterator into the given container. Elements are insertedahead of the element denoted by the given iterator.
If it is an iterator generated by inserter, then an assignment such as
*it = va1;behaves as
it= c.insert(it, val); // it points to the newly added element
++it; // increment it so that it denotes the same element as before
Even if the position we pass to inserter initially denotes the first element, as soon as we insert an element in front of that element, that element is no longer the one at the beginning of the container:
list<int> lst = {1,2,3,4};list<int> lst2, lst3; // empty lists
// after copy completes, 1st2 contains 4 3 2 1
copy(lst.cbegin(),lst.cend(), front_inserter(lst2));
// after copy completes, 1st3 contains 1 2 3 4
copy(lst.cbegin(),lst.cend(), inserter(lst3, lst3.begin()));
When we call front_inserter(c), we get an insert iterator that successively calls push_front. As each element is inserted, it becomes the new first element in c.Therefore, front_inserter yields an iterator that reverses the order of the sequence that it
inserts; inserter and back_inserter don’t.
Even though the iostream types are not containers, there are iterators that can be used with objects of the IO types. An istream_iterator reads an input stream, and an ostream_iterator writes an output stream. These iterators treat their corresponding stream as a sequence of elements of a specified type.
Table10.3. istream_iterator Operations
Table10.4. stream_iterator Operations
An istream_iterator uses >> to read a stream:
istream_iterator<int> in_iter(cin); // read ints from cin
istream_iterator<int> eof; // istream ‘‘end‘‘ iterator
while (in_iter != eof) // while there‘s valid input to read
// postfix increment reads the stream and returns the old value of the iterator
// we dereference that iterator to get the previous value read from the stream
vec.push_back(*in_iter++);
An iterator bound to a stream is equal to the end iterator once its associated stream hits end-of-file or encounters an IO error.
What is more useful is that we can rewrite this program as:
istream_iterator<int> in_iter(cin), eof; // read ints from cin
vector<int> vec(in_iter, eof); // construct vec from an iterator range
istream_iterator<int> in(cin), eof;
cout << accumulate(in, eof, 0) << endl;
// The second argument specifies a character string to print following each element.That string must be a C-style character string (i.e., a string literal or a pointer to a null-terminated array).
ostream_iterator<int>out_iter(cout, " ");
for (auto e : vec)
*out_iter++ = e; // the assignment writes this element to cout
cout << endl;
for(auto e : vec)
out_iter = e; // the assignment writes this element to cout
cout << endl;
The * and ++operators do nothing on an ostream_iterator, so omitting them has no effect on our program. However, we prefer to write the loop as first presented.That loop uses the iterator consistently with how we use other iterator types. We can easily change this loop to execute on another iterator type. Moreover, the behavior of this loop will be clearer to readers of our code.
copy(vec.begin(),vec.end(), out_iter);
cout << endl;
We can create an istream_iterator for any type that has an input operator (>>). Similarly, we can define an ostream_iterator so long as the type has an output operator (<<).
A reverse iterator is an iterator that traverses a container backward, from the last element toward the first.A reverse iterator inverts the meaning of increment (and decrement). Incrementing (++it) a reverse iterator moves the iterator to the previous element; derementing (--it) moves the iterator to the next element.
Although it may seem confusing to have the meaning of the increment and decrement operators reversed, doing so lets us use the algorithms transparently to process a container forward or backward.For example, we can sort our vector in descending order by passing sorta pair of reverse iterators:
sort(vec.begin(),vec.end()); // sorts vec in ‘‘normal‘‘ order
// sorts in reverse: puts the smallest element at the end of vec
sort(vec.rbegin(), vec.rend());
It is not possible to create a reverse iterator from a forward_list or a stream iterator because it is not possible to move backward through a forward_list or a stream.
Relationship between Reverse Iterators and Other Iterators:
#include <algorithm>
#include <string>
#include <iostream>
using namespace std;
int main()
{
string line("ABC,CDE");
// find the first element in a comma-separated list
auto comma = find(line.cbegin(), line.cend(), ‘,‘);
//输出:ABC
cout << string(line.cbegin(), comma) << endl;
// find the last element in a comma-separated list
auto rcomma = find(line.crbegin(), line.crend(), ‘,‘);
// WRONG: will generate the word in reverse order
//输出:EDC
cout << string(line.crbegin(), rcomma) << endl;
//we can’t use rcomma directly. That iterator is a reverse iterator,
//which means that it goes backward toward the beginning of the string.
//What we need to do is transform rcommaback into an ordinary iterator that will go forward through line. We can do so by calling thereverse_iterator’s base member,
//which gives us its corresponding ordinary iterator:
// ok: get a forward iterator and read to the end of line
//输出:CDE
cout << string(rcomma.base(), line.cend()) << endl;
return 0;
}
rcomma and rcomma.base() refer to different elements, as do line.crbegin()and line.cend(). These differences are needed to ensure that the range of elements, whether processed forward or backward, is the same.
Technically speaking, the relationship between normal and reverse iterators accommodates the properties of a left-inclusive range. The point is that [line.crbegin(), rcomma)and [rcomma.base(), line.cend()) refer to the same elements in line. In order for that to happen, rcomma and rcomma.base() must yield adjacent positions, rather than the same position, as must crbegin()and cend().
The most fundamental property of any algorithm is the list of operations it requires from its iterator(s).
The iterator operations required by the algorithms are grouped into five iterator categories listed in Table 10.5.
A second way is to classify the algorithms (as we did in the beginning of this chapter) is by whether they read, write, or reorder the elements in the sequence. Appendix A covers all the algorithms according to this classification.
The standard specifies the minimum category for each iterator parameter of the generic and numeric algorithms. For example, find—which implements a one-pass, read-only traversal over a sequence—minimally requires an input iterator.
Many compilers will not complain when we pass the wrong category of iterator to an algorithm(抓狂!).
Input iterators: can read elements in a sequence. An input iterator must provide
?Equality and inequality operators (==, !=) to compare two iterators?Prefix and postfix increment (++) to advance the iterator
?Dereference operator (*) to read an element; dereference may appear only on the right-hand side of an assignment
?The arrow operator (->) as a synonym for (* it).member—that is, dereference the iterator and fetch a member from the underlying object
We are guaranteed that *it++is valid, but incrementing an input iterator may invalidate all other iterators into the stream. As a result, there is no guarantee that we can save the state of an input iterator and examine an element through that saved iterator. Input iterators, therefore, may be used only for single-pass algorithms. istream_iterators are input iterators.
Output iterators: they write rather than read elements. Output iterators must provide
?Prefix and postfix increment (++) to advance the iterator
?Dereference (*), which may appear only as the left-hand side of an assignment(Assigning to a dereferenced output iterator writes to the underlying element.)
Like input iterators, output iterators may be used only for single-pass algorithms. Iterators used as a destination are typically output iterators. The ostream_iteratortype is an output iterator.
Forward iterators: can read and write a given sequence. They move in only one direction through the sequence. Forward iterators support all the operations of both input iterators and output iterators. Moreover, they can read or write the same element multiple times. Therefore, we can use the saved state of a forward iterator.Hence, algorithms that use forward iterators may make multiple passes through the sequence. The replace algorithm requires a forward iterator; iterators on forward_list are forward iterators.
Bidirectional iterators: can read and write a sequence forward or backward. In addition to supporting all the operations of a forward iterator, a bidirectional iterator also supports the prefix and postfix decrement (--) operators. The reverse algorithm requires bidirectional iterators, and aside from forward_list, the library containers supply iterators that meet the requirements for a bidirectional iterator.
?The relational operators (<, <=, >, and >=) to compare the relative positions of two iterators.
?Addition and subtraction operators (+, +=, -, and -=) on an iterator and an integral value. The result is the iterator advanced (or retreated) the integral number of elements within the sequence.
?The subtraction operator (-) when applied to two iterators, which yields the distance between two iterators.
?The subscript operator (iter[n]) as a synonym for * (iter + n).
Iterators for array, deque, string, and vector are random-access iterators, as are pointers when used to access elements of a built-in array.
Most of the algorithms have one of the following four forms:
alg(beg, end, other args);
alg(beg, end, dest, other args);
alg(beg, end, beg2, other args);
alg(beg, end, beg2, end2, other args);
A dest parameter is an iterator that denotes a destination in which the algorithm can write its output. If dest is an iterator that refers directly to a container, then the algorithm writes its output to existing elements within the container. More commonly, dest is bound to an insert iterator or an ostream_iterator.
Algorithms with a Second Input Sequence:
Algorithms that take only beg2(and not end2) treat beg2 as the first element in a second input range. The end of this range is not specified. Instead, these algorithms assume that the range starting at beg2 is at least as large as the one denoted by beg, end.
10.5.3 ALGORITHM NAMING CONVENTIONS
Algorithms that take a predicate to use in place of the < or ==operator, and that do not take other arguments, typically are overloaded. One version of the function uses the element type’s operator to compare elements; the second takes an extra parameter that is a predicate to use in place of <or ==:
unique(beg,end); // uses the == operator to compare the elements
unique(beg, end, comp); // uses comp to compare the elements
Algorithms that take an element value typically have a second named (not overloaded) version that takes a predicate in place of the value. The algorithms that take a predicate have the suffix _if appended:
find(beg,end, val); // find the first instance of val in the input range
find_if(beg, end, pred); // find the first instance for which pred is true
These algorithms both find the first instance of a specific element in the input range.The find algorithm looks for a specific value; the find_if algorithm looks for a value for which pred returns a nonzero value.
These algorithms provide a named version rather than an overloaded one because both versions of the algorithm take the same number of arguments. To avoid any possible ambiguities, the library provides separate named versions for these algorithms.
reverse(beg,end); // reverse the elements in the input range
reverse_copy(beg, end, dest);// copy elements in reverse order into dest
// removes the odd elements from v1
remove_if(v1.begin(), v1.end(),[](int i) { return i % 2; });
// copies only the non-odd (aka even) elements from v1 into v2; v1 is unchanged
remove_copy_if(v1.begin(), v1.end(), back_inserter(v2),[](int i) { return i % 2; });
Unlike the other containers, list and forward_list define several algorithms as members. In particular, the list types define their own versions of sort, merge, remove, reverse, and unique. The generic version of sort requires random access iterators. As a result, sort cannot be used with list and forward_list because these types offer bidirectional and forward iterators, respectively.
The generic versions of the other algorithms that the list types define can be used with lists, but at a cost in performance. These algorithms swap elements in the input sequence. A list can “swap” its elements by changing the links among its elements rather than swapping the values of those elements. As a result, the list-specific versions of these algorithms can achieve much better performance than the corresponding generic versions.
These list-specific operations are described in Table 10.6. Generic algorithms not listed in the table that take appropriate iterators execute equally efficiently on lists and forward_lists as on other containers.
The splice Members
The list types also define a splice algorithm, which is described in Table 10.7. This algorithm is particular to list data structures. Hence a generic version of this algorithm is not needed