The most fundamental property of any algorithm is the list of operations it requires from its iterator(s). Some algorithms, such as find
, require only the ability to access an element through the iterator, to increment the iterator, and to compare two iterators for equality. Others, such as sort
, require the ability to read, write, and randomly access elements. The iterator operations required by the algorithms are grouped into five iterator categories listed in Table 10.5. Each algorithm specifies what kind of iterator must be supplied for each of its iterator parameters.
Table 10.5. Iterator Categories
A second way is to classify the algorithms (as we did in the beginning of this chapter) is by whether they read, write, or reorder the elements in the sequence. Appendix A covers all the algorithms according to this classification.
The algorithms also share a set of parameter-passing conventions and a set of naming conventions, which we shall cover after looking at iterator categories.
Like the containers, iterators define a common set of operations. Some operations are provided by all iterators; other operations are supported by only specific kinds of iterators. For example, ostream_iterator
s have only increment, dereference, and assignment. Iterators on vector
, string
s, and deque
s support these operations and the decrement, relational, and arithmetic operators.
Iterators are categorized by the operations they provide and the categories form a sort of hierarchy. With the exception of output iterators, an iterator of a higher category provides all the operations of the iterators of a lower categories.
The standard specifies the minimum category for each iterator parameter of the generic and numeric algorithms. For example, find
—which implements a one-pass, read-only traversal over a sequence—minimally requires an input iterator. The replace
function requires a pair of iterators that are at least forward iterators. Similarly, replace_copy
requires forward iterators for its first two iterators. Its third iterator, which represents a destination, must be at least an output iterator, and so on. For each parameter, the iterator must be at least as powerful as the stipulated minimum. Passing an iterator of a lesser power is an error.
Many compilers will not complain when we pass the wrong category of iterator to an algorithm.
Input iterators: can read elements in a sequence. An input iterator must provide
• Equality and inequality operators (
==
,!=
) to compare two iterators
• Prefix and postfix increment (
++
) to advance the iterator
• Dereference operator (
*
) to read an element; dereference may appear only on the right-hand side of an assignment
• The arrow operator (
->
) as a synonym for(* it).member
—that is, dereference the iterator and fetch a member from the underlying object
Input iterators may be used only sequentially. We are guaranteed that *it++
is valid, but incrementing an input iterator may invalidate all other iterators into the stream. As a result, there is no guarantee that we can save the state of an input iterator and examine an element through that saved iterator. Input iterators, therefore, may be used only for single-pass algorithms. The find
and accumulate
algorithms require input iterators; istream_iterator
s are input iterators.
Output iterators: can be thought of as having complementary functionality to input iterators; they write rather than read elements. Output iterators must provide
• Prefix and postfix increment (
++
) to advance the iterator
• Dereference (
*
), which may appear only as the left-hand side of an assignment (Assigning to a dereferenced output iterator writes to the underlying element.)
We may assign to a given value of an output iterator only once. Like input iterators, output iterators may be used only for single-pass algorithms. Iterators used as a destination are typically output iterators. For example, the third parameter to copy
is an output iterator. The ostream_iterator
type is an output iterator.
Forward iterators: can read and write a given sequence. They move in only one direction through the sequence. Forward iterators support all the operations of both input iterators and output iterators. Moreover, they can read or write the same element multiple times. Therefore, we can use the saved state of a forward iterator. Hence, algorithms that use forward iterators may make multiple passes through the sequence. The replace
algorithm requires a forward iterator; iterators on forward_list
are forward iterators.
Bidirectional iterators: can read and write a sequence forward or backward. In addition to supporting all the operations of a forward iterator, a bidirectional iterator also supports the prefix and postfix decrement (--
) operators. The reverse
algorithm requires bidirectional iterators, and aside from forward_list
, the library containers supply iterators that meet the requirements for a bidirectional iterator.
Random-access iterators: provide constant-time access to any position in the sequence. These iterators support all the functionality of bidirectional iterators. In addition, random-access iterators support the operations from Table 3.7 (p. 111):
• The relational operators (
<
,<=
,>
, and>=
) to compare the relative positions of two iterators.
• Addition and subtraction operators (
+
,+=
,-
, and-=
) on an iterator and an integral value. The result is the iterator advanced (or retreated) the integral number of elements within the sequence.
• The subtraction operator (
-
) when applied to two iterators, which yields the distance between two iterators.
• The subscript operator (
iter[n]
) as a synonym for* (iter + n)
.
The sort
algorithms require random-access iterators. Iterators for array, deque, string
, and vector
are random-access iterators, as are pointers when used to access elements of a built-in array.
Exercises Section 10.5.1
Exercise 10.38: List the five iterator categories and the operations that each supports.
Exercise 10.39: What kind of iterator does a
list
have? What about avector
?Exercise 10.40: What kinds of iterators do you think
copy
requires? What aboutreverse
orunique
?
Superimposed on any other classification of the algorithms is a set of parameter conventions. Understanding these parameter conventions can aid in learning new algorithms—by knowing what the parameters mean, you can concentrate on understanding the operation the algorithm performs. Most of the algorithms have one of the following four forms:
alg(beg, end, other args);
alg(beg, end, dest, other args);
alg(beg, end, beg2, other args);
alg(beg, end, beg2, end2, other args);
where alg is the name of the algorithm, and beg
and end
denote the input range on which the algorithm operates. Although nearly all algorithms take an input range, the presence of the other parameters depends on the work being performed. The common ones listed here—dest
, beg2
, and end2
—are all iterators. When used, these iterators fill similar roles. In addition to these iterator parameters, some algorithms take additional, noniterator parameters that are algorithm specific.
A dest
parameter is an iterator that denotes a destination in which the algorithm can write its output. Algorithms assume that it is safe to write as many elements as needed.
Algorithms that write to an output iterator assume the destination is large enough to hold the output.
If dest
is an iterator that refers directly to a container, then the algorithm writes its output to existing elements within the container. More commonly, dest
is bound to an insert iterator (§ 10.4.1, p. 401) or an ostream_iterator
(§ 10.4.2, p. 403). An insert iterator adds new elements to the container, thereby ensuring that there is enough space. An ostream_iterator
writes to an output stream, again presenting no problem regardless of how many elements are written.
Algorithms that take either beg2
alone or beg2
and end2
use those iterators to denote a second input range. These algorithms typically use the elements from the second range in combination with the input range to perform a computation.
When an algorithm takes both beg2
and end2
, these iterators denote a second range. Such algorithms take two completely specified ranges: the input range denoted by [beg, end)
, and a second input range denoted by [beg2, end2)
.
Algorithms that take only beg2
(and not end2)
treat beg2
as the first element in a second input range. The end of this range is not specified. Instead, these algorithms assume that the range starting at beg2
is at least as large as the one denoted by beg, end
.
Algorithms that take
beg2
alone assume that the sequence beginning atbeg2
is as large as the range denoted bybeg
andend
.
Separate from the parameter conventions, the algorithms also conform to a set of naming and overload conventions. These conventions deal with how we supply an operation to use in place of the default <
or ==
operator and with whether the algorithm writes to its input sequence or to a separate destination.
Algorithms that take a predicate to use in place of the <
or ==
operator, and that do not take other arguments, typically are overloaded. One version of the function uses the element type’s operator to compare elements; the second takes an extra parameter that is a predicate to use in place of <
or ==
:
unique(beg, end); // uses the == operator to compare the elements
unique(beg, end, comp); // uses comp to compare the elements
Both calls reorder the given sequence by removing adjacent duplicated elements. The first uses the element type’s ==
operator to check for duplicates; the second calls comp
to decide whether two elements are equal. Because the two versions of the function differ as to the number of arguments, there is no possible ambiguity (§ 6.4, p. 233) as to which function is being called.
_if
VersionsAlgorithms that take an element value typically have a second named (not overloaded) version that takes a predicate (§ 10.3.1, p. 386) in place of the value. The algorithms that take a predicate have the suffix _if
appended:
find(beg, end, val); // find the first instance of val in the input range
find_if(beg, end, pred); // find the first instance for which pred is true
These algorithms both find the first instance of a specific element in the input range. The find
algorithm looks for a specific value; the find_if
algorithm looks for a value for which pred
returns a nonzero value.
These algorithms provide a named version rather than an overloaded one because both versions of the algorithm take the same number of arguments. Overloading ambiguities would therefore be possible, albeit rare. To avoid any possible ambiguities, the library provides separate named versions for these algorithms.
By default, algorithms that rearrange elements write the rearranged elements back into the given input range. These algorithms provide a second version that writes to a specified output destination. As we’ve seen, algorithms that write to a destination append _copy
to their names (§ 10.2.2, p. 383):
reverse(beg, end); // reverse the elements in the input range
reverse_copy(beg, end, dest);// copy elements in reverse order into dest
Some algorithms provide both _copy
and _if
versions. These versions take a destination iterator and a predicate:
// removes the odd elements from v1
remove_if(v1.begin(), v1.end(),
[](int i) { return i % 2; });
// copies only the even elements from v1 into v2; v1 is unchanged
remove_copy_if(v1.begin(), v1.end(), back_inserter(v2),
[](int i) { return i % 2; });
Both calls use a lambda (§ 10.3.2, p. 388) to determine whether an element is odd. In the first case, we remove the odd elements from the input sequence itself. In the second, we copy the non-odd (aka even) elements from the input range into v2
.
Exercises Section 10.5.3
Exercise 10.41: Based only on the algorithm and argument names, describe the operation that the each of the following library algorithms performs:
replace(beg, end, old_val, new_val);
replace_if(beg, end, pred, new_val);
replace_copy(beg, end, dest, old_val, new_val);
replace_copy_if(beg, end, dest, pred, new_val);