C/C++中字符串String及字符操作方法

2022-08-22 22:39:19

本文总结C/C++中字符串操作方法，还在学习中，不定期更新。。。

字符串的输入方法

1、单个单词可以直接用std::cin，因为：std::cin读取并忽略开头所有的空白字符（如空格，换行符，制表符），读取字符直至再次遇到空白字符，读取终止。所以cin只能读取单个单词。显然可以多次使用cin来获取多个单词；

2、多个单词使用函数std::getline(std::cin, s)请看下面代码：

#include <iostream>
#include <string> 
int main()
{
	std::string line; // empty string
	while(std::getline(std::cin, line))
	{

            // read line at time until end-of-file
	    std::cout << line << std::endl; // write s to the output
	}
    return 0;
}

Name: getline

这个函数接受两个参数：一个输入流对象和一个 string 对象。getline 函数从输入流的下一行读取，并保存读取的内容到不包括换行符。和输入操作符不一样的是，getline 并不忽略行开头的换行符。只要 getline 遇到换行符，即便它是输入的第一个字符，getline 也将停止读入并返回。如果第一个字符就是换行符，则 string 参数将被置为空 string。

由于getline函数返回时丢弃换行符，换行符将不会存储在string对象中。

Prototype: ssize_t getline (char **lineptr, size_t *n, FILE *stream)
Description:
This function reads an entire line from stream, storing the text (including the newline and a terminating null character) in a buffer and storing the buffer address in *lineptr.
Before calling getline, you should place in *lineptr the address of a buffer *n bytes long, allocated with malloc. If this buffer is long enough to hold the line, getline stores the line in this buffer.  Otherwise, getline makes the buffer bigger using realloc, storing the new buffer address back in  *lineptr and the increased size back in *n. .
If you set *lineptr to a null pointer, and *n to zero, before the call, then getline allocates the initial buffer for you by calling malloc.
In either case, when getline returns, *lineptr is a char * which points to the text of the line.
When getline is successful, it returns the number of characters read (including the newline, but  not including the terminating null). This value enables you to distinguish null characters that are  part of the line from the null character inserted as a terminator.
This function is a GNU extension, but it is the recommended way to read lines from a stream. The  alternative standard functions are unreliable.
If an error occurs or end of file is reached without any bytes read, getline returns -1. Header files:stdio.h

String的操作方法

s.empty()

Returns true if s is empty; otherwise returns false
如果 s 为空串，则返回 true，否则返回 false。

s.size()

Returns number of characters in s
返回 s 中字符的个数

s[n]

Returns the character at position n in s; positions start at 0.
返回 s 中位置为 n 的字符，位置从 0 开始计数

【注意：1、引用下标时如果超出下标作用范围就会引起溢出错误。同样不会报错！2、索引的实际数据类型是类型 unsigned 类型string::size_type。】

#include <iostream>
#include <string>

int main()
{
	std::string s = "hello world";
	std::cout<<s<<std::endl;
	for (std::string::size_type ix = 0; ix != s.size(); ++ix)
	s[ix] = '*';
    std::cout<<"Now s is:"<<s<<std::endl;
    std::cout<<"s's len is:"<<s.size()<<", s[12]="<<s[100]<<std::endl;
    return 0;
}

注意：循环中使用了std::string::size_type ix = 0;请使用string内置类型size_type来操作。因为int型可能不够string的长度，所以内置类型size_type（实际可以认为是unsigned）被创建，保证各机器的兼容性，避免溢出（和下标溢出可不是一回事）。任何存储 string 的 size 操作结果的变量必须为 string::size_type 类型。特别重要的是，还要把 size 的返回值赋给一个 int 变量。

s1 + s2

Returns a string equal to the concatenation of s1 and s2
把 s1 和s2 连接成一个新字符串，返回新生成的字符串

【备注：可以连续加，和Python类似。string s3 = s1 + ", " + s2 + "\n";。注意：当进行 string 对象和字符串字面值混合连接操作时，+ 操作符的左右操作数必须至少有一个是 string 类型的【想象下级联也就知道这确实是有道理的】。----1、也就是说+连接必须保证前两个有一个为string类型！2、字符串字面值不能直接相加，字符串字面值和string是不同类型的，字符串里面没有空字符‘\0‘。（更新于2014.06.24）】

s1 = s2

Replaces characters in s1 by a copy of s2
把 s1 内容替换为 s2 的副本

【备注：。它必须先把 s1 占用的相关内存释放掉，然后再分配给 s2 足够存放 s2 副本的内存空间，最后把 s2 中的所有字符复制到新分配的内存空间。】

v1 == v2

Returns true if v1 and v2 are equal; false otherwise
比较 v1 与 v2 的内容，相等则返回 true，否则返回 false

!=, <, <=, >, and >=

Have their normal meanings
保持这些操作符惯有的含义

cctype Functions

我们经常要对 string 对象中的单个字符进行处理，例如，通常需要知道某个特殊字符是否为空白字符、字母或数字。以下列出了各种字符操作函数，适用于 string 对象的字符（或其他任何 char 值）。这些函数都在cctype 头文件中定义。

isalnum(c)

True if c is a letter or a digit.如果 c 是字母或数字，则为 True。

isalpha(c)

true if c is a letter.如果 c 是字母，则为 true。

iscntrl(c)

true if c is a control character.如果 c 是控制字符，则为 true

isdigit(c)

true if c is a digit.如果 c 是数字，则为 true。

isgraph(c)

true if c is not a space but is printable.如果 c 不是空格，但可打印，则为 true。

islower(c)

true if c is a lowercase letter.如果 c 是小写字母，则为 true。

isprint(c)

True if c is a printable character.如果 c 是可打印的字符，则为 true。

【注意：可打印的字符是指那些可以表示的字符】

ispunct(c)

True if c is a punctuation character.如果 c 是标点符号，则 true。

【注意：标点符号则是除了数字、字母或（可打印的）空白字符（如空格）以外的其他可打印字符】

isspace(c)

true if c is whitespace.如果 c 是空白字符，则为 true。

【注意：空白字符则是空格、制表符、垂直制表符、回车符、换行符和进纸符中的任意一种】

isupper(c)

True if c is an uppercase letter.如果 c 是大写字母，则 true。

isxdigit(c)

true if c is a hexadecimal digit.如果是 c 十六进制数，则为 true。

tolower(c)

If c is an uppercase letter, returns its lowercase equivalent; otherwise returns c unchanged.如果 c 大写字母，返回其小写字母形式，否则直接返回 c。

toupper(c)

If c is a lowercase letter, returns its uppercase equivalent; otherwise returns c unchanged.如果 c 是小写字母，则返回其大写字母形式，否则直接返回 c。

【注意：ctype.h是定义在C标准库中的头文件，cctype 其实就是利用了 C 标准库函数。C 标准库头文件命名形式为 name 而 C++ 版本则命名为 cname ，少了后缀，.h而在头文件名前加了 c 表示这个头文件源自 C 标准库。因此，cctype 与 ctype.h 文件的内容是一样的，只是采用了更适合 C++程序的形式。特别地，cname 头文件中定义的名字都定义在命名空间 std 内，而 .h 版本中的名字却不是这样。通常，C++ 程序中应采用 cname 这种头文件的版本，而不采用 name.h 版本，这样，标准库中的名字在命名空间 std 中保持一致。使用 .h 版本会给程序员带来负担，因为他们必须记得哪些标准库名字是从 C 继承来的，而哪些是 C++ 所特有的。】

字符串操作

准备写一个字符串操作方法集，和Python接口相同，后续开始项目的时候开工，如果已经有了，请留言告知，谢谢。未完待续。。。

例子

【此小节更新于2014.06.24】

原题在另一篇博文中：C/C++中容器vector使用方法 http://blog.csdn.net/zhanh1218/article/details/33323111

#include <iostream>
#include <string>
#include <vector>
using std::cin; using std::cout; using std::endl; using std::string; using std::vector;

string deal_word(string word)
{
	// 使用c++11 auto 语句 以及range for 语句
	for(auto &c : word)
	{
		if (not ispunct(c))
		{
			c = toupper(c); //连接非标点字符到字符串
		}
		else
		{
			word.erase(word.size()-1, 1); //只能删除最后一个标点符号。有局限性！
		}
	}
	return word;
}

string deal_word2(string word)
{
	// 使用下标及c++11 decltype
    for (decltype(word.size()) index = 0; index != word.size(); ++index)
    {
    	if (not ispunct(word[index]))
		{
    		word[index] = toupper(word[index]);
		}
    	else
    	{
    		word.erase(index, 1); // 删除指定位置上的某一个字符，在此为标点
    		index -= 1; //保证下标不越界！重要！
    	}
    }
    return word;
}

int main()
{
	string word; // 缓存输入的单词
	vector<string> text; // empty vector
	cout<<"Please input the text:"<<endl; //提示输入
	while (std::cin >> word and word != "INPUTOVER") // INPUTOVER 用于标示输入结束，也可以ctrl + z停止输入
	{
        word = deal_word(word); // 单词处理
		text.push_back(word); // append word to text
	}
	for(std::vector<int>::size_type ix =0, j = 0; ix != text.size(); ++ix, ++j)
	{
		if (j==8) // 8个单词一行
		{
			cout<<endl; //换行
			j = 0; //重新计数
		}
	    cout<<text[ix]<<" "; //加空格！
	}
    return 0;
}

改写了两种处理单词的方法。使用了c++11中的新特性！第二种方法更合理，适用性广。

其他

str.erase()方法：来自百度

1、erase(pos, n); 删除从pos开始的n个字符，比如erase(0,1)就是删除第一个字符
2、erase(position); 删除position处的一个字符（position是个string类型的迭代器）
3、erase(first, last); 删除从first到last之间的字符（first和last都是迭代器）

本文由@The_Third_Wave（Blog地址：http://blog.csdn.net/zhanh1218）原创。不定期更新，有错误请指正。

如果你看到这篇博文时发现没有不完整，那是我为防止爬虫先发布一半的原因，请看原作者Blog。

如果这篇博文对您有帮助，为了好的网络环境，不建议转载，建议收藏！如果您一定要转载，请带上后缀和本文地址。

C/C++中字符串String及字符操作方法,布布扣,bubuko.com