【C++ 筆記】C++ 與 C style string

C-style string

先從老祖宗 C 語言講起好了。

C-style string 就是字元陣列所組成的，最後結尾用 null（\0）表示字串結束。

如：

1	char a[] = "Hello"; // 等於 char a[] = {'H', 'e', 'l', 'l', 'o', '\0'};

這種字元陣列需要搭配 <cstring> 標頭檔引入 C 語言版本的字串函式庫。

裡面常見的操作如下：

函式名	功能
`strlen(s)`	回傳字串長度（不含 `\0`）
`strcpy(dst, src)`	字串複製
`strcmp(s1, s2)`	字串比較（回傳 0 相同，非 0 不同）
`strcat(dst, src)`	字串串接

dst 就是 destination，意為目的地；src 就是 source，意為來源。

strcpy(dst, src) 就是將 src 複製到 dst。

strcat(dst, src) 就是將 src 串接於 dst 上。

範例：

#include <iostream>
#include <cstring>
using namespace std;

int main() {

    char str1[50];
    char str2[] = "World";

    // 1. strcpy() 複製字串
    strcpy(str1, "Hello");
    cout << "str1 after strcpy: " << str1 << endl;

    // 2. strcat() 串接字串
    strcat(str1, " ");        // 加空格
    strcat(str1, str2);       // 將 str2 加到 str1 後面
    cout << "str1 after strcat: " << str1 << endl;

    // 3. strlen() 測量字串長度（不含 \0）
    cout << "Length of str1: " << strlen(str1) << endl;

    // 4. strcmp() 比較兩個字串
    if (strcmp(str1, "Hello World") == 0) {
        cout << "str1 與 \"Hello World\" 相同" << endl;
    } else {
        cout << "str1 與 \"Hello World\" 不相同" << endl;
    }

    return 0;
}

Output：

str1 after strcpy: Hello
str1 after strcat: Hello World
Length of str1: 11
str1 與 "Hello World" 相同

Strength and weakness

優點：

效能高，無額外抽象（Abstract）。
與低階記憶體操作無縫整合。
在嵌入式系統、C 接口中常見。

缺點：

安全性低，常因未妥善管理記憶體導致緩衝區溢位（buffer overflow）。
很麻煩，沒有 C++ style string 來的方便。
無法支援動態擴充長度，一個字串一旦寫好就固定長度了。

C++ style string

C++ style string 是由類別所構成的，使用前須引入標頭檔 <string>。

要建立一個字串，如下：

1
2
3

#include <string>

string name = "LukeTseng";

可以用中括號存取某個字元，索引（位置）起始值一樣從 0 開始，如上範例的 name[0] 就會取得 'L' 字元。

#include <iostream>
#include <string>

using namespace std;

int main(){
    string name = "LukeTseng";
    cout << name[0];
    return 0;
}

Output：

另外 C++ style string 是可以更新字串的，不像 C-style string 那麼死板：

#include <iostream>
#include <string>

using namespace std;

int main(){
    string name = "LukeTseng";
    name = "王奕翔醜男"; // 這絕對不是在說誰
    cout << name;
    return 0;
}

Output：

王奕翔醜男

也可以更改特定字元：

#include <iostream>
#include <string>

using namespace std;

int main(){
    string name = "LukeTseng";
    name[1] = 'i';
    cout << name;
    return 0;
}

Output：

LikeTseng

相關函式

Function	Description
`length()`	回傳字串長度。
`swap(a, b)`	交換兩個字串。
`size()`	查找字串的大小。
`resize()`	將字串長度調整為給定的字元數。
`find()`	尋找傳入參數的字串。
`push_back(c)`	把字元 c 推送到字串的結尾。
`pop_back(c)`	移除字串中最後一個字元 c。
`clear()`	清空字串。
`strncmp(const char str1, const char str2, size_t count)`	最多比較兩個字串的前 num 個位元組。
`strncpy(char dest, const char src, size_t n)`	該函式與 `strcpy()` 函式類似，不同之處在於最多複製 src 的 n 個位元組。
`strrchr(char* str, int chr)`	定位字串中某個字元的最後出現的位置。
`strcat(dest, src)`	把來源字串 src 的副本附加到目標字串 dest 的結尾。
`replace()`	把區間 `[first,last)` 中每個等於舊值的元素替換為新值。
`substr()`	從給定字串中建立子字串。
`compare()`	比較兩字串並以整數形式回傳結果。
`erase()`	刪除字串的某個部分。
`rfind()`	查找字串最後一次出現的位置。

表格來源：https://www.geeksforgeeks.org/strings-in-cpp/

find() 有多種語法：

1
2
3

s.find(sub, pos);            // For substring 用於子字串
s.find(sub, pos, n);        // For n character of sub 用於字串的 n 個字元
s.find(c, pos);                 // For character 用於字元

以上的第二種語法僅適用於 C-style string。

find() 範例：

#include <iostream>
#include <string>

using namespace std;

int main() {
    string str = "The quick brown fox jumps over the lazy dog.";

    // 尋找子字串 "fox"
    size_t pos = str.find("fox");

    if (pos != string::npos) {
        cout << "\"fox\" found at position: " << pos << endl;
    } else {
        cout << "\"fox\" not found." << endl;
    }

    // 尋找字元 'z'
    pos = str.find('z');
    if (pos != string::npos) {
        cout << "'z' found at position: " << pos << endl;
    } else {
        cout << "'z' not found." << endl;
    }

    // 尋找不存在的子字串
    pos = str.find("cat");
    if (pos != string::npos) {
        cout << "\"cat\" found at position: " << pos << endl;
    } else {
        cout << "\"cat\" not found." << endl;
    }

    return 0;
}

Output：

1
2
3

"fox" found at position: 16
'z' found at position: 37
"cat" not found.

string::npos 代表找不到字串，回傳值的型態為 size_t，通常為其型態的最大值 4294967295。

接下來是有關 strncmp(), strncpy(), strrchr(), strcat() 的範例：

strncmp()

strncmp() 回傳值：0（相等）、<0（str1 < str2）、>0（str1 > str2）

#include <iostream>
#include <cstring> // strncmp()
using namespace std;

int main() {
    const char* str1 = "abcdef";
    const char* str2 = "abcxyz";

    // 比較前 3 個字元
    int result = strncmp(str1, str2, 3);

    if (result == 0) {
        cout << "前 3 個字元相同" << endl;
    } else if (result < 0) {
        cout << "str1 小於 str2" << endl;
    } else {
        cout << "str1 大於 str2" << endl;
    }

    return 0;
}

Output：

前 3 個字元相同

strncpy(dest, src, n)

若 src 長度小於 n，dest 會補上 \0。

若 src 長度大於或等於 n，可能不會自動補 \0，需手動處理。

#include <iostream>
#include <cstring>  // strncpy()
using namespace std;

int main() {
    const char* src = "Hello";
    char dest[10];

    // 複製前 5 個字元到 dest，後面不足的會補 \0
    strncpy(dest, src, 10);

    cout << "複製後的 dest: " << dest << endl;

    return 0;
}

Output：

1	複製後的 dest: Hello

strrchr(str, ch)

strrchr(str, ch) 回傳指向 ch 最後一次出現位置的指標，找不到回傳 nullptr。
可用指標運算計算該位置。

#include <iostream>
#include <cstring>  // strrchr()
using namespace std;

int main() {
    const char* str = "This is a sample sentence.";

    // 找出字元 's' 最後一次出現的位置
    const char* result = strrchr(str, 's');

    if (result != nullptr) {
        cout << "最後一個 's' 出現在位置: " << (result - str) << endl;
        cout << "從該位置開始的字串為: " << result << endl;
    } else {
        cout << "找不到字元 's'" << endl;
    }

    return 0;
}

Output：

1 2	最後一個 's' 出現在位置: 17 從該位置開始的字串為: sentence.

strcat(dest, src)

strcat(dest, src)：將 src 字串加到 dest 字串尾端。
dest 必須有足夠的空間存放合併後的結果，否則會產生記憶體區段（Segmentation fault）錯誤。

#include <iostream>
#include <cstring>  // strcat()
using namespace std;

int main() {
    char dest[50] = "Hello, ";
    const char* src = "world!";

    strcat(dest, src);

    cout << "連接後的字串: " << dest << endl;

    return 0;
}

Output：

1	連接後的字串: Hello, world!

以下是有關函式 replace(), substr(), compare(), erase(), rfind() 的範例：

replace(pos, len, new_str)

str.replace(pos, len, new_str)：從位置 pos 開始，取代長度為 len 的部分為 new_str。

#include <iostream>
#include <string> // 注意為 <string>

using namespace std;

int main() {
    string str = "I love apples";
    
    // 從位置 7 開始，替換 6 個字元為 "oranges"
    str.replace(7, 6, "oranges");

    cout << "替換後字串: " << str << endl;
    return 0;
}

Output：

1	替換後字串: I love oranges

substr(pos, len)

substr(pos, len)：從 pos 開始擷取長度為 len 的子字串。
若省略 len，則取到字串尾端。

#include <iostream>
#include <string>
using namespace std;

int main() {
    string str = "Hello, world!";

    // 從位置 7 開始擷取 5 個字元
    string sub = str.substr(7, 5);

    cout << "擷取的子字串: " << sub << endl;
    return 0;
}

Output：

1	擷取的子字串: world

a.compare(b)

a.compare(b)：
- 回傳 0：a 與 b 相同
- 回傳 <0：a 小於 b（字典序）
- 回傳 >0：a 大於 b。
可用來實作排序（自訂排序）或搜尋。

#include <iostream>
#include <string>
using namespace std;

int main() {
    string a = "apple";
    string b = "banana";

    int result = a.compare(b);

    if (result == 0) {
        cout << "字串相同" << endl;
    } else if (result < 0) {
        cout << "a < b" << endl;
    } else {
        cout << "a > b" << endl;
    }

    return 0;
}

Output：

a < b

erase(pos, len)

erase(pos, len)：從 pos 開始，刪除 len 個字元。
也可用 erase(iterator) 或 erase(iterator_first, iterator_last)。

#include <iostream>
#include <string>
using namespace std;

int main() {
    string str = "0123456789";

    // 從位置 3 開始，刪除 4 個字元
    str.erase(3, 4);

    cout << "刪除後字串: " << str << endl;
    return 0;
}

Output：

1	刪除後字串: 012789

rfind(substring)

rfind(substring)：回傳子字串最後一次出現的位置。
若找不到，則回傳 string::npos。

#include <iostream>
#include <string>
using namespace std;

int main() {
    string str = "one two three two one";

    // 搜尋 "two" 最後一次出現的位置
    size_t pos = str.rfind("two");

    if (pos != string::npos) {
        cout << "\"two\" 最後一次出現於位置: " << pos << endl;
    } else {
        cout << "找不到子字串" << endl;
    }

    return 0;
}

Output：

1	"two" 最後一次出現於位置: 14

總結

C-style string

C-style string 是由 C 語言時期所沿用至今的字串表示方式，形式為字元陣列，並以特殊終止字元 '\0'（null character）表示字串結尾。

1	char a[] = "Hello"; // 等價於 {'H', 'e', 'l', 'l', 'o', '\0'}

常見的函式

在 C++ 使用時用 <cstring> 標頭檔引入。

函式	說明
`strlen(s)`	傳回字串長度（不包含 `\0`）
`strcpy(dst, src)`	將 src 字串複製到 dst
`strcmp(s1, s2)`	比較兩個字串（相同回傳 0）
`strcat(dst, src)`	將 src 串接於 dst 之後

dst（destination）：目標字串（目的地）。
src（source）：來源字串。

優點：

效能高，無額外抽象（Abstract）。
與低階記憶體操作無縫整合。
在嵌入式系統、C 接口中常見。

缺點：

安全性低，常因未妥善管理記憶體導致緩衝區溢位（buffer overflow）。
很麻煩，沒有 C++ style string 來的方便。
無法支援動態擴充長度，一個字串一旦寫好就固定長度了。

C++ style string

C++ style string 是以 std::string 類別為核心的字串型態，需引入 <string> 標頭檔。

用起來比 C-style string 更方便，也更符合現代化。

範例：

#include <string>
using namespace std;

string name = "LukeTseng";
cout << name[0]; // Output : L

也可以更新字串：name = "abc";。

也能更新特定字元：name[1] = 'i';。

常見函式表

Function	Description
`length()`	回傳字串長度。
`swap(a, b)`	交換兩個字串。
`size()`	查找字串的大小。
`resize()`	將字串長度調整為給定的字元數。
`find()`	尋找傳入參數的字串。
`push_back(c)`	把字元 c 推送到字串的結尾。
`pop_back(c)`	移除字串中最後一個字元 c。
`clear()`	清空字串。
`strncmp(const char str1, const char str2, size_t count)`	最多比較兩個字串的前 num 個位元組。
`strncpy(char dest, const char src, size_t n)`	該函式與 `strcpy()` 函式類似，不同之處在於最多複製 src 的 n 個位元組。
`strrchr(char* str, int chr)`	定位字串中某個字元的最後出現的位置。
`strcat(dest, src)`	把來源字串 src 的副本附加到目標字串 dest 的結尾。
`replace()`	把區間 `[first,last)` 中每個等於舊值的元素替換為新值。
`substr()`	從給定字串中建立子字串。
`compare()`	比較兩字串並以整數形式回傳結果。
`erase()`	刪除字串的某個部分。
`rfind()`	查找字串最後一次出現的位置。

表格來源：https://www.geeksforgeeks.org/strings-in-cpp/

find() 有多種語法：

1
2
3

s.find(sub, pos);            // For substring 用於子字串
s.find(sub, pos, n);        // For n character of sub 用於字串的 n 個字元
s.find(c, pos);                 // For character 用於字元

以上的第二種語法僅適用於 C-style string。

`strncmp()`, `strncpy()`, `strrchr()`, `strcat()` 總表

函式名稱	功能說明
`strncmp()`	比較兩字串前 n 個字元
`strncpy()`	複製字串前 n 個字元到另一字串
`strrchr()`	找到某字元最後一次出現位置
`strcat()`	將一字串接到另一字串之後

`replace()`, `substr()`, `compare()`, `erase()`, `rfind()` 總表

函式	功能說明
`replace()`	替換字串中的某段子字串
`substr()`	擷取子字串
`compare()`	比較兩字串字典順序
`erase()`	刪除某段字串
`rfind()`	搜尋某子字串最後一次出現的位置