C++佳实践 | 代码风格

2023-03-30 00:00:00 函数代码初始化命名重载

本系列是开源书C++ Best Practises^[1]的中文版，全书从工具、代码风格、安全性、可维护性、可移植性、多线程、性能、正确性等角度全面介绍了现代C++项目的佳实践。本文是该系列的第二篇。

C++佳实践:

1. 工具

2. 代码风格（本文）

3. 安全性

4. 可维护性

5. 可移植性及多线程

6. 性能

7. 正确性和脚本

代码风格

代码风格重要的是一致性，其次是遵循C++程序员习惯的阅读风格。

C++允许任意长度的标识符名称，因此在命名时没必要非要保持简洁，建议使用描述性名称，并在风格上保持一致。

CamelCase(驼峰命名法)
snake_case(蛇形命名法)

这两种是很常见的命名规范，snake_case的优点是，在需要的时候可以适配拼写检查器。

建立代码风格指南

无论建立什么样的代码风格指南，一定要实现指定期望风格的.clang-format文件。虽然这对命名没有帮助，但对于开源项目来说，保持一致的风格尤为重要。

许多IDE、编辑器都支持内置的clang-format，或者可以很方便的通过加载项安装。

VSCode: Microsoft C/C++ extension for VS Code^[2]
CLion: ClangFormat as alternative formatter
VisualStudio: ClangFormat^[3]
Resharper++: Using Clang-Format^[4]
Vim

Format your C family code^[5]
vim-autoformat^[6]

XCode: ClangFormat-Xcode^[7]

通用C++命名约定

类以大写字母开头: MyClass。
函数和变量以小写字母开头: myMethod。
常量全部大写: const double PI=3.14159265358979323。

C++标准库(以及其他C++库，如Boost^[8])使用以下指导原则:

宏使用大写和下划线: INT_MAX。
模板参数名使用驼峰命名法: InputIterator。
所有其他名称都使用蛇形命名法: unordered_map。

区分私有对象数据

使用m_前缀命名私有数据，以区别于公共数据，m_代表“member（成员）”数据。

区分函数参数

重要的是保持代码库的一致性，这是一种有助于保持一致性的方式。

使用t_前缀命名函数参数，t_可以被认为是“the”，但其可以表示任意含义，关键是要将函数参数与作用域内的其他变量区分开来，同时遵循一致的命名策略。

可以为团队选择任何前缀或后缀，下面是一个例子，提出了一个有争议的建议，相关讨论见issue #11^[9]。

struct Size
{
  int width;
  int height;

  Size(int t_width, int t_height) : width(t_width), height(t_height) {}
};

// This version might make sense for thread safety or something,
// but more to the point, sometimes we need to hide data, sometimes we don't.
class PrivateSize
{
  public:
    int width() const { return m_width; }
    int height() const { return m_height; }
    PrivateSize(int t_width, int t_height) : m_width(t_width), m_height(t_height) {}

  private:
    int m_width;
    int m_height;
};

不要用下划线(_)作为名字的开头

_ 开头的名字有可能与编译器或标准库的保留名发生冲突: What are the rules about using an underscore in a C++ identifier?^[10]

良好代码风格示例

class MyClass
{
public:
  MyClass(int t_data)
    : m_data(t_data)
  {
  }

  int getData() const
  {
    return m_data;
  }

private:
  int m_data;
};

使Out-of-Source-Directory构建

确保构建生成的文件存放在与源文件夹分离的输出文件夹中。

使用`nullptr`

C++11引入了nullptr表示空指针，应该用来代替或NULL来指示空指针。

注释

注释块应该使用//，而不是/* */，使用//可以更容易的在调试时注释掉代码块。

// this function does something
int myFunc()
{
}

要在调试期间注释掉这个函数块，可以这样做:

/*
// this function does something
int myFunc()
{
}
*/

如果函数头注释使用/* */，这么做就会有冲突。

永远不要在头文件中使用`using namespace`

这会导致正在using的命名空间被强行拉入到包含头文件的所有文件的命名空间中，从而造成命名空间污染，并可能在导致名称冲突。在实现文件中using命名空间就足够了。

Include保护符

头文件必须包含名称清晰的include保护符，从而避免同一头文件被多次include的问题，并防止与其他项目的头文件发生冲突。

#ifndef MYPROJECT_MYCLASS_HPP
#define MYPROJECT_MYCLASS_HPP

namespace MyProject {
  class MyClass {
  };
}

#endif

此外还可以考虑使用#pragma once指令，这是许多编译器的准标准，内容简短，意图明确。

代码块必须包含`{}`

省略{}可能会导致代码语义错误。

// Bad Idea
// This compiles and does what you want, but can lead to confusing
// errors if modification are made in the future and close attention
// is not paid.
for (int i = ; i < 15; ++i)
  std::cout << i << std::endl;

// Bad Idea
// The cout is not part of the loop in this case even though it appears to be.
int sum = ;
for (int i = ; i < 15; ++i)
  ++sum;
  std::cout << i << std::endl;


// Good Idea
// It's clear which statements are part of the loop (or if block, or whatever).
int sum = ;
for (int i = ; i < 15; ++i) {
  ++sum;
  std::cout << i << std::endl;
}

保持每行代码长度合理

// Bad Idea
// hard to follow
if (x && y && myFunctionThatReturnsBool() && caseNumber3 && (15 > 12 || 2 < 3)) {
}

// Good Idea
// Logical grouping, easier to read
if (x && y && myFunctionThatReturnsBool()
    && caseNumber3
    && (15 > 12 || 2 < 3)) {
}

许多项目和编码标准都对此制定了软规则，即每行字符应该少于80或100个，这样的代码通常更容易阅读，此外还可以把两个文件并排显示在一个屏幕上，不用小字体也能看到全部代码。

使用`""`表示include本地文件

...<>表示include系统文件^[11]。

// Bad Idea. Requires extra -I directives to the compiler
// and goes against standards.
#include <string>
#include <includes/MyHeader.hpp>

// Worse Idea
// Requires potentially even more specific -I directives and
// makes code more difficult to package and distribute.
#include <string>
#include <MyHeader.hpp>


// Good Idea
// Requires no extra params and notifies the user that the file
// is a local file.
#include <string>
#include "MyHeader.hpp"

初始化成员变量

...使用成员初始化列表。

对于POD类型，初始化列表的性能与手动初始化相同，但对于其他类型，有明显的性能提升，见下文。

// Bad Idea
class MyClass
{
public:
  MyClass(int t_value)
  {
    m_value = t_value;
  }

private:
  int m_value;
};

// Bad Idea
// This leads to an additional constructor call for m_myOtherClass
// before the assignment.
class MyClass
{
public:
  MyClass(MyOtherClass t_myOtherClass)
  {
    m_myOtherClass = t_myOtherClass;
  }

private:
  MyOtherClass m_myOtherClass;
};

// Good Idea
// There is no performance gain here but the code is cleaner.
class MyClass
{
public:
  MyClass(int t_value)
    : m_value(t_value)
  {
  }

private:
  int m_value;
};

// Good Idea
// The default constructor for m_myOtherClass is never called here, so 
// there is a performance gain if MyOtherClass is not is_trivially_default_constructible. 
class MyClass
{
public:
  MyClass(MyOtherClass t_myOtherClass)
    : m_myOtherClass(t_myOtherClass)
  {
  }

private:
  MyOtherClass m_myOtherClass;
};

在C++11中，可以为每个成员初始化默认值(使用=或使用{})。

使用`=`设置默认值

// ... //
private:
  int m_value = ; // allowed
  unsigned m_value_2 = -1; // narrowing from signed to unsigned allowed
// ... //

这样可以确保不会出现构造函数“忘记”初始化成员对象的情况。

用大括号初始化默认值

用大括号初始化不允许在编译时截断数据长度。

// Best Idea

// ... //
private:
  int m_value{  }; // allowed
  unsigned m_value_2 { -1 }; // narrowing from signed to unsigned not allowed, leads to a compile time error
// ... //

除非有明确的理由，否则优先使用{}初始化，而不是=。

忘记初始化成员会导致未定义行为错误，而这些错误通常很难发现。

如果成员变量在初始化后不会更改，则将其标记为const。

class MyClass
{
public:
  MyClass(int t_value)
    : m_value{t_value}
  {
  }

private:
  const int m_value{};
};

由于不能给const成员变量赋值，拷贝赋值操作可能对这样的类没有意义。

总是使用命名空间

几乎没有理由需要全局命名空间中声明标识符。相反，函数和类应该存在于适当命名的命名空间中，或者存在于命名空间里的类中。放在全局命名空间中的标识符有可能与来自其他库(主要是没有命名空间的C库)的标识符发生冲突。

为标准库特性使用正确的整数类型

标准库通常使用std::size_t来处理与尺寸相关的内容，size_t的大小由实现定义。

一般来说，使用auto可以避免大部分问题。

请确保使用正确的整数类型，并与C++标准库保持一致，否则有可能在当前使用的平台上不会发出警告，但如果切换到其他平台，可能会发出警告。

注意，在对无符号数执行某些操作时，可能会导致整数下溢。例如:

std::vector<int> v1{2,3,4,5,6,7,8,9};
std::vector<int> v2{9,8,7,6,5,4,3,2,1};
const auto s1 = v1.size();
const auto s2 = v2.size();
const auto diff = s1 - s2; // diff underflows to a very large number

使用`.hpp`和`.cpp`作为文件扩展名

归根结底，这是个人喜好问题，但是.hpp和.cpp已被各种编辑器和工具广泛认可。因此，这是一个务实的选择。具体来说，Visual Studio只自动识别.cpp和.cxx为C++文件，而Vim不一定会把.cc识别为C++文件。

某个特别大的项目(OpenStudio^[12])使用.hpp和.cpp表示用户生成的文件，而使用.hxx和.cxx表示工具生成的文件。两者都能被很好的识别，并且区分开来有很大的帮助。

不要混用tab和空格

某些编辑器喜欢在默认情况下使用tab和空格的混合缩进，这使得没有使用完全相同的tab缩进设置的人很难阅读代码。请配置好编辑器，确保不会发生这种情况。

不要将有副作用的代码放在assert()中

assert(registerSomeThing()); // make sure that registerSomeThing() returns true

上述代码在debug模式下构建时可以成功运行，但在进行release构建时会被编译器删除，从而造成debug和release构建的行为不一致，原因在于assert()是一个宏，它在release模式下展开为空。

不要害怕模板

模板可以帮助我们坚持DRY原则^[13]。由于宏有不遵守命名空间等问题，因此能用模板的地方就不要用宏。

明智的使用操作符重载

运算符重载是为了支持表达性语法。比如让两个大数相加看起来像a + b，而不是a.add(b)。另一个常见的例子是std::string，通常使用string1 + string2连接两个字符串。

但是，使用过多或错误的操作符重载很容易写出可读性不强的表达式。在重载操作符时，要遵循stackoverflow文章^[14]中描述的三条基本规则。

具体来说，记住以下几点:

处理资源时必须重载operator=()，参见下面Rule of Zero章节。
对于所有其他操作符，通常只有在需要在上下文中使用时才重载。典型的场景是用+连接事物，负号可以被认为是“真”或“假”的表达式，等等。
一定要注意操作符优先级^[15]，尽量避免不直观的结构。
除非实现数字类型或遵循特定域中可识别的语法，否则不要重载~或%这样的外部操作符。
永远不要重载```operator,()```^[16](逗号操作符)。
处理流时使用非成员函数operator>>()和operator<<()。例如，可以重载operator<<(std::ostream &， MyClass const &)，从而允许将类“写入”到一个流中，例如std::cout或std::fstream或std::stringstream，后者通常用于创建值的字符串表示。
这篇文章描述了更多需要重载的常见操作符: What are the basic rules and idioms for operator overloading?^[17]。

更多关于自定义操作符实现细节的技巧可以参考: C++ Operator Overloading Guidelines^[18]。

避免隐式转换

单参数构造函数

可以在编译时应用单参数构造函数在类型之间自动转换，比如像std::string(const char *)，这样的转换很方便，但通常应该避免，因为可能会增加额外的运行时开销。

相反，可以将单参数构造函数标记为explicit，从而要求显式调用。

转换操作符

与单参数构造函数类似，编译器可以调用转换操作符，同样也会引入额外开销，也应该被标记为explicit。

//bad idea
struct S {
  operator int() {
    return 2;
  }
};

//good idea
struct S {
  explicit operator int() {
    return 2;
  }
};

考虑Rule of Zero

Rule of Zero规定，除非所构造的类具有某种新的所有权形式，否则不提供编译器可以提供的任何函数(拷贝构造函数、拷贝赋值操作符、移动构造函数、移动赋值操作符、析构函数)。

目标是让编译器提供在添加更多成员变量时自动维护的佳版本。

这篇文章介绍了这一原则的背景，并解释了几乎可以覆盖所有情况的实现技术: C++'s Rule of Zero^[19]。

微信公众号：DeepNoMind

相关文章