c – 在bool中设置额外位会使其同时为true和false

2024-03-06 22:56:56

如果我得到一个bool变量并将其第二位设置为1,则变量同时计算为true和false.使用带-g选项的gcc6.3编译以下代码(gcc-v6.3.0 / Linux / RHEL6.0-2016-x86_64 / bin / g -g main.cpp -o mytest_d)并运行可执行文件.你得到以下.

T如何同时等于真和假？

       value   bits 
       -----   ---- 
    T:   1     0001
after bit change
    T:   3     0011
T is true
T is false

当您使用不同语言(例如fortran)调用函数时,可能会发生这种情况,其中true和false定义与C不同.对于fortran,如果任何位不为0,则该值为true,如果所有位均为零,则该值为false.

#include <iostream>
#include <bitset>

using namespace std;

void set_bits_to_1(void* val){
  char *x = static_cast<char *>(val);

  for (int i = 0; i<2; i++ ){
    *x |= (1UL << i);
  }
}

int main(int argc,char *argv[])
{

  bool T = 3;

  cout <<"       value   bits " <<endl;
  cout <<"       -----   ---- " <<endl;
  cout <<"    T:   "<< T <<"     "<< bitset<4>(T)<<endl;

  set_bits_to_1(&T);


  bitset<4> bit_T = bitset<4>(T);
  cout <<"after bit change"<<endl;
  cout <<"    T:   "<< T <<"     "<< bit_T<<endl;

  if (T ){
    cout <<"T is true" <<endl;
  }

  if ( T == false){
    cout <<"T is false" <<endl;
  }


}

///////////////////////////////////
//使用ifort编译时与Fort不兼容的Fortran函数.

       logical*1 function return_true()
         implicit none

         return_true = 1;

       end function return_true

解决方法:

在C中,bool的位表示(甚至大小)是实现定义的;通常它被实现为char大小的类型,取1或0作为可能的值.

如果将其值设置为与允许值不同的任何值(在此特定情况下,通过char将bool别名化并修改其位表示),则会破坏语言规则,因此任何事情都可能发生.特别是,在标准中明确规定,“破坏”的bool可能同时表现为true和false(或既不是true也不是false)：

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false

(C 11,[basic.fundamental],注47)

在这种特殊情况下,you can see how it ended up in this bizarre situation：第一个if编译到

    movzx   eax, BYTE PTR [rbp-33]
    test    al, al
    je      .L22

它在Tax中加载T(零扩展),如果全部为零,则跳过打印;相反,下一个是

    movzx   eax, BYTE PTR [rbp-33]
    xor     eax, 1
    test    al, al
    je      .L23

测试if(T == false)被转换为if(T ^ 1),它只翻转低位.这对于有效的bool来说是好的,但是对于你的“破碎”它来说它不会削减它.

请注意,这个奇怪的序列仅在低优化级别生成;在较高级别,这通常会归结为零/非零检查,并且像你这样的序列可能会变成a single test/conditional branch.无论如何,在其他情况下你会得到奇怪的行为,例如：将bool值与其他整数相加时：

int foo(bool b, int i) {
    return i + b;
}

becomes

foo(bool, int):
        movzx   edi, dil
        lea     eax, [rdi+rsi]
        ret

其中dil被“信任”为0/1.

如果您的程序都是C,那么解决方案很简单：不要以这种方式破坏bool值,避免弄乱它们的位表示,一切都会顺利进行;特别是,即使你从一个整数分配给一个bool,编译器也会发出必要的代码以确保结果值是一个有效的bool,所以你的bool T = 3确实是安全的,而T最终会得到一个真值在它的胆量.

相反,如果你需要与其他语言编写的代码进行互操作,这些代码可能不同于bool的相同概念,只需避免bool代表“边界”代码,并将其编组为适当大小的整数.它将在条件和&amp ;;合.同样好.

有关该问题的Fortran /互操作性方面的更新

Disclaimer all I know of Fortran is what I read this morning on standard documents, and that I have some punched cards with Fortran listings that I use as bookmarks, so go easy on me.

首先,这种语言互操作性的东西不是语言标准的一部分,而是ABI平台的一部分.在我们谈论Linux x86-64时,相关文档是the System V x86-64 ABI.

首先,没有指定C _Bool类型(在3.1.2注意†中定义为与C bool相同)与Fortran LOGICAL有任何兼容性;特别是在9.2.2表9.2中指定将“plain”LOGICAL映射到signed int.关于TYPE * N类型,它说

The “TYPE*N” notation specifies that variables or aggregate members of type TYPE shall occupy N bytes of storage.

(同上)

没有为LOGICAL * 1明确指定的等效类型,这是可以理解的：它甚至不是标准的;事实上,如果您尝试在Fortran 95兼容模式下编译包含LOGICAL * 1的Fortran程序,您会收到有关它的警告

./example.f90(2): warning #6916: Fortran 95 does not allow this length specification.   [1]

    logical*1, intent(in) :: x

------------^

并且由gfort

./example.f90:2:13:
     logical*1, intent(in) :: x
             1
Error: GNU Extension: Nonstandard type declaration LOGICAL*1 at (1)

所以水已经糊里糊涂了;所以,结合上面的两个规则,我会选择签名字符是安全的.

但是：ABI还指定：

The values for type LOGICAL are .TRUE. implemented as 1 and .FALSE.
implemented as 0.

所以,如果你有一个程序在LOGICAL值中存储除1和0之外的任何东西,那么你已经超出了Fortran方面的规范！你说：

A fortran logical*1 has same representation as bool, but in fortran if bits are 00000011 it is true, in C++ it is undefined.

最后的陈述不正确,Fortran标准是表示不可知的,而ABI明确地说相反.事实上,你可以在checking the output of gfort for LOGICAL comparison轻松地看到这一点：

integer function logical_compare(x, y)
    logical, intent(in) :: x
    logical, intent(in) :: y
    if (x .eqv. y) then
        logical_compare = 12
    else
        logical_compare = 24
    end if
end function logical_compare

变

logical_compare_:
        mov     eax, DWORD PTR [rsi]
        mov     edx, 24
        cmp     DWORD PTR [rdi], eax
        mov     eax, 12
        cmovne  eax, edx
        ret

您会注意到两个值之间存在直接的cmp,而不是先将它们标准化(与ifort不同,在这方面更为保守).

更有趣的是：无论ABI说什么,ifort默认使用LOGICAL的非标准表示;这在-fpscomp logicals交换机文档中有解释,该文档还指定了有关LOGICAL和跨语言兼容性的一些有趣细节：

Specifies that integers with a non-zero value are treated as true, integers with a zero value are treated as false. The literal constant .TRUE. has an integer value of 1, and the literal constant .FALSE. has an integer value of 0. This representation is used by Intel Fortran releases before Version 8.0 and by Fortran PowerStation.

The default is fpscomp nologicals, which specifies that odd integer values (low bit one) are treated as true and even integer values (low bit zero) are treated as false.

The literal constant .TRUE. has an integer value of -1, and the literal constant .FALSE. has an integer value of 0. This representation is used by Compaq Visual Fortran. The internal representation of LOGICAL values is not specified by the Fortran standard. Programs which use integer values in LOGICAL contexts, or which pass LOGICAL values to procedures written in other languages, are non-portable and may not execute correctly. Intel recommends that you avoid coding practices that depend on the internal representation of LOGICAL values.

(重点补充)

现在,LOGICAL的内部表示通常不应该成为问题,因为从我收集的内容来看,如果你按照规则进行游戏并且不跨越语言边界,你就不会注意到.对于符合标准的程序,INTEGER和LOGICAL之间没有“直接转换”;我认为你可以将INTEGER推入LOGICAL的唯一方法似乎是TRANSFER,它本质上是不可移植的并且没有真正的保证,或者非标准的INTEGER< - >分配时的逻辑转换.

后者一个is documented由gfort总是导致非零 – > .TRUE.,零 – > .FALSE.,和you can see在所有情况下生成的代码都是为了实现这一点(即使在带有遗留表示的ifort的情况下它是复杂的代码),所以你似乎无法以这种方式将任意整数推入LOGICAL.

logical*1 function integer_to_logical(x)
    integer, intent(in) :: x
    integer_to_logical = x
    return
end function integer_to_logical

integer_to_logical_:
        mov     eax, DWORD PTR [rdi]
        test    eax, eax
        setne   al
        ret

LOGICAL * 1的反向转换是直的整数零扩展(gfort),因此,为了遵守上面链接的文档中的合同,显然期望LOGICAL值为0或1.

但总的来说,这些转换的情况是a mess的a bit,所以我只是远离它们.

所以,长话短说：避免将INTEGER数据放入LOGICAL值,因为即使在Fortran中它也很糟糕,并确保使用正确的编译器标志来获得符合ABI的布尔表示,并且与C/C++的互操作性应该没问题.但为了更安全,我只是在C方面使用普通字符.

最后,根据我收集的from the documentation,在ifort中有一些内置支持与C的互操作性,包括布尔值;你可以尝试利用它.

码农公寓

相关文章