5.Primitive, Reference, and Value Types

2022-11-28 22:04:45

1.Programming Language Primitive Types

　　primitive types：Any data types the compiler directly supports. Primitive types map directly to types existing in the Framework Class Library (FCL).　　

　　　　use the FCL type names and completely avoid the primitive type names.

　　the C# compiler supports patterns related to casting, literals, and operators.

　　1.casting

　　C# allows implicit casts if the conversion is “safe”(no loss of data is possible);But C# requires explicit casts if the conversion is potentially "unsafe"(could lose precision or magnitude as a result of the conversion).

　　different compilers can generate different code to handle these cast operations.

　　2.literals.A literal is considered to be an instance of the type itself

　　if you have an expression consisting of literals, the compiler is able to evaluate the expression at compile time, improving the application’s performance.

　　3.operators.the compiler automatically knows how and in what order to interpret operators (such as +,-, *, /, %, &, ^, |, ==, !=, >, <, >=, <=, <<, >>, ~, !, ++, --, and so on)

2.Checked and Unchecked Primitive Type Operations

　　1.many arithmetic operations on primitives could result in an overflow.

　　overflow is undesirable and if not detected causes the application to behave in strange and unusual ways;but some time this overflow is not only acceptable but is also desired(calculating a hash value or a checksum)

　　Different languages handle overflows in different ways.

　　The CLR offers IL instructions that allow the compiler to choose the desired behavior {add(add/add.ovf),subtraction (sub/sub.ovf), multiplication (mul/mul.ovf), and data conversions (conv/conv.ovf)}.

　　C# allows the programmer to decide how overflows should be handled. By default, overflow checking is turned off (means that the compiler generates IL code by using the versions of the add, subtract, multiply, and conversion instructions that don’t include overflow checking).but developers must be assured that overflows won’t occur or that their code is designed to anticipate these overflows.

　　1.One way to get the C# compiler to control overflows is to use the /checked+ compiler switch(OverflowException).

　　2.programmers can control overflow checking in specific regions of their code

　　the explicit cast: OverflowException;cast: the exception wouldn’t occur.

　　3.offers checked and unchecked statements. The statements cause all expressions within a block to be checked or unchecked.

　　等于

　　calling a method within a checked operator or statement has no impact on that method

　　how to programing?

　　1.Use signed data types (such as Int32 and Int64) instead of unsigned numeric types (such as UInt32 and UInt64) wherever possible.

　　This allows the compiler to detect more overflow/underflow errors.

　　various parts of the class library (such as Array's and String's Length properties) are hard-coded to return signed values, and less casting is required as you move these values around in your code.Fewer casts make source code cleaner and easier to maintain

　　unsigned numeric types are not CLS-compliant

　　2.explicitly use checked around blocks where an unwanted overflow might occur due to invalid input data

　　such as processing a request with data supplied from an end user or a client machine. You might want to catch OverflowException as well, so that your application can gracefully recover from these failures.

　　3.explicitly use unchecked around blocks where an overflow is OK, such as calculating a checksum.

　　4.For any code that doesn’t use checked or unchecked, the assumption is that you do want an exception to occur on overflow.

　　for example, calculating something (such as prime numbers) where the inputs are known, and overflows are bugs.

　　5.debug your application:turn on the compiler’s /checked+ switch;release your application:use the compiler’s/checked-switch

　　/checked+ switch:run more slowly because the system will be checking for overflows on any code that you didn’t explicitly mark as checked or unchecked. If an exception occurs, you’ll easily detect it and be able to fix the bug in your code

　　/checked-switch:runs faster and overflow exceptions won’t be generated

　　If your application can tolerate the slight performance hit of always doing checked operations, then I recommend that you compile with the /checked command-line option even for a release build(prevent your application from continuing to run with corrupted data and possible security holes,such as calutate)

　　The System.Decimal type is a very special type.always throw an OverflowException if the operation can’t be performed safely.

　　 Although many programming languages (C# and Visual Basic included) consider Decimal a primitive type, the CLR does not.

　　the CLR doesn’t have IL instructions that know how to manipulate a Decimal value.manipulating Decimal values is slower than manipulating CLR primitive values.

　　because there are no IL instructions for manipulating Decimal values, the checked and unchecked operators,statements, and compiler switches have no effect

　　the System.Numerics.BigInteger type is also special . never result in an OverflowException.

　　it internally uses an array of UInt32s to represent an arbitrarily large integer whose value has no upper or lower bound.

　　a BigInteger operation may throw an OutOfMemoryException if the value gets too large and there is insufficient available memory to resize the array.

3.Reference Types and Value Types

The CLR supports two kinds of types: reference types and value types.

Reference types:

　　allocated from the managed heap.the C# new operator returns the memory address of the object—the memory address refers to the object’s bits

　　1.The memory must be allocated from the managed heap.

　　2.Each object allocated on the heap has some additional overhead members associated with it that must be initialized.

　　3.The other bytes in the object (for the fields) are always set to zero.

　　4.Allocating an object from the managed heap could force a garbage collection to occur.

Value type:

　　usually allocated on a thread’s stack (although they can also be embedded as a field in a reference type object).

　　1.The variable representing the instance doesn’t contain a pointer to an instance; the variable contains the fields of the instance itself

　　2. Value type instances don’t come under the control of the garbage collector, so their use reduces pressure in the managed heap and reduces the number of collections an application requires over its lifetime.

　　3.all value types must be derived from System.ValueType

　　4.looking up a type in the documentation. reference type: called a class; value type: a structure or an enumeration

　　The .NET Framework SDK documentation clearly indicates which types are reference types and which are value types.

　　All of the structures are immediately derived from the System.ValueType abstract type.System.ValueType is itself immediately derived from the System.Object type.

　　All enumerations are derived from the System.Enum abstract type, which is itself derived from System.ValueType.The CLR and all programming languages give enumerations special treatment.

　　5.can’t choose a base type when defining your own value type, a value type can implement one or more interfaces if you choose.

　　all value types are sealed, which prevents a value type from being used as a base type for any other reference type or value type　　

　　6.the behavior of reference types and value types differs quite a bit.(copy 、point)

　　7.C# “thinks” that the instance is initialized if you use the new operator.

　　SomeVal v1 = new SomeVal(); // Allocated on stack

　　SomeVal v1; // Allocated on stack

　　8.instances of your type.

　　arguments are passed by value, which causes the fields in value type instances to be copied, hurting performance

　　a method that returns a value type causes the fields in the instance to be copied into the memory allocated by the caller when the method returns, hurting performance

should declare a type as a value type if all the following statements are true

　　1.The type acts as a primitive type.（it is a fairly simple type that has no members that modify any of its instance fields）the type is immutable.recommended that many value types mark all their fields as readonly

　　2.The type doesn’t need to inherit from any other type.

　　3.The type won’t have any other types derived from it.

　　4.Instances of the type are small (approximately 16 bytes or less).

　　5.Instances of the type are large (greater than 16 bytes) and are not passed as method parameters or returned from methods.

value types and reference types differ

　　1.Value type objects have two representations: an unboxed form and a boxed form (discussed in the next section).

　　Reference types are always in a boxed form.

　　2.Value types are derived from System.ValueType. This type offers the same methods as defined by System.Object

　　However, System.ValueType overrides Equals()、GetHashCode().Due to performance issues with this default implementation, when defining your own value types, you should override and provide explicit implementations for the Equals and GetHashCode methods.

　　3.can’t define a new value type or a new reference type by using a value type as a base class

　　shouldn’t introduce any new virtual methods into a value type;No methods can be abstract; all methods are implicitly sealed (can’t be overridden).

　　4.Reference type variables: contain the memory address of objects in the heap.when a reference type variable is created, it is initialized to null(indicating that the reference type variable doesn’t currently point to a valid object),NullReferenceException

　　value type variables:contain a value of the underlying type, and all members of the value type are initialized to 0.The CLR does offer a special feature that adds the notion of nullability to a value type. This feature, called nullable types

　　5.assign a value type variable to another.value type variable:a field-by-field copy;reference type variable:only the memory address is copied.

　　6.reference type variables can refer to a single object in the heap,operations on one can affect another

　　value type variables are distinct objects,operations on one cann't affect another

　　7.unboxed value types aren’t allocated on the heap, the storage allocated for them is freed as soon as the method that defines an instance of the type is no longer active as opposed to waiting for a garbage collection.

4.Boxing and Unboxing Value Types

Value types are lighter weight than reference types because they are not allocated as objects in the managed heap, not garbage collected, and not referred to by pointers

However, in many cases, you must get a reference to an instance of a value type.

boxing:convert a value type to a reference type

　　1.Memory is allocated from the managed heap.

　　2.The value type’s fields are copied to the newly allocated heap memory.

　　3.The address of the object is returned.

the FCL now includes a new set of generic collection classes that make the non-generic collection classes obsolete.

　　For example, you should use the System.Collections.Generic.List<T> class instead of the System.Collections.ArrayList class.

　　1.the API has been cleaned up and improved

　　2.the performance of the collection classes has been greatly improved as well

　　3.one of the biggest improvements,allow you to work with collections of value types without requiring that items in the collection be boxed/unboxed.

　　far fewer objects will be created on the managed heap,reducing the number of garbage collections required,improving the performance

　　4.get compile-time type safety, and your source code will be cleaner due to fewer casts

unboxing:convert the reference (or pointer) type to value type

　　Point p = (Point) a[0];

　　1.all of the fields contained in the boxed Point object must be copied into the value type variable, which is on the thread’s stack(the values of these fields are copied from the heap to the stack-based value type[the raw value type] instance)

　　2.Unboxing is not the exact opposite of boxing.

　　The unboxing operation is much less costly than boxing.

　　Unboxing is really just the operation of obtaining a pointer to the raw value type (data fields) contained within an object. In effect, the pointer refers to the unboxed portion in the boxed instance.

　　unlike boxing, unboxing doesn’t involve the copying of any bytes in memory

a boxed value type instance is unboxed：

　　1.If the variable containing the reference to the boxed value type instance is null, a NullReferenceException is thrown.

　　2.If the reference doesn’t refer to an object that is a boxed instance of the desired value type, an InvalidCastException is thrown

boxing and unboxing/copy operations hurt your application’s performance in terms of both speed and memory。

　　1.boxing

　　many compilers implicitly emit code to box objects, so it is not obvious when you write code that boxing is occurring.

　　If concerned about the performance of a particular algorithm,can use a tool such as ILDasm.exe to view the IL code and see where the box IL instructions are.

　　The important thing is that you’ve done the best you could and have eliminated the boxing from your own code.

　　Most of these methods offer overloaded versions for the sole purpose of reducing the number of boxing operations for the common value types.

　　boxing three times

　　calls String’s static Concat method:public static String Concat(Object arg0, Object arg1, Object arg2);

　　优化：Console.WriteLine(v + ", " + o);// Displays "123, 5" ;

　　boxing twice

　　more efficient:removing the cast saved two operations:an unbox and a box; save memory:an additional object from the managed heap that must be garbage collected in the future

　　boxing once.

　　public static void WriteLine(Boolean);

　　public static void WriteLine(Char);
　　public static void WriteLine(Char[]);
　　public static void WriteLine(Int32);
　　public static void WriteLine(UInt32);
　　public static void WriteLine(Int64);
　　public static void WriteLine(UInt64);
　　public static void WriteLine(Single);
　　public static void WriteLine(Double);
　　public static void WriteLine(Decimal);
　　public static void WriteLine(Object);
　　public static void WriteLine(String);

　　define your own value type, these FCL classes will not have overloads of these methods that accept your value type;or,there are a bunch of value types already defined in the FCL for which overloads of these methods do not exist

　　　　1.should not:calling or defining the overload that takes an Object(Passing a value type instance as an Object will cause boxing to occur, which will adversely affect performance)

　　　　2.should:calling or defining the overload the methods to be generic (possibly constraining the type parameters to be value types). Generics give you a way to define a method that can take any kind of value type without having to box it.

　　manually box the value type: smaller and faster

　　2.unboxing

　　Usually this happens because you have a value type instance and you want to pass it to a method that requires a reference type.

　　unboxed value types are lighter-weight types than boxing:

　　　　1.They are not allocated on the managed heap.(orignal stack)

　　　　2.They don’t have the additional overhead members that every object on the heap has: a type object pointer and a sync block index

　　Even though unboxed value types don’t have a type object pointer, you can still call virtual methods (such as Equals, GetHashCode, or ToString) inherited or overridden by the type.

　　If your value type overrides one of these virtual methods, then the CLR can invoke the method nonvirtually because

value types are implicitly sealed and cannot have any types derived from them

　　the value type instance being used to invoke the virtual method is not boxed. However, if your override of the virtual method calls into the base type's implementation of the method, then the value type instance does get boxed when calling the base type's implementation so that a reference to a heap object gets passed to the this pointer into the base method.

　　calling a nonvirtual inherited method (such as GetType or MemberwiseClone) always requires the value type to be boxed because these methods are defined by System.Object, so the methods expect the this argument to be a pointer that refers to an object on the heap.

　　casting an unboxed instance of a value type to one of the type’s interfaces requires the instance to be boxed, because interface variables must always contain a reference to an object on the heap.

　　Casting o to a Point unboxes o and copies the fields in the boxed Point to a temporary Point on the thread’s stack! The m_x and m_y fields of this temporary point are changed to 3 and 3, but the boxed Point isn’t affected by this call to Change。Many developers do not expect this.

　　Some languages, such as C++/CLI, let you change the fields in a boxed value type, but C# does not. However, you can fool C# into allowing this by using an interface。

　　p, is cast to an IChangeBoxedPoint. This cast causes the value in p to be boxed. Change is called on the boxed value, which does change its m_x and m_y fields to 4 and 4, but after Change returns, the boxed object is immediately ready to be garbage collected. So the fifth call to WriteLine displays (2, 2).

　　o is cast to an IChangeBoxedPoint. No boxing is necessary here because o is already a boxed Point. Then Change is called, which does change the boxed Point’s m_x and m_y fields. The interface method Change has allowed me to change the fields in a boxed Point object! Now, when WriteLine is called, it displays (5, 5) as expected.

　　 In C#, this isn’t possible without using an interface method.

　　value types should be immutable: that is, they should not define any members that modify any of the type’s instance fields. In fact, I recommended that value types have their fields marked as readonly so that the compiler will issue errors should you accidentally write a method that attempts to modify a field.

　　why value types should be immutable?

　　The unexpected behaviors shown in the previous example all occur when attempting to call a method that modifies the value type’s instance fields. If after constructing value type, you do call any methods that modify its state, you will get confused when all of the boxing and unboxing/field copying occurs. If the value type is immutable,you will end up just copying the same state around, and you will not be surprised by any of the behaviors you see.

5.Object Equality and Identity

　　The System.Object type offers a virtual method named Equals.if the arguments refer to different objects,Equals can’t be certain if the objects contain the same values, and therefore, false is returned.

　　the default implementation of Object’s Equals method really implements identity, not value equality.

　　question:because a type can override Object’s Equals(), this Equals() can no longer be called to test for identity.(When a type overrides Equals, the override should call its base class’s implementation of Equals unless it would be calling Object’s implementation)

　　answer:Object offers a static ReferenceEquals().should always call ReferenceEquals if you want to check for identity (if two references point to the same object). You shouldn’t use the C# == operator (unless you cast both operands to Object first) because one of the operands’ types could overload the == operator, giving it semantics other than identity.

　　ValueType do not call base.Equals.

　　System.ValueType (the base class of all value types) does override Object’s Equals() and is correctly implemented to perform a value equality check (not an identity check)

　　question:ValueType’s Equals method uses reflection to accomplish(compare the value in the this object with the value in the obj object by calling the field’s Equals method)

　　answer:Because the CLR’s reflection mechanism is slow, when defining your own value type, you should override Equals and provide your own implementation to improve the performance of value equality comparisons that use instances of your type.

　　overriding the Equals method:

　　1.implement the System.IEquatable<T> interface’s Equals().This generic interface allows you to define a type-safe Equals().implement the Equals() that takes an Object parameter to internally call the type-safe Equals().

　　2.Overload the == and != operator methods.implement these operator methods to internally call the type-safe Equals().

　　if you think that instances of your type will be compared for the purposes of sorting,you’ll want your type to also implement System.IComparable’s CompareTo method and System.IComparable<T>’s type-safe CompareTo method.If you implement these methods, you’ll also want to overload the various comparison operator methods (<, <=, >, >=) and implement these methods internally to call the type-safe CompareTo method.

6.Object Hash Codes

　　question:If you define a type and override the Equals(), you should also override the GetHashCode(),otherwise ,Microsoft’s C# compiler emits a warning

　　answer:The reason a type that defines Equals must also define GetHashCode is that the implementation of the System.Collections.Hashtable type, the System.Collections.Generic.Dictionary type, and some other collections require that any two objects that are equal must have the same hash code value.

　　So if you override Equals, you should override GetHashCode to ensure that the algorithm you use for calculating equality corresponds to the algorithm you use for calculating the object’s hash code.

　　Basically, when you add a key/value pair to a collection, a hash code for the key object is obtained first.

　　Using this algorithm of storing and looking up keys means that if you change a key object that is in a collection, the collection will no longer be able to find the object.

　　If you intend to change a key object in a hash table, you should remove the original key/value pair, modify the key object, and then add the new key/value pair back into the hash table.

　　System.Object’s implementation of the GetHashCode method doesn’t know anything about its derived type and any fields that are in the type.For this reason, Object’s GetHashCode method returns a number that is guaranteed not to change for the lifetime of the object.

　　hash code value should never,ever persist hash code values.

　　1. hash code values are subject to change

　　2.a future version of a type might use a different algorithm for calculating the object’s hash code.

7.The dynamic Primitive Type

type-safe programming language:

　　all expressions resolve into an instance of a type and the compiler will generate only code that is attempting to perform an operation that is valid for this type

　　benefit:

　　1. a type-safe programming language over a non–type-safe programming language is that many programmer errors are detected at compile time, helping to ensure that the code is correct before you attempt to execute it.

　　2.compile-time languages can typically produce smaller and faster code because they make more assumptions at compile time and bake those assumptions into the resulting IL and metadata.

　　question:However, there are also many occasions when a program has to act on information that it doesn’t know about until it is running.

　　1.using reflection

　　2.communicate with components that are not implemented in C#.

　　anserw:To make it easier,The C# compiler offers you a way to mark an expression’s type as dynamic.the compiler generates special IL code that describes the desired operation.This special code is referred to as the payload. At run time, the payload code determines the exact operation to execute based on the actual type of the object now referenced by the dynamic expression/ variable.

　　Because value is dynamic, the C# compiler emits payload code that will examine the actual type of value at run time and determine what the + operator should actually do.

　　use dynamic when specifying generic type arguments to a generic class (reference type), a structure (value type), an interface, a delegate, or a method. When you do this, the compiler converts dynamic to Object and applies DynamicAttribute to the various pieces of metadatawhere it makes sense. Note that the generic code that you are using has already been compiled and will consider the type to be Object; no dynamic dispatch will be performed because the compiler did not produce any payload code in the generic code.

　　cannot write methods whose signature differs only by dynamic and Object.

　　Any expression can implicitly be cast to dynamic because all expressions result in a type that is derived from Object.the CLR will validate the cast at run time to ensure that type safety is maintained(InvalidCastException exception)

　　Declaring a local variable using var is just a syntactical shortcut that has the compiler infer the specific data type from an expression.The var keyword can be used only for declaring local variables inside a method, whereas the dynamic keyword can be used for local variables, fields, and arguments.

　　You must explicitly initialize a variable declared using var, whereas you do not have to initialize a variable declared with dynamic

　　A dynamic expression is really the same type as System.Object. The compiler assumes that whatever operation you attempt on the expression is legal, so the compiler will not generate any warnings or errors. However, exceptions will be thrown at run time if you attempt to execute an invalid operation.

　　优化：

　　excel.Cells[1, 1] is of type dynamic, you do not have to explicitly cast it to the Range type before its Value property can be accessed. Dynamification can greatly simplify code that interoperates with COM objects.

　　how to use reflection to call a method (“Contains”) on a String target (“Jeffrey Richter”) passing it a String argument (“ff”) and storing the Boolean result in a local variable (result).

　　优化：

　　Different programming languages define their own runtime binders that encapsulate the rules of that language. The code for the C# runtime binder is in the Microsoft.CSharp.dll assembly, and you must reference this assembly when you build projects that use the dynamic keyword.

　　At run time, the Microsoft.CSharp.dll assembly will have to load into the AppDomain, which hurts your application’s performance and increases memory consumption. Microsoft.CSharp.dll also loads System.dll and System.Core.dll. If you are using dynamic to help you interoperate with COM components, then System.Dynamic.dll will also load.

　　Due to all the overhead associated with C#’s built-in dynamic evaluation feature, you should consciously decide that you are getting sufficient syntax simplification from the dynamic feature to make it worth the extra performance hit of loading all these assemblies and the extra memory that they consume.

　　If the type of the object being used in the dynamic expression does not implement the IDynamicMetaObjectProvider interface, then the C# compiler treats the object like an instance of an ordinary C#-defined type and performs operations on the object using reflection.

　　One of the limitations of dynamic is that you can only use it to access an object’s instance members because the dynamic variable must refer to an object

码农公寓

相关文章