CLR via C# 总结之Chap4 Type Fundamentals

本章包含知识点：
1. 类型转换基础
2. is和as的区别
3. 命名空间及其与程序集的关系
4. 代码运行时，运行栈和堆的allocate逻辑，类型相关（主要是继承）调用各类方法（静态，虚，非虚）时对象实例如何找到应该调用的方法

Type conversion

All types are derived from System.Object(like Java), so every object shares a minimum set of behaviors defined by System.Object
CLR requires every object to be created using new operator. Things done by new:
1. Calculate the bytes required by the fields in the class type up to and including System.Object. Additional members(type object pointer and sync block index) are also needed to manage the object.
2. Allocate memory for the object and set all the bytes zero
3. Initialize the object’s type object pointer and block index members.
4. Constructors called up to System.Object
One of the most important features of CLR is type safety. You can always know the type of an object by calling GetType(). Also, this method is non-virtual so it cannot be overrided. So it can never spoof its type.
Conversion to base type is safe so it can be implicitly converted, while conversion to derived type must be explicitly cast. If not, CTE(Compile Time Error). At run time, the CLR checks casting operations to ensure that casts are always to the object’s actual type or any of its base types, that is, if the types are compatible. If not compatible, RTE(Run Time Error).
```
internal class B{}
internal class D : B{}

Object o2 = new B();
B b1 = new B();
B b2 = new D();
B b3 = new Object(); // CTE
B b4 = d1;
D d3 = b2; // CTE
D d5 = (D) b2;
D d6 = (D) b1; // RTE
```
is: checks whether an object is compatible with a given type with the result of evaluation Boolean. It will never throw an exception. Typically, it is used as follows:
if(o is SomeClass){ SomeClass e = (SomeClass) o; // Use e within the remainder of 'if' statement }
But in this case, the CLR actually checks the type twice, one at is and one inside if statement to verify again. This is a performance problem without doubt because this programming paraidgm is quite common.
as: To solve the problem, when using as, if the object is compatible with the specified type, it will return a non-null reference to the same object; if not, it returns null.
SomeClass e = o as SomeClass; if(e != null){ // blablabla }

Namspaces and Assemblies

To the developer, namespace allow for the logical grouping of related types, so it’s easier to locate a particular type; to the compiler, a namespace is dimply an easy way of making a type’s name longer and more likely to be unique by preceding the name with some symbols separated by dots. So name collision problem can be solved in some degree.
The C# using directive instructs the compiler to try prepending different prefixes to a type name until a match is found.
The CLR doesn’t know anything about namespaces. When accessign a type, the CLR needs to know the full name of the type and which assmebly(程序集) contains the definition of the type so that the runtime can load the proper assembley, find the type, and manipulate it.

Except for typing the full name, we can also create an alias for a single type or namespace.

using Microsoft; // Try prepending "Microsoft."
using Wintellect; // Try prepending "Wintellect."

// Define WintellectWidget symbol as an alias to Wintellect.Widget
using WintellectWidget = Wintellect.Widget;
public sealed class Program {
    public static void Main() {
        WintellectWidget w = new WintellectWidget(); // No error now
    }
}

Create a namespace
```
namespace CompanyName{
    public class A{} // TypeDef: CompanyName.A

    namespace X{
        public class B{} // TypeDef: COmpanyName.X.A
    }
}
```
The comment above indicates the real name of the type the compiler will emit into the type definition metadata table; this is the real name of the type from the CLR’s perspective.
Namespace and assembly are not necessarily related. That is to say, the various types belonging to a single namespace might be implemented in multiple assemblies. And a single assemnbly can define types from different namespaces.

How Things Relate at Run Time

When a thread is created, it is allocated a X-MB stack. This stack space is used for passing arguments to a method and for local variables defined within a method.
All but the simplest of methods contain some prologue code, which initializes a method before it starts doing its work, and epilogue code, which cleans up a method after its work done so that it can return to its caller.
The return address indicates where the called method should return to in the calling method. CPU’s instruction pointer will be set to this return address when the called method’s work done.

An example as illustration: The relationship between source code, IL, and JITted code

Windows process starts -> CLR loaded into it -> managed heap initialized -> a thread created ->
As the just-in-time(JIT) compiler converts method M3’s Intermediate Language(IL) code into native CPU instructions, it notices all the types refferred to inside M# and CLR ensures that the assemblies that define these types are loaded. Then, using the assembly’s metadata, the CLR extracts infos about these types and creates some data structures to represent the types themselves.(The data sturctures for Employee and Manager are shown above, and called type object).
The content of type object:
- All objects on the heap contain two overhead members: type object pointer and sync block index.
- The bytes that back these static data fields are allocated within the type objects themselves.
- A method table with one entry per method defined
After the type objects required created and the code for M3 compiled, the CLR allows the thread to execute M3’s native code. Prologue code executes to allocate local variables(initialized to null or 0, no default constructor will be called, and C# compiler complains if you attempts to read from a local variable that you have not explicitly initialized in source code. 更具体地说，函数中如此则编译错误；类全局空间如此则NullReferenceException).
Then, M3 executes its code to construct a Manager object. It has a type object pointer(initialized to refer to the object’s correspoinding type object), a sync block index, and the bytes necessary to hold all of the instance data fields defined by the Manager type and any base classes of it. Mind that all of these are allocated in heap, NOT stack. The new operator returns the memory address of the Manager object, which is saved in the variable e on the thread’s stack.
For static method: JIT compiler locates the type object, then the entry in its method table that refers to the static method being called, JITs the method(if necessary), and calls the JITted code. At this time, the first Manager object has no one to refer to and is a prime cadidate for being garbage collected.
For nonvirtual instance method: JIT compiler locates the type object that corresponds to the type of the variable being used to make the call(not necessary where the type object pointer refer to), then calls the method if it’s defined in this type; if not, it will climb down the class hierarchy until it reaches the method and then call it.
For virtual instance method: JIT compiler locates the type object according to the type object pointer, which is the actual type of the object.(This is to say, assume the type return by Lookup is Manager, then the pointer points to Manager) Of course, if the method is not overridden(mind that name hiding is not override), it will walk through the class hierarchy until it meets someone override this method.

// demo:)

class A{
    public int fuck(){return 1;}
    public virtual int you(){reutrn 10;}
}

class B : A{
    public int fuck(){return 2;}
    public override int you(){reutrn 20;}
}

class C : B{
    public int you(){return 200;}
}

class Program{
    static void main(String[] args){
        A a = new A();
        a.fuck(); // 1
        a.you(); // 10
        B b = new B();
        b.fuck(); // 2
        b.you(); // 20
        C c = new C();
        c.fuck(); // 2
        c.you(); // 30
        A d = new C();
        d.fuck(); // 1
        d.you(); // 20, NOT 30!!! Name hiding is NOT overriding
        A e = new B();
        e.fuck(); // 1
        e.you(); // 20
    }
}

What’s type object

Type object will point to the object’s type. And we can see that the Employee and Manager type objects both contain type object pointer members, because type objects are actually objects themselves. Their types are System.Type, so their type object pointer will points to System.Type’s type object. As for the type object pointer of System.Type’s type object, it just refers to itself as we can see in the picture above. BTW, System.Object.GetType() simply returns the address stored in the specified object’s type object pointer member, that is, a pointer to an object’s type object.