<$BlogRSDUrl$>

Tuesday, January 04, 2005

Byte Alignment in .NET and Structure Layouts :: JIT and .NET 2.0



What is the difference between the following structures?

struct BadValueType
{
char c1;
int i;
char c2;
}

struct GoodValueType
{
int i;
char c1;
char c2;
}

Nothing much, except the jumbled type declarations Huh?

Fine,Now let's look at the size of these structures,

The size of BadValueType Structure in:
.NET Framework 1.x : Managed sizeof= 8 Bytes, Marshal.Sizeof = 12 Bytes
.NET Framework 2.0 : Managed sizeof= 12 Bytes, Marshal.Sizeof = 12 Bytes

The size of GoodValueType Structure in:
.NET Framework 1.x : Managed sizeof= 8 Bytes, Marshal.Sizeof = 8 Bytes
.NET Framework 2.0 : Managed sizeof= 8 Bytes, Marshal.Sizeof = 8 Bytes

[Note: Size of int=4, char=2]

The Reason behind these differences is "BYTE ALIGNMENT", As with the default packing in unmanaged C++, integers are laid out on four-byte boundaries, so while the first
character uses two bytes (a char in managed code is a Unicode character, thus occupying two bytes), the integer moves up to the next 4-byte boundary, and the second character uses the subsequent 2 bytes. The resulting structure is 12 bytes when measured with Marshal.SizeOf.

32 bit microprocessors typically organize memory as shown below.

Byte 0 Byte 1 Byte 2 Byte 3
0x1000
0x1004 A0 A1 A2 A3
0x1008
0x100C B0 B1 B2
0x1010 B3


Most of the processer architectures cannot read data from odd addresses.
Processor Architectures are inefficient in reading the data if it starts at an address not divisible by four.

Memory is accessed by performing 32 bit bus cycles. 32 bit bus cycles can however be performed at addresses that are divisible by 4. So for efficiency purposes, compilers add the so-called pad bytes.

The reasons for not permitting misaligned long word reads and writes are not difficult to see. For example, an aligned long word A would be written as A0, A1, A2 and A3.

Thus the microprocessor can read the complete long word in a single bus cycle. If the same microprocessor now attempts to access a long word at address 0x100D, it will have to read bytes B0, B1, B2 and B3. Notice that this read cannot be performed in a single 32 bit bus cycle. The microprocessor will have to issue two different reads at address 0x100C and 0x1010 to read the complete long word. Thus it takes twice the time to read a misaligned long word.

The following byte padding rules will generally work with most 32 bit processor.

a. single byte numbers can be aligned at any address
b. Two byte numbers should be aligned to a two byte boundary
c. Four byte numbers should be aligned to a four byte boundary

This is the cause of the difference.

Fine…. But, why does this differ from .NET 1.x and .NET 2.0 ?

The Answer is that by default, the .NET compilers all apply a StructLayoutAttribute to structures, specifying a Sequential layout. This means that the fields are laid out in the type according to their order in the source file. However, in the .NET Framework 1.x, a request for a Sequential layout is not respected by the just-in-time compiler.

Here is the IL.

.class nested private sequential ansi sealed beforefieldinit BadValueType
extends [mscorlib]System.ValueType
{
.field public char c1

.field public char c2

.field public int32 i

}


In the .NET Framework 2.0, the JIT does enforce a Sequential
layout (if specified) for the managed layout of value types
,
though only if there are no reference type field members. Thus, the sizing of types is more important in the next version[2.0] of the Framework.

Bottom-line: in simple English, .NET 2.0 JIT does its job flawlessly on structures.

Some time back I had a bit of argument with a friend of mine, to prove the above 2 structures actually differs in implementation, and I remembered this when I read a article

on the same lines today from MSDN, A pretty neat article. Infact this article is an elaboration of the point given as one-liner in the article.

Watchout for structures when you create them next time, and think about playing around with ‘m’ structures with ‘n’ size…. m x n = !!!


posted by Logu Krishnan : 2:14 AM

Comments:
well,i have basic doubt for which i do not get a satisfactory answer through google.Why have the certification exams for VC++ been discontinued by microsoft since last year.Will there be any Microsoft certification exams for C++ in the future or is it dead?
Regards
UV


 
Last year, there was lots of branding happened for .NET, also there were lots of cribs about VC++.NET(!), Many core c++ programmers did not like the way VC++.NET had evolved. eventually, now we have C++/CLI Evolving and catching up. Once, this issue settles down we would definitely have certifications on VC++ again :) Just a matter of time.


 
Thnx for the response.
Regards
UV


 
When it comes to individual variables, CLR does the basic plumbing behind the scenes. and this issue is bit insignificant for individual variables. whereas, in case of structs this can be come a huge memory hog.


 
Post a Comment

This page is powered by Blogger. Isn't yours?