Using Pointers, Arrays, Structures and Unions in 8051 C Compilers

by Olaf Pfieffer, based on the C51 Primer by Mike Beach, Hitex UK

Although both the Keil and Raisonance 8051 C compiler systems allow you to use pointers, arrays, structures and unions as in any PC-based C dialect, there are several important extensions allowing to generate more efficient code.

Using Pointers And Arrays

One of C's greatest strengths can also be its greatest weakness - the pointer. The use and, more appropriately, the abuse of this language feature is largely why C is condemned by some as dangerous!

Pointers In Assembler

For an assembler programmer the C pointer equates closely to indirect addressing. In the 8051 this is achieved by the following instructions

  1. MOV R0,#40     ; Put on-chip address to be indirectly
  2. MOV A,@RO      ; addressed in R0
  3. MOV R0,#40     ; Put off-chip address to be indirectly
  4. MOVX A,@RO     ; addressed in R0
  5. MOVX A,@DPTR   ; Put off-chip address to be indirectly
  6.                ; addressed in DPTR
  7. CLR A
  8. MOV DPTR,#0040 ; Put off-chip address to be indirectly
  9. MOVC A,@A+DPTR ; addressed in DPTR

In each case the data is held in a memory location indicated by the value in registers to the right of the '@'.

Pointers In C

The C equivalent of the indirect instruction is the pointer. The register holding the address to be indirectly accessed in the assembler examples is a normal C type, except that its purpose is to hold an address rather than a variable or constant data value.

It is declared by:

  1. unsigned char *pointer0 ;

Note the asterisk prefix, indicating that the data held in this variable is an address rather than a piece of data that might be used in a calculation etc..

In all cases in the assembler example two distinct operations are required:

   1. Place address to be indirectly addressed in a register.
   2. Use the appropriate indirect addressing instruction to access data held at chosen address.

Fortunately in C the same procedure is necessary, although the indirect register must be explicitly defined, whereas in assembler the register exists in hardware.

  1. /* 1 - Define a variable which will hold an address */
  2. unsigned char *pointer ;
  3. /* 2 - Load pointer variable with address to be accessed*/
  4. /*indirectly */
  5. pointer = &c_variable ;
  6. /* 3 - Put data '0xff' indirectly into c variable via*/
  7. /*pointer */
  8. *pointer = 0xff ;

Taking each operation in turn...

  1. Reserve RAM to hold pointer. In practice the compiler attaches a symbolic name to a RAM location, just as with a normal variable.
  2. Load reserved RAM with address to be accessed, equivalent to 'MOV R0,#40'. In English this C statement means: "take the 'address of' c_variable and put it into the reserved RAM, i.e, the pointer" In this case the pointer's RAM corresponds to R0 and the '&' equates loosely to the assembler '#'.
  3. Move the data indirectly into pointed-at C variable, as per the assembler 'MOV A,@R0'.

The ability to access data either directly, x = y, or indirectly, x = *y_ptr, is extremely useful. Here is C example:

  1. /* Demonstration Of Using A Pointer */
  2. void function(void)
  3. {
  4. unsigned char c_variable ; // 1 - Declare a c variable unsigned char
  5. *ptr ;                     // 2 - Declare a pointer (not pointing at anything yet!)
  6.   c_variable = 0xff ;      // 3 - Set variable equal to 0xff directly
  7.                            // OR, to do the same with pointers:
  8.   ptr = &c_variable ;      // 4 - Force pointer to point at c_variable at run time
  9.   *ptr = 0xff ;            // 5 - Move 0xff into c_variable indirectly
  10. }

Note: Line 8 causes pointer to point at variable. An alternative way of doing this is at compile time thus:

  1. /* Demonstration Of Using A Pointer */
  2. void function (void)
  3. {
  4. unsigned char c_variable;         // 1-Declare a c variable
  5. unsigned char *ptr = &c_variable; // 2-Declare a pointer, intialized to pointing at
  6.                                      c_variable during compilation
  7.    c_variable = 0xff ;   // 3 - Set variable equal to 0xff directly
  8.                          // OR - use the pointer which is already initialized
  9.    *ptr = 0xff           // 5 - Move 0xff into c_variable indirectly
  10. }

Pointers with their asterisk prefix can be used exactly as per normal data types. The statement:

  1. x = y + 3 ;

could equally well perform with pointers, as per

  1. char x, y ;
  2. char *x_ptr = &x ;
  3. char *y_ptr = &y ;
  4. *x_ptr = *y_ptr + 3 ;

or:

  1. x = y * 25 ;
  2. *x_ptr = *y_ptr * 25 ;

The most important thing to understand about pointers is that

  1. *ptr = var ;

means "set the value of the pointed-at address to value var", whereas

  1. ptr = &var ;

means "make ptr point at var by putting the address of (&) in ptr, but do not move any data out of var itself".

Thus the rule is to initialize a pointer:

  1. ptr = &var ;

To access the data indicated by *ptr:

  1. var = *ptr ;

Pointers To Absolute Addresses

In embedded C, ROM, RAM and peripherals are at fixed addresses. This immediately raises the question of how to make pointers point at absolute addresses rather than just variables whose address is unknown (and largely irrelevant).

The simplest method is to determine the pointed-at address at compile time:

  1. char *abs_ptr = 0x8000 ; // Declare pointer and force to 0x8000

However if the address to be pointed at is only known at run time, an alternative approach is necessary. Simply, an uncommitted pointer is declared and then forced to point at the required address thus:

  1. char *abs_ptr ; // Declare uncommitted pointer
  2. abs_ptr = (char *) 0x8000 ; // Initialize pointer to 0x8000
  3. *abs_ptr = 0xff ; // Write 0xff to 0x8000
  4. *abs_ptr++ ; // Make pointer point at next location in RAM

Arrays And Pointers - Two Sides Of The Same Coin?

Uninitialized Arrays

The variables declared via

  1. unsigned char x ;
  2. unsigned char y ;

are single 8 bit memory locations. The declarations:

  1. unsigned int a ;
  2. unsigned int b ;

yield four memory locations, two allocated to 'a' and two to 'b'. In other programming languages it is possible to group similar types together in arrays. In basic an array is created by DIM a(10).
Likewise 'C' incorporates arrays, declared by:

  1. unsigned char a[10] ;

This has the effect of generating ten sequential locations, starting at the address of 'a'. As there is nothing to the right of the declaration, no initial values are inserted into the array. It therefore contains zero data and serves only to reserve ten contiguous bytes.

Initialized Arrays

A more usual instance of arrays would be

  1. unsigned char test_array [] = { 0x00,0x40,0x80,0xC0,0xFF } ;

where the initial values are put in place before the program gets to "main()". Note that the size of this initialized array is not given in the square brackets - the compiler works-out the size automatically upon compilation.

Another common instance of an array is analogous to the BASIC string as per:

  1. A$ = "HELLO!"

In C this equates to:

  1. char test_array[] = { "HELLO!" } ;

In C there is no real distinction between strings and arrays as a C array is just a series of sequential bytes occupied either by a string or a series of numbers. In fact the realms of pointers and arrays overlap with strings by virtue of :

  1. char test_array = { "HELLO!" } ;
  2. char *string_ptr = { "HELLO!" } ;

Case 1 creates a sequence of bytes containing the ASCII equivalent of "HELLO!". Likewise the second case allocates the same sequence of bytes but in addition creates a separate pointer called *string_ptr to it. Notice that the "unsigned char" used previously has become "char", literally an ASCII character.

The second is really equivalent to:

  1. char test_array = { "HELLO!" } ;

Then at run time:

  1. char arr_ptr = test_array ; // Array treated as pointer - or;
  2. char arr_ptr = &test_array[0] ;
  3. // Put address of first element of array into pointer

This again shows the partial interchangeability of pointers and arrays. In English, the first means "transfer address of test_array into arr_ptr". Stating an array name in this context causes the array to be treated as a pointer to the first location of the array. Hence no "address of" (&) or '*' to be seen.

The second case reads as "get the address of the first element of the array name and put it into arr_ptr". No implied pointer conversion is employed, just the return of the address of the array base.

The new pointer "*arr_ptr" now exactly corresponds to *string_ptr, except that the physical "HELLO!" they point at is at a different address.

Using Arrays

Arrays are typically used like this

  1. /* Copy The String HELLO! Into An Empty Array */
  2. unsigned char source_array[] = { "HELLO!" } ;
  3. unsigned char dest_array[7];
  4. unsigned char array_index ;
  5. array_index = 0 ; // First character index
  6. while(array_index < 7)   // Check for end of array
  7. {
  8.     dest_array[array_index] = source_array[array_index] ;      
  9.     // Move character-by-character into destination array
  10.     
  11.     array_index++ ;  // Next character index
  12. }

The variable array_index shows the offset of the character to be fetched (and then stored) from the starts of the arrays.

As has been indicated, pointers and arrays are closely related. Indeed the above program could be re-written as:

  1. /* Copy The String HELLO! Into An Empty Array */
  2. char *string_ptr = { "HELLO!" } ;
  3. unsigned char dest_array[7] ;
  4. unsigned char array_index  ;
  5. array_index = 0 ; // First character index
  6. while(array_index < 7)     // Check for end of array
  7. {
  8.     dest_array[array_index] = string_ptr[array_index] ;  
  9.     // Move character-by-character into destination array.
  10.     array_index++ ;
  11. }

The point to note is that only the definition of string_ptr (previous source_array) changed. By removing the '*' on string_ptr and appending a '[ ]' pair, this pointer can be turned back into an array!

However in this case there is an alternative way of scanning along the HELLO! string, using the *ptr++ convention:

  1. /* Copy The String HELLO! Into An Empty Array */
  2. char *string_ptr = { "HELLO!" } ;
  3. unsigned char dest_array[7] ;
  4. unsigned char array_index  ;
  5. array_index = 0 ; // First character index
  6. while(array_index < 7)     // Check for end of array
  7. {
  8.     dest_array[array_index] = *string_ptr++ ;  
  9.     // Move character-by-character into destination array.
  10.     array_index++ ;
  11. }

This is an example of C being somewhat inconsistent; this *ptr++ statement does not mean "increment the thing being pointed at" but rather, increment the pointer itself, so causing it to point at the next sequential address. Thus in the example the character is obtained and then the pointer moved along to point at the next higher address in memory.

Summary Of Arrays And Pointers

To summarize

Create An Uncommitted Pointer

  1. unsigned char *x_ptr ;

Create A Pointer To A Normal C Variable

  1. unsigned char x ;
  2. unsigned char *x_ptr = &x ;

Create An Array With No Initial Values

  1.  unsigned char x_arr[10] ;

Create An Array With Initialized Values

  1. unsigned char x_arr[] = { 0,1,2,3 } ;

Create An Array In The Form Of A String

  1. char x_arr[] = { "HELLO" } ;

Create A Pointer To A String

  1. char *string_ptr = { "HELLO" } ;

Create A Pointer To An Array

  1. char x_arr[] = { "HELLO" } ;
  2. char *x_ptr = x_arr ;

Force A Pointer To Point At The Next Location

  1. *ptr++ ;

Structures

Structures are perhaps what makes C such a powerful language for creating very complex programs with huge amounts of data. They are basically a way of grouping together related data items under a single symbolic name.

Why Use Structures?

Here is an example: A piece of C51 software had to perform a linearization process on the raw signal from a variety of pressure sensors manufactured by the same company. For each sensor to be catered for there is an input signal with a span and offset, a temperature coefficient, the signal conditioning amplifier, a gain and offset. The information for each sensor type could be held in "normal" constants thus:

  1. unsigned char sensor_type1_gain = 0x30 ;
  2. unsigned char sensor_type1_offset = 0x50 ;
  3. unsigned char sensor_type1_temp_coeff = 0x60 ;
  4. unsigned char sensor_type1_span = 0xC4 ;
  5. unsigned char sensor_type1_amp_gain = 0x21 ;
  6. unsigned char sensor_type2_gain = 0x32 ;
  7. unsigned char sensor_type2_offset = 0x56 ;
  8. unsigned char sensor_type2_temp_coeff = 0x56 ;
  9. unsigned char sensor_type2_span = 0xC5 ;
  10. unsigned char sensor_type2_amp_gain = 0x28 ;
  11. unsigned char sensor_type3_gain = 0x20 ;
  12. unsigned char sensor_type3_offset = 0x43 ;
  13. unsigned char sensor_type3_temp_coeff = 0x61 ;
  14. unsigned char sensor_type3_span = 0x89 ;
  15. unsigned char sensor_type3_amp_gain = 0x29 ;

As can be seen, the names conform to an easily identifiable pattern of:

  1. unsigned char sensor_typeN_gain = 0x20 ;
  2. unsigned char sensor_typeN_offset = 0x43 ;
  3. unsigned char sensor_typeN_temp_coeff = 0x61 ;
  4. unsigned char sensor_typeN_span = 0x89 ;
  5. unsigned char sensor_typeN_amp_gain = 0x29 ;

Where 'N' is the number of the sensor type. A structure is a neat way of condensing this type of related and repeating data. In fact the information needed to describe a sensor can be reduced to a generalized:

  1. unsigned char gain ;
  2. unsigned char offset ;
  3. unsigned char temp_coeff ;
  4. unsigned char span ;
  5. unsigned char amp_gain ;

The concept of a structure is based on this idea of generalized "template" for related data. In this case, a structure template (or "component list") describing any of the manufacturer's sensors would be declared:

  1. struct SENSOR_DESC
  2. {
  3.   unsigned char gain ;
  4.   unsigned char offset ;
  5.   unsigned char temp_coeff ;
  6.   unsigned char span ;
  7.   unsigned char amp_gain ;
  8. } ;

This does not physically do anything to memory. At this stage it merely creates a template which can now be used to put real data into memory.

This is achieved by:

  1. struct SENSOR_DESC sensor_database ;

This reads as "use the template SENSOR_DESC to layout an area of memory named sensor_database, reflecting the mix of data types stated in the template". Thus a group of 5 unsigned chars will be created in the form of a structure.

The individual elements of the structure can now be accessed as:

  1. sensor_database.gain = 0x30 ;
  2. sensor_database.offset = 0x50 ;
  3. sensor_database.temp_coeff = 0x60 ;
  4. sensor_database.span = 0xC4 ;
  5. sensor_database.amp_gain = 0x21 ;

Arrays Of Structures

In the example though, information on many sensors is required and, as with individual chars and ints, it is possible to declare an array of structures. This allows many similar groups of data to have different sets of values.

  1. struct SENSOR_DESC sensor_database[4] ;

This creates four identical structures in memory, each with an internal layout determined by the structure template. Accessing this array is performed simply by appending an array index to the structure name:

  1. /*Operate On Elements In First Structure Describing */
  2. /*Sensor 0 */
  3. sensor_database[0].gain = 0x30 ;
  4. sensor_database[0].offset = 0x50 ;
  5. sensor_database[0].temp_coeff = 0x60 ;
  6. sensor_database[0].span = 0xC4 ;
  7. sensor_database[0].amp_gain = 0x21 ;
  8. /* Operate On Elements In First Structure Describing */
  9. /*Sensor 1 */
  10. sensor_database[1].gain = 0x32 ;
  11. sensor_database[1].offset = 0x56 ;
  12. sensor_database[1].temp_coeff = 0x56 ;
  13. sensor_database[1].span = 0xC5 ;
  14. sensor_database[1].amp_gain = 0x28 ;
  15. // and so on...

Initialized Structures

As with arrays, a structure can be initialized at declaration time

  1. struct SENSOR_DESC sensor_database = { 0x30, 0x50, 0x60, 0xC4, 0x21 } ;

so that here the structure is created in memory and pre-loaded with values.
The array case follows a similar form:

  1. struct SENSOR_DESC sensor_database[4] =
  2. {
  3.   {0x20,0x40,0x50,0xA4,0x21},
  4.   {0x33,0x52,0x65,0xB4,0x2F},
  5.   {0x30,0x50,0x48,0xC4,0x3A},
  6.   {0x32,0x56,0x56,0xC5,0x28}
  7. } ;

Placing Structures At Absolute Addresses

It is sometimes necessary to place a structure at an absolute address. A typical example are CAN interfaces or other peripheral chips that offer arrays of data groups.

For example, the registers of a memory-mapped real time clock chip are to be grouped together as a structure. The template in this instance might be

  1. // Contents Of RTCBYTES.C Module
  2. struct RTC
  3. {
  4.   unsigned char seconds ;
  5.   unsigned char minutes ;
  6.   unsigned char hours ;
  7.   unsigned char days ;
  8. } ;
  9. struct RTC xdata RTC_chip ; // Create xdata structure

A trick using the linker is required here so the structure creation must be placed in a dedicated module. This module's XDATA segment, containing the RTC structure, is then fixed at the required address at link time.

Using the absolute structure could be:

  1. /* Structure located at base of RTC Chip */
  2. MAIN.C Module
  3. extern xdata struct RTC_chip ;
  4. /* Other XDATA Objects */
  5. xdata unsigned char time_secs, time_mins ;
  6. void main(void)
  7. {
  8.   time_secs = RTC_chip.seconds ;
  9.   time_mins = RTC_chip.minutes;
  10. }

Linker Input File To Locate RTC_chip structure over real RTC Registers is:

  1. l51 main.obj,rtcbytes.obj XDATA(?XD?RTCBYTES(0h))

Pointers To Structures

Pointers can be used to access structures, just as with simple data items. Here is an example:

  1. /* Define pointer to structure */
  2. struct SENSOR_DESC *sensor_database ;
  3. /* Use Pointer To Access Structure Elements */
  4. sensor_database->gain = 0x30 ;
  5. sensor_database->offset = 0x50 ;
  6. sensor_database->temp_coeff = 0x60 ;
  7. sensor_database->span = 0xC4 ;
  8. sensor_database->amp_gain = 0x21 ;

Note that the '*' which normally indicates a pointer has been replaced by appending '->' to the pointer name. Thus '*name' and 'name->' are equivalent.

Passing Structure Pointers To Functions

A common use for structure pointers is to allow them to be passed to functions without huge amounts of parameter passing; a typical structure might contain 20 data bytes and to pass this to a function would require 20 parameters to either be pushed onto the stack or an abnormally large parameter passing area. By using a pointer to the structure, only the two or three bytes that constitute the pointer need be passed. This approach is recommended for C51 as the overhead of passing whole structures can tie the poor old 8051 CPU in knots!

This would be achieved by:

  1. struct SENSOR_DESC *sensor_database ;
  2. sensor_database->gain = 0x30 ;
  3. sensor_database->offset = 0x50 ;
  4. sensor_database->temp_coeff = 0x60 ;
  5. sensor_database->span = 0xC4 ;
  6. sensor_database->amp_gain = 0x21 ;
  7. test_function(*struct_pointer) ;
  8. test_function(struct SENSOR_DESC *received_struct_pointer)
  9. {
  10.   // Write directly into the structure
  11.   received_struct_pointer->gain = 0x20 ;
  12.   received_struct_pointer->temp_coef = 0x40 ;
  13. }

Advanced Note: Using a structure pointer will cause the called function to operate directly on the structure rather than on a copy made during the parameter passing process.

Structure Pointers To Absolute Addresses

It is sometimes necessary to place a structure at an absolute address. This might occur if, for example, a memory-mapped real time clock chip is to be handled as a structure. An alternative approach to that given earlier is to address the clock chip via a structure pointer.

The important difference is that in this case no memory is reserved for the structure - only an "image" of it appears to be at the address.

The template in this instance might be:

  1. /* Define Real Time Clock Structure */
  2. struct RTC
  3. {
  4.     char seconds ;
  5.     char mins ;
  6.     char hours ;
  7.     char days ;
  8. } ;
  9.              
  10. /* Create A Pointer To Structure */
  11. struct RTC xdata *rtc_ptr ;  // 'xdata' tells C51 that this
  12.                              //is a memory-mapped device.
  13. void main(void)
  14. {
  15.     rtc_ptr = (void xdata *) 0x8000 ;  // Move structure
  16.                             // pointer to address of real-time
  17.                             // clock at 0x8000 in xdata
  18.     rtc_ptr->seconds = 0 ;  // Operate on elements
  19.     rtc_ptr->mins = 0x01 ;
  20. }

This general technique can be used in any situation where a pointer-addressed structure needs to be placed over a specific IO device. However it is the user's responsibility to make sure that the address given is not likely to be allocated by the linker as general variable RAM!

To summarize, the procedure is:

  1. Define template
  2. Declare structure pointer as normal
  3. At run time, force pointer to required absolute address in the normal way.

Unions

Unions allow you to define different datatype references for the same physical address. This way you can address a 32-bit word as a "long" OR as 2 different "ints" OR as an array of 4 bytes.

A union is similar in concept to a structure except that rather than creating sequential locations to represent each of the items in the template, it places each item at the same address. A union specifying 4 bytes may still only occupy a single byte. A union may consist of a combination of longs, char and ints all based at the same physical address.

The the number of bytes of RAM used by a union is simply determined by the size of the largest element, so:

  1. union test
  2. {
  3.   char x ;
  4.   int y ;
  5.   char a[3] ;
  6.   long z ;
  7. } ;

requires 4 bytes, this being the size of a long. The physical location of each element is the base address plus the following offsets:

Offset x y a z
0 byte high byte a[0] highest byte
+1   low byte a[1] mid byte
+2     a[2] mid byte
+3     a[3] lowest byte


In embedded C the commonest use of a union is to allow fast access to individual bytes of longs or ints. These might be 16 or 32 bit real time counters, as in this example:

  1. /* Declare Union */
  2. union clock
  3. {
  4.     long real_time_count ;     // Reserve four byte
  5.     int real_time_words[2] ;   // Reserve four bytes as
  6.                                // int array
  7.     char real_time_bytes[4] ;  // Reserve four bytes as
  8.                                // char array
  9. } ;
  10. /* Real Time Interrupt */
  11. void timer0_int(void) interrupt 1 using 1
  12. {
  13.     clock.real_time_count++ ;       // Increment clock
  14.    
  15.     if(clock.real_time_words[1] == 0x8000)
  16.     {    // Check/compare lower word only
  17.     /* Do something! */
  18.     }
  19.     if(clock.real_time_bytes[3] == 0x80)
  20.     {    // Check/compare most significant byte only
  21.  
  22.     /* Do something! */
  23.     }
  24.       
  25. }

Generic Pointers

C51 offers two basic types of pointer, the spaced (memory-specific) and the generic.

As has been mentioned, the 8051 has many physically separate memory spaces, each addressed by special assembler instructions. Such characteristics are not peculiar to the 8051 - for example, the 8086 has data instructions which operate on a 16 bit (within segment) and a 20 bit basis.

For the sake of simplicity, and to hide the real structure of the 8051 from the programmer, C51 uses three byte pointers, rather than the single or two bytes that might be expected. The end result is that pointers can be used without regard to the actual location of the data.

For example:

  1. xdata char buffer[10] ;
  2. code char message[] = { "HELLO" } ;
  3. void main(void)
  4. {
  5. char *s ;
  6. char *d ;
  7.    
  8.     s = message ;
  9.     d = buffer ;
  10.     while(*s != '\0')
  11.     {
  12.         *d++ = *s++ ;
  13.     }
  14. }

Yields the following code:

  1.     RSEG  ?XD?T1
  2. buffer:            DS  10
  3.     RSEG  ?CO?T1
  4. message:
  5.     DB  'H' ,'E' ,'L' ,'L' ,'O' ,000H
  6. ;
  7. ;
  8. ; xdata char buffer[10] ;
  9. ; code char message[] = { "HELLO" } ;
  10. ;
  11. ;    void main(void) {
  12.     RSEG  ?PR?main?T1
  13.     USING    0
  14. main:
  15.             ; SOURCE LINE # 6
  16. ;
  17. ;       char *s ;
  18. ;       char *d ;
  19. ;   
  20. ;       s = message ;
  21.             ; SOURCE LINE # 11
  22.     MOV      s?02,#05H
  23.     MOV      s?02+01H,#HIGH message
  24.     MOV      s?02+02H,#LOW message
  25. ;       d = buffer ;
  26.             ; SOURCE LINE # 12
  27.     MOV      d?02,#02H
  28.     MOV      d?02+01H,#HIGH buffer
  29.     MOV      d?02+02H,#LOW buffer
  30. ?C0001:
  31. ;
  32. ;       while(*s != '\0') {
  33.             ; SOURCE LINE # 14
  34.     MOV      R3,s?02
  35.     MOV      R2,s?02+01H
  36.     MOV      R1,s?02+02H
  37.     LCALL    ?C_CLDPTR
  38.     JZ       ?C0003
  39. ;          *d++ = *s++ ;
  40.             ; SOURCE LINE # 15
  41.     INC      s?02+02H
  42.     MOV      A,s?02+02H
  43.     JNZ      ?C0004
  44.     INC      s?02+01H
  45. ?C0004:
  46.     DEC      A
  47.     MOV      R1,A
  48.     LCALL    ?C_CLDPTR
  49.     MOV      R7,A
  50.     MOV      R3,d?02
  51.     INC      d?02+02H
  52.     MOV      A,d?02+02H
  53.     MOV      R2,d?02+01H
  54.     JNZ      ?C0005
  55.     INC      d?02+01H
  56. ?C0005:
  57.     DEC      A
  58.     MOV      R1,A
  59.     MOV      A,R7
  60.     LCALL    ?C_CSTPTR
  61. ;          }
  62.             ; SOURCE LINE # 16
  63.     SJMP     ?C0001
  64. ;       }
  65.             ; SOURCE LINE # 17
  66. ?C0003:
  67.     RET      
  68. ; END OF main
  69.     END

As can be seen, the pointers '*s' and '*d' are composed of three bytes, not two as might be expected. In making *s point at the message in the code space an '05' is loaded into s ahead of the actual address to be pointed at. In the case of *d '02' is loaded. These additional bytes are how C51 knows which assembler addressing mode to use. The library function C_CLDPTR checks the value of the first byte and loads the data, using the addressing instructions appropriate to the memory space being used.

This means that every access via a generic pointer requires this library function to be called. The memory space codes used by C51 are:

Spaced Pointers In C51

Considerable run time savings are possible by using spaced pointers. By restricting a pointer to only being able to point into one of the 8051's memory spaces, the need for the memory space "code" byte is eliminated, along with the library routines needed to interpret it.

A spaced pointer is created by:

  1. char xdata *ext_ptr ;

to produce an uncommitted pointer into the XDATA space or

  1. char code *const_ptr ;

which gives a pointer solely into the CODE space. Note that in both cases the pointers themselves are located in the memory space given by the current memory model. A pointer to xdata which is to be itself located in PDATA would be declared as:

  1. pdata char     xdata     *ext_ptr ;

pdatachar  = location of pointer, xdata = memory space pointed to.

In this example strings are always copied from the CODE area into an XDATA buffer. By customizing the library function "strcpy()" to use a CODE source pointer and a XDATA destination pointer, the runtime for the string copy was reduced by 50%. The new strcpy has been named strcpy_x_c().

The function prototype is:

  1. extern char xdata *strcpy(char xdata*,char code *) ;

Here is the code produced by the spaced pointer strcpy():

  1. ; char xdata *strcpy_x_c(char xdata *s1, char code *s2) {
  2. _strcpy_x_c:
  3. MOV s2?10,R4
  4. MOV s2?10+01H,R5
  5. ;__ Variable 's1?10' assigned to Register 'R6/R7' __
  6. ; unsigned char i = 0;
  7. ;__ Variable 'i?11' assigned to Register 'R1' __
  8. CLR A
  9. MOV R1,A
  10. ?C0004:
  11. ;
  12. ; while ((s1[i++] = *s2++) != 0);
  13. INC s2?10+01H
  14. MOV A,s2?10+01H
  15. MOV R4,s2?10
  16. JNZ ?C0008
  17. INC s2?10
  18. ?C0008:
  19. DEC A
  20. MOV DPL,A
  21. MOV DPH,R4
  22. CLR A
  23. MOVC A,@A+DPTR
  24. MOV R5,A
  25. MOV R4,AR1
  26. INC R1
  27. MOV A,R7
  28. ADD A,R4
  29. MOV DPL,A
  30. CLR A
  31. ADDC A,R6
  32. MOV DPH,A
  33. MOV A,R5
  34. MOVX @DPTR,A
  35. JNZ ?C0004
  36. ?C0005:
  37. ; return (s1);
  38. ; }
  39. ?C0006:
  40. END

Notice that no library functions are used to determine which memory spaces are intended. The function prototype tells C51 only to look in code for the string and xdata for the RAM buffer.