Monday, October 21, 2013

Fastest digitalRead / digitalWrite Alternative

Arduino's standard digitalRead/digitalWrite is well known for two reasons: it's simplicity / ease to use, and... it's extraordinary slow speed.

Fastest alternative is by using direct port manipulation. For example, alternative to digitalWrite( 13, HIGH ) is PORTB |= (1 << 5). Compiler will translated that code into 2-cycle instruction using sbi opCode. In 16 MHz, it will be executed in about 130 nano-seconds.

However it's not equivalent of digitalWrite. Beside of setting corresponding pin with specified value, digitalWrite also check and turn PWM output off for corresponding pin with PWM capability. If you've previously use PWM on corresponding pin (i.e., by invoking analogWrite function), this method won't work.

Anyway, this condition is rarely encountered. Usually once a pin assigned as PWM-driven pin, it will never reverted back to "normal" (non-PWM-driven) pin. So rather than wasting execution time on invoking unnecessary code, we should take a little bit control and explicitly turn PWM off only if it's really necessary.

To keep it simple and easy to use, we'll use following macros (note that this code only works with ATmega8/168/328-based board such Arduino Uno. Other MCU might have different pin numbering!):

#define portOfPin(P)\
  (((P)>=0&&(P)<8)?&PORTD:(((P)>7&&(P)<14)?&PORTB:&PORTC))
#define ddrOfPin(P)\
  (((P)>=0&&(P)<8)?&DDRD:(((P)>7&&(P)<14)?&DDRB:&DDRC))
#define pinOfPin(P)\
  (((P)>=0&&(P)<8)?&PIND:(((P)>7&&(P)<14)?&PINB:&PINC))
#define pinIndex(P)((uint8_t)(P>13?P-14:P&7))
#define pinMask(P)((uint8_t)(1<<pinIndex(P)))

#define pinAsInput(P) *(ddrOfPin(P))&=~pinMask(P)
#define pinAsInputPullUp(P) *(ddrOfPin(P))&=~pinMask(P);digitalHigh(P)
#define pinAsOutput(P) *(ddrOfPin(P))|=pinMask(P)
#define digitalLow(P) *(portOfPin(P))&=~pinMask(P)
#define digitalHigh(P) *(portOfPin(P))|=pinMask(P)
#define isHigh(P)((*(pinOfPin(P))& pinMask(P))>0)
#define isLow(P)((*(pinOfPin(P))& pinMask(P))==0)
#define digitalState(P)((uint8_t)isHigh(P))

Thus, you can save valuable code space and get dramatically faster execution by changing:
  • pinMode( pin, INPUT ); with pinAsInput( pin );
  • pinMode( pin, OUTPUT ); with pinAsOutput( pin );
  • pinMode( pin, INPUT_PULLUP); with pinAsInputPullUp( pin );
  • digitalWrite( pin, LOW ); with digitalLow( pin );
  • digitalWrite( pin, HIGH ); with digitalHigh( pin );
  • digitalRead( pin ) with digitalState( pin )

Additionally, rather than typing if( digitalState( pin ) == HIGH ) you can type if( isHigh( pin ) ) for clearer code clarity. Also use isLow( pin ) rather than digitalState( pin ) == LOW.

Now let's try it in action. Load the Blink.pde example sketch and try to compile. You'll get 1,084 bytes of compiled code. Now insert our new macros in the beginning of the file, and replace the code according to
changing guide above.

Your source code will be like this (comments removed, newly inserted macros are not shown):

int led = 13;

void setup() {              
  pinAsOutput(led);
}

void loop() {
  digitalHigh( led );
  delay( 1000 );
  digitalLow( led );
  delay( 1000 );
}

After compiling, we'll get size reduction to 956 bytes. Not much fat-loss, eh? Actually you can get much smaller code, by changing the way you define associated led symbol.

First, it's defined as int (with range from -32,768 to 32,767) which taken 2 bytes. A pin number in Arduino Uno is from 0 to 19, so it's a waste to declared it as int (2 bytes). If you really need to put it in variable, you should defined it with byte (uint8_t) type.

Second, and most importantly, since the LED won't changed it's pin attachment on middle of execution, you should define it as constant with const keyword. This way, compiler will evaluate associated macro condition in compile time (instead of making actual run-time code to evaluate variable arguments).

Take hard notice on this issue. Under any circumstances, use variable only if you need to change it's value (variable ⇒ able to vary) throughout execution. Otherwise, always use const (constant ⇒ always the same, never changed).  You'll save a lot of code space and execution time by follow this simple rule.

So, try changing the int led = 13; statement with const byte led = 13 (or simply #define led 13) and recompile you code. Now your slim program only takes 674 bytes, more than 30% size reduction!

How fast is digitalHigh / digitalLow versus digitalWrite in common 16 MHz clockrate? For digitalWrite  it depends on whether specified pin has PWM capabilities or not  (from about 3.6 µs to  4.8 µs). For digitalHigh / digitalLow, it is exactly 130 ns (2 cycles), so it's between 27-37 times faster).


11 comments: