Posts Tagged ‘i2c sniffer’

Sniffing the i2C traffic of a NunChuk

March 3, 2009

There are several articles on the web about how to use a NunChuk with an Arduino.  There are also articles about how to get NunChuk data via communication to the WiiRemote.  But what I haven’t found is a sampling of the actual interchange between the NunChuk and WiiRemote during their initialization and use.  I developed this program to help me understand that communication traffic.

The communication between the NunChuk and Remote utilizes i2C at 400KHz.  I studied the atmega128 product specification and couldn’t determine how to use Arduino’s i2C hardware for sniffing and not as a master or slave device.  I decided to write my own i2C protocol parsing algorithm. 

i2C is a simple communication protocol.  There are only three wire connections needed: SCL, SDA and ground reference.  SDA is primarily data, but is used for signifying start and end of message.  SCL is the data clock.   The state relationship is shown in the following diagram.

capture91

There are six fundamental operations: 1) Idle, 2) Start condition, 3) Sample SDA, 4) Process Sampled SDA Bit, 5) Stop condition, and 6) Repeated Start condition.  Startup condition is in the idle state.  A byte of data occurs by accumulating the first 8 bits (from MSB 1st bit to LSB) of the sampled SDA.   A ninth bit is then processed as the byte acknowledge with a “0” meaning acknowledged and “1” a not acknowledge.  The next byte + ack is then processed unless there is either a “stop condition” or a “repeated start” condition.  A message is a collection of bytes that start with a “start condition” or “repeated start” and ending with a “stop condition” or “repeated start”.   The first byte of a message is the message address and message read/write information.

Because of the i2C data rate, the Arduino USB serial interface is not fast enough and has too much overhead to simultaneously transmit the data received.  Because of this, we have to store the bytes in Arduino memory, which isn’t very much (<<1024 bytes).  This also means that this memory will fill rather quickly.  This full condition is checked every time a new data byte has been accumulated.  When “full” is detected, the process shown above is aborted and the program outputs all of the accumulated messages that were received.

Now for some of the messages.   The following stream of messages occurs immediated upon powerup of both the NunChuk + Remote (with comments).  The syntax of each line is the message byte count (e.g “(2):” for two bytes) and each of the message bytes (in hex).   After each byte is either a “+” or “-” signifying acknowledge/not-acknowledge respectively.  The first byte of a message always contains the address of the device (bits 7:1) and bit 0 is either read (=1) or write (=0).  Bytes with 0xa4 or 0xa5 have been previously identified as NunChuk messages by others in their web sites.  I don’t know what is being addressed by all of the other addresses.  Keep in mind that several of the values in these messages are particular to my NunChuk and other NunChuks will likely have different values.  I’m assuming at this point that the sequence of messages won’t change with the values however.  It would be nice to get any confirmation to that assumption.

 .. looks like a non-NunChuk addressed device
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00-
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 02+
(8):A1+ 00+ 00+ 00+ 00+ 00+ 00+ 00+
(2):A0+ 00+
(8):A1+ 1E- F5+ 5A+ 46+ 09+ CE+ 80-
(8):A1+ 2C+ E5+ E2+ F5- A8- D2- 30+
(2):A0+ 00-
(8):A1+ 2E+ CC+ 1A+ 63- 29- 0F- A0+
(8):A1+ FE+ 15+ C2+ B4- 01+ 88+ 3D-
(2):A0+ 00-
(8):A1+ FE- 04- F8+ 54- 09- D3+ 04-
(2):A1+ 80-

.. something happening with NunChuk addresses
(3):A4+ F0+ 55+
(3):A4+ FB+ 00+
(2):A4+ FA+
(7):A5+ 00+ 00+ A4+ 20+ 00+ 00-
(3):A4+ F0+ AA+
(8):A4+ 40+ AC+ B9+ 8B+ 35+ 1F+ C8+
(8):A4+ 46+ 28+ BE+ B2+ 89+ 07+ 18+
(6):A4+ 4C+ 19+ FF+ 01+ C6+

… the following looks like the calibration data being read
… this is the parameters for my NunChuk
(2):A4+ 20+
(9):A5+ 3F+ 6B+ 0C+ 67+ BD+ D2+ 5F+ 72-
(9):A5+ 3F+ 6B+ 0C+ 67+ BD+ D2+ 5F+ 72-
(2):A4+ 30+
(9):A5+ 3F+ 6B+ 0C+ 67+ BD+ D2+ 5F+ 72-
(9):A5+ 3F+ 6B+ 0C+ 67+ BD+ D2+ 5F+ 72-

.. some more non-NunChuk data
(2):B0+ 60+
(8):B0+ 00+ 08+ 00+ 03- 20+ 40+ 2A-
(3):B0+ 0E+ 00-
(3):B0+ 34+ 8C+
(2):B0+ 66+
(2):B0+ 60+
(2):B0+ 60+
(8):B0+ 00+ 08+ 00+ 03- 20+ 40+ 2A-
(3):B0+ 0E+ 00- (3):B0+ 34+ 8C+
(2):B0+ 66+
(2):B0+ 60+
(1):B0+
(8):B1+ FE- FD- FB- F7- EF- DF- BF-
(2):B1+ FE-

… the following 5 messages are then repeated, which is the polling of the accelerations and buttons
… of course the NunChuk data varies with changes of position of the NunChuk device
    (2):A4+ 00+
    (7):A5+ B0+ EB+ B3+ 1F+ 00+ B7-

… non-NunChuk messages (part of the 5 repeating messages)
    (1):B0+
    (8):B1+ FE- FD- FB- F7- EF- DF- BF-
    (2):B1+ FE-

————————————————————————————————-

Here is the program.  The way the “loop” routine was written was to make it possible to process the i2C messages at 400KHz.  It does not follow normal structured programming rules – there are lots of “goto” statements instead of for/while/do structured techniques.   I found that normal structured C code executed too slowly and missed the SDA data.

i2c_sniffer2

The overhead of Arduino Interrupts

March 2, 2009

I wanted to build an application that could sniff  i2C transactions.  In particular, I was interested in the actual data transactions between the Wii NunChuk and WiiRemote.  This particular data interchange utilizes 400KHz i2C protocol.  So, I thought this was a good opportunity to use arduino interrupts, especially the change on pin interrupts.  What I envisioned is a state-machine that would perform work on transitions of either SCL or SDA – or so I thought.

What I found out was the overhead in the simple version of attaching interrupts was too much to do anything useful at the 400KHz data rate.  I changed to edge-sensitive interrupts without any improvements.  I had to find out what was going on.

I started looking at the actual assembly code generated from the “attachInterrupt” arduino reference library.  Here’s what I found.

Here’s a simple code example:

#define LED_PIN 5  // digital pin #13 (portb)
#define LED_ON() PORTB |= _BV(LED_PIN)
#define LED_OFF() PORTB &= ~_BV(LED_PIN)

void myfunc() {
  LED_OFF();
}
void setup() {
   pinMode(13, OUTPUT);
  attachInterrupt(0, myfunc, RISING);
}

void loop() {
  LED_ON();
  delayMicroseconds(20);
}

I tied the LED output back to Digital pin #2 and this is running on a 16MHz Arduino.  Notice, all this program does is turn an LED on then the interrupt, which responds to the rising of the LED voltage, turns off the LED.  Here is a picture of the LED “on” strobe:

img_25261

As can be seen, the width of the LED “on” is 3.435usec, which is considerable longer than the 600ns that SCL of the i2C protocol is high!  So, this won’t work to respond to SCL changes.  This is about 55 clock cycles @ 16MHz!  What is going on?  Also, what about the 20usec delay in the loop?

img_2527

Notice that the total period for the LED is 26.60usec.  That means the off time for the LED is 23.165usec, which is 3.165usec longer than the delay statement request.   Let’s now look at what happens under the hood.  To do that we have to look at the assembled instructions that came from the program’s compile.

The following is what is generated for the interrupt handler for INT0.  Using the atmegas168 datasheet and information about instructions we can count the number of instruction cycles.  There are 45 cycles before there is a call to the “myfunc” and there are 35 cycles after return of “myfunc” and return from the interrupt handler.  There is also 3 cycles in the pin synchronizer and another 3 cycles for a JMP instruction (which is in the interrupt vector table).  So, we have a total of 51 cycles before we start to execute “myfunc”.

capture4

Now the “myfunc” has some amount of delay before it sets the LED off.   As can be seen, the overhead for a “cbi” instruction  is about 1 cycle.  We’ve accounted for 52 cycles, there are another in 2 cycles in the “sbi”  instruction in the “loop” that actually turns the LED on, and also this blocks interrupts until after the instruction completes.  Now we’ve accounted for 54 cycles.  The other cycle is likely due to the fact that we are setting the LED synchronously with the same clock as the interrupt sampling; this will add another cycle for synchronization.  There we are, all 55 cycles accounted for!

capture51

So, the simple to use “attachInterrupt” has a too much of a significant overhead for the application that I would like to do.  I’ll have to find another solution. 

I made changes to the program.  I got rid of the generic “attachInterrupt” and manually enabled the interrupt.  I also used the “ISR” construct to declare the interrupt handler.  To compile this I had to unfortunately modify Arduino’s “WInterrupts.c” source (located in the \hardware\cores\arduino folder) – I commented out the SIGNAL operations for INT0_vect as this conflicts with the use of “ISR” statement in my application.

#define LED_PIN 5  // digital pin #13 (portb)
#define LED_ON() PORTB |= _BV(LED_PIN)
#define LED_OFF() PORTB &= ~_BV(LED_PIN)

ISR(INT0_vect) {
  LED_OFF();
}
void setup() {
  pinMode(13, OUTPUT);
  // enable INT0 interrupt on change
  EICRA = 0x03;  // INT0 – rising edge on SCL
  EIMSK = 0x01;  // enable only int0
}

void loop() {
  LED_ON();
  delayMicroseconds(20);
}

The changes to the actual waveform on LED were remarkable:

img_2528

The overhead of 1.118usec is now only about 18 clock cycles.  Unfortunately, this is still too much for me to use interrupts to process i2C message.  In addition, it may now be obvious that the background interrupts to keep the delay clock alive are also adding cycles everytime that supporting interrupt occurs.  This extra delay occuring at random places in your application could be significant and cause the occasional strange behavior.

This is a good example of using the instruction and cycle counting methodology to better understand how to improve critical timing in an application.

cheers!