Co2 reading of -3 constantly

hi all! just set up the DIY Basic kit last night and the co2 sensor only returns -3. looking at AirGradient::getCO2_Raw() in the code, seems the d2 mini is timing out when waiting for a response from the co2 sensor. any ideas how to fix?

i will check all my solder joints later. i’m new to electronics so i was wondering if this could be related to the fact that i accidentally pulled some header pins out of the plastic casing and just stuck them back in while i was resoldering them to the d2 mini (i mistakenly soldered the entire thing upside-down at first).

thanks in advance!

I was able to fix this eventually by simply changing the command in AirGradient::getCO2_Raw() to the example from https://www.driesen-kern.de/downloads/ba-senseair-s8-modbus.pdf (“CO2 read sequence”) and setting the baud rate to 9600.

It’s interesting that I ran across this, I had during one reading (and one only) the co2 level be read as -3 too. I have a nice sensible graph over the past 12 hours except for a single dip to -3 (and one weird upward spike, but that wasn’t such a huge spike). I wonder what the underlying cause of this is

Mine used to work fine, but last several days constantly showing -3 :frowning:

I’ve also noticed occasional -3 readings. Seems like not a reading, but some error value. I’ve put in a quick piece of code to toggle IO16 (D0 pin) whenever that CO2 value comes back -3. I’ll use that to trigger a scope and see what’s on the UART at the time.

Interim Update: After collecting data for a few hours, I think I see exactly what’s happening. TL;DR at the end.

I was working on a few leads in parallel. I have a scope connected to the Tx and Rx lines, as well as modified code to toggle IO16 (D0 pin) whenever that CO2 value comes back -3. I’m also logging the serial monitor output to a file with PuTTY. Below is a breakdown as I worked through each of these leads mostly in parallel.

"CO2:-3" = Timeout

Looking at the airgradient.cpp, it was clear where the -3 was coming from. It’s the timeout error code in the AirGradient::getCO2_Raw() function.

  // attempt to read response
  int timeoutCounter = 0;
  while (_SoftSerial_CO2->available() < commandSize) {
      timeoutCounter++;
      if (timeoutCounter > 10) {
        // timeout when reading response
        return -3;
      }
      delay(50);
  }

First conclusion is that it could be handled better in the sketch code. Instead of blindly assuming whatever is returned is the CO2 reading, it could check for negative values indicating an error. Then it can probably print the error to the serial monitor port, skip updating the CO2 reading and just wait for the next reading.

Moving on, I can see with the logic analyzer inputs of my scope that the sketch is calling the getCO2_Raw() function to read the sensor every ~5 seconds or so as expected:

From the code, the timeout counter checks SoftwareSerial.Available() to see if there are the expected number of received bytes (7 bytes) and after 10 checks and 50 milliseconds between, it times out… That’s 500 milliseconds in total before hitting the time out condition. I can confirm this with the scope:

Notice the trigger point at time t=0 and the serial bus request and response approximately 500 ms prior. I can see the corresponding CO2: -3 output in the serial monitor output as well.

The scope decodes the request and the response packets and they look complete (7 bytes in each direction) and valid (value seems reasonable and CRC checks-out), so it should not have timed-out.

Undocumented Function Code

I was looking at the data being sent to the S8 sensor and couldn’t make sense of it. According to the specifications (https://rmtplusstoragesenseair.blob.core.windows.net/docs/Dev/publicerat/TDE2067.pdf), the proper request PDU should be 1-byte address, followed by ModBus PDU, followed by 2-byte CRC. ModBus PDU for reading CO2 data is function code 0x04. Function code 0x04 consist of 1-byte function code (0x04), 2-bytes for IR starting address (the register to read is IR4 at location 0x0003) and 2-byte read quantity (reading a single register is 0x0001).


But the AirGradient library sends the following request:

0xFE 44 00 08 02 9F 25

That’s only 5 bytes + 2-byte CRC = 7 bytes instead of the expected 8. And it seems this may be function code 0x44 and I’m guessing location at 0x0008 and reading 2 bytes?

A typical response is:

0xFE 44 02 05 E3 FA 3D, which I think is a CO2 reading of 0x05E3 = 1507

So it seems to be working and reading something that is within expectations.

@AirGradientBlog : Any idea why we’re sending an undocumented request to the S8? Is there some documentation you have to support this? I know it has been discussed on another thread here on the forums: https://forum.airgradient.com/t/where-can-i-find-the-s8-documentation/292/5

The latest update from @dreamdevil is that Sensair stated that 0x44 should not be used.

So, I thought it was worth trying to change the request PDU to use the documented function code and so I modified the code to send the new 8-byte request as suggested in the example in the specification document. The code needed modifications is a few areas because the expected response is still 7-bytes.

  while(_SoftSerial_CO2->available())  // flush whatever we might have
      _SoftSerial_CO2->read();

  //const byte CO2Command[] = {0xFE, 0X44, 0X00, 0X08, 0X02, 0X9F, 0X25};
  const byte CO2Command[] = {0XFE, 0X04, 0X00, 0X03, 0X00, 0X01, 0XD5, 0XC5}; //KEN
  byte CO2Response[] = {0,0,0,0,0,0,0};
 
  // tt
  int datapos = -1;
  //

  //const int commandSize = 7;
  const int commandSize = 8; //KEN

  int numberOfBytesWritten = _SoftSerial_CO2->write(CO2Command, commandSize);

  if (numberOfBytesWritten != commandSize) {
    // failed to write request
    return -2;
  }

  // attempt to read response
  int timeoutCounter = 0;
  //while (_SoftSerial_CO2->available() < commandSize) {
  while (_SoftSerial_CO2->available() < (commandSize-1)) {   //KEN
      timeoutCounter++;
      if (timeoutCounter > 10) {
        // timeout when reading response
        return -3;
      }
      delay(50);
  }

  // we have 7 bytes ready to be read
  //for (int i=0; i < commandSize; i++) {
  for (int i=0; i < (commandSize-1); i++) { //KEN
    CO2Response[i] = _SoftSerial_CO2->read();

    // tt
            if ((CO2Response[i] == 0xFE) && (datapos == -1)){
				datapos = i;
			}
            Serial.print (CO2Response[i],HEX);
			Serial.print (":");
    //
  }

With this new code in place, I verified I was still receiving reasonable CO2 readings from the sensor. And so I let it run for a few hours and the scope still triggered on the timeout condition.

ModBus CRC & the last byte issue
Throughout all of this work, something stood out to me immediately from the very beginning: every time my scope triggers on a timeout condition, the decoded response from the S8 sensor has a CRC MSByte of 0xFF! And since we’re in little endian mode, this will be the last byte of the response.

At first, I thought it could be an error condition from the S8 sensor, but there was no documentation to support this. Looking closer, the CO2 reading in the response data looks reasonable. So, I captured a bunch more data and saw that there are a few different readings that also end with 0xFF. I calculated the ModBus CRC myself to verify that indeed the CRC is correct. In the last example above, the 0x05E5 is a reading of 1509.

So how can this possibly cause the timeout condition when all the code does is wait for the correct number of bytes to be sitting in the SoftwareSerial buffer? The code is simple enough that I am confident it is 100% rock-solid.

This made me very suspicious of SoftwareSerial. With my scope watching the bus for >12 hours I have captured every single timeout event (the hardware counter matches my PuTTY log file so I know I didn’t miss any) and I was confident that all 7 bytes were put onto the serial bus by the sensor each and every time. And each and every time the timeout occurred anyway, the last byte on the bus was 0xFF. And the only way this condition can be triggered is if SoftwareSerial.Available() doesn’t return a number greater or equal to 7. This must be a bug.

A Google search turn-up this gem: SoftwareSerial fails to deliver last byte of Nextion End-Of-Packet until more data received · Issue #226 · plerup/espsoftwareserial · GitHub. This is an issue report on the SoftwareSerial library github. Apparently, the user was reporting the exact same behavior we’re seeing here. The last byte of their packet is 0xFF and they never get it, but it does show up in the beginning of the next packet. In our case, we always flush the receive buffer each time, so we don’t have any left-over bytes, so it gets lost.

The SoftwareSerial maintainers have addressed this bug and fixed it in release 6.15.1 in November 2021. I have been running various mish-mash versions of the Arduino Core and Software library, but most recently, I have decided to start with a clean slate with stock Arduino Core 3.0.2, which used SoftwareSerial 6.12.7, which predates the fix for this bug.

I’ve since updated to the latest Arduino Core 3.1.1, which includes SoftwareSerial v7, which causes Exception 0 crashes. I then manually replaced SoftwareSerial with the prior v6.17.1, which should have the fix for the last byte 0xFF issue, but not have the breaking-change for exception 0 crashes. I have now run this for about 2 hours and don’t see any timeouts yet. So far, in the log file, I see 3 occurrences of receiving CO2 response packets that end in 0xFF whereas previous logs have never seen this condition because when it occurs, it returns -3 instead of the packet data.

The next release of the Arduino Core reverts to SoftwareSerial v6.17.1, so when that is released, we should have a fix without having to manually replace SoftwareSerial.

TL;DR
If you are running Arduino Core v3.1.x, you will have SoftwareSerial v7 and you will encounter Exception 0 crashes. If you downgraded to Arduino Core v3.0.2, you will have SoftwareSerial v6.12.7 and you will encounter false serial timeout events. I recommend running SoftwareSerial v6.17.1 to avoid both issues.

I’ll let this run for a few more hours to confirm it’s solid, but this feels like the fix.

2 Likes

Okay, my testing is now at the 9-hour mark with zero -3 timeouts and a total of 52 occurrences where the reading came back with a 0xFF as the last byte. I’m confident this is the root cause and the fix.

@ken830 as ususal many many thanks for digging into this!

I actually do not remember why we use this request and I will update the library on the next update. I also want to integrate the code for slowing down the i2c frequency.

thanks all for looking into this. in case it helps this is what i had to change to get it to work: mods · diracdeltas/airgradient@21e8825 · GitHub.

i’ve been running this for weeks now without the -3 error.

@Achim_AirGradient I just noticed on your github, your latest release is 2.3.0 from Dec 15, 2022, but in the Arduino IDE, the latest is 2.2.0. I don’t know how the Arduino library works with github or if there is something you need to do to push over a new release. I realize I have been working with 2.2.0 this whole time.

I’ve just forked the repository and made the changes in getCO2_Raw() :

// <<>>
int AirGradient::getCO2_Raw() {

  while(_SoftSerial_CO2->available())  // flush whatever we might have
      _SoftSerial_CO2->read();

  const byte CO2Command[] = {0XFE, 0X04, 0X00, 0X03, 0X00, 0X01, 0XD5, 0XC5};
  byte CO2Response[] = {0,0,0,0,0,0,0};

  // tt
  int datapos = -1;
  //

  const int commandSize = 8;
  const int responseSize = 7;

  int numberOfBytesWritten = _SoftSerial_CO2->write(CO2Command, commandSize);

  if (numberOfBytesWritten != commandSize) {
    // failed to write request
    return -2;
  }

  // attempt to read response
  int timeoutCounter = 0;
  while (_SoftSerial_CO2->available() < responseSize) {
      timeoutCounter++;
      if (timeoutCounter > 10) {
        // timeout when reading response
        return -3;
      }
      delay(50);
  }

  // we have 7 bytes ready to be read
  for (int i=0; i < responseSize; i++) {
    CO2Response[i] = _SoftSerial_CO2->read();

    // tt
            if ((CO2Response[i] == 0xFE) && (datapos == -1)){
				datapos = i;
			}
            Serial.print (CO2Response[i],HEX);
			Serial.print (":");
    //
  }
 // return CO2Response[3]*256 + CO2Response[4];
// tt
 return CO2Response[datapos + 3]*256 + CO2Response[datapos + 4];
 //

}

I’ve tested this new function in my code and I don’t think it affects anything else, but because mine is the v2.2.0 from the Arduino library and not the v2.3.0 from github, some extremely low probability it could break something that I haven’t tested with my setup.

I’ve added responseSize = 7; to separate the difference in size between the PDU request and the PDU response.

I’ll put in a pull-request for this and maybe for the i2c bus speed change. For that one, I’m still trying to see if I have the time and motivation to chase down the root cause for the speed discrepancy between what is set and what is actually happening. I suspect another issue with some code in some library somewhere. But that is low-priority for me as i2c speeds just aren’t that important, as long as it’s not too fast.

@azuki : From what I can tell, the change in the PDU and function code doesn’t resolve the issue because the root cause is a bug in an older version (<6.15.1) of the SoftwareSerial library. However, this will change which specific CO2 readings return -3, so you may see this more or less when you changed the PDU and function code because of the way the CRC is calculated.

For instance, a CO2 reading of 869 will result in the S8 responding with: 0xFE 04 02 03 65 6D FF. The CRC is 0xFF6D, so the resulting last byte will be 0xFF and cause the -3 timeout issue. The same is true for a reading of 1892, resulting in a response of 0xFE 04 02 07 64 AE FF. In the original PDU/function code, however, a reading of 869 will result in 0xFE 44 02 03 65 78 3F which will not result in a -3 timeout, but a different CO2 value will.

This change, however, should be better because it’s using the documented features of the device by the manufacturer.

By the way, in your code, you missed one line that needed to be changed on line 696:

 for (int i=0; i < commandSize; i++) {

should be :

 for (int i=0; i < commandSize-1; i++) {

or

 for (int i=0; i < 7; i++) {

Right now, your commandSize is 8 and that means it will read 8 bytes from the buffer when there will only be 7 bytes present. That won’t fail directly, but the last byte will read as 0xFF and the serial monitor printout will reflect this error’s artifact. Anyway, I think it’s better if you define a separate responseSize = 7 like I have in my code.

I have tested these changes and also did not see any negative impact so I updated the Arduino AirGradient Library with above changes and the u8g2 frequency change. Version 2.4.0 should be available in the next 24 hours (Arduino syncs the libraries with our git hub at least once per day).

By the way, the tag for 2.3.0 was not updated in the Arduino library properties so it was not pulled.

2 Likes

I see 2.4.0 in the Arduino library (still no 2.3.0) and updated and started testing ~ hour ago. So far, so good. I can confirm the new function code to the S8.

I have this problem too, and switching the Wemos for another one I had lying around with a similar or same firmware for Airgradient makes the CO2 sensor work and not display -3. With the Wemos that doesn’t display the CO2, I tried updating the firmware to the latest but it still displays -3. I’m assuming the newer version broke this somehow?

I have the exact same problem and to be honest I have read the whole thread and still have no idea how to solve it. could anyone tell me how to fix the problem an what exactly do I change in the diy example on Arduino IDE in order to get it to work (I’m very new to coding so please help)

Exact same problem as whom? Do you have constant -3 or occasional -3 reading?

Can you downgrade the AirGradient Arduino library to a previous version, e.g. 2.2.0 and see if that makes a difference?

Constant reading of -3

I see you’ve written the code that I need to input into the sketch but where and how would I do this?

It did not, but it turned out to be just a fried board or something. I tried it with the old firmware board I knew that worked, and the sensor worked, then I tried it with a brand new board I had with the latest firmware, which also worked. I’ve also confirmed I soldered everything on the board that the CO2 sensor doesn’t work, and all the solder joints look good.

This is like the 4th board I’ve supposedly fried/broke, these seem super easy to break. Everything is working now for me.

One question, when I was testing the board with other firmware versions and such, it always connected back to my central WiFi network without me having to enter the password which I thought was strange because it should have wiped everything on it when I reflashed the firmware.

Please post some picture of your built. The -3 can also happen if the CO2 sensor is connected wrongly.