AirGradient Pro with ESPhome, display keeps turning back on

Hi @ken830, thanks a lot for your work on the SoftwareSerial bug. I read it with great interest.

When I compile with verbose I see framework-arduinoespressif8266 @ 3.30002.0 (3.0.2). I think v3.0.2 contains the SoftwareSerial version before the bug, correct?

So for the particular issue I was commenting on it’s probably something else, just this once. But I think there are multiple problems around that look similar in symptoms (random reboots). This time mine must be caused by an out of memory exception from the graphs+fonts+mqtt+logger, I believe. I’ve been trying to slim down my esphome yaml for a while, glad @MallocArray found one more thing to remove!

I’m not sure how you can be sure of the version, but if it really is using Arduino Core 3.0.2, it is including SoftwareSerial 6.12.7, which predates the fix for this bug. If you install the latest Ardiuno Core 3.1.1, then it will use SoftwareSerial 7.x.x, which will have the fix for the bug, but will introduce the breaking change that causes exception 0 crashes. That means, as of now, you cannot have a working version of SoftwareSerial without manually replacing it with a version that sits in between those two – ideally, you should use the latest pre-v7 version, which is SoftwareSerial v6.17.1.

Upcoming release of the Arduino core will revert SoftwareSerial to <7.x.x, and I believe it’s going to be v6.17.1. But until then, there’s no other way except to manually put in v6.17.1.

EDIT: I just want everyone to understand that no matter what version of Arduino Core you use as of today, you will have one of the two problems with SoftwareSerial. For the default AirGradient Code, the bug just causes a “-3” reading, but no crashing. For ESPHome, I don’t know how the code behaves. Before you waste a lot of time chasing down “unknowns,” you should fix all the “knowns”. If you notice improvements with changing update rates and other similar changes, then know it could easily be related to this bug because it happens when reading from the sensor, so the more often you do it, the more likely you’ll get a reading that ends in 0xFF – that’s just statistics. And also depends on the environment you’re in, since the last byte from the S8 is part of a checksum, there are certain specific CO2 levels that will always result in the last byte being 0xFF – this means you can have a perfectly working system for many, many weeks and then suddenly see it fail frequently because the physical environment is different and the CO2 levels happen to be around one of the values that causes the last byte of the checksum to be 0xFF.

@ken830 Thanks! I understand now. That is a great find and very important. Thanks for pointing it out in such explicit terms to me (us)! It’s much appreciated.

1 Like

@ken830 Thank you for this clear writeup about the situation. I had been confused about where we stand, but this helped clear it up and hopefully a permanent resolution is available soon.

For what it is worth, with ESPHome 2023.2.2 and only setting it up for the D1 mini

esphome:
  name: airgradient-basement
  platform: ESP8266
  board: d1_mini

It looks like mine is using version 3.2.0 which is different from what argafal had mentioned.

INFO Reading configuration /config/airgradient-basement.yaml...
INFO Generating C++ source...
INFO Compiling app...
Processing airgradient-basement (board: d1_mini; framework: arduino; platform: platformio/espressif8266 @ 3.2.0)

Okay… I see the confusion… ESPHome is compiled with PlatformIO, right? For PlatformIO, you’re not selecting the Arduino core directly – it is included as a dependency. And in turn, the version of SofwareSerial is also included in the Arduino core. As I highlighted in my post a month back, these are the last four versions of PlatformIO espressif8266 library that I have tested and the corresponding version of Arduino 8266 core that it includes:

PlatformIO platform-espressif8266:

  • v4.1.0 (2023-01-16), Arduino Core v3.1.0, SoftwareSerial v7.0.0 <== Exception 0 Crashes
  • v4.0.1 (2022-01-01), Arduino Core v3.0.2, SoftwareSerial V6.12.7 <== Last-byte 0xFF Bug
  • v4.0.0 (2022-05-31), Arduino Core v3.0.2, SoftwareSerial V6.12.7 <== Last-byte 0xFF Bug
  • v3.2.0 (2021-08-13), Arduino Core v3.0.2, SoftwareSerial V6.12.7 <== Last-byte 0xFF Bug

Hope this helps. If you are using the latest version of PlatformIO ESP8266 library, then you will see crashes. And all previous versions will have a much-too-old SoftwareSerial and you will be hit with the last-byte 0xFF bug. No way around it without directly replacing SoftwareSerial. For now.

EDIT: Fixed typos.

Oh… It’s not difficult to replace the SoftwareSerial version. On Windows, you should find it here:

C:\Users\Ken\.platformio\packages\framework-arduinoespressif8266\libraries\SoftwareSerial

The text in bold needs to be changed to your specific user’s directory.

You can download the ESP SoftwareSerial v6.17.1 (or any version) here:
https://github.com/plerup/espsoftwareserial/releases

Just open the zip file and you will see that you can simply replace your existing directory with the contents of the zip file. Probably need to restart VS Code or whatever to have a good chance of clearing any cached version of the libraries.

Im finally trying other softwareserial in esphome. It’s not too hard to configure.
Having latest arduino framework supported by esphome because there were wifi compilation issues.

esphome:
  name: ${devicename}
  libraries:
    - uart=https://github.com/plerup/espsoftwareserial.git#6.17.1

esp8266:
  board: d1_mini
  framework:
    version: latest
    platform_version: 4.1.0
1 Like

Nice! Even easier! Is this with framework-arduinoespressif8266 3.1.1 as well? At least you’ll know you’ve eliminated a few known issues now. Hope your testing goes well.

Touching 20 hours now, but I’ve had 48 hours before but eventually it restarts so lets wait.

Latest(and also recommended) pulls version 3.0.2 now. I tested with 3.1.0/3.1.1 but they changed dhcpsoftap and then esphome breaks.

I dived a little more into how the dependencies work. Arduino framework looks for the release of SoftwareSerial at the point that version is released. So the issue on Esphome could be mitigated with only pulling a different softwareserial version, I guess. So I was using 6.12.7 before(Arduino 3.0.2) and that one crashes and now 6.17.1. I never used 7.0.0 because I never have used any newer Arduino framework version.
In the future updating the Arduino framework(when they revert to the older softwareserial) should work if Esphome supports it. That will probably take a long time so meanwhile we only can pull a different softwareserial to mitigate the issue(s) on Esphome.

Nope still irregular restarts after 20, 46 and 28 hours. So nothing changed for me using the softwareserial 6.17.1.

Ill test again when a new arduino release with changed softwareserial releases.

If it still restarts, then it’s likely not related (or not 100% caused by) SoftwareSerial. That’s good to know. Eliminates a big piece.

I tried adding the same libraries lines to mine and I haven’t had reboots on my AG Pro where I originally found this issue, but now that I’m watching uptimes, I’m not seeing reboots on my original AirGradient DYI or IKEA monitor with an SGP40 and D1. Seems like the whole house starts rebooting or being stable around the same times, so I’m wondering if it is related to the data returns, such as when I hit certain temperatures or TVOC levels or something.

So not certain it solved anything for me at the moment, since my untouched configs are stable right now as well, but I’ll keep monitoring.

I follow this thread with great interest. I had similar problems. Most recently I had an uptime of 66 hours. I posted my esphome file in this thread: Esphome with graphs - #5 by argafal

If you try it I’d be most curious to hear if/how it works for you.

@MallocArray You make an interesting observation here. I follow this thread since I have (had?) a similar issue with my own esphome yaml. My random reboots have also disappeared the last few days. I cannot understand why. I guess I should simply be happy. :wink:

At the off chance that some external condition could be a contributing factor I will document what I observed below:

  • In a post on Feb 28 I published an esphome yaml with graphs: Esphome with graphs - #4 by argafal
  • I commented that it worked well for me but randomly reboots every 18-24 hours. My only local change to the published file: I ran with mqtt enabled.
  • A day or two later, I added an uptime counter as an additional sensor.
  • My random reboots have disappeared.
  • In my post today Mar 5 (Esphome with graphs - #5 by argafal) I posted the most recent configuration I use. Given that I always had MQTT enabled, it is identical to the Feb 28 one except for the additional uptime counter (and some updated comments).
  • I do not use a TVOC sensor, I do not have one.
  • I looked at my graphs for temperature, humidity, particles density, and co2 for the last two weeks. I do not see any obvious change in trend.

@MallocArray Did you always have an uptime sensor in your definition or did you recently add it too, by any chance?

I did not always have an uptime, but my SGP40 for TVOC needs 12 hours to self-calibrate and I couldn’t get that due to the reboots, which is when I started tracking uptime, but I only did it on my AG Pro. After fiddling with it for months, I added uptime sensors to my other two (AG DIY and IKEA) and discovered they were rebooting too, even the IKEA with a different PMS sensor and not an air gradient board at all.

I opened a Issue on the ESPHome github and they asked for logs, so I moved my AG Pro to my ESPHome server to collect decrypted logs and then everything went stable for me. I also had a power outage in my house that had rebooted everything, so wasn’t sure I was having a network issue, but after a few days, everything started rebooting again. That is what leads me to think maybe it is a particular reading, such as Ken mentioned certain ending conditions causing the serial library to force a reboot.

My random reboots are back now. It was nice while it lasted :smiley:

I’ll be curious to try the new PCB and see if that gives any improvements. With my PCB v3.3 I had to change the i2c frequency depending on the MCU (D1 vs C3), otherwise the display doesn’t work. I’m also still getting occasional error messages from the particle sensor and also occasionally from the Co2 sensor. I have changed the version of SoftwareSerial already.

How is it going for you, @MallocArray ?

About the same. I’m not getting as frequent of reboots, but they do happen sometimes, but I also see the unmodified ESPHome install with similar stability, so I don’t know what to think

1 Like

Same here. Occasional reboots every other day. It would be best to find out where it is coming from for every crash but that requires logging everything from serial and possibly removing and adding every single sensor to reduce the possibilities. At this moment it isnt really convenient for me to do that and I don’t have a second board to test it. I don’t really want to stop logging data for too long.

Another way to look into these issues is to look here GitHub - nkitanov/iaq_board: IAQ Board is a DIY (Do-It-Yourself) device for measuring internal air quality
It looks like most of the hardware is the same and they encountered the sgp31 issues also.

Or just put all the sensors on a breadboard and see if the same issues arise after a while.
If someone has a good test strategy we can try to collectively find the main issue(s)

It’s either an individual, hardware specific issue (not likely, since there are multiple of you), or there is an issue with ESPHome. I say that because I’ve been running mostly-stock AirGradient code with OLED, SHT, TVOC, S8, PM5003 on a modified v3.3 PCB with OLED on 5V and SHT on 3.3V, I2C at 100kHz, Arduino Core 3.1.0, Software Serial v6.12.7 and it has been running non-stop for 17 days, 21 hours, and 9 minutes with zero reboots, zero sensor dropouts/timeout. Rock solid.