@ken830 Thanks! I understand now. That is a great find and very important. Thanks for pointing it out in such explicit terms to me (us)! It’s much appreciated.
@ken830 Thank you for this clear writeup about the situation. I had been confused about where we stand, but this helped clear it up and hopefully a permanent resolution is available soon.
For what it is worth, with ESPHome 2023.2.2 and only setting it up for the D1 mini
esphome:
name: airgradient-basement
platform: ESP8266
board: d1_mini
It looks like mine is using version 3.2.0 which is different from what argafal had mentioned.
INFO Reading configuration /config/airgradient-basement.yaml...
INFO Generating C++ source...
INFO Compiling app...
Processing airgradient-basement (board: d1_mini; framework: arduino; platform: platformio/espressif8266 @ 3.2.0)
Okay… I see the confusion… ESPHome is compiled with PlatformIO, right? For PlatformIO, you’re not selecting the Arduino core directly – it is included as a dependency. And in turn, the version of SofwareSerial is also included in the Arduino core. As I highlighted in my post a month back, these are the last four versions of PlatformIO espressif8266 library that I have tested and the corresponding version of Arduino 8266 core that it includes:
PlatformIO platform-espressif8266:
- v4.1.0 (2023-01-16), Arduino Core v3.1.0, SoftwareSerial v7.0.0 <== Exception 0 Crashes
- v4.0.1 (2022-01-01), Arduino Core v3.0.2, SoftwareSerial V6.12.7 <== Last-byte 0xFF Bug
- v4.0.0 (2022-05-31), Arduino Core v3.0.2, SoftwareSerial V6.12.7 <== Last-byte 0xFF Bug
- v3.2.0 (2021-08-13), Arduino Core v3.0.2, SoftwareSerial V6.12.7 <== Last-byte 0xFF Bug
Hope this helps. If you are using the latest version of PlatformIO ESP8266 library, then you will see crashes. And all previous versions will have a much-too-old SoftwareSerial and you will be hit with the last-byte 0xFF bug. No way around it without directly replacing SoftwareSerial. For now.
EDIT: Fixed typos.
Oh… It’s not difficult to replace the SoftwareSerial version. On Windows, you should find it here:
C:\Users\Ken\.platformio\packages\framework-arduinoespressif8266\libraries\SoftwareSerial
The text in bold needs to be changed to your specific user’s directory.
You can download the ESP SoftwareSerial v6.17.1 (or any version) here:
https://github.com/plerup/espsoftwareserial/releases
Just open the zip file and you will see that you can simply replace your existing directory with the contents of the zip file. Probably need to restart VS Code or whatever to have a good chance of clearing any cached version of the libraries.
Im finally trying other softwareserial in esphome. It’s not too hard to configure.
Having latest arduino framework supported by esphome because there were wifi compilation issues.
esphome:
name: ${devicename}
libraries:
- uart=https://github.com/plerup/espsoftwareserial.git#6.17.1
esp8266:
board: d1_mini
framework:
version: latest
platform_version: 4.1.0
Nice! Even easier! Is this with framework-arduinoespressif8266 3.1.1 as well? At least you’ll know you’ve eliminated a few known issues now. Hope your testing goes well.
Touching 20 hours now, but I’ve had 48 hours before but eventually it restarts so lets wait.
Latest(and also recommended) pulls version 3.0.2 now. I tested with 3.1.0/3.1.1 but they changed dhcpsoftap and then esphome breaks.
I dived a little more into how the dependencies work. Arduino framework looks for the release of SoftwareSerial at the point that version is released. So the issue on Esphome could be mitigated with only pulling a different softwareserial version, I guess. So I was using 6.12.7 before(Arduino 3.0.2) and that one crashes and now 6.17.1. I never used 7.0.0 because I never have used any newer Arduino framework version.
In the future updating the Arduino framework(when they revert to the older softwareserial) should work if Esphome supports it. That will probably take a long time so meanwhile we only can pull a different softwareserial to mitigate the issue(s) on Esphome.
Nope still irregular restarts after 20, 46 and 28 hours. So nothing changed for me using the softwareserial 6.17.1.
Ill test again when a new arduino release with changed softwareserial releases.
If it still restarts, then it’s likely not related (or not 100% caused by) SoftwareSerial. That’s good to know. Eliminates a big piece.
I tried adding the same libraries lines to mine and I haven’t had reboots on my AG Pro where I originally found this issue, but now that I’m watching uptimes, I’m not seeing reboots on my original AirGradient DYI or IKEA monitor with an SGP40 and D1. Seems like the whole house starts rebooting or being stable around the same times, so I’m wondering if it is related to the data returns, such as when I hit certain temperatures or TVOC levels or something.
So not certain it solved anything for me at the moment, since my untouched configs are stable right now as well, but I’ll keep monitoring.
I follow this thread with great interest. I had similar problems. Most recently I had an uptime of 66 hours. I posted my esphome file in this thread: Esphome with graphs - #5 by argafal
If you try it I’d be most curious to hear if/how it works for you.
@MallocArray You make an interesting observation here. I follow this thread since I have (had?) a similar issue with my own esphome yaml. My random reboots have also disappeared the last few days. I cannot understand why. I guess I should simply be happy.
At the off chance that some external condition could be a contributing factor I will document what I observed below:
- In a post on Feb 28 I published an esphome yaml with graphs: Esphome with graphs - #4 by argafal
- I commented that it worked well for me but randomly reboots every 18-24 hours. My only local change to the published file: I ran with mqtt enabled.
- A day or two later, I added an uptime counter as an additional sensor.
- My random reboots have disappeared.
- In my post today Mar 5 (Esphome with graphs - #5 by argafal) I posted the most recent configuration I use. Given that I always had MQTT enabled, it is identical to the Feb 28 one except for the additional uptime counter (and some updated comments).
- I do not use a TVOC sensor, I do not have one.
- I looked at my graphs for temperature, humidity, particles density, and co2 for the last two weeks. I do not see any obvious change in trend.
@MallocArray Did you always have an uptime sensor in your definition or did you recently add it too, by any chance?
I did not always have an uptime, but my SGP40 for TVOC needs 12 hours to self-calibrate and I couldn’t get that due to the reboots, which is when I started tracking uptime, but I only did it on my AG Pro. After fiddling with it for months, I added uptime sensors to my other two (AG DIY and IKEA) and discovered they were rebooting too, even the IKEA with a different PMS sensor and not an air gradient board at all.
I opened a Issue on the ESPHome github and they asked for logs, so I moved my AG Pro to my ESPHome server to collect decrypted logs and then everything went stable for me. I also had a power outage in my house that had rebooted everything, so wasn’t sure I was having a network issue, but after a few days, everything started rebooting again. That is what leads me to think maybe it is a particular reading, such as Ken mentioned certain ending conditions causing the serial library to force a reboot.
My random reboots are back now. It was nice while it lasted
I’ll be curious to try the new PCB and see if that gives any improvements. With my PCB v3.3 I had to change the i2c frequency depending on the MCU (D1 vs C3), otherwise the display doesn’t work. I’m also still getting occasional error messages from the particle sensor and also occasionally from the Co2 sensor. I have changed the version of SoftwareSerial already.
How is it going for you, @MallocArray ?
About the same. I’m not getting as frequent of reboots, but they do happen sometimes, but I also see the unmodified ESPHome install with similar stability, so I don’t know what to think
Same here. Occasional reboots every other day. It would be best to find out where it is coming from for every crash but that requires logging everything from serial and possibly removing and adding every single sensor to reduce the possibilities. At this moment it isnt really convenient for me to do that and I don’t have a second board to test it. I don’t really want to stop logging data for too long.
Another way to look into these issues is to look here GitHub - nkitanov/iaq_board: IAQ Board is a DIY (Do-It-Yourself) device for measuring internal air quality
It looks like most of the hardware is the same and they encountered the sgp31 issues also.
Or just put all the sensors on a breadboard and see if the same issues arise after a while.
If someone has a good test strategy we can try to collectively find the main issue(s)
It’s either an individual, hardware specific issue (not likely, since there are multiple of you), or there is an issue with ESPHome. I say that because I’ve been running mostly-stock AirGradient code with OLED, SHT, TVOC, S8, PM5003 on a modified v3.3 PCB with OLED on 5V and SHT on 3.3V, I2C at 100kHz, Arduino Core 3.1.0, Software Serial v6.12.7 and it has been running non-stop for 17 days, 21 hours, and 9 minutes with zero reboots, zero sensor dropouts/timeout. Rock solid.
I’d concur that there is a problem with ESPHome. Having reduced the pair of devices I have to just literally the ESPHome WiFi stack and an uptime sensor, they still reboot every few hours.
It’s worth noting that ESPHome has several built-in reboots e.g. WiFi drops for a period, though detecting these is tricky when the presence of debug logging seems to sometimes be enough to cause more instability!
What I notice about @argafal’s approach is that the Home Assistant native API is not in use. I might have to try the MQTT approach. Despite the native one being ‘recommended’, I generally have more faith in MQTT.
Thanks all by the way for chipping in thoughts, this is both frustrating and interesting!
Having trialed some changes for a while, I am fairly confident in saying that this set of ESP lines are causing the problem for me. No sensors needed, just enabling UART seems to start reboot cycles. Remove this and the device is rock solid (albeit doing very little!)
esp8266:
board: d1_mini
framework:
version: recommended
platform_version: 3.2.0
esphome:
name: "${devicename}"
libraries:
- uart=https://github.com/plerup/espsoftwareserial.git#6.17.1
# Remove this, reboots solved
uart:
- rx_pin: D5
tx_pin: D6
baud_rate: 9600
id: uart1
- rx_pin: D4
tx_pin: D3
baud_rate: 9600
id: uart2
So this does seem to point back to some instability around the UART code again, but I was under the impression 6.17.1 did not have the major errors the new versions did. Very odd; I’ll see if I can carve out some time to see what the actual reboot failures are.
The graph downloaded confirms that, unless the build is really broken in dependency management somehow, that it has the right libraries
Dependency Graph
|-- ESPAsyncTCP-esphome @ 1.2.3
|-- EspSoftwareSerial @ 6.17.1+sha.12f8480
|-- ESPAsyncWebServer-esphome @ 2.1.0
| |-- ESPAsyncTCP-esphome @ 1.2.3
| |-- Hash @ 1.0
| |-- ESP8266WiFi @ 1.0
|-- DNSServer @ 1.1.1
|-- ESP8266WiFi @ 1.0
|-- ESP8266mDNS @ 1.2
|-- AsyncMqttClient-esphome @ 0.8.6
| |-- ESPAsyncTCP-esphome @ 1.2.3
|-- ArduinoJson @ 6.18.5