Airgradient rebooting itself?

softserial 8.0.1 is something interesting for this topic or still need to wait?
I’m replying to you message seems it seems you are very familiar with this topic :slight_smile:

thanks
bye
Marco

Yes! SoftwareSerial 8.0.1 was released a couple of weeks ago and esp8266 3.1.2 was released yesterday that includes the new version, so I guess it’s time for me (or someone else) to test. I can start that tonight when I get home.

looking forward to read your comments about that.
In case I would also have a try, the correct yaml config part would be the following?

esphome:
  name: "${devicename}"
    libraries:
    - uart=https://github.com/plerup/espsoftwareserial.git#8.0.1

esp8266:
  board: d1_mini
  framework:
    version: latest
    platform_version: 3.1.2

can you confirm or eventually correct me, pleasae?
thanks
bye
Marco

I don’t think that platform_version will be correct. According to the documentation:
ESP8266 Platform — ESPHome

So that is not the same release numbers, as it is the platform-espressif8266 project, which the most recent is 4.1.0 which references arduino core 3.1.1 only

There is the version: under framework: that does map to the arduino release for esp8266 and you could try setting to 3.1.2, but since platform will still reference the old one, I’m not sure which one will take precedence.

I would try without the libraries section so you aren’t also trying to push a separate software serial and then set the esp8266 section to look like:

esp8266:
  board: d1_mini
  framework:
    version: 3.1.2

Although this will still use an older platformio

Go with @MallocArray 's advice because I have no experience with ESPhome. But if you are somehow already getting the recent 3.1.2 release of the Arduino Core library, then the new Software Serial should already be included.

i based my configuration attempt on a post on another thread of this forum always realted to softwareserial issue with esp8266.
I’m referring to https://forum.airgradient.com/t/airgradient-pro-with-esphome-display-keeps-turning-back-on/641/31?u=marco

I don’t have experience about such kind of configuration, so that’s why i was asking :slight_smile:

This is what I get with my suggestion:

INFO Reading configuration /config/ag-pro.yaml...
WARNING The selected Arduino framework version is not the recommended one. If there are connectivity or build issues please remove the manual version.
WARNING The selected Arduino framework version is not the recommended one. If there are connectivity or build issues please remove the manual version.
INFO Generating C++ source...
INFO Compiling app...
Processing ag-pro (board: d1_mini; framework: arduino; platform: platformio/espressif8266 @ 3.2.0)
--------------------------------------------------------------------------------
Tool Manager: Installing platformio/framework-arduinoespressif8266 @ ~3.30102.0
INFO Installing platformio/framework-arduinoespressif8266 @ ~3.30102.0
Downloading  [####################################]  100%   

So it is using the recently release arduino framework, but still an older version of platform-espressif8266 which has not been updated to match yet. So not sure what the outcome is.

Edit: it failed to compile for me

src/esphome/components/wifi/wifi_component_esp8266.cpp: In member function 'bool esphome::wifi::WiFiComponent::wifi_ap_ip_config_(esphome::optional<esphome::wifi::ManualIP>)':
src/esphome/components/wifi/wifi_component_esp8266.cpp:697:3: error: 'dhcpSoftAP' was not declared in this scope
  697 |   dhcpSoftAP.begin(&info);
      |   ^~~~~~~~~~
*** [.pioenvs/ag-pro/src/esphome/components/wifi/wifi_component_esp8266.cpp.o] Error 1
========================= [FAILED] Took 115.19 seconds =========================
1 Like

@MallocArray I reproduce the same issue.

I made a workaround to have arduino 3.1.x getting compiled on esphome just to test if this helps with the crashes. Im not responsible for any issues on your devices so use at your own risk. I removed some bits to workaround the softAP changes in the wifi module which I don’t use.

esphome:
  name: "${devicename}"
    libraries:
    - uart=https://github.com/plerup/espsoftwareserial.git#8.0.1

external_components:
  - source: github://eavyon/esphome@dev
    components: [ wifi ]
    refresh: 0s

esp8266:
  board: d1_mini
  framework:
    version: 3.1.2

Good news. I re-compiled my custom AirGradient (non ESPHome) FW using the new ESP8266 Arduino library v3.1.2 which includes SoftwareSerial 8.0.1 last night. It’s been running with no exception crashes or any other issues for >15hours now, so I’m reasonably-confident the issue is resolved.

Clocking in at 66 hours, one of the highest uptime now. Lets continue with this version for now.

Im not sure if ESPHome will add 3.1.2 soon as recommended version so maybe some testing could be done with only the new espsoftwareserial 8.0.1 on 3.0.2. If this also works then maybe we should make a request for it on esphome pmsx003 code.

Im sad to report it just crashed after 71 hours.

Do you know what kind of crash? Exception 0? Mine is at 2D,16h,49m so almost 65 hours.

How nice to read all your efforts.

We are also mixing up a few things at the same time, e.g. we discuss air gradient’s own software but also esphome in the same place. We are also mixing different versions/configurations. So we have to be careful not to draw wrong conclusions in the end. :slight_smile:

Personally, I use esphome with graphs and mqtt. I have posted my configuration here: Esphome with graphs - #5 by argafal

I have two problems:

  • Wifi re-connection frequently fails. I have found reports of other esphome projects with similar symptoms, the cause was i2c timing. This may or may not be the reason for my particular issue, it’s hard to narrow it down. I currently have to modify i2c settings for my air gradient board (v3.3) to work, depending on which MCU I use (D1 or C3). I will be curious to see how the v4 prototype performs.

  • In addition, I have random reboots, exception Out of Memory. I believe these are caused by heap_fragmentation, values are around 30-40%. I have added the following debug sensors to my esphome yaml to understand the problem better:

sensor:                                                                                                                                                                                 
  - platform: debug                                                                                                                                                                     
    free:                                                                                                                                                                               
      name: "Heap Free"                                                                                                                                                                 
    fragmentation:                                                                                                                                                                      
      name: "Heap Fragmentation"                                                                                                                                                        
    block:                                                                                                                                                                              
      name: "Heap Max Block"                                                                                                                                                            
    loop_time:                                                                                                                                                                          
      name: "Loop Time" 

@Hendrik Would you be able to catch the exact Exception you are getting? Is it the same for both of us? And might it be worth to start recording heap (fragmentation) values, too?

I’m aware that ESPhome and arduino can behave differently but they do use the same libraries where most probably the error occurs.

@ken830 @argafal
I will try to connect the D1 mini to a pc to catch any errors. I’ve had the debug sensors enabled for a long time but in the end I couldn’t find any correlations with my crashes. The only thing which definetly caused immediate crashes was having multple graphs drawn because of memory limitations. I removed all graphs therefore.

@argafal
It seems like the i2c frequency has something to do with wifi (re)connection issues. Mine didn’t even connect when i2c was at 50khz or below. I have the feeling that too low frquencies cause to much wait time so the wifi process timesout. Especially when more sensors are on the bus and the cumulative wait time builds up. So higher speeds(100khz) did sort that one out for me.

I just had another reset and caught only the reset cause.
ets Jan 8 2013,rst cause:4, boot mode:(3,6)

This is an interesting one because its a hardware reset and I have no strack trace. It could be a one off maybe. I still do have memory pressure with low max heap block free sometimes of 200bytes but 3k heap free. Does anyone have that also? EDIT: I removed the webserver and ‘only’ doubled free space but max block space got to 4k now. Which makes it far easier for esphome do it things like generating json for the api.

Also I found out that having a esp connected directly by serial to the ESPHome docker and logging opened it decodes a stack trace. So I keep it connected for any new crashes.

1 Like

Definitely either ESPHome-caused crash or your specific hardware (power?) because with the new libraries, mine has been running non-stop for 5 Days, 2 Hours, and 59 minutes.

1 Like

At this moment I’m at a loss. I only can get hardware resets as described before. So no stack to debug on.

Looking for common issues this could caused by is bad power supply, wrong wifi library called or wrong pin numbers put in the config. Well the power has changed from a usb adapter to a usb port on a computer but the latter two is more or less defined with ESPHome. And the power rail has added capacitors for stability already. So not much more I could do.

Who with ESPHome has also rst cause:4 (hardware reset) with the latest versions?

with the following configuration

esphome:
  name: "${devicename}"
  libraries:
    - uart=https://github.com/plerup/espsoftwareserial.git#8.0.1

esp8266:
  board: d1_mini

text_sensor:
  - platform: debug
    device:
      name: "Device Info"
    reset_reason:
      name: "Reset Reason"

  - platform: debug                                                                                                                                                                     
    free:                                                                                                                                                                               
      name: "Heap Free"                                                                                                                                                                 
    fragmentation:                                                                                                                                                                      
      name: "Heap Fragmentation"                                                                                                                                                        
    block:                                                                                                                                                                              
      name: "Heap Max Block"                                                                                                                                                            
    loop_time:                                                                                                                                                                          
      name: "Loop Time" 

i got this morning an unexpected reset with reason “Hardware Watchdog”"

Device Info changed to 2023.3.2|Flash: 4096kB Speed:40MHz Mode:DOUT|Chip: 0x008a6cec|SDK: 2.2.2-dev(38a443e)|Core: 3.0.2|Boot: 31|Mode: 1|CPU: 80|Flash: 0x0016405e|Reset: Hardware Watchdog|Fatal exception:4 flag:1 (Hardware Watchdog) epc1:0x40103b35 epc2:0x00000000 epc3:0x00000

at that time, there is also a big spike in:
heap free
heap max block
loop time

hope it can help

If I had more time, I would look into ESPHome… but for now, I did a quick look through the documentation and according to Espressif, during power-on, the ROM will print out a reset cause. Reset cause 4 is the watchdog timer. Both of you are probably seeing the same basic reset cause.

image

If the user program (ESPHome FW in this case) support this, it can also get reset cause information:

image

These tables were pulled from: https://www.espressif.com/sites/default/files/documentation/esp8266_reset_causes_and_common_fatal_exception_causes_en.pdf

Unfortunately, a watchdog timer expiration doesn’t tell you what went wrong, but it does mean the SoC was busy doing something that took so long, it didn’t have the chance to reset the WDT – an infinite loop, for example.

I believe the Arduino core library has support for a software watchdog, which can be set to expire earlier than the hardware watchdog. In the case a software watchdog expires, you will get a stack dump on the terminal that you can put in the decoder to pinpoint exactly which part of the code is stuck.