23 posts / 0 new
Last post
dilectric
Server v3 kills connection

I am about to build a Home Assistant environment and want to integrate car data there, so I recently activated Server v3 to send MQTT, using a TLS secured connection. After a bit of playing around with the "update intervals" settings I left them to the default, since these don't seems to change anything (i.e.: left the configuration fields blank). This means there is a quite big flood of messages going from OVMS to the MQTT server, as mentioned in another thread here.

Server v2 is also active to keep the OVMS Android app informed.

After about half a day, the OVMS module's connection stops, i.e. no more traffic from servers v2 or v3 and no accessability via WiFi (OVMS being a client in my home network).

This did not happen before I actrivated Server v3.

I'm using firmware 3.3.003 on a Renault Twizy.

Any advice on how to proceed here?

I'll be happy to provide further information if needed.

anthonws
I'm having the same issue. As

I'm having the same issue. As soon as I get v3 activated the whole thing crashes. I've factory reset it too many times already.

To be honest, given how "dead" this forum and community is, I have given up. I appreciate everyone's effort on sharing such thing with peasants like me that have not enough time and baseline knowledge to even be able to dream to do such a thing, but I have tried to provide feedback, scenarios and info to improve it mine and others experience, but ultimately we get no answers/feedback.

Its one of those things that, if you have the time to debug and eventually be able to build support for you vehicle, then this is awesome. But for the ones that are not electronics and programming genius, then I would advise to find an alternative (feel free to DM me if you want suggestions).

Best of luck,
anthonws.

 

Edit: You can see this post where I reported the same behavior along with some data points: https://www.openvehicles.com/comment/9305#comment-9305

markwj
markwj's picture
Happy to look at it, if you

Happy to look at it, if you can provide logs.

Note that in general the mqtt protocol used by server v3 is very data intensive (at least an order of magnitude over the v2 protocol), so I would suggest to only run it when the car is on wifi back in the garage. That would be achieved by simple one line command scripts to start/stop the v3 server depending on wifi status.

dilectric
@markwj SInce I am a humble

@markwj SInce I am a humble user in regards to OVMS: For the log files I would insert a SD card, make sure logging is enabled and then wait for the crash to happen, right?

As for the high data volume of Server v3: to my understanding MQTT is one of the most lightweight protocols out there. From what I have seen, I would attribute that order of magnitude of traffic it produces over v2 to a very suboptimal implementation of message triggers (including completely disfunctional config settings for the update intervals). But I may be wrong.

markwj
markwj's picture
@markwj SInce I am a humble

@markwj SInce I am a humble user in regards to OVMS: For the log files I would insert a SD card, make sure logging is enabled and then wait for the crash to happen, right?

Yes, that should be a starting point at least. More detailed logging may be required, for one particular area, but the basic logging should at least give us an indication of the source of the problem.

As for the high data volume of Server v3: to my understanding MQTT is one of the most lightweight protocols out there. From what I have seen, I would attribute that order of magnitude of traffic it produces over v2 to a very suboptimal implementation of message triggers (including completely disfunctional config settings for the update intervals). But I may be wrong.

Nothing is as lightweight as a protocol specifically crafted for one specific job. I will give you a (partial) example:

Here is how v2 protocol transmits SOC, units, line voltage, and charge current:

S80,M,218,32

There is more passed in that 'S' message, but you get the idea.

Here is how MQTT protocol transmits the same information:

PUB OVMS/MARKJOHNSON/VEHICLEID/metric/v.b.soc = 80
PUB OVMS/MARKJOHNSON/VEHICLEID/metric/v.b.voltage = 218
PUB OVMS/MARKJOHNSON/VEHICLEID/metric/v.c.current = 32

13 bytes for v2 protocol vs 163 for MQTT (just a very rough estimate for indication only).

You can see the issue is that the topic takes up a lot of space, and the OVMS v2 protocol is optimised to have single character topics sending a large amount of information.

Now MQTT v5 will improve on this, as it allows us to register aliases with the topic's first use, and thereafter provide just a short 2byte integer to refer to the topic rather than the full topic name. That will have a dramatic impact on our use case. But support for that is still a way away (the library we use doesn't support v5).

There is a very good presentation on the feature here: https://www.emqx.com/en/blog/mqtt5-topic-alias

dilectric
OK, I give you the factor 10

OK, I give you the factor 10 for the highly condensed proprietary format.

I'm still wondering why Mosquitto kept transfering the exaktly identifical MQTT-Messages from OVMS over and over again. But maybe it's just something I don't understand here.

 

Anyway, I activated logging in it's default configuration (info level) and am now awaiting the crash ;-)

markwj
markwj's picture
I'm still wondering why

I'm still wondering why Mosquitto kept transfering the exaktly identifical MQTT-Messages from OVMS over and over again. But maybe it's just something I don't understand here.

The logic is:

  1. When we connect to the mqtt server, we send all metrics (as we don't know what the server will have missed since we last connected).
  2. There is a config setting (server.v3, updatetime.sendall) which specifies the number of seconds between transmission of all metrics, but that default to zero (ie; don't send all).
  3. Otherwise, it should follow the other update time config settings to delay transmitting modified metrics.

I think if you are seeing a lot of unmodified metrics being re-transmitted to the server, it is most likely because the server connection is being lost and we are reconnecting.

The default values for configuration are:

  • streaming: 0
  • updatetime.connected: 60 seconds (modified metrics)
  • updatetime.idle: 600 seconds (modified metrics)
  • updatetime.awake: 600 seconds (modified metrics)
  • updatetime.on: 600 seconds (modified metrics)
  • updatetime.charging: 600 seconds (modified metrics)
  • updatetime.sendall: 0 (disabled, but if enabled will transmit all metrics)

With the exception of updatetime.sendall, the purpose of these uptime time settings is to reduce the number of metric transmissions, as some metrics are updated extremely frequently (like speed when you are driving).

dilectric
According to the OVMS Android

According to the OVMS Android app, the crash happened around 1:45 CEST tonight, i.e. around 23:45 GMT. Around that time, apparently the cellular modem has been powered off (OVMS definitely was connected via my home WLAN):

2023-03-29 23:42:27.071 GMT I (8251) command: OpenLogfile: now logging to file '/sd/checkv3_info_level.log'
2023-03-29 23:42:41.051 GMT I (22231) cellular: State: Enter CheckPowerOff state
2023-03-29 23:42:56.051 GMT I (37231) cellular: State: Enter PoweredOff state
2023-03-29 23:44:20.051 GMT I (121231) housekeeping: System considered stable (RAM: 8b=113788-138712 32b=12344 SPI=3996964-3997648)
2023-03-29 23:47:20.061 GMT I (301241) housekeeping: 2023-03-29 23:47:19 GMT (RAM: 8b=113788-138656 32b=12344 SPI=3996964-3997680)

As described before, no connectivity was actice after that time. OVMS Android App showed no data after that point in time and showed "6 hours" where you usually would expect the "live" information. Again, of course browser access also was not possible, so I rebooted OVMS the hard way.

One thing that might cause the problem is that OVMS gets assigned a 10.xx.xx.xx/32 address by the cellular provider, and my home network is 10.0.0.0/8. I wonder if I should fire up a different subnet to test if the issue continues with a different home subnet.

Complete Log: https://drive.google.com/file/d/1aF8zKJxpklVzWrqiW0PEIq3qBfxI4YtW/view?usp=sharing

markwj
markwj's picture
There seems to be way too

There seems to be way too much being sent via mqtt, but as your module is not using api.openvehicles.com server, I can't check.

Can you open a support ticket here, to provide the following information:

  1. Output of 'module summary'
  2. Output of 'boot status', after crash
  3. V2 server logs around crash (can obtain from the v2 server, or dexter if you are using that server)

Without your module on the server it will be difficult to help, but I am willing to try.

werdnum
Now MQTT v5 will improve on

Now MQTT v5 will improve on this, as it allows us to register aliases with the topic's first use, and thereafter provide just a short 2byte integer to refer to the topic rather than the full topic name. That will have a dramatic impact on our use case. But support for that is still a way away (the library we use doesn't support v5).

It looks like you use the Mongoose library. It appears that this library added support for MQTTv5 in version 7.8 released last August.

markwj
markwj's picture
It looks like you use the

It looks like you use the Mongoose library. It appears that this library added support for MQTTv5 in version 7.8 released last August.

Yes, they added support for connecting to v5 servers, but no support for the extra functionality (such as the topic aliases we need).

werdnum
It looks like you use the

It looks like you use the Mongoose library. It appears that this library added support for MQTTv5 in version 7.8 released last August.

Yes, they added support for connecting to v5 servers, but no support for the extra functionality (such as the topic aliases we need).

OVMS also seems to use a 5-year-old fork of that codebase, so I guess there's plenty of work to do even if they had topic alias support.
jetpax
Out of interest, why do use

Out of interest, why do use Mongoose and not the esp-idf  built in MQTT client which does support V5 topic alias. I believe

markwj
markwj's picture
The reason we use mongoose is

The reason we use mongoose is that it is async so more than one connection can be on one task. The connections can even be completely independent.

Using the native APIs (like MQTT) typically require blocking calls, and that means one task per connection. Each task requires a sizeable amount of on-chip RAM (which is at the moment our most limiting factor).

werdnum
FWIW, I've run the v3 server

FWIW, I've run the v3 server pretty much exclusively since I set up OVMS because I have an abiding mistrust of 3rd party servers that aren't maintained by big well-resourced companies.

It's generally worked totally fine for me, and the data usage for the last ~6 months has been 55 MiB. About 3.5 MiB on days when I have done long drives and had ABRP reporting enabled, 600KiB on days when I've not driven much but otherwise spent the whole day away from home. I generally find this acceptable, I try to mitigate it by having my phone auto-hotspot whenever it's plugged into the car, though I haven't taken the time to verify that it's working properly.

I do generally find that OVMS isn't as well maintained as I'd like. But in the absence of other good options for my car (Kia Niro EV, outside the US where KiaConnect is available), it's basically my only option.

markwj
markwj's picture
Yeah, that pretty much

Yeah, that pretty much matches what I would expect. V2 protocol is about 60KB /day.

anthonws
Just reporting that this

Just reporting that this continues to happen consistently.
Server3-Start-Lost-Connection

Edit:

Module summary: https://pastebin.com/zehzhEP9

OVMS# boot status
Last boot was 662 second(s) ago
Time at boot: 2023-04-01 21:45:31 WEST
  This is reset #5 since last power cycle
  Detected boot reason: SoftReset (12/12)
  Reset reason: esp_restart (3)
  Crash counters: 0 total, 0 early

markwj
markwj's picture
A few things noticed:

A few things noticed:

  1. You have server.v3 disabled in auto seciton of config
  2. You have updatetime.connected configured in server.v3, but blank. I suggest you remove that configuration setting if not required.
  3. You have updatetime.idle configured in server.v3, but blank. I suggest you remove that configuration setting if not required.

 

anthonws
Thanks for the feedback.

Thanks for the feedback.

1. It was not configured because I simply was testing whether it would work or not manually (starting it manually)

2. and 3. This is after factory reset.. I did not configured anything outside of the server settings. I can repro, again if needed.

But how would those settings cause a reset connectivity?

markwj
markwj's picture
But how would those settings

But how would those settings cause a reset connectivity?

My concern is the two timer settings. Not sure of the behaviour if those are blank (ie; defined, but empty, as opposed to undefined).

The issue here is how to repeat the problem. Once repeated, it can be solved. My best guess is something to do with the network stack being overloaded (memory exhaustion?) due to so many metrics being sent out over cellular data. Wifi should be able to cope, but SSL/TLS does add quite some overhead.

anthonws
Understood.

Understood.

Please note that LTE was not active since antennas were not in place. WiFi was the connection method and the router is 40/50cm away.

I will factory reset again this weekend and will report back.

But according to your info, it seems that it is not recommended to go to very low values on those timers and that eventually the system itself should not allow zero as a value.

Thanks again. Will report back asap.

Edit: and that also blank values should not be allowed.

dexter
dexter's picture
blank values should not be allowed

If you use the web UI to configure the intervals, it will not allow invalid intervals.

If you set the config directly, e.g. by some script/shell command, no validation is done.

A blank (i.e. empty) config instance means applying the defaults.

anthonws
I have never see MQTT values

I have never see MQTT values from shell, only UI. So something happened in between :/

FYI, I've reset to factory default both partitions and updated both to the latest firmware and could not repro.

I will report back if I find anything relevant.

Thank you for the feedback!

Log in or register to post comments
randomness