This is the first time that this happens. I’ve been running some data analysis scripts for some time without any problems but the past few days I’ve been trying to fetch messages from about a month ago and I randomly get an error 500 from the API.
The JSON response I get is:
{
“success”: false,
“error”: “Error looking up messages. Try again later”
}
Fetching data from the past day or so works without any problem but if I ask for older data then it gives me a 500 error … most of the times.
Here is the query I send https://dashboard.hologram.io/api/1/csr/rdm?topicnames=_TAG_5107_×tart=1563933600&timeend=1564020000&limit=3&orgid=<ORGID>&apikey=<APIKEY>
There are ~30 devices reporting once a minute and I’m requesting 24 hours worth of messages which should be around 43000 messages.
So … is there a limit on how far back in time I can request messages from a device? Is it a performance issue when looking farther back in time that it takes long and fails?
[EDIT]
I’ve now tried many combinations and fetching data between timestart-timeend works for:
timestart = 1564015000
timeend = 1564026400, up to 1564099999
But it doesn’t work for:
timestart = 1564020000
timeend = 1564100000
So, it doesn’t seem to be an issue with fetching old messages, it seems to be an issue with fetching MANY old messages.
Thoughts?
We do store over a year’s worth of messages right now but there can be some performance issues with pulling older ones depending on how they are distributed in the database. It’s a known issue that we’re looking into.
One thing that might actually help is pulling MORE messages at a time with a higher limit parameter (like try 1000) due to a quirk of the query optimizer for the database engine that we’re using.
Thanks for the info. The sample query had a limit of 3 because I was just testing quick responses but the one I’m actually trying to use has a limit of 1000 (which is the limit of the limit )
Yes, it seems to be a DB performance issue and for my queries it seems to be that messages before the 25th of July are somewhere it takes a little too long to fetch together with messages after the 26th and maybe that causes the queries to fail. I’ve now tried several other combinations and it only fails “reliably” if I put a start time before the 25th and and end time after the 26th (UTC) … I can work around that though so not critical but it did make me pull my hair for a few hours trying to re-create some analysis.
Yes, whatever patch was applied, helped with the query.
However, before you pushed that I modified the queries to also filter by “deviceid” or “deviceids” and that seems to help and I could get the same query but going device by device.
I’ll keep an eye on these “historical” queries and maybe have some scheduled “backup” to one of my servers for old messages.