Breaking changes???

alano · May 21, 2019, 8:02pm

I’m been sending data to my phone via a SMS route from an Arduino MKR GSM 1400 using the following commands for months:

client.print("{\"k\":\"" + key + "\",\"d\":\"");
client.print(msg); 
client.println("\",\"t\":\"" + HOLOGRAM_TOPIC + "\"}");

And now I’m just getting gibberish. Did you guys change anything?

Reuben · May 21, 2019, 8:32pm

Is it still broken? There was a deploy that caused a problem but we already put out a fix

alano · May 21, 2019, 8:41pm

Still broken!

Reuben · May 21, 2019, 8:42pm

Ok investigating now

Reuben · May 21, 2019, 9:35pm

Give it a try now

alano · May 21, 2019, 9:43pm

Yes, it’s working now.

Don’t you guys perform regression testing before rolling out changes?
Why don’t you perform regression tests before rolling out changes?

I know you guys are skilled developers, but come on, changes should be verified before going live.

Reuben · May 21, 2019, 9:48pm

Yes, sorry for the trouble. We do have automated regression testing but we had a hole in our coverage of this case. We’ve added a test for this so it shouldn’t have an issue going forward.

Titus · May 21, 2019, 10:03pm

same here, I receive strange text from sms fot the last 4-5 hours.

Reuben · May 21, 2019, 10:23pm

Should be good now. If you’re still having trouble let me know

alano · May 22, 2019, 7:56pm

Reuben,

It’s great that you were able to fix it, but having a hole in your coverage points to a much larger issue that perhaps you guys aren’t addressing? Why was there a hole in the first place? Do you guys perform code reviews? Usually an event like this triggers a CAPA, whereby your company needs to look at it’s processes, determine the root cause, and perhaps make changes to avoid this happening in the future. I’m not sure you guys have critically examined why there was a hole in the first place and what you can do to avoid something like this happening again? Any thoughts on beyond simply putting out this fire?

HologramPat · May 22, 2019, 9:42pm

Hi,

Thank you for being our customer and for being active on this forum. We take testing extremely seriously and reliability is of utmost importance to us (our engineering culture) and, of course, our customers. We utilize extensive testing at both the unit level and in staging changes to test what will happen within a production environment. Accordingly, we navigate the vast majority of changes to our offering while preventing any noticeable effects, on an almost daily basis when considering the breadth of our offering.

In this incident, we encountered an unanticipated edge case in production that testing did not adequately prevent. This incident was immediately escalated to the highest levels of our org, and a fix was built, underwent testing, and was deployed. I, personally, reviewed the root cause and fix, and our testing processes will be amended to ensure prevention in the future.

Again, apologies for the inconvenience and frustration this caused, and we have implemented the fixes and process changes necessary to address the reason this incident occurred.

Best,
PFW

system · June 21, 2019, 9:59pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.