Movebubble Sprint: Day 10

Standup starts late, as Laura was at a ‘women in tech’ breakfast event at the Google campus. Jack is having problems testing the router bug, because new viewings are generating new chats (the old behaviour) rather than being routed to an existing chat with the same agent, and he needs to multiple viewings behaviour in order to reproduce the bug he is trying to fix. It seems this is something that I broke with all of my ‘fixes’ the previous evening, but I don’t have time to look at it immediately because I have a planning meeting with Will and Amy to work out what we’re doing next week, so I just roll back the dev environment temporarily.

Planning goes pretty quickly – Tom is on holiday already, I will be on holiday from Monday and Valerio is off for a week from Wednesday, so we discuss the kind of things that it makes sense for us to work on over the Easter break. From a Water Camels perspective, there are a number of small improvements to the suggested properties and chat system messages which we couldn’t get done this sprint and it would make sense for Jordi and Laura to get on with those; Amy suggests a number of small improvements that Adrian has wanted to work on for a while. We decide that, rather than try and work on stuff as a a single time, we should maintain the two separate team workflows for that week.

I head back to my desk to look at the broken chat routing Jack identified but I’ve barely been sat there for 5 minutes before Andrew calls regarding an issue with the app. Apparently it just hangs when he tries to enter certain screens. It was doing this yesterday as well; I wish I’d known about it yesterday as I could have investigated it properly, but he is going into a meeting in 10 minutes and I’m not sure there is much I can do here.

I take a quick look at the data on the server to check that there is nothing obviously wrong and I also take a look at the logs to see if I can find out where the problem is occurring. Nothing. It’s possible that the chat service and the main API are out of sync, so I suggest to Andrew that I can redeploy one or the other and see if that fixes it. He doesn’t want to gamble on that making things worse though. I tell him he can’t use the test environment as we’re on release day and it’s going to by high churn. ‘Never mind, I’ll just wing it’, he tells me.

Before I get started on my code again, I head over to Laura’s desk as she is sat there with Valerio looking glum. She doesn’t think that the badges issue is going to be done in time for the release. The pair of them take a few minutes to explain the problem to me, with the aid of a conveniently located whiteboard. It seems that they’ve tried a couple of different approaches but the problem is that we’re trying to maintain two data structures that are fundamentally at odds and whichever way they do it something is going to be not quite right.

Valerio seems to have a solution that involves rewriting one of the data stores (at least, that is what I understand) and I ask him how long he thinks it will take to do that. 2 to 4 hours, is the response, if he has no interruptions. But he is on dev support today, so he will probably have interruptions.

I volunteer Laura and I to cover his dev support if he can work on fixing the issue and Laura seems relieved. It’s always demoralising to pick up an issue that looks simple but turns out to be a lot more difficult than imagined due to the way a particular feature is written. It is particularly demoralising when you are relatively new to development, because you feel like you should be doing better and you left with the nagging feeling that it’s not the code that is being difficult but you. I hope that the fact that Valerio was also struggling with the problem is enough to assure Laura that this was just a difficult problem. She does seem happier now that she doesn’t have to deal with it, at least.

I finally get to start looking at the chat initialisation bug, but it’s a challenging one. There’s no obvious reason why I have broken that function and I when I try to reproduce it locally it all works correctly, regardless of whether it is the renter or the agent who initialises. I need some better diagnostics, so I update the code with some minor logging additions and while I’m waiting for it to deploy I help Laura with a dev support issue.

We have a negotiator who is complaining that she keeps getting logged out of the app and that she is missing leads as a result. However, when we look in the database it’s clear that she hasn’t had to log in for a couple of months and the events on Mixpanel show she has successfully been retrieving her profile and leads on a regular basis. What’s more, the events clearly show that she is accessing the app immediately after receiving push notifications and there is less than a couple of minutes delay, so that bit of the function is also working.

It turns out that the real reason she is missing leads is that she has a colleague who is consistently responding to them a lot faster than she is – within 30 seconds! On the one hand, we’re pleased, because this is exactly the kind of thing we are trying to achieve with the two apps, but on the other it feels like there could be a usability problem here in that it doesn’t seem clear to the unhappy negotiator that this is why the leads are being missed.

We need to call her anyway to ascertain if the password issue is still a problem or whether she was conflating a previous issue she had (this was a known problem a few weeks ago) with the more recent one, so it feels like we should try and ask her some open ended questions about the experience as she’s probably the first example we have in the field of an agent being out-responded by a colleague as in most cases we’ve only given the app to one negotiator per branch.

I get back to the chat bug. I have some log diagnostics now and I can reproduce some of the calls between the services in Postman. I still can’t see the problem though – when I run through end to end it all seems fine. It’s not until I’m about to go for lunch that I spot it – there’s been a change of case.

RavenDB stores IDs in the form [Collection]/[IdNumber], where [Collection] is saved in Pascal case. But some of our apps when they call the endpoint use camel case or lower case, so e.g. UserInformation/123 because e.g. userInformation/123. We can get ensure consistency in the code by loading the document and then using the id from the loaded entity rather than the incoming string. As part of my refactoring, I had failed to do this, and now the search for an existing chat was failing to match the UserInformation tag due to a case mismatch.

Having finally identified the problem, the fix is pretty simple, and I get it into our test environment. But it’s 13:20 and I feel like I should have fixed this earlier, to be honest. We’re supposed to have a planning session and a retro this afternoon, but I reckon we’re going to be struggling to fit the retro in and release on time. I don’t want to cancel the retro as we’ve not been particularly good at having them (we’ve probably done 3 over the last 3 months) and I think it’s an important part of our continuous improvement loop, but it is what it is and it’s less important (at least, to the business) than getting the app submitted.

I wonder whether we should move to Wednesday to Wednesday sprints. It’s something that we did in my previous job and it meant that we could do a retro the morning after with the sprint work still fresh in our minds. It also meant that any last minute scrabbling tended to occur mid-week rather than at half-past-pub on a Friday when people would rather be anywhere other than the office. I make a note on my task list to bring this up at the next dev lunch (something else we missed this week). NB should amend Wednesday’s post with this info. 

After lunch, I test that my fix for the chat routing actually works (it does), and then I’m able to reel off and verify about 5 other backend tickets which were all at least partially solved by this and the work that I did the previous evening refactoring and commonising our request viewing code. After that, it’s 14:40 and I have around 40 minutes before our planning session – I do a quick check where everybody is at. Laura thinks she’ll have finished the styling work in the renter app; Valerio is looking good for the badges fix; Jordi is still struggling with a specific case of the router bug in the negotiator app related to offers.

We decide that the router issue is not a show stopper if it is only reproducable in relation to offers – we have so few offers going through the app at the moment that it’s very unlikely to surface in the field. That means Jordi can focus on tidying up a couple of offer message related issues. The same for Laura – there’s still some back end work to be done for the offer messages, but as long as the styling and handling is there then it can be done after we submit for release.

Then it’s planning and Jordi and Laura and myself all sit round and go through what will be on their plate for the following week when I’ll be on holiday. It’s going to be a micro-sprint tidying up some of the work that got descoped this week and as there are only two of them and the Friday is a bank holiday I reckon we have a capacity of about 20 points. That doesn’t get us very far, enough to finish off some of the smaller items that got descoped this time round and also for Jordi to do some work on chat performance improvement.

The latter is something that we’ve wanted to do for a while, since we currently have an MVP implementation of chat that polls a lot of data from the server, and now seems like we may finally be able to make some time for it. Jordi’s a little unsure about it but I think it will be a good exercise for him and it will also help us spread a bit more knowledge about the chat back end around the team.

I let them both know that they’ve done a great job this sprint – we’ve had the usual issues with scope creep and changing goalposts, but after committing to 63 points in the sprint we’ve actually done 77.5 and we’re on track to deliver exactly what we said we would – releases for both renter and negotiator apps with a suggested properties MVP. And we’ve done this despite both Jordi and I taking sick days and a wasted day I spent wrestling with Android Studio. We discuss what we’re going to do to celebrate; one suggestion is a team outing to Barcelona to visit Jordi, who will be moving back there and working remotely next week. I reckon we can make it happen.

We’re out of planning by 15:50. In the meantime, Valerio has finished his work on the renter app and so we can merge that in; the main thing left to do is some last minute regression testing. Jack and Will volunteer to help with that as well while Laura and Valerio handle the merge and getting a suitable version of the renter app onto test flight.

We also need to release the API, which is risky this late on a Friday but we need to ensure that it’s consistent with the app so that there are no problems with the Apple review process. I volunteer to check on it later in the evening and the following day. That way, if there are any visible errors or problems I can roll it back easily enough. I also handle the release, consists mostly of clicking the ‘deploy’ button on our Octopus installation, running through some basic tests in the app to check that the happy paths still work and watching for any errors in the live logs.

At 17:15 we have TGIF, which consists of going round the room and telling everybody how we helped a renter. It’s a weekly ritual that has been going since long before I joined the company. The idea is that everybody maintains a focus on what the company’s mission is. It’s become easier to contribute over the last couple of months; I didn’t used to be allowed to include work I’d done as a developer, which meant I was always trying to think of some person I’d spoken to or given advice to and it was often pretty spurious. But as we’ve moved more towards focussed, data driven decisions, one of the changes is that we’re now allowed to contribute anything as long as it is aligned with the company’s current focus. In this case, several of the things that we’ve done of the last week should directly impact offer conversion, but I decide to focus on offer communications themselves.

Then it is also a chance for people to shout out particularly good work by colleagues and for Aidan (or Jack if Aidan is not around) to reflect on the week and talk about how things are going. We’ve shaken it up a little recently, with weekly figures now being shared on a Monday morning when we can take in the weekend as well, so the speech at the end is a bit shorter. Jack wants to shout out the development teams this week for delivering releases on both apps on time (though we haven’t actually released yet) and I second that, particularly calling out Jordi, Laura and Valerio for their work over the last couple of weeks.

Finally, it’s back to our desks to release the apps. Jordi takes charge of the negotiator app release and Adrian starts the renter app release, though we discuss whether it should be somebody else who manages the full release process. Everybody is pretty keen to get down to the pub, so in the end both Jordi and Adrian grab their laptops and decide to release from there. I have a couple of things I want to tidy up before I go, since I won’t be on Monday, so I form the rear guard with Will and Nicola.

The apps are finally submitted to the app store at around 18:30. All that’s left then is to chill out with a drink.



Leave a Reply

Your email address will not be published. Required fields are marked *