Cross-posted from https://eng.hudya.io/how-we-launched-electricity-services-in-eight-weeks/, original post from Sept 2017
Hudya Engineering is all about building a digital platform for multiple services for regular people’s homes and families, and to do that for Norway, Sweden, and Denmark (phase 1) in parallel. However, “a digital platform” is not something you magically pull out of your sleeve, it is purpose built, carefully designed, step by step, and it takes time. Along the road, you can solve the wrong problems in the wrong way, and there are many, many choices to make.
Pretend…
A typical approach when you start a new company is to pretend: pretend that things are automated, pretend that there are more people in the company than you are, pretend that your processes are more complete than they really are. At Hudya, we want our customers and partners to trust us, so we believe we need to be open and honest, also about the shortcomings, thus I will share with you in more detail, under the hood, how we went about launching our electricity service in eight weeks.
We really wanted to be as lean as possible in everything we did, that is, not spend time coding things we really didn’t need. This is easy to say, and hard to do as the things you believe you need or the solution to a specific problem might be different from what you think as you learn more.
Key Components
We identified the following things as necessary to build right away:
- Infrastructure and services to host the microservices platform we had decided to build
- A build pipeline and tests to allow code to be deployed often with high quality
- A set of backend services to store data about new customers and the electricity service
- A signup application to allow customers to sign up from PCs and mobiles
- Authentication and authorisation
- A way of interacting with customers and tracking the interactions
- A backend integration with our energy partner
- A way for our customers to pay
Note that this was after I had drawn up the initial v1 of our microservices-based architecture, so we had already decided to deploy small, standalone code components in Docker containers, use JavaScript React, and only interact with the backend using REST-based services.
I wanted us to be very focused on solving customer-specific needs. You see, ask an engineer to build a platform, and you get more than you think that you ask for… I thus focused the work on solving one specific thing: allow signup of new customers from the web. You can see the (current, not original) signup at https://signup.hudya.io/.
Minimum Workable Product
To focus everybody on only solving the problems we needed to solve, I wrote a specification of the signup process and what was supposed to happen for each step. A bit similar to what you might call an MVP (Minimum Viable Product). Then I applied a technique inspired by Occam’s razor: for every piece of functionality, could I take it out and still have something that could be tested or tried? What you end up with is not something usable to a customer, but rather what I call an MWP, a Minimum Workable Product, something you can work with and improve.
Scaffolding
With this bare-bone functionality, everybody could work on their pieces and only do the absolute minimum necessary to have something running, something we could touch and use. To increase the velocity even further, we used a technique I call scaffolding. Instead of writing full production code for all tasks, we simplify and write scripts that allow us to do what we want in the beginning, learn critical things for our production code, and avoid solving problems we don’t need to solve at this point. An example below shows a snippet from a script that adds a new customer to Stripe (for credit card payments, number 8 in our list) and registers an event in Intercom (how we track customer interactions, number 6 in our list).
if not subscription and not list_cust:
print "Adding plan to customer..."
try:
subscription = stripe.Subscription.create(
customer=cust_id,
plan=plan_data['id'],
trial_period_days=trial_days,
tax_percent=tax
)
print "Added plan to customer."
except:
print "Failed adding plan to customer."
sys.exit()
intercom.send_event(
token=intercom_token, email=cust['email'],
event=intercom_event_plan_added,
meta_data={'trial_period_days': trial_days}
)
So, this script was not run automatically when adding a new customer, but rather manually when we got confirmation from our energy partner that the electricity services were confirmed. Obviously, with lots of new customers, automation is needed, but in the beginning we wanted to be close to how things work and not add unnecessary automation too early.
Use Partners With APIs
We also made some key decisions quite early on using partners with services exposed with good APIs to get as far as possible as quickly as possible:
- Use Stormpath as an external identity and authentication/authorisation provider
- Use Intercom as a way to track customer status, events, and interactions
- Use Stripe for payment processing
- Use Bisnode for address lookup
Two of these (Stormpath and Bisnode) turned out to give us quite a lot of trouble, while Stripe and Intercom have worked out really well. Stormpath was acquired/hijacked by Okta, and we basically had to write identity, authentication, and authorisation twice (an approximate four weeks setback). Bisnode launched a new REST API that turned out to be very immature and has cost us lots of extra work to identify where the issues came from, and try to work around them.
Set The Target, Implement a Quick-Fix
One of the big things in our architecture is privacy, and what we call a self-adaptive engagement platform (more about that in another post). We are designing privacy into the core, however, we cannot build the fully secure, audited, multi-product, multi-country compliant platform from day 1. So, when customers approve our terms and conditions, we have a simple solution: We decided that for the first product (electricity) we could use the same terms and conditions across all countries (something we also try to do anyway to simplify things). This allowed us to store each customer’s acceptance of the terms and conditions as a timestamp and a commit hash of the terms and conditions text from the code repository (a commit hash is a unique key identifying one specific version of a text). The example below shows the hash ececb95 and that revision referring to the update done as specified in issue HD-116 in our Jira tracker,
The terms and conditions are also generated and made available online directly from this code repository. Thus, we could then postpone the audit and acceptance service that we planned!
Key Learnings
We managed to launch electricity services in eight weeks, so we were quite happy with that result. However, there are things to be careful with when using such a technique. First of all, your technical debt (stuff that you need to fix or improve to have “the right quality”) can quickly get out of hand. The non-technical people in your company will be super-happy that things move so quickly, and then, when you slow down features to do consolidation, they don’t understand. You thus need to manage both focus on the technical debt, as well as expectations. If you don’t, you will hit the wall at some point.
When using partners, you are quite vulnerable to their quality of service (and if they go out of business like Stormpath). This is hard to manage, you need to do due diligence up front, but even if you do, you can get hit like we did with Stormpath.
Another thing that can be a bit tricky to handle is when your assumptions are wrong. I described above how we postponed the audit service due to the need for only one set of approvals needed for electricity. This approach (read more about using pivotal assumptions on my personal blog) gave us velocity that we leveraged to get other more important things done. However, we have now learned that we may have to store records of a second type of approval. We then have to either implement another quick-fix (which rapidly instead becomes a “hack on a hack” and technical debt), or we have to do the audit service implementation with a bad timing. If we choose the latter, other things we had planned will be pushed out, and it’s just difficult to explain this to non-technical people: “It’s just another check box! How can it be so hard?!”