The HubSpot Coding Challenge and Me

I sent a copy of my resume to HubSpot, and one of their talent acquisition people got me on the phone to talk. It sounded like a potentially good fit, and I was invited to take the HubSpot coding challenge.

Which I did.

I have written about how badly coding challenges are often designed and how to do it better. It seems that HubSpot has also introspected on coding challenges and came to the same conclusion a few years ago and revamped their process.

The new HubSpot coding assessment is a small coding project, which you can perform using whatever tools you’d like. It has a 3-hour time limit, but they’ve designed it, they say, to fit into an hour or two. (More on this later.) Your solution does an HTTP GET at a specific URL, using a personalized API key, and the server sends a data structure encoded in JSON. Then your code needs to do some computation on the data and to POST the result to another URL.

This does not raise any of my red flags. It sounded pretty exciting, so I was looking forward to completing it.

Beforehand prep

I did as much as I could before beginning, using whatever information I had.

I knew the solution would be a REST client, so I chose Node.js as an environment. JavaScript grew up doing REST clients. (Back in the day, we used to call it “AJAX,” even after we exchanged XML for JSON.) And Node now brings all of that to the command line.

I spent a few days refreshing my JavaScript, delving into Node, and developing a minimal REST-client app that GETs data from a URL and POSTs it back, complete with an integration test. I even created a package that can be used to quickly spin up a minimal REST server in Node, for automated integration testing.

After I felt I had done all I could to master the background, I took a break over the long Fourth of July weekend.

The following Tuesday, I clicked on the challenge link.

Tackling the challenge

Without getting into details that HubSpot wants to remain private, my solution involved several processing steps:

  1. GET the input dataset from a specified URL, using a personal API key.
  2. Rearrange the input data into a more workable form.
  3. Calculate the desired answer to the challenge problem. (This is the business logic of the application.)
  4. Format the answer per the API requirements.
  5. POST the result to another specified URL, using the API key.

All pretty standard stuff.

Everything except for step 3 is coordination: I/O and rearranging data. Most of it can be expressed semi-declaratively in JavaScript, that is, without mutable state, and my top-level function (which I called main) has no mutable state. The code that implements the core algorithm (step 3) went into its own module, and everything else went into the main script.

After setting up a skeleton project, the first thing I did was to write an integration test for the main script. This test spins up an in-memory HTTP server, with mocks that implement the API’s GET and POST methods. Then it spawns the main script, waits for it to exit, and checks whether it fetched the input dataset and provided the correct result. It does so using the sample data that HubSpot provided in the project requirements.

Most of this integration test was modified boilerplate from my previous experiments with my minimal REST client the previous week. Even so, it took almost an hour to complete (and contained a couple of bugs that would come back to bite me later). This test code would be instrumental in testing and debugging the final app and would become part of the final package.

So over a third of the way into the allotted time, and all I have to show for it is a not-yet-used integration test. I think you can sense where this is going.

I developed the business-logic module using test-first programming. This actually proceeded quickly and smoothly. The core algorithm requires some data structures and clear thinking, but it isn’t terribly clever—nor does it need to be. I wish I could say more because I’m really quite proud of it. And in the real world, if needed, the code could easily be adapted to run under a map-reduce or streaming framework, in order to scale to handle as much data as you want.

With the business logic fully and correctly implemented, I returned to the main script. By this time, I had used more than half my allotted time, and I was starting to get worried. I implemented the main script and hit “run” on my integration test. And…

Bugs, bugs, bugs

Lots of typos. You know that phase of debugging in which you run, observe some syntax or other stupid error, fix it, rerun, observe the next error, fix it, rerun again, repeat ad nauseam?

At one point, the base URL and API key weren’t getting parsed correctly from the command line. WTF? I’m using commander; it’s foolproof—I thought. Turns out, I misspelled the names of the properties used to access those values, and JavaScript (being a loosely typed language) was simply returning undefined.

I finally got the thing running, then discovered that my getDataset function (which uses fetch to GET the input dataset) was now returning undefined.

Huh? This is f*ing boilerplate. What could possibly have gone wrong?

At this point, I had all the pieces in place, and I only had a few minutes remaining in my allotted 3 hours. So I ran the script against the real service. It correctly fetched the (massive amount of production) input data, calculated an answer, and sent back a result.

But it was the wrong result.

Nonetheless, at least I knew the undefined input dataset was a problem with my test, not with my code. Hacking with it for several more minutes, I discovered that I had—yes—accomplished another typo. Shrug.

By this time, the 3 hours had expired, and I was feeling much less stress.

I finally got the test running to a point where I could debug the main script. I had originally misformatted the result data. More debugging. More fixes.

The test still didn’t correctly detect success, but this was because the result data includes several arrays that are order-insensitive. So in my test assertions, I needed to first convert those arrays to Sets before comparing them.

However, I was confident that I had worked through all the issues. I ran the script against the real service, and it succeeded.

Time limits and effort estimation

“The coding project has a 3-hour time limit, but we’ve designed it to ideally be completable in 1-2 hours.”

I “finished” in 3 hours 20 minutes, then spent another hour performing minor refactorings and fleshing out the documentation—the spit and polish. Oh, and I fixed that final assertion in the integration test.

We used to say: In order to get a realistic estimate of how much effort your software project will take, you should start with what you think it will take and then multiply by 3.

Now, to be fair, I don’t know how long candidates usually take to complete this project. (And I also don’t know the quality of their code.) I may have taken longer to POST an initial correct result because:

  • I wrote my tests first, including the integration test.
  • I wrote some of my documentation first, when it helped to solidify my thoughts and document my logic.
  • I also snagged several times on bugs. Working through these did not take hours, though they did push me over the 3-hour limit.

But color me unconvinced.

Tests and documentation are not afterthoughts, but rather part of the process. You either acknowledge this and write them up-front, or else you spend even more time patching things up afterward. I used my tests in order to get my code working, as in each case they were key to helping me test (duh) and debug my code. I documented what I was doing so that I wouldn’t lose my train of thought as I worked through the phases of the project. And all of this became part of the final package.

To HubSpot’s credit, they did allow me to complete the project and submit it, even after the 3 hours were up. And they gave me as much time as I wanted after submitting a correct result to “clean up the code afterwards.” These are all positives.

And I admit that I enjoyed the project.

However…

The only concern I have is reflected in one piece of advice given me before I even started:

“I’d suggest being pragmatic about it rather than trying to be perfect, at least on your first pass.”

I’m not sure what that means.

So-called “quick and dirty” may be dirty, but it’s hardly quick. I mean, yeah, if I had just flung down code without tests, I might’ve gotten lucky and hacked together a working solution. And there are techniques to hack code faster (which we used to use before we had test frameworks). I could have, for example, hard-coded the sample input data in a getDataset stub, debugged my application without a test, and then written the test afterward.

Is that the kind of developer you want to hire?

Tests and documentation are part of the process, not an afterthought.

I’ll have to see what they think of my solution.

May all your bars turn from red to green.
Tim

This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

Leave a reply