How I became a Data Scientist during COVID (as a Doctor): 5 takeaways from an unconventional career transition


Background

3 years ago, I had just finished medical school and started working full-time as a doctor in the UK’s National Health Service (NHS). Now, I work full-time as a data scientist at dunnhumby, writing code for “Big Data” analytics with Python and Spark.

More and more people are making the transition towards data science, or related technical roles, from a variety of disciplines. So in this article I’m going to share my experiences and advice for making a (perhaps) unconventional career transition into a technical role. I can break these down into five main learnings:

(1) Find technical friends

Coming from a medical background, I didn’t have the first clue about how to develop coding skills or data science understanding. And neither did anybody around me.

This made it really important for me to branch out and find people who did. I quickly saw the benefits of doing so.

Early inefficiencies

When I initially started out, I’d have to resort to Google or StackOverflow to try and solve my problems. These are great resources, but it’s hard to find what you want when you don’t really know what you’re looking for.

One of the top skills a developer needs to know is what to search to find the solution to your current problem. Without that skill, I would spend ages stuck at a relatively simple hurdle — like trying to manipulate a pandas dataframe in a particular way, or how to install and import the package I needed.

I’d heard that it’s best to think of your own projects as a means to learn. However, without insight into what it takes to build a project, and what’s possible, I would typically come up with over-ambitious projects with too many moving parts. I remember an early project idea was to build a chatbot patient for doctors to practice with, and I even started collecting transcripts from real conversations to help make this. In hindsight, this type of task was way too ambitious for someone of my technical level at that time (having just completed an Intro to Python course).

Getting by with a little help from some friends

Having technical friends is great for overcoming both of these sources of inefficiency.

If you have a relatively simple technical issue, but don’t know where to go to solve it, a technical friend can point you in the right direction pretty quickly. This saves a lot of time and frustration.

Likewise, if you come up with a project idea, you can run it by a technical friend. They’ll be able to break it down into stages and ultimately advise you whether it makes sense to do and how best to go about it. This can save you a lot of time from barking up the wrong tree.

How to make technical friends

I don’t think there’s a “right” way to find technical friends and establish a relationship where you can ask them for advice. Here are a few principles that I find helpful.

Firstly, being open and honest about my intentions (“I’m learning to code and would love someone I could ping a message to when I get stuck”).

Secondly, being respectful of their time. I pushed myself to only ask for help if I’d truly searched for the solution and spent time trying to solve it myself. (To be honest, I think you also learn better this way.)

This wasn’t always easy. Sometimes I’d hit the initial wall of frustration and feel an urge to send off multiple messages, hoping for a quick solution. I tried to have a good crack myself first, but I’ll admit I sometimes caved in.

It was also helpful to have multiple people I could go to. I didn’t have to keep bugging the same person, reducing the risk of annoying them.

I personally met these friends from multiple places; from attending events that interested me (such as data science and machine learning meet-ups), from working on projects together (more on that in Section 2 and 3) and from a smattering of formal ‘networking’, friends-of-friends and random LinkedIn messages.

At some point, when it felt comfortable, I’d reach out for advice on a specific problem or project I was working on. Sometimes the problems were simple or the project ideas were bad, so I had to put my pride to the side and seek out the constructive criticism.

If you’re starting out, and looking for a technical friend to help you get started, feel free to reach out (email at hi@chrislovejoy.me, Twitter [@ChrisLovejoy__](https://twitter.com/ChrisLovejoy)).

(2) Build a portfolio: the rule of 3

One of the first pieces of advice I received from a mentor is one that has stuck with me since:

Have a portfolio of three great projects.

Initially, for me, this just meant striving to get three projects under my belt. I went to hackathons, interned at companies and designed my own.

Once I achieved that, I kept looking at how I could build on them or replace them with new and cooler projects.

It’s so simple, but I’ve found it a useful way to frame it. You’re only as good as your top three projects.

To this day, the lower right-hand side of my CV is dedicated to my top three projects at that moment in time.

If a round of job applications comes up, or if somebody asks me about work I’ve done, I know what to talk about. These three projects are always in mind.

Getting projects

I’m a big advocate of designing our own projects, both for learning and for potentially contributing to community. But it’s not always easy, particularly when starting out.

A good source of ‘ready-made’ projects to work on is Kaggle. They provide a dataset and often specific challenges. You can also see other people’s solutions, which is a great source of learning.

A great way to devise a project within a team is to attend hackathons. These are typically weekend sprints to develop a solution to a problem and are held in most major cities around the world.

One thing I found really helpful was attending a project-based course. So much so, that I’ll devote the next section to it.

(3) Go on a project-based course or bootcamp if you can

Even with all the motivation in the world and a great team of technical mentors, it can still be challenging to build great projects and learn new skills off your own back. There’s a lot to be said for being in the right environment, and having good projects being defined for you.

The best place for this would be to get a full-time job in a technical role. However, it’s not always possible to jump straight into this.

A really great intermediate step can be to go on a project-based course.

These typically range between around 5 weeks to few months and are centred around a group project that produces a tangible output. There’s typically a partnership with a commercial client who has a genuine interest in what you are building.

I went on the “Science to Data Science” (S2DS) virtual course. I found it really helpful having a defined project and having responsive technical mentors to go to for any problems that I arose.

I learnt a huge amount during the project; in particular I learnt how to structure source code, became more familiar with github, gained better understanding of regression performance metrics and learnt PEP-8 Python coding guidelines. (I’ll be sharing a full post on this in future.)

Another course I’ve heard good things about is the ‘ASI Data Science’ course (who I think have now re-branded as ‘faculty.ai’).

Note: I’m based in the UK. S2DS is international. I’m not sure about ASI. I’m sure there are programs abroad that I’m not familiar with.

One word of warning is that there are a lot of courses which, in my opinion, over-charge. This is a reflection of the area having become popular, but it isn’t necessarily a reflection of the value that courses offer. The S2DS course I attended cost £800, which felt like fantastic value after attending, but I’ve seen many courses in the £5,000+ range.

(4) Nail down understanding of core concepts

Data science is a really big (and expanding) field. There’s a huge amount that you could learn and it’s easy to be overwhelmed when starting out.

My approach has been to work towards a solid understanding of (i) core concepts and (ii) my specific areas of interest.

So I guess the question is: what constitutes a ‘core concept?

I can’t claim to be an authority on what you should know, but these pages appear useful:

If I were to suggest core concepts and skills, it would be something like:

PROGRAMMING:

  • comfortable with python, pandas, numpy and scikit-learn
  • familiar with GitHub
  • familiar with the command line
  • familiar with installing packages

THEORY:

  • classification algorithms (SVMs, random forest, logistic regression, AdaBoost): higher level principles of how they work
  • regression algorithms (linear/OLS regression, lasso and ridge regression, regression trees): higher level principles of how they work, and common considerations
  • performance measures for classification and for regression algorithms
  • a familiarity with neural networks and deep learning
  • clustering algorithms: K means and hierarchical clustering
  • familiar with dimensionality reduction techniques such as PCA

I’d suggest using online courses to build up your understanding in each of these key areas. Ones I found helpful were Brilliant.org and Khan academy for principles and maths, plus various courses on Coursera and Udemy for the more technical aspects.

As for the skills on top of this, I’d argue it depends on the industry of interest and type of work you’ll be doing. If working with Big Data, Apache Spark will be helpful. If working with time series, familiarity with ARIMA modelswill be helpful.

(5) A Master’s Degree (or other formal qualification) isn’t essential, but can help

There are a lot of data science roles that don’t specify a master’s degree as a formal requirement, and I think it’s possible to get a job without one.

However, for me, and coming from an unconventional background, I found it hard to get taken seriously until I started working towards one.

I think if my bachelor’s degree was more directly relevant to data science (rather than Medicine), I may not have needed to. A good bachelor’s + proof of practical experience is sufficient in many cases.

I ended up choosing a master’s degree in Data Science and Machine Learning at UCL in London. Even before starting it, I was pretty confident that I had a good grasp of key concepts from my own self-study.

But a lot of job applications led to straight-out rejection, so I never had the chance to prove myself. And I can completely understand why. If you have a lot of applications for a role, the one that says “I’m a full-time doctor, but have done loads of data science self-study” is a pretty easy one to remove from the pile.

The master’s didn’t guarantee I’d progress to an interview, but I definitely felt it helped me get a foot in the door.

(Whether or not I’m fully behind paying X thousand for a master’s degree in our current climate of remote courses is a matter for another day, however…)

Final thoughts

I’m absolutely loving work as a data scientist. It’s really satisfying to see the hard work pay off, even if the journey was tough at times. It feels great to have two skillsets in my portfolio (as both a doctor and as a data scientist) and one step towards a more ‘portfolio’ future approach to work. I hope that the experiences and suggestions I’ve shared here help you with your transition, too.

Best of luck! :)

Many thanks to Abdel Mahmoud and Luke Harries for reviewing this article.