Data engineering combines elements of software engineering and data science and is one of the fastest-growing roles in IT. According to Indeed.com, data engineers develop and maintain the architecture used in data science projects. They are responsible for ensuring that data flows between servers and applications uninterrupted.
Data engineers must be familiar with a range of operating systems and databases and able to write and program software. They are experienced with data warehousing and data analysis and must possess excellent critical thinking and communication skills. Data engineers may learn their skills through a combination of education, on-the-job training, and ongoing certificates. Indeed notes that acquiring a certification is an excellent way to showcase abilities and move ahead in the field.
To find out what’s involved in becoming a data engineer, we spoke with Lance Miles, a data engineer at unitQ.
Early education and employment
Miles earned a Bachelor of Science degree in neuroscience from the University of California, Santa Cruz, in 2013; a certification in data science from the University of Washington in 2017; and a Master of Information and Data Science degree from the University of California, Berkeley, in 2020.
“When I reflect on my career and the steps I have taken, one particular experience had a great deal of influence on me,” Miles says. “In the final quarter of college, a Python course, Programming for Biologists, set the groundwork for a new passion.”
Miles spent every day writing code to extract information from massive sequencing data sets, developing methods to calculate the physical-chemical properties of protein sequences, identify the length and location of genes, and characterize viral DNA.
“The ability to distill unwieldy data sets into concise results highlighted the power of pairing programming with biology,” Miles says. “This course challenged me in new ways, and I found myself completely hooked. At a personal level, the simple act of coding brought me happiness and gratification.”
From health sciences to data analytics
Although Miles had always been interested in technology, he started his career in the healthcare sector at Gilead Sciences, a pharmaceuticals company.
“My journey to becoming a data engineer has been far from straightforward, but what has connected it all together has no doubt been my interest in utilizing data to change how teams and companies look at their work and the impact it has,” Miles says.
At Gilead Sciences, Miles worked as a senior research associate in in-vitro biology, identifying clinically translatable biomarkers indicative of cardiovascular health. Each experiment he worked on yielded thousands of data points, but the data analysis was time consuming.
“I saw an opportunity to streamline the analysis, creating Excel macros that efficiently parsed the data and extracted vital information,” Miles says. “This allowed the team to focus on digesting the results and deciding on the subsequent experiments. Seeing the impact of my work in identifying effective biomarkers, I sought to focus on projects where the impact on patients was clear and immediate.”
Upon completing hist pre-clinical projects, Miles transferred to the clinical pharmacology group as a bioanalytical operations lead for antiviral clinical studies. “Tapping into my data analytics roots, I worked with clinical enrollment information to forecast when we would have pharmacokinetic data available for drug submissions,” he says. “In addition, I had opportunities to work with clinical data, where I compiled, cleaned, and analyzed patient data across multiple clinical studies to assess data quality.”
Becoming a data engineer
With the encouragement and help of senior leadership, Miles enrolled in a data science certificate program through the University of Washington. “This was my first exposure to machine learning and what ultimately cemented my desire to switch careers,” he says.
For the next two years, Miles worked as a consultant for Vir Biotechnology while getting his master’s degree. “Our coursework focused on the lifecycle of a data science project, from data collection and cleaning to model development and deployment,” he says. “Working with data day in and day out taught me the fundamentals of data science; I gained a deep appreciation for data engineering and machine learning alike.”
After earning the master’s degree, Miles worked as the lead data science engineer for a startup called Popdog. “As startups go, there was a lot of work to be done; the projects were extensive and the impact immediate,” he says.
Projects ranged from data analytics and data engineering to data science and machine learning. “Throughout my time at Popdog, I found myself gravitating toward data engineering projects, and I started to buy books and take online classes to bolster my understanding of the field,” Miles says. “As projects came, I began to focus on the data engineering aspects, given how critical this piece is to the success of data science projects.”
By the end of his tenure at Popdog, Miles had led a team of engineers to develop an end-to-end computer vision system, which ran millions of predictions on video data each day. “This involved a lot of data engineering work while integrating new technologies into our data stack,” he says.
In 2021, Miles accepted an opportunity at unitQ, where he currently works as a data engineer for the data team. The company “is the epitome of everything one could hope to have in a tech startup—an environment that fosters collaboration, innovation, and growth,” he says. “Using really novel ways to look at data, unitQ has developed cutting-edge machine learning solutions that address very clear problems for really interesting customers like Spotify, Quizlet, and Pandora in a novel way.”
Now, the data engineering problems Miles looks to solve are particularly challenging, “and I am truly excited for the road ahead,” he says. “This has been a great opportunity to learn and apply myself.”
A typical work week
“At unitQ we focus on the five vs of data—volume, value, variety, velocity, and veracity—and the data team’s initiatives are centered around solving problems that address these categories,” Miles says. “We surface data-driven insights from user feedback, so that companies can improve product quality. Our systems ingest millions of pieces of feedback across dozens of data sources daily, and the list of supported data sources continues to grow each month.”
Projects and tasks revolve around things such as building out new capabilities to clean and prepare data for services, Miles says. “In addition, we are scaling out our microservices to handle much larger volumes of data, so that we can continue to meet the needs of our clients as our customer base grows,” he says. “A typical work week pulls from all of these initiatives.”
A memorable career moment
One of the more memorable moments in Miles’ career happened just a few months ago. “We had a unitQ customer interested in categorizing all of their feedback into a set of unique buckets,” he says. “Folks from all parts of the organization came together to build out an exciting new solution. We rapidly prototyped several machine learning models to see if we could build a system that could reliably address this problem.”
After a lot of testing and internal review, the team developed a solution for the customer in just one month. “Our customer was pleased with our results,” Miles says. “In addition, the feature we built proved valuable to many other customers, a true testament to our leadership, teamwork, and capabilities.”
Ongoing education and career development
To gain additional knowledge about his field, Miles reads books, takes online courses, and leverages other online sources of information.
“I firmly believe that learning new technologies and staying up to date on the latest and greatest is crucial—especially because the tech sector is continually evolving,” he says. “My current interests are around scaling our data streaming services.”
The data science certificate program through the University of Washington was a critical component for getting into graduate school, Miles says. “And my master’s degree in information and data science was crucial for switching into the tech industry,” he says. “Having had little coding experience in my undergraduate career, this gave me the experience I was looking for.”
Inspirations and advice for others
A mentor once advised, “don’t settle. I don’t want to see you here in a year. I want to see you pursue your passions.” This sounded harsh and somewhat cliche, Miles says, “but it was an important theme I kept in mind as I worked towards switching career paths. It was scary to move from something that I was good at into a new field where I would be starting from square one. But this is what I wanted for myself, so I took the advice to heart.”
“There are a million ways to get into data engineering,” Miles says. “There are traditional paths and non-traditional paths like my own. If you are interested and passionate, find ways to gain experience and start digging into data. Whether through on-the-job experience, going back to school, taking a certificate program, or reading books and watching YouTube videos, there are many ways to get relevant experience and knowledge in data engineering.”