Company Description
Tamr is the enterprise data mastering company trusted by large enterprises like Blackstone, the US Air Force, Google, and GSK. The company’s patented software platform uses machine learning supplemented with human feedback to master and prepare data across myriad silos to deliver previously unavailable business-changing insights. With a co-founding team led by Andy Palmer (founding CEO of Vertica) and Mike Stonebraker (Turing Award winner) and backed by top-tier investors such as NEA and GV, Tamr is transforming how companies get value from their data.
You will work on the enrichment and data products team at Tamr, playing an active role in product development. You’ll gain new skills related to data engineering and analysis through hands-on exposure to real challenges facing our enterprise customers. You’ll learn how multibillion-dollar companies are using machine learning and leading-edge technologies to modernize their infrastructure and turn their data into a competitive advantage. Throughout the summer, you will get broad exposure to teams and leaders throughout Tamr, providing you with insight into the operations of a growth-stage company and the range of potential opportunities.
Responsibilities
Perform exploratory data analysis on new data sources
Develop algorithms for cleaning data and engineering features
Build data pipelines to feed data to machine learning models
Collaborate with software engineers on developing applications to deliver data
Challenges that make this job interesting:
The problem we’re solving is hard - enterprise data is messy and there is a lot of it. It’s our job to derive value from this data in a flexible and scalable way
We’re working at the cutting edge - we’re responsible for the innovation, experimentation, and development that goes into building new products that our customers find useful
Qualifications
All undergrad and graduate students are welcome to apply
Major or minor in a technical field
Demonstrated interest and aptitude for working with data
You’ve built data pipelines to prepare data for analysis
You’ve experience of data analysis, data cleaning, and feature engineering
You’ve written code in python and SQL
Other Preferred Qualifications / Nice to have:
You’ve experience with any of the following technologies: Spark, GCP, AWS, Azure, GitHub
You’ve built, trained, and tested machine learning models on a variety of datasets