Job Detail

Data Engineer at
Palo Alto, CA, US



About is a leading digital business payments company with a network of over 3 million members, managing more than $60 billion annually. Making it simple to connect and do business, the cloud-based Payment Management Platform automates, simplifies, and controls the payments process, saving more than 50 percent of the time typically spent. partners with seven of the leading U.S financial institutions, more than 70 of the top 100 accounting firms, major accounting software providers including NetSuite, Intacct, QuickBooks and Xero, and is the preferred provider of digital payments solutions for, the technology arm of the American Institute of CPAs (AICPA). Winner of more than 70 awards, is recognized as one of San Francisco Business Times’ and Silicon Valley Business Journal’s “2018 Best Places to Work.”

Mission: moves over $60B per year and we have 10 years worth of customer data.  We are leveraging this data to make data driven decisions, and apply data science and machine learning to solve a variety of tough problems.  We are in the middle of a large-scale transformation to the public cloud and are developing data pipelines, data warehouse, and machine learning infrastructure in AWS.

Data engineers at will be responsible for building data pipelines and the infrastructure to enable data science, data analytics, and machine learning at scale in AWS.  Some of the problems we are currently working on include: detecting payment fraud, extracting semantic data from customer documents, and increasing customer acquisition through advanced analytics. Data engineers will own and build the data platform that makes all of this possible. We have multiple positions available at different levels of seniority.

Professional Experience/Background to be successful in this role:

  • 5+ years of experience owning and building data pipelines.
  • Extensive knowledge of data engineering tools, technologies and approaches
  • Ability to absorb business problems and understand how to service required data needs
  • Design and operation of robust distributed systems
  • Proven experience building data platforms from scratch for data consumption across a wide variety of use cases (e.g data science, ML, scalability etc)
  • Demonstrated ability to build complex, scalable systems with high quality
  • Experience with multiple data technologies and concepts such as Airflow, Kafka, Hadoop, Hive, Spark, MapReduce, SQL, NoSQL, and Columnar databases.
  • Experience with specific AWS technologies (such as S3, Redshift, EMR, and Kinesis) a plus
  • Experience in SQL and one or more of Python, Java and Scala

Expected Outcomes:

  • Design and implement data infrastructure and processing workflows required to support data science, machine learning, BI and reporting in AWS
  • Build robust, efficient and reliable data pipelines consisting of diverse data sources
  • Design and develop real time streaming and batch processing pipeline solutions
  • Own the data expertise and data quality for the pipelines
  • Drive the collection of new data and refinement of existing data sources
  • Identify shared data needs across, understand their specific requirements, and build efficient and scalable pipelines to meet various needs
  • Build data stores for feature variables required for machine learning Culture:

  • Humble – No ego
  • Fun –  Celebrate the moments
  • Authentic – We are who we are
  • Passionate – Love what you do   
  • Dedicated – To each other and the customer