Responsibilities include, but not limited to:
- Design, develop, and program methods, processes, and systems to consolidate and analyze unstructured, diverse “big data” sources to generate actionable insights and solutions for client services and product enhancement.
- Interact with product and service teams to identify questions and issues for data analysis and experiments.
- Develop and code software programs, algorithms, and automated processes to cleanse, integrate, and evaluate large datasets from multiple disparate sources.
- Identify meaningful insights from large data and metadata sources.
- Interpret and communicate insights and findings from analysis and experiments to product, service, and business managers.
Minimum Qualifications:
- Strong desire to grow a career as a Data Scientist in highly automated industrial manufacturing.
- Experience in statistical modeling, feature extraction and analysis, supervised/unsupervised/semi-supervised learning.
- Exposure to the semiconductor industry is a plus but not a requirement.
- Strong verbal and written communication skills.
Minimum Experience & Skills:
- Ability to extract data from different databases via SQL and other query languages and apply data cleansing, outlier identification, and missing data techniques.
- Strong software development skills.
- Experience with or desire to learn:Machine learning and other advanced analytical methods.
Fluency in Python and/or R.
pySpark and/or SparkR and/or SparklyR.
Hadoop (Hive, Spark, HBase).
Teradata and/or other SQL databases.
TensorFlow, and/or other statistical software including scripting capability for automating analyses.
SSIS, ETL.
JavaScript, AngularJS 2.0, Tableau.
- Experience working with time-series data, images, semi-supervised learning, and data with frequently changing distributions is a plus.
- Experience working with Manufacturing Execution Systems (MES) is a plus.
- Existing papers from CVPR, NIPS, ICML, KDD, and other key conferences are a plus, but this is not a research position.