Data Engineers work with data storage, access, and transformation. Common tools used by data engineers are Informatica or Pentaho Kettle for data warehousing and Apache Spark, Hive, Sqoop for big data lake management. Data Engineers work closely with Data Scientists and Business Analysts to support insight generation from the data.
Role Levels
Level 1
Level 2
Level 3
Junior
Experience Level
12k - 18k
$HKD / Month
0 jobs available
SKILLS NEEDED TO QUALIFY FOR ROLE
Big Data Stores and Pipelines
Rudimentary knowledge of data processing techniques, but missing either strong programmatic abilities or enterprise solution knowledge.
When given a set of data to process, may be able to do it in tool of choice.
Programmatic-fluent individuals tend not to appreciate enterprise DWH / Hadoop / Pipeline solutions.
Enterprise solution people tend not to be fluent in programmatic approach.
Web Analytics
Read news in social channels, periodically post content there.
Knows how to add tracking (such as Google Analytics) to a web page and then read its reports.
Modern Scripting and Command Line
Could SSH to remote servers.
Basic knowledge of unix commands: ls, sort, uniq, join.
Could execute other programs from the command line.
Machine Learning
Use of standard Python or R libraries for machine learning, including scikit-learn, pandas, matplotlib.
Basic Hadoop understanding and interaction. Could list files, upload/download them, knows different storage formats.
Databases and Queues
Has used both SQL and NoSQL databases in practice, able to create strong relational and flat schemas.
Dashboards and Visualization
Able to present results in Excel with charts using default color scheme.
Could read information from multiple dashboards.
Understanding of simple dashboard setup with QlikView/Tableau/Spotfire or others.
Junior
Experience Level
12k - 18k
$HKD / Month
0 jobs available
SKILLS NEEDED TO QUALIFY FOR ROLE
Big Data Stores and Pipelines
Rudimentary knowledge of data processing techniques, but missing either strong programmatic abilities or enterprise solution knowledge.
When given a set of data to process, may be able to do it in tool of choice.
Programmatic-fluent individuals tend not to appreciate enterprise DWH / Hadoop / Pipeline solutions.
Enterprise solution people tend not to be fluent in programmatic approach.
Web Analytics
Read news in social channels, periodically post content there.
Knows how to add tracking (such as Google Analytics) to a web page and then read its reports.
Modern Scripting and Command Line
Could SSH to remote servers.
Basic knowledge of unix commands: ls, sort, uniq, join.
Could execute other programs from the command line.
Machine Learning
Use of standard Python or R libraries for machine learning, including scikit-learn, pandas, matplotlib.
Basic Hadoop understanding and interaction. Could list files, upload/download them, knows different storage formats.
Databases and Queues
Has used both SQL and NoSQL databases in practice, able to create strong relational and flat schemas.
Dashboards and Visualization
Able to present results in Excel with charts using default color scheme.
Could read information from multiple dashboards.
Understanding of simple dashboard setup with QlikView/Tableau/Spotfire or others.
Junior
Experience Level
12k - 18k
$HKD / Month
0 jobs available
SKILLS NEEDED TO QUALIFY FOR ROLE
Big Data Stores and Pipelines
Rudimentary knowledge of data processing techniques, but missing either strong programmatic abilities or enterprise solution knowledge.
When given a set of data to process, may be able to do it in tool of choice.
Programmatic-fluent individuals tend not to appreciate enterprise DWH / Hadoop / Pipeline solutions.
Enterprise solution people tend not to be fluent in programmatic approach.
Web Analytics
Read news in social channels, periodically post content there.
Knows how to add tracking (such as Google Analytics) to a web page and then read its reports.
Modern Scripting and Command Line
Could SSH to remote servers.
Basic knowledge of unix commands: ls, sort, uniq, join.
Could execute other programs from the command line.
Machine Learning
Use of standard Python or R libraries for machine learning, including scikit-learn, pandas, matplotlib.
Basic Hadoop understanding and interaction. Could list files, upload/download them, knows different storage formats.
Databases and Queues
Has used both SQL and NoSQL databases in practice, able to create strong relational and flat schemas.
Dashboards and Visualization
Able to present results in Excel with charts using default color scheme.
Could read information from multiple dashboards.
Understanding of simple dashboard setup with QlikView/Tableau/Spotfire or others.
Assessments
The following assessments award a Data Engineer badge: