Sie sind auf Seite 1von 4

Data Scientist Key Points For Responsibility

Skills and Qualifications:

 Experience of working on Python or R. Java or C# experience is


a plus.
 Necessary knowledge of working on Jupyter notebooks, spyder
and google colab.
 Credible knowledge of statistics and linear algebra and how they
provide the basis for most of machine learning algorithms.
 Understanding of the data pre-processing techniques such as
feature extraction and dimensionality reduction.
 Solid understanding of machine learning techniques and
algorithms, such as k-nearest neighbour, Naive Bayes, K-means
clustering and SVM etc.
 Experience of working on common data science related modules
such as NumPy, Scikit learn, matplotlib and pandas.
 Good understanding of relational databases like MySql as well
as NoSql (Big data) systems like MongoDB and Hbase
(preferred).
 Understanding of ETL pipeline and protocols. Data engineering
skills would be a plus.
 Good understanding of Artificial neural networks such as Auto-
encoders, Convolutional neural networks (CNNs) and Long
short-term memory (LSTM).
 Some understanding of working on Keras and TensorFlow APIs.
 Good analytical and programming skills.
 The candidate must have a Bachelor’s degree in Computer
Science or any related discipline and preferably a Master’s
degree in Computer Science/ Data Science / Artificial
Intelligence or any of the related fields.
 Minimum of 1 year relevant experience of working in the field of
software development or database related domain. Data
Analytics or Business Intelligence related experience is a plus.
 Relevant certifications in any of the tools mentioned above would
be a plus.

Key Responsibilities

 In-depth knowledge of SQL and other database solutions


 Data Engineers need to understand database management, and
as such, in-depth knowledge of SQL is required. Likewise, other
database solutions, such as Mongo, Cassandra or Bigtable, are
great.

 Data warehouse architecture and ETL tools


 Data warehousing and ETL experience is essential to this
position. Data warehousing solutions like Redshift or Panoply, as
well as familiarity with ETL Tools, such as with StitchData or
Segment is hugely valuable. Similarly, experience with data
storage and retrieval is equally vital, as the amount of data being
dealt with is simply astronomical
 Hadoop based Analytics (Hbase, Hive, Mapreduce etc)
 Strong understanding of apache Hadoop-based analytics are
very common requirements in this space, with knowledge of
Hbase, Hive, and Mapreduce often considered a requirement.
 CODING
 Expertise in Python, C/C++, Java, Perl, Golang, or other such
languages is required.
 MACHINE LEARNING
While mainly the focus of data scientist, some level of
understanding of how to act upon this data is also invaluable for
Data Engineers. For this reason, some knowledge of statistical
analysis and the basics data modeling are hugely valuable.
 While machine learning is technically something relegated to
the Data Scientist, knowledge in this area is helpful to construct
solutions usable by your cohorts. This knowledge has the added
benefit of making you extremely marketable in this space, as
being able to “put on both hats” in this case makes you a
formidable tool.

Preferred Qualifications:-

 1+ years of experience in industry as a Data Scientist, Machine


Leaning Engineer, Business Intelligence or in a related field
 Experience with software engineering and machine learning as
well.
 Be adept with the basics of Python.
 You have fluency in most of the following topics: Probability,
Statistics; Python, Manipulation of large data sets; Data
visualization techniques; Algorithms; Machine learning
 Teaching Data Basics, Sampling, Study Design, Exploratory
Data Analysis, Descriptive Statistics, Statistical Inference,
Hypotheses Testing, Supervised & Unsupervised Machine
Learning, Association Rule Mining, Principal Component
Analysis, Predictive Modelling, Univariate/Multivariate
Regression, Decision Trees, Random Forests, xgboost,
Clustering (K-means, K-modes, DBSCAN), etc.
 Hands-on experience building production models with Python,
Jupyter,Keras, Matplotlib, Numpy,Pandas, Plotly, Python,
PyTorch, scikit-learn, SciPy, Seaborn, TensorFlow, XGBoost

Skill:

Knowledge of handling of databases is required.At least 1 years


hands-on python programming andMust have good hands-on
experience with: BS/MS in Computer Science / Data Science
oremonstrated commitment to learning about AIExperience building AI
models in platforms suchMust be hands-on with Linux OS.strong
foundation in a statistical platform suchWorking experience on
Hadoop is a big plus.Demonstrated proficiency in multiple
programming