Sampath Polishetty BigData Consultant

Sampath Polishetty Phone: 9176035588
Email: sampath.polishetty@gmail.com
Summary:
• 9+ years of IT experience and over 6 years of experience in analysis, design, development and
implementation of large-scale applications within Big Data Hadoop environment using
technologies such as Spark, Map Reduce, Cassandra, Hive, Pig, SQOOP, OOZIE, and HDFS.
• Having good experience in Bigdata related technologies like Hadoop frameworks, Map Reduce,
Hive, HBASE, PIG, SQOOP, Spark, Kafka, Flume, OOZIE, and Storm.
• Extensively used ETL,Data Integraation,Data Migration methodologies for supporting
data extraction, transformations and loading processing, in a corporate-wide-ETL
Solution using Ab Initio 3.1.16,Talend,Clover ETL.
• Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name
Node, Data Node and Map Reduce programming paradigm.
• Worked with Apache Spark which provides fast and general engine for large data processing
integrated with functional programming language Scala.
• Hands on experience on Spark framework on batch,real time data processing.
• Handled large datasets of structured,semi-structured and unstructured data using py and Hive.
• Hands on experience in writing Pig Latin scripts, working with grunt shells and job scheduling
with Oozie.
• Very good experience of Partitions, Bucketing concepts in Hive and designed both Managed and
External tables in Hive to optimize performance.
• Experience in migrating the data using Sqoop from HDFS to Relational Database
System,Mainframe and vice-versa according to requirement.
• Extending HIVE and PIG core functionality by using custom User Defined
• Experience in converting SQL queries into Spark Transformations using Spark RDDs, Data
Frames,Data Sets and Scala, and performed map-side joins on RDD's.
• Good understanding of NoSQL databases and hands on work experience in writing applications
on NoSQL databases like HBase, Cassandra and MongoDB.
• Very good knowledge in implementing web service layers and prototyping User Interfaces for
intranet, web applications and websites using HTML5, XML, CSS, CSS3, AJAX, Java Script, JQuery,
Bootstrap, AngularJS, Ext JS, JSP and JSF.
• Expertise in Database Design, Creation and Management of Schemas, writing Stored
Procedures, Functions, DDL, DML SQL queries.
• Extensive knowledge in using SQL queries for backend database analysis.
• Experience in database development using SQL and PL/SQL and experience working on
databases like Oracle 9i/10g/11g, Informix, and SQL Server.
• Worked with BI tools like Tableau for report creation and further analysis from the front end.
• Have a good experience working in agile development environment including Scrum
methodology.
Technical Forte:
Apache Spark, Scala, Map Reduce, HDFS, HBase, Hive, Pig, SQOOP,
Big Data
PostgreSQL
Databases Oracle 9i/11g,DB2,,Hive,PIG,Hbase,MongoDB,Cassandra
Hadoop distributions Cloudera, Hortonworks, AWS
DWH (Reporting) OBIEE 10.1.3.2.0/11g
DWH (ETL) Ab Initio 3.1.16,Talend,Clover ETL
Languages SQL, PL/SQL, Java
UI HTML, CSS, JavaScript
Defect Tracking Tools Quality Center, JIRA
Tools SQL Tools, TOAD
Version Control Tortoise SVN, GitHub
Operating Systems Windows[...], Linux/Unix
Educational Qualifications
Degree Institute Major and Specialization
Bachelor of University, Hyderabad Computer Science And Information Technology
Technology
Employment Summary:
Employer From To Designation/Role
Virtusa India pvt ltd Dec, 2009 June,2013 Engineer/ Developer

Tech Mahindra Pvt June 2013 March, 2014 Associate Technical Specialist.
Ltd
Cognizant Technology March,2014 September,2017 Associate
Solutions
IBM India Pvt ltd September,2017 Till Date Advisor Technical Specialist.
Professional Experience:
Project -1 :
Company Name: IBM Inida Pvt Ltd, Pune, India
Client Name: SunTrust
Role: Senior Big Data Consultant Sep 2017 to Present
Project Name: Atlas – Data Lake
Atlas Data-Lake Main Purpose of the project is to build Data Lake system on Hadoop Platform
to replace existing DIME ETL Platform. Data Lake is used for Data Integration, Data Analytics for All the
Sun Trust systems. Data is being Ingested from Various DB, Mainframe, MDM source systems to Data
Lake. Avro schema design is used to write the data write. Data is integrated to multiple target data
bases.
Responsibilities:
• Responsible for design development of Spark SQL Scripts based on Functional Specifications
• Responsible for Spark Streaming configuration based on type of Input Source Wrote the Map
Reduce jobs to parse the web logs which are stored in HDFS.
• Developed the services to run the Map-Reduce jobs as per the requirement basis.
• Importing and exporting data into HDFS and HIVE using SQOOP.
• Responsible to manage data coming from different sources.
• Responsible for loading data from UNIX file systems to HDFS. Installed and configured Hive and
also written Hive UDFs.
• Experienced with AWS services to smoothly manage application in the cloud and creating or
modifying instances.
• Collected the Json data from HTTP Source and developed Spark APIs that helps to do inserts and
updates in Hive tables.
• Involved the performance and optimization of the existing algorithms in Hadoop using Spark
Context,Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
• Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke
and run MapReduce jobs in the backend.
• Implemented the workflows using Apache Oozie framework to automate tasks.
• Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi
structured data coming from various sources.
• Developing design documents considering all possible approaches and identifying best of them.
• Exploring with the Spark improving the performance and optimization of the existing algorithms
in Hadoop.
• Experienced with Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
• Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala.
• Involved in gathering the requirements, designing, development and testing.
Environment: Spark, Hive, Flume, Java, Maven, Impala, Oozie, Oracle, Yarn, GitHub, Junit, Tableau, Unix,
Cloudera, Flume, SQOOP, HDFS, Tomcat, Java, Scala, Hbase, Abinitio.
Project -2 :
Company Name: Cognizant India Pvt Ltd, Chennai India
Client Name: Visa, Singapore
Role: Big Data Developer/
Abinitio ETL Developer Mar 2014 to Sep 2017
Project Name: VROL
Visa Resolve Online (VROL) is a web-based Dispute Management System that supports
customer service representatives and back-office analysts responsible for managing and resolving
disputed transactions through the dispute life cycle. It is used to send transaction inquiries, send
financial exception processing messages for both BASE II and SMS, request for copies, fraud reporting
and exception file management. In addition, it is a mandated service used to exchange electronic
dispute documentation and information, RFC fulfilments, pre-filings, and case filings between Members
and Visa during the dispute cycle. VROL is accessible through Visa Online (VOL) via a standard user
interface (UI) or a custom system interface. Members can use dispute resolution questionnaires to
capture information in place of a cardholder or merchant letter. The questionnaires lead the user to ask
the right questions to support the dispute in an intuitive way, making resolution faster and easier,
resulting in a dispute process that is clear to all parties. Questionnaires are valid on Visa, Interlink, and
Plus networks; representments and pre-filings can also be done for MasterCard and Cirrus networks.
Responsibilities:
• Configured different topologies for spark cluster and deployed them on regular basis.
• Load and transform large sets of structured, semi structured and unstructured data.
• Importing and exporting data into HDFS and Hive using Sqoop
• Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
• Developing scripts to perform business transformations on the data using Hive and PIG.
• Developed Spark scripts by using Scala shell commands as per the requirement.
• Created and update Hives table for the offline data storage.
• Loaded offline data from Hive Database to join with transformed RDD in order to generate
required dataset.
• Replaced and complimented exist Spark batch job into Spark Streaming job to enable near real
time data analysis.
• Optimizing of existing algorithms by using Spark Context, Spark-SQL, and Data Frames.
• Supported and Monitored Map Reduce Programs running on the cluster.
• Monitored logs and responded accordingly to any warning or failure conditions.
• Requirement analysis and Design Document preparation, Development of ETL jobs using Clover
ETL, Abinitio.
• Expertise in fixing production support issues related to multi-provider mapping, data
inconsistency, and job failures.
• Developed Script for Building Clover ETL Environment.
• Converted complex ABINITIO Plans, Graphs into Clover ETL job flows, Graphs.
Environment: Spark, Spark SQL, Spark Streaming, Scala, JavaScript, Cassandra, MongoDB, Linux/Unix,
CDH5, Abinitio, Clover ETL, DB2, ESP Job Scheduler, SQOOP, Mainframes,
Perl.
Project -3:
Company Name: Tech Mahindra Pvt Ltd, Bangalore, India
Client Name: Vodafone, Qatar
Role: Abinitio/Big Data Developer
June 2013 to March 2014
Project Name: VFQ Data Migrations.
Migration of customer and billing data for all fixed line, Prepaid
and Postpaid (Mobile) business customers. Business data needs to be migrated from SIEBEL CRM 8.0 to
new stack SIEBEL CRM 8.1
The project mainly has the focus on the mappings, validations & accurate data transformation on the
new stack as data have differences in the product hierarchy and data physical structure and merge of
Mobile and Fixed Line stacks. There are two main parts of the project; MC (Migration Controller) which
consists of Customer Account Migration from Source to Target stack and DAN (Data analysis) are for
selecting eligible accounts.
Responsibilities:
• Requirement analysis and Design Document preparation.
• Involvedin data migration audit, reconciliation and exception reporting.
• Involved in implementing the Data Migration Strategy.
• Environmental setup and Sandbox creation, migration
• Graphs scripts ,dmls and other sandbox object migration
• Field Mapping designer.
• Involved in Designing and Implementing Data Profiling rules for Data Migration.
• Involved in Designing and Implementing Data Quality rules for Data Cleansing.
• Established data cleansing, mapping and enrichment.
• Developing graph using Ab Initio according to HLD.
Environment: HP Unix, Windows NT, Ab Initio 3.1.16, Oracle 9i database, Toad, PL/SQL Developer
Project -4 :
Company Name: Virtusa Pvt Ltd- Chennai, India
Client: British Telecom
Role: Ab Initio Developer,
UNIX Shell Scripting Developer August 2011 to June 2013
Project: UKBSS & Migration
Project that deals with the conversion of a tactical loading process called OST into a strategic BAU
load to load the wholesale and Global Services customer’s contact details into a Siebel system called
“One Siebel”. The new loading process will contain the functionality of SIM and also should
accommodate the OST functionalities. Thus, we need to switch off the existing OST functionality and
implement the same in UKBSS.
Responsibilities:-
• Requirement analysis and Design Document preparation and analysis.

• Environmental setup
• Sandbox creation and migration
• Graphs scripts ,dmls and other sandbox object migration
• Part of Cost estimation
• Execution of UKB BAU test loads and fixing bugs encountered if any
• Analysis of HLD and prepared LLD.
• Developing graph using Ab Initio according to HLD
• Change requirement driven development
• Creation of CIT approach and devised the same.
• Updating Client in daily Status Call.
Environment: - HP UNIX, Windows NT, Ab Initio 3.1.16, Oracle 9i database, Toad

PL/SQL Developer
Project -5 :
Company Name: Virtusa Pvt Ltd- Chennai, India
Client: British Telecom
Role: Abinitio Developer December 2009 to July 2011
Project: CMF SIM
The project deals with contact information of BT’s retail and small business customers. This is a
warehouse repository developed for loading old stack legacy system data into a strategic new stack
Siebel system called “One view” after multi mastering it. The Data is unloaded from various Source
systems like CSS, COSMOSS, CISP, SWIFT, SMERF, CCAT, CRS etc.Each of these source systems contains a
part of the customers contact detail. These data will be extracted to a common location called DAA
(data acquisition area) and will then be taken as a single source and will be multi mastered with
different set of rules built by SIM and will be loaded into SIM entity tables. These data will undergo a
true diff process with the existing Siebel data to get insert, update and delete records. This will then
finally be loaded onto Siebel. Along with this BAU loads, there are some set of extract feeds generated
for downstream systems occasionally.
Responsibilities:
• Analysis of HLD and Developing graph using Ab Into according to HLD

• Involved in Change requirement driven development
• Involved in creating detail data flows with source and target mappings and convert data
requirements into low level design templates.
• Developed highly generic graphs to serve the instant requests from the business.
• Developed complicated graphs using various Ab Initio components such as Join, Rollup, Lookup,
Partition by Key, Round Robin, Gather, Merge, Dedup sorted, Scan and Validate
• Worked with Infrastructure team to write some custom shell scripts to serve the daily needs like
collecting logs and cleaning up data.
• Migrated scripts from DEV to SIT and UAT environment to test & validate data.
• Gathering the knowledge of existing operational sources for future enhancements and
performance optimization of graphs.
• Extensively involved in Ab Initio Graph Design, development and Performance tuning.
• Used sandbox parameters to check in and checkout of graphs from repository Systems.
Environment: HP Unix, Windows NT, Ab Initio 3.1.16, Oracle 9i database, Toad, PL/SQL Developer

Sampath Polishetty BigData Consultant

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Sampath Polishetty BigData Consultant

Hochgeladen von

Copyright:

Verfügbare Formate

Sampath Polishetty Phone: 9176035588

Employer From To Designation/Role

Virtusa India pvt ltd Dec, 2009 June,2013 Engineer/ Developer

• Requirement analysis and Design Document preparation and analysis.

Environment: - HP UNIX, Windows NT, Ab Initio 3.1.16, Oracle 9i database, Toad

• Analysis of HLD and Developing graph using Ab Into according to HLD

Das könnte Ihnen auch gefallen