Software Engineer 2 @ Working with the data team which provides data services to other business teams
like mobile, search, analytics and advertisement.
Written Map Reduce code to combine data from 2 data sets and develop a container. This is used by downstream users for analysis of customer behavior data. Documented the process for technical and business users of the service.
Worked with Hadoop admin team on the cluster upgrade as the POC from data team. Tested the MR jobs, PIG and HIVE scripts.
Resolved bugs and actively worked on multiple tasks to improve the data quality.
Developed an ETL pipeline using Teradata utilities to automate the process to get data from a
customer database and merge it into production data on a daily basis.
Developed an android app prototype for occasion based shopping.
Developed a POC using Apache Kafka, Storm and Redis that finds the Top N keywords searched on eBay at near real time. The code is currently being used as a base for multiple other projects.
Pitched and idea to the CEO and received funding to implement a pilot. From January 2014 to Present (2 years) San Francisco Bay AreaGraduate Teaching Assistant: 95-869 Hadoop and MapReduce @ Assisted the professor to grade the assignments
Helped students understand the concepts discussed in class and the requirements for the assignment. From October 2013 to December 2013 (3 months) Greater Pittsburgh AreaSoftware Development Intern @ • Developed code to automate the process to classify webpages into different page families.
• The model follows the Nearest Neighbor data mining technique of classification
• Completed a Hadoop vendor training by Hortonworks and wrote 4 Map Reduce jobs in JAVA for processing data. Also used Pig for ETL processing.
• Developed a shell script to automate the process of parameterizing all hard coded database names across multiple scripts to improve performance and implement best practice. From June 2013 to August 2013 (3 months) Greater Seattle AreaSoftware Engineer @ • Developed database scripts using Teradata utilities like Multi load and Fast load to extract, cleanse and transform terabytes of customer data and populate business databases for reporting.
• Performed data analysis to check data quality for correctness and completeness.
• Part of team performing SQL code optimization and performance tuning.
• Collaborated with reporting team to ensure the reports meet the evolving needs of the client.
• Automated the process to back up all the database objects in Teradata using UNIX scripting and Teradata BTEQ utility.
Efficiency: It saved approx. $2000 and 1hour/database backup activity for the client.
• Automated the process to replicate database objects across multiple databases maintaining consistency of data model using UNIX shell scripting and Teradata BTEQ utility.
Efficiency: It saved approx. $5000 and 2 hours/database replication activity for the client. From August 2010 to July 2012 (2 years) Mumbai Area, IndiaAssociate Software Engineer @ • Implemented database queries to check sales and orders of the client.
• Developed and deployed graphs using Ab-Initio ETL tool to transform the CRM data to provide a global customer insight to the client.
• Worked in a team performing data analysis and building data models for the project. From August 2010 to July 2011 (1 year) Mumbai Area, India
Master's degree @ Carnegie Mellon University From 2012 to 2013 Pritish Karanjkar is skilled in: Databases, SQL, Data Analysis, Oracle, Java, ETL, Hadoop, Data Mining, Unix Shell Scripting, Software Development, Teradata, Eclipse, HTML, Windows, Microsoft Excel