Building systems for extracting, transforming, retrieving and analyzing data is what I love. I believe in iterating and will always strive to push the envelop, taking it to the next level until its time to move on to the next problem.
Successfully architected and built:
- Real-time/batch processing system processing data at scale about 2.5-3 Billion events per day.
- Data layer that is capable of supporting different SLA's. This data layer acts as deep storage for processing platform(Real time and batch) and other DBs like Cassandra, Druid, MonetDB, Vertica, Oracle, MySQL.
- Analytical cube processing capable of generating millions of metrics in real-time and batch.
- Data On-boarding and distribution.
- Segmentation insights platform.
- Predictive modeling framework.
So far I have engaged in
- Algorithms and distributed system design & development.
- Batch and real-time processing systems at scale.
- Creative products around data.
- Cross team synchronization.
- Thought leadership.
Specialties:
Distributed systems, NoSQL, Columnar Databases, Real-time data processing at scale, Software architecture, Analytics, Data Mining, Information retrieval, Big data, Engineering Management, Technical Leadership.
Senior Software Engineer(Data Architect) @ Following my passion and dream to architect and build the next generation Data Analytical/Processing Platform. From July 2014 to Present (1 year 6 months) San Francisco Bay AreaPrincipal Architect @ Principal architect for Acxiom's AOS platform housing most of AOS data. Coming from architecting Data depot (core component of AOS), now responsible for overall architecture of AOS including
- Data on-boarding
- Audience analytics and segmentation
- Data distribution
- Analytical platform and Data layer From April 2014 to July 2014 (4 months) San Francisco Bay AreaArchitect & Team Lead @ Team Lead and architect of data depot. Data Depot is one of the core of Acxiom AOS platform housing most of AOS data. Data Depot is a real-time and batch framework that is processing 2.5 to 3 Billion events per day.
Data depot features real-time event processing using Storm/Cassandra, custom compact serialization with backward compatibility, fast information retrieval in real-time and batch, enables complex joins over Hadoop, enables efficient and fast processing of TBs of data.
Tech stack:
Cassandra, Storm, Kestrel/Kafka, Hadoop, MySQL, Sarasvati(DAG based workflow engine), Java, Hive From October 2013 to March 2014 (6 months) Enterprise Architect @ Architect and a generalist building big data solutions like ETL, Reporting, Cookie store for analytical processing.
Built a framework(Data depot) that is capable of processing terabytes of information from the ground up.
Data depot features custom compact serialization with backward compatibility, enables complex joins over Hadoop, enables efficient and fast processing of TBs of data. From May 2012 to October 2013 (1 year 6 months) Principal Lead Distributed Computing Engineer @ Principal lead distributed engineer at AOL Advertising responsible for all the data, right from ETL to building user profiles that forms the data layer for all the downstream reporting and predictive modeling applications, on a 600+ node Hadoop cluster.
Solely lead an effort to build real time predictive scoring with Naïve-Bayes and K-means on a 40-node cluster.
Designed and built:
- Reporting platform engine that displays Advertiser/Campaign performance, heat maps and segmentation lifts for better optimization and sales.
- Price volume prediction engine that predicts the required impression volume for various bid prices simulation Ad-server auction. From July 2010 to May 2012 (1 year 11 months) Sr. Software Engg & Researcher @ Designed and built:
- Reporting platform engine that displays Advertiser/Campaign performance, heat maps and segmentation lifts for better optimization and sales.
- Price volume prediction engine that predicts the required impression volume for various bid prices simulation Ad-server auction. From May 2008 to July 2010 (2 years 3 months) Software Engineer @ Develop software for categorized search engines using open source technologies like Nutch, Hadoop and Lucene.
Develop social network using MVC framework like Spring, Hibernate and MySQL. From June 2006 to April 2008 (1 year 11 months) Software Engineer @ Worked as a Software developer to build features and applications in the telecom domain. From 2003 to 2004 (1 year)
Graduate Certificate, Data mining @ Stanford University From 2010 to 2012 MS, Computer Science @ The University of Texas at Dallas From 2004 to 2006 BS, Computer Engg. @ University of Pune From 1999 to 2003 Rohan Mehta is skilled in: MapReduce, Distributed Systems, Hive, Hadoop, Apache Pig, Analytics, Mahout, Big Data, MySQL, Data Mining, Java, Machine Learning, Algorithms, Scalability, Lucene