Mehul was a principal scientist at HP (2004-2011) where his work spanned large-scale data management, distributed systems, and energy-efficient computing. Most recently, he worked with HP's new Cloud Services unit to develop a geographically distributed large object store. Mehul currently serves on the Sort Benchmark committee in the database community. Prior to HP, his doctoral thesis (U.C. Berkeley, 2004) brought robustness and scale to the TelegraphCQ data-stream processing system. Mehul has developed on the IBM DB2/UDB database. His masters thesis was ReferralWeb, an engine to automatically discover and extract social networks, co-developed at AT&T Labs and MIT (1997). He received his MEng in 1997 and BS in Computer Science and Physics in 1996, all from MIT.
Specialties:Large-scale data management, distributed systems, and energy-efficienct computing
Co-founder / CEO @ Businesses are awash with data of all kinds. Our mission is to move that data to where it is most valuable to them. Amiato is a real-time integration service in the cloud. It automatically bridges the gap between unstructured data and the world of structure business intelligence (BI) tools. Run reports on MongoDB or CouchBase, use Tableau for ad-hoc analysis of event logs, and get visibility across disparate silos.
We've got a fantastic team, and we're growing fast. Contact me if you'd like to know more. From September 2011 to Present (4 years 2 months) Palo Alto, CAPrincipal Research Scientist @ My work at HP spanned large-scale data management, distributed computing and energy-efficiency. Below I list my most significant projects in reverse chronological order.
Armonia: Principal investigator for Armonia project -- a scalable, distributed, main-memory data management platform that offers strongly consistent low-latency operations and complex on-the-fly analytics. Applications include financial trading and social networking.
HP-KVS: Built a highly available, low-cost key-value service for the cloud. HP-KVS is an eventually consistent, erasure-coded, large object store that spans multiple geographies.
Sinfonia: A highly scalable, distributed, memory-based transactional store for building data-center infrastructure applications. Built a large-scale distributed B-tree, clustered file-system, and group communication using Sinfonia. Basis for the Armonia project. Won best paper, SOSP 2007.
Energy-efficient systems: Characterized and optimized the energy use of computer systems as a whole, from storage to memory to compute. Inventor and maintainer of the JouleSort bechmark, the first holistic energy efficiency benchmark, which has inspired efficient server designs and influenced other benchmarks. Investigated energy efficiency of DB workloads.
Other work: Designed software and hardware for non-volatile RAM technologies like NAND Flash and Memristors. Developed methods for long-term preservation of digital information. From September 2004 to September 2011 (7 years 1 month) Palo Alto, CAGraduate Student @ Thesis: “Flux: A Mechanism for Building Highly-Available, Fault-Tolerant, Scalable Dataflows”: In the TelegraphCQ system, my dissertation focused on making parallel CQ dataflows – computations that analyze high-throughput streaming data in real time – highly available, fault-tolerant, and automatically load-balancing.
Continuously Adaptive Continuous Queries (CACQ): Developed an adaptive query processing system that executes numerous long-running queries simultaneously over streaming data.
AMDB: A debugger and profiler for search indexes on non-traditional data types like audio and images. Designed UI for navigating high-fanout search trees. (Released open-source). From 1997 to 2004 (7 years) Berkeley, CAResearch Intern @ Investigated alternative strategies for implementing collection types in IBM DB2/UDB. Designed language extensions for querying collection types. Gained experience with administration and software development in DB2/UDB. From January 1999 to October 1999 (10 months) AlmadenIntern @ MEng Thesis (jointly done at MIT): "ReferralWeb: A Resource Location System Guided by Personal Relations." The first system to automatically discover and extract social networks by mining publicly available data on the web. ReferralWeb also automatically finds experts on user-specified topics and recommends paths in the extracted social graph to connect users with those experts. From June 1996 to January 1997 (8 months) Murray Hill, NJIntern @ Built a prototype of a content-based image search and retrieval system. From June 1995 to August 1995 (3 months) Murray Hill, NJIntern @ Built tools to integrate simulated navigation systems with context-relevant websites. From June 1994 to August 1994 (3 months) Holmdel, NJ
PhD, Computer Science (Databases) @ University of California, Berkeley From 1997 to 2004 MEng, Electrical Engineering and Computer Science @ Massachusetts Institute of Technology From 1996 to 1997 BS, Physics, Computer Science @ Massachusetts Institute of Technology From 1992 to 1996 Montgomery Blair High School Mehul Shah is skilled in: C, C++, Java, Python, Perl, PostgreSQL, DB2, Distributed Systems, Data Management
Websites:
http://www.mshah.org/,
http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/s/Shah:Mehul_A=.html