Big Data is more than storing all available data. Data management, data analyses, data visualisation and identifying new business values from the data are new challenges which have to be solved on Big Data. Technologies for storing the data are available, but no efficient concepts for interesting Big Data Analyses as prediction models are available for production systems today. In my view centralised IT technologies in the Internet (Cloud) are the future of efficient and economic computing. As already proven in many projects Cloud computing is the best available accelerator for most Big Data projects.
Due to my education, my PhD and Post-Doc position I have a wide knowledge in the acquisition and management of research projects, of supervising students and publishing research results. Based on my professional career I have a broad overview and experience on real world scenarios running Big Data Analytics and Machine Learning. It is my daily business to use technologies from the Hadoop ecosystem, R is my favorite programming language and I am switching between structured and un-structured data models daily. My preferred computing environment is Amazon Web Services.
In the future I would like to get more user and business oriented and to push industry and research into using the right computing environment for their data analyses and applications.
Senior Big Data Consultant, Professional Services @ I am working as an AWS Professional Services Senior Big Data consultant in the EMEA region. I am working together with many AWS customers and AWS partners in Germany and Europe. My preferred Amazon Web Services are Amazon Redshift, Amazon EMR, Amazon Dynamo DB, Amazon S3 and Amazon Kinesis. Based on my analytical background I prefer to set up Big Data Analyses environments on AWS using tools as R, python, Datameer, MapR, Tableau and Jaspersoft, and to evaluate new analyses & visualization software. From June 2014 to Present (1 year 7 months) Munich Area, GermanyBig Data Analyst & Cloud Engineer @ Consultant for development projects in the telecommunication area with customers as Telefonica Dynamic Insights. Responsible for big data modelling and storing concepts, and the data analyses and data quality concepts with R and AWS EMR. Furthermore, responsible for the AWS Cloud based operation of applications and analyses infrastructure. From October 2012 to May 2014 (1 year 8 months) Munich Area, GermanyCloud and HPC Consultant @ HPC and statistic consultant for European companies as Gazprom Germania GmbH, especially for using R in HPC environments including AWS EC2 Cloud. From February 2010 to October 2012 (2 years 9 months) GermanySenior Community Manager @ Main part was the delivery of technical high-quality output to the community and the marketing team. Furthermore, I was a kind of in-house consultant developing ideas for new features and training new team members. For new customers I was a pre-sales engineer to get them as fast and efficient as possible onto the cloud based analyses environment. From June 2011 to December 2011 (7 months) Berlin Area, GermanyPost-Doc @ Big Data Analyst working with protein data. Research focus was on analysing protein data for unstructured regions. Furthermore, I was responsible for the operation of the lab's IT infrastructure. From January 2010 to May 2011 (1 year 5 months) PhD-student @ Working in the 'Department for Biometrics and Bioinformatic'. Working on my PhD thesis "Parallel Computing for Biological Data". From August 2006 to January 2010 (3 years 6 months) Munich Area, GermanyResearch Assistant @ R programmer for the BioConductor project and PhD student working on parallel computing concepts for R and algorithms in Bioinformatics. From 2008 to 2008 (less than a year)
Dr. rer. nat., Bioinformatics @ Ludwig-Maximilians Universität München From 2006 to 2009 Diplom, Technical Mathematics @ Technical University Munich From 2001 to 2006 Abitur, Mathematics and Physics @ Gymnasium Weilheim From 1991 to 2000 Markus Schmidberger is skilled in: Big Data, R, Cloud Computing, High Performance Computing, Hadoop, Machine Learning, Bioinformatics, Algorithms, Parallel Computing, Amazon Web Services (AWS), Python, Linux, Data Mining, Programming, Statistics