Senior Data Scientist @ Aruba, a Hewlett Packard Enterprise company
Data Scientist @ Niara, Inc.
Research Engineer @ Cisco
Master of Science (M.S.) @
National Tsing Hua University
I am a data scientist specializing in the analysis, management, visualization, and interpretation of large-scale data. With a background of physical science, I enjoy hands-on problem solving. The result would often end up with some totally novel solutions of an unsolved problem with efficient usage of existing resources. I also enjoy playing the interfacial role between quantitative
I am a data scientist specializing in the analysis, management, visualization, and interpretation of large-scale data. With a background of physical science, I enjoy hands-on problem solving. The result would often end up with some totally novel solutions of an unsolved problem with efficient usage of existing resources. I also enjoy playing the interfacial role between quantitative and non-quantitative people in the organization.
Specialties: Hadoop, MapR, HDFS, Big Data, MapReduce, Data Analysis, Data Visualization, and Mathematical Modeling
Data Scientist @ Raw data may look dirty, confusing, and hard to deal with. We see opportunities in the raw data. And yes, lots and lots of them. From March 2014 to Present (1 year 10 months) Research Engineer @ My job function involves various applied researches and fast prototyping of the applications in network security (web and email), by using the combination of big data analytics and machine learnings. I managed the massive amount of historical data using the HDFS and its derivatives, and perform high performance analysis with Hadoop. Machine learning methods employed include Bayesian network model, naive Bayes classifier, graph-based data mining, graph-based clustering, topic modeling and expectation-maximization for parameter estimation. From February 2011 to March 2014 (3 years 2 months) Staff Engineer @ I collaborated with system chemists, system integration, and marketing people on the analysis of massive sequencing data generated by SOLiD sequencer. My responsibility has dealt primarily with the exploratory data analysis, data mining, and algorithm developments for the processing and corrections of the sequencing data. Performed various non-routine analysis to discover the performance metric for a sequencer. Set the performance spec for SOLiD instruments. From June 2008 to February 2011 (2 years 9 months) Postdoctoral Associate @ Designed a novel statistical potential for the prediction and assessment of protein structures based on an analytically derived reference state, accounting for the finite and spherical shape of the protein structures. It has a wikipedia entry created by an independent user:
Created a random-packing theory to explain the observed domain size of proteins. Collaborated with various scientists on many aspects of protein structure predictions, for example, applied the machine learning algorithms (SVM and logistic regression) to the fold assessment problem. From February 2003 to June 2008 (5 years 5 months) Research Assistant @ Applied the mode-coupling theory to describe the non-equilibrium dynamics of short peptides and synthetic polymers. Developed an implicit solvent model and a model of atomistic friction for protein solvation. Performed long-time Langevin dynamics simulations with a novel implicit solvent model that directly observed the early folding events of HP-36 domain. Invented a fast linearization algorithm for non-bonded force evaluations. Experimented parallelized molecular simulations with OpenMP. From November 1998 to December 2002 (4 years 2 months)
Postdoctoral Associate, Bioinformatics @ University of California, San Francisco From 2003 to 2008 Doctor of Philosophy (Ph.D.), Theoretical Chemistry, 4.0 @ University of Chicago From 1998 to 2002 BS, Chemistry @ National Tsing Hua University From 1990 to 1996 Min-Yi Shen is skilled in: Machine Learning, Data Mining, Data Analysis, MapReduce, Hadoop, MRJob, Hive, Parallel Computing, Mathematical Modeling, Bayesian statistics, Data Visualization, Network Security, Computer Security, Computational Biology, Genomics
Looking for a different
Get an email address for anyone on LinkedIn with the ContactOut Chrome extension