Xingwei Yang is a Machine Learning Scientist in the Machine Learning Core Team in Amazon, Seattle. He is not only responsible for designing and implementing solutions but also the product designs for multiple products. He is now collaborating with Customer Review Team within Amazon to provide next generation review services.
Prior to that, he was a computer vision scientist/ project lead role in GE Global Research Center responsible for designing and implementing end-to-end solutions for various computer vision and machine learning projects for different GE business.
He is an experienced and highly self-motivated scientist in machine learning, computer vision and natural language processing with excellent engineering skills. He is an enthusiast in large scale machine learning and deep learning with a lot of project experience.
Dr. Yang received his PhD from Temple University and B.S. in Huazhong University of Science and Technology. Dr. Yang has authored over 20 refereed publications and the total number of citations is over 700. (details are here http://scholar.google.com/citations?user=QJpb480AAAAJ&hl=en).
Technical Skill Highlights: Hadoop, Apache Spark, Caffe, OpenCV, FACTORIE, Stanford NLP, scikit-learn, Theano, Java, C/C++, Python, Scala, SQL, Matlab.
Scientific Expertise: Deep Learning, Supervised Learning, Unsupervised Learning, Online Learning, Bayesian Data Analysis, Graphical Model, Representation Learning, Recommendation System.
machine learning scientist @ I am working with Customer Review Team in Amazon to provide next generation review service, which is more reliable and able to surface product summarization to customer more efficient. I am responsible for the end to end algorithm development, including, product feature design, data collection, algorithm design and implementation, results evaluation and final deployment.
• Large-scale text analysis with EMR (Amazon's Hadoop service) over all reviews within Amazon. (Java).
• Design, Implement, evaluate and deploy New Star Rating Algorithm for all products of Amazon (Regression Based Approach and Bayesian Based Approach) (Java & Python).
• Theme discovery by Hierarchical LDA model. (Scala);
• Design, Implement and launch Crowdsourcing jobs using Amazon Mechanical Turk and Analyze the collected data (HTML/SQL);
• Implement, evaluate and integrate Sentiment Analysis capabilities into Amazon’s Internal System (Java);
• Implement, evaluate a regular expression based system for quote selection (Java);
• Implement, evaluate word2vec based Sentiment Analysis algorithms (Python);
• Implement and evaluate large-scale sentence clustering algorithms based on Apache Spark over EC2 cluster (Java & Scala);
• Implement, evaluate and deliver the Fraud Detection algorithms (Python);
• Contributors of Amazon’s Internal NLP toolkits (Java);
• Consultant for Computer Vision Project in Amazon Instant Video;
• Involved into Product designs and management for Amazon’s Products;
• Regularly give technical presentations of machine learning to audiences within Amazon;
• Organizer of Machine Learning Technical forum and Machine Learning Office Hours within Amazon; From March 2014 to Present (1 year 10 months) Greater Seattle AreaComputer Scientists @ During the time in GE Global Research Center, my main role is independent contributor, where I have contributed to multiple projects that end up into pilot test or production. I am also a project lead for several projects responsible for project management, communicating with customers and reporting to leadership team. Four patent applications are submitted, one of them is already issued and the other three are still pending.
My main responsibilities are discovering business problems, designing plausible solutions, implementing the solution, evaluating the system and presenting it to customers.
Here are summaries for some of the projects:
• Object Recognition: Develop object recognition capabilities within GE supported as GE CEO Project
My Role: Design, Implement, evaluate the object recognition system using feature extraction with random forest (C/C++);
• 3D Cell Segmentation: Provide 3D Cell segmentation and analysis capabilities for GE HealthCare
My Role: Design, Implement, evaluate the 3D cell segmentation system using Wavelet and Level Set (C/C++);
• RealTime Operational Monitoring: Provide Real Time Operational Monitoring capabilities for Gas Turbines in GE Energy.
My Role:Lead, Design, Implement, evaluate the real-time operational monitoring system using projective geometry and anomaly detection model (Matlab);
• Quality Control: Provide Semi-automatic defect measurement capabilities for GE Aviation.
My Role: Lead, Design, Implement, evaluate the interactive manufacturing quality control system. (C/C++); From June 2011 to March 2014 (2 years 10 months) Albany, New York AreaDeep Learning for Object Recognition @ In this project, I collaborate with a group of engineers and scientists to provide a service for automatic object recognition. I am responsible for algorithm development and building-up the service.
• Use Caffe to implement an Deep Learning system for Object recognition (C/C++);
• Put this system on Amazon AWS as a object recognition service (SQS/SNS, EC2, RDS, S3); From 2014 to 2014 (less than a year) Internet File Analysis @ In this project, I independently develop a system to transform unstructured data obtained from internet to structured data. To handle the large scale problem, I build the system on EC2 clusters and run the data analysis with Apache Spark Framework.
• Use Web Crawler to get targeted html files from internet (Java);
• Transform unstructured html files into structured data and save it to database (Java & MySQL); From 2014 to 2014 (less than a year) PHD @ Published more than 20 peer-reviewed papers in top-tier conferences and journals (IEEE PAMI, IJCV, CVPR, ICCV, ECCV, NIPS, Pattern Recognition, et al. )
The total citation is over 700 (see Google scholar)
http://scholar.google.com/citations?user=QJpb480AAAAJ&hl=en From August 2006 to June 2011 (4 years 11 months) Summer Research Intern @ The quantization of unsupervised low-level image segmentation is nearly impossible to determine. Thus, if the hierarchical information is wisely used, the segmentation results, both unsupervised and supervised, can be improved. This is the motivation of my current work and I am working in this direction right now. From May 2010 to August 2010 (4 months) Undergraduate student @ From 2002 to 2006 (4 years)
Doctor of Philosophy (PhD), Computer Science @ Temple University From 2006 to 2010 B.E., Electroics and Information Science @ Huazhong University of Science and Technology From 2002 to 2006 Xingwei Yang is skilled in: Machine Learning, Computer Vision, Pattern Recognition, Matlab, Data Mining, Image Processing, OpenCV, Artificial Intelligence, C++, Algorithms, Scala, Spark, JavaScript, Node.js, Java
Websites:
http://happyyxw.googlepages.com/xingweiyang