Master of Science (M.S.) @
Indiana University Bloomington
Chief Data Scientist @ Heading the data science efforts at Banjo (Ban.jo), where I will grow a data science team and tackle incredibly challenging problems in the social network arena. I will be building models to detect events as they happen across the world in real time as well as predicting their topics. Furthermore, I will be
Chief Data Scientist @ Heading the data science efforts at Banjo (Ban.jo), where I will grow a data science team and tackle incredibly challenging problems in the social network arena. I will be building models to detect events as they happen across the world in real time as well as predicting their topics. Furthermore, I will be building models that will be able to predict the impact of news stories before they are news.
Other equally exciting problems will also be part of my work: recommendation engines, personalization, social graph exploration, and models to explain the likes and dislikes of populations down to the individual's level. The datasets available will include massive (big data) collections of images, video, text, social graphs and social paths which will enable the use of topic modeling, image, video, and sound analysis, graph algorithms, population simulations and a variety of feature engineering techniques on historical data and user patterns.
I believe that these models, allied with Banjo's platform, have the power to change the a diverse set of industries including: news, advertising, financial and sports to name a few. From May 2014 to Present (1 year 8 months) Sr. Data Scientist III @ Designing overall data mining approaches to improve outcomes through model definitions, target specifications and evaluation metrics.
Providing data mining expertise in various areas to increase efficiency and effectiveness of modeling efforts, such as the examples below.
Feature extraction from a rich set of data including images, free-form text, personal profiles, and online behavior. Using techniques such as: restricted boltzmann machines, deep belief networks, clustering, and dimensionality reduction.
Developing, closely with engineers, parallel implementations of particle swarm optimization algorithm that will allow for fast feature selection and the creation of predictive models that can optimize non-differentiable loss functions. This implementation will allow the mimicking of various machine learning algorithms such as: logistic regression, neural networks, RBF networks, as well as ensembling methods like stacking, grading, voting, boosting, bagging, gated ensembles, greedy ensembles and cascading. From February 2014 to May 2014 (4 months) San Francisco Bay AreaScientist @ This position involved performing Data science on a variety of healthcare related problems. I was part of a small team that was responsible for creating products that would boost the overall value of the company to their clients. Work was done in an independent fashion: from project design, discovery phase, specification and requirements, research, data acquisition (databases), feature engineering and selection, training of model, final documents (report, specifications, code, sql, description, model) and knowledge transfer to engineers.
I have worked on projects for predicting: hospital readmissions, health insurance future costs on individual level, health metrics, imputation of missing data, visualization of high-dimensional space, injury detection through tracking jumping mechanics, sports (soccer) strategy effectiveness evaluation, transfer to pediatric intensive care unit and others. From February 2012 to February 2014 (2 years 1 month)
Doctor of Philosophy (PhD) (currently being completed), Computational Biology @ Yale University From 2007 to 2012 Master of Science (M.S.), Bioinformatics @ Indiana University Bloomington From 2005 to 2007 Bachelor of Science (BS), Computer Science @ University of Notre Dame From 2001 to 2004 Pedro Alves is skilled in: Machine Learning, Data Mining, Bioinformatics, Computational Biology, R, Algorithms, Big Data, Programming, Matlab, Artificial Intelligence, Genomics, SQL, ChIP-seq, Molecular Biology, Genetics
Looking for a different
Get an email address for anyone on LinkedIn with the ContactOut Chrome extension