Broadly, I have spent the last 14 years developing, testing and deploying a wide range of novel statistical techniques to solve complex modelling problems involving very large datasets.
Specialist in:
● Machine learning methodologies: neural networks, support vector machines/regression, logistic regression, Bayesian nets, random forests, very fast exact/approx nearest neighbour search, clustering methods, feature selection, over-fitting problem (a priori residual variance estimates, cross validation, Monte Carlo, White's reality test), multiple testing problem (Bonferroni, False Discovery Rate).
● Optimisation theory and implementation: Quasi-Newton, simulated annealing, genetic algorithms, tabu search.
● Time series modelling: stochastic processes, chaotic time series, transfer functions, neural networks, LLR, input selection.
● Data preparation: denoising, feature extraction, unbiased estimators, data cleansing, dimension reduction, EDA, sampling, stratified sampling.
● 14 years experience programming in R and C++. Also proficient in C#, Python and Java.
● Parallel Algorithms for analysing datasets that are too large to fit in-memory (RAM) i.e. Parallel External Memory Algorithms (PEMA).
● Modern Data Architectures: Hadoop, YARN, MapReduce, Spark, Hive, Pig
Technology Solutions Professional - EMEA @ I am the technical/data science lead for client engagements involving the advanced analytics software portfolio and associated partner integration. More specifically:
● Provide solutions to clients moving from legacy statistical software to R(evolution) or building new analytical capability in R(evolution) via proof of concept sprints/longer term arrangements.
● Create end-to-end predictive modelling solutions e.g. Operationalize R models using a combination of Azure Data Factory, Azure Stream Analytics/HDInsight Storm, Azure Machine Learning and Power BI.
● Assist clients integrate and exploit R(evolution) on modern data architectures e.g. Hadoop, YARN, Spark, Hive, HBase.
● Develop .Net and Java based solutions for clients to expose R scripts as a service via DeployR. DeployR is an integration technology for deploying R analytics inside web, desktop, mobile, and dashboard applications as well as backend systems. DeployR turns R scripts into analytics web services, so R code can be easily executed by applications running on a secure server.
● Develop R packages to complement the R(evolution) product offering and bespoke packages for customers requiring specific analytical capability.
● Design and lead training courses on: Open Source R, Big Data Analytics with ScaleR, Data Science, R for SAS users, Using R in Hadoop, Parallel R, DeployR. From September 2014 to Present (1 year 2 months) London, United KingdomRisk Management Consultant @ Delivering a holistic view of financial institutions' global trading operations and controls that enable the detection of unusual and un-authorised trading at the earliest point possible. From May 2014 to September 2014 (5 months) Guildford, United KingdomSenior Analyst @ Worked in a front-office trading environment designing and implementing advanced computational statistics to solve a number of problems pertinent to quantitative trading:
● Systematic strategies (trend following, short-term pattern recognition, trend reversal prediction, multi-factor models) across FX, stock indices, fixed income, interest rates, energies and commodities.
● Volatility/liquidity based position sizing algorithms.
● Algorithmic/Electronic execution: Implemented execution 'algos', market microstructure, and tactics to minimise slippage.
● Risk management systems: robust VaR, consistency/stability metrics, benchmarking (smart-beta), stress testing, market impact, execution risk, event risk, trading limit settings, slippage.
● Operational systems: real-time trade reconciliation, CFTC compliant trade allocation, electronic trading interface with multi-broker support, order management systems, back-testing framework.
● Co-authored the S&P Systematic Global Macro Index (SGMI) - a CTA replicator model (smart-beta) which was created with S&P Indices.
● Portfolio optimisation: risk parity under both homogeneous and heterogeneous correlation assumptions, most diverse, 1/n, minimum-variance. From July 2009 to April 2014 (4 years 10 months) London, United KingdomSenior Research Analyst @ ● Created the software for real-time intraday and end-of-day quantitative analysis of futures and FX markets, which was used by leading hedge funds and investment banks.
● Developed the mathematical framework for intra-day price signal generation that is utilised by institutional clients.
● Interacted with clients to produce bespoke statistical research. From September 2006 to June 2009 (2 years 10 months) Research Assistant @ ● Worked on the development of AI techniques to forecast UK house prices.
● Researched meta heuristics (genetic algorithms, tabu search and simulated annealing) to solve the container-ship stowage problem.
● Gave undergraduate seminars/labs and supervised several students through their coursework. From October 2002 to August 2003 (11 months)
Doctor of Philosophy (PhD), Mathematics @ The University of South Wales From 2003 to 2006 Bachelor of Science (BSc), Computer Science @ Cardiff University / Prifysgol Caerdydd From 1999 to 2002 Samuel Kemp is skilled in: R, Machine Learning, Hadoop, Data Mining, Quantitative Analytics, Data Analysis, Computational..., C++, Python, Algorithms, Java, Applied Mathematics, Quantitative Finance, Risk Management, Portfolio Management, Bloomberg, SQL, Software Engineering, Financial Modeling