Master of Science, Computer Science @
University of New England (AU)
I am a data scientist at Uncharted Software ("Tough problems solved visually"). Much of my work involves doing natural language processing (NLP) research with 'big data' aimed at identifying & quantifying crises & terrorist activity on social media (especially Twitter). I also lecture in computer science at The University of Toronto. I am an intellectually curious life-long
I am a data scientist at Uncharted Software ("Tough problems solved visually"). Much of my work involves doing natural language processing (NLP) research with 'big data' aimed at identifying & quantifying crises & terrorist activity on social media (especially Twitter). I also lecture in computer science at The University of Toronto. I am an intellectually curious life-long learner who enjoys tackling tough problems.
• Graduate degrees in applied linguistics, computer science and applied computing.
• Well-honed communication and leadership skills. I have ~15 years of teaching experience (over
10,000 hours in the classroom) and have spoken at hundreds of public events.
• Security clearance under the Canadian Controlled Goods Program.
Natural Language Processing on text
• Sentiment analysis
• Entity detection
• Syntactic & Semantic writing similarity
• Recommendation engine
Apache Spark cluster computing
• Data wrangling (cleaning, parsing, exploring)
• Statistical analysis of big data
• Model building
Scala & Python
Bringing structure to unstructured data
• Discovery and communication of meaningful patterns in data
Undefined problems and huge amounts of unstructured data don't scare me, but I am not a 'rock star' or 'guru' or interested in positions looking for such. Feel free to contact me, but I love the work I am currently doing and probably won't respond to other offers.
Things I like:
• Functional programming
• Apache Spark
• Machine Learning
• Natural Language Processing / Computational Linguistics
• a work-from home policy
• a healthy work-life balance
Data Scientist @ Data scientist & NLP researcher doing analyses on highly sensitive data.
I collect, filter and curate textual data, build and refine representative NLP models and then distill and communicate the results.
Wrote a language detector capable of quickly identifying the main language used in a text sample.
Wrote linguistic analysis software for non-European languages:
• Arabic: transliteration, morphological categorization, lemmatization
• Russian: lemmatization
Implemented a multi-lingual sentiment analysis engine
• Analysed lexical-based vs machine learning approaches to sentiment analysis
• Identified discriminative sentiment terms (positive and negative) in languages for which no such lists exist using various correlation meastures (Mutual Information, Pearson's R correlation coefficient, Dunning log likelihood, Damerau's relative frequency ratio, TFIDF)
Explored entity detection to identify and track ISIL supporters and opponents on social media.
This work contributed to the DARPA Xdata project to identify people at risk of being recruited into terrorist networks.
Researched and compared novel ways of classifying and annotating raw (unlabeled, multilingual) data using machine learning vs heuristic rule-based approaches
• text with sentiment vs texts with no-sentiment
• news texts vs non-news texts
• tweets in support of ISIL vs tweets opposed to ISIL
• texts about a topic of interest vs texts not about a topic of interest
Lots of data munging, data wrangling, data cleaning, data exploration, data mining using Scala and Apache Spark on a Hadoop cluster along with Python and R. From January 2015 to Present (11 months) Lecturer @ Sessional university lecturer teaching csc108: Introduction to Computer Programming (Python)
Responsible for creating and delivering lectures.
Created programming assignments, tests and final exam.
Administered a Coursera online version of the course.
Supervised 5 teaching assistants, assigned duties and held grading meetings. From May 2014 to May 2015 (1 year 1 month) Data Scientist @ Built a knowledge graph using Freebase for use in entity detection and classification.
Researched solutions to the ‘long tail’ and 'cold start' problem of recommender systems - how to recommend content for which there is little user activity and how to make suggestions to new users.
• created an algorithm to identify authorship and writing style similarity
• explored alternate ways of semantic and topical similarity between texts
Used statistical similarity metrics to quantify differences and improvements in NLP algorithms explored (Jaccard's coefficient, TFIDF, Pearson's R coefficient, Kullback Leibler divergence)
Used supervised and unsupervised machine Learning for classification and clustering
• K-means clustering
• Linear/Logistic Regression
• K-Nearest Neighbors classification
• Support Vector Machine classification
• Principal Component Analysis dimensionality reduction
Did data exploration, created ETL tasks, and curated data on Amazon Web Services (using Hadoop, Hive, Pig, PostgreSQL) From May 2014 to December 2014 (8 months) Master of Science in Applied Computing student @ I was a student in the professional CS masters (MScAC) program at the university of Toronto focusing on natural language processing and machine learning From September 2013 to December 2014 (1 year 4 months) Assistant Professor @ Lectured in applied linguistics with a focus on second language reading and writing development. (Faculty of Letters)
Researched Virtual Learning Environments
Created an Automated Essay Evaluation software suite for computer assessment of writing.
Wrote a speed-reading module for the Moodle learning management system From April 2012 to September 2013 (1 year 6 months) Lecturer @ Sessional lecturer in applied linguistics From April 2010 to September 2013 (3 years 6 months) Lecturer @ Lectured in applied linguistics and civil liberties.
• Automated writing evaluation
• Statistical part-of-speech tagging on non-native vs. native texts
• Language policy and planning
• The use of mobile applications in education From September 2005 to March 2012 (6 years 7 months) Assistant Academic Director @ Responsible for hiring and supervising faculty, and curriculum development
Assisted owners, and business director with business development, education fairs,
and marketing presentations From February 2002 to February 2004 (2 years 1 month)
Master of Science (Professional Masters), Applied Computing @ University of Toronto From 2013 to 2014 Master of Science, Computer Science @ University of New England (AU) From 2008 to 2011 Master of Arts, Applied Linguistics @ University of New England (AU) From 2003 to 2005 B.A., Philosophy / Literature @ Mount Allison University From 1987 to 1993 High School @ Warwick Academy From 1980 to 1985 Craig Hagerman is skilled in: Research, Python, Higher Education, Teaching, Natural language processing, Machine learning, Objective-C, Data Analysis, RDF, Apache Pig, Entity Extraction, Big Data, SQL, Ontologies, Gensim