The Art of Feature Engineering

Essentials for Machine Learning

by Pablo Duboue, PhD

This book is structured into two parts. The first part presents feature engineering ideas and approaches that are as much domain independent as feature engineering can possibly be. The second part exemplifies different techniques in key domains through cases studies.

Order on Amazon Download through Cambridge Core


Covers feature engineering material seldom presented in book format.

In one place summarizes dozens of blogs, videos, forum posts under a unified view and nomenclature. The book references more than 300 sources.

Focus on end-to-end performance rather than model performance.

Helps the practitioner obtain better end-to-end performance than just tuning model parameters.

Has a full section on variable-length feature data.

Helps the practitioner work with sets, lists, trees and graphs, traditionally problematic for statistical machine learning

Uses multiple domains to teach a topic that is domain dependent.

Practitioners working on new domains can study solutions in other domains to help build new ones on their own. Note that each domain uses a different language and the book bridges this interdisciplinary barriers.

Puts together a first-of-its-kind publicly available dataset for teaching feature engineering and uses it in four chapters.

It helps readers compare techniques across domains as different as text and images. Instructors can reuse the dataset for their class examples.

Contains the source code for all case studies, released open source.

The readers can look at the code for lower level details, the instructors can extend it and adapt it for their own classroom use.

Content Overwiew


Part I

This part focuses on domain independent techniques and overall process, where careful data analysis can steer practitioners away from bad assumptions and yield high-performing models.

Chapter 1


Definitions and processes

Topics: machine learning cycle, f-measure, precision, recall, error analysis, feature ideation, feature creation, feature extraction, feature engineering, domain modelling, data preparation

Learn More
Chapter 2

Features, combined

Normalization, histograms and outliers

Topics: normalization, binning, outliers, outlier detection, histogram, descriptive statistics, whitening, zca whitening, scaling, standardization

Learn More
Chapter 3

Features, expanded

Computable features, imputation and kernels

Topics: computable features, feature imputation, kernels, target rate encoding, one hot encoding, training expansion, tidy data

Learn More
Chapter 4

Features, reduced

Feature selection, dimensionality reduction and embeddings

Topics: feature selection, feature utility, recursive feature elimination, ablation study, dimensionality reduction, lasso, elasticnet, embeddings, word2vec, non-negative matrix factorization

Learn More
Chapter 5

Advanced topics

Variable-length vectors and automated feature engineering

Topics: variable length feature vector, encoding lists, encoding sets, automated feature engineering, featuretools, deep learning, autoenconders

Learn More

Case Studies

Part II

Tapping into domain expertise allows to avoid known problems in a target domain. This parts seeks to learn from well understood domains to help practitioners tackle new, less understood domains.

Chapter 6


Population prediction from DBpedia

Topics: graph data, machine learning on graphs, variable-length feature vector, one hot encoding examples, error analysis examples, exploratory data analysis examples, dbpedia, population prediction

Learn More Code and Data
Chapter 7

Timestamped data

Population prediction from history

Topics: time stamped data, time series, machine learning for time series, lagged features, autorregressive models, moving averages, windowing features, historical population prediction, arma models, arch models

Learn More Code and Data
Chapter 8


Population prediction from Wikipedia

Topics: natural language processing, information extraction, feature selection examples, mutual information, stemming, word embeddings, tsne, tf-idf, feature weighting

Learn More Code and Data
Chapter 9


Population prediction from satellite images

Topics: image processing, satellite images, non-photographic satellite images, image feature extraction, histograms, image gradients, histogram of gradients, local feature extractors, corner detection, nuisance variations

Learn More Code and Data
Chapter 10

Other domains

Video, geographic information and preferences

Topics: Video feature engineering, GIS feature engineering, feature engineering for preferences, geographical information system, high performance feature extraction, keypoints, preference imputation

Learn More Code and Data