Page 538 - Python Data Science Handbook

P. 538

discriminative classification, 405-407 modifying values with, 82
documentation, accessing selection of random points, 81
IPython, 3-8, 98 feature engineering, 375-382
Pandas, 98 categorical features, 376
double question mark (??), 5 derived features, 378-380
dropna() method, 125 image features, 378
dynamic typing, 34 imputation of missing data, 381
processing pipeline, 381
E text features, 377
eigenfaces, 442-445 feature, data point, 334
ensemble estimator/method, 421 features matrix, 344
(see also random forests) fillna() method, 126
ensemble learner, 421 filter() method, 166
equidistant cylindrical projection, 301 FiveThirtyEight stylesheet, 287
errors, visualizing fixed-type arrays, 38
basic errorbars, 238
continuous quantities, 239 G
Matplotlib, 237-240 Gaussian basis functions, 394-396
Estimator API, 346-359 Gaussian mixture models (GMMs), 476-491
basics, 347 choosing covariance type, 484
Iris classification example, 351 clustering with, 353
Iris clustering example, 353 density estimation algorithm, 484-488
Iris dimensionality example, 352 E–M generalization, 480-484
simple linear regression example, 347-354 handwritten data generation example,
eval() function, 210-211 488-491
DataFrame.eval() method and, 211-213 k-means weaknesses addressed by, 477-480
pd.eval() function and, 210-211 KDE and, 491
when to use, 214 Gaussian naive Bayes classification, 351, 357,
exceptions, controlling, 20-22 383-386, 510
expectation-maximization (E-M) algorithm Gaussian process regression (GPR), 239
caveats, 467-470 generative models, 383
GMM as generalization of, 480-484 geographic data, 298
k-means clustering and, 465-476 Basemap toolkit for, 298
exponentials, 55 California city population example, 308
external code, magic commands for running, drawing a map background, 304-307
12 map projections, 300-304
plotting data on maps, 307
F surface temperature data example, 309
face recognition get() operation, 183
HOG, 506-514 get_dummies() method, 183
Isomap, 456-460 ggplot stylesheet, 287
PCA, 442-445 graphics libraries, 330
SVMs, 416-420 GroupBy aggregation, 170
faceted histograms, 318 GroupBy object, 163-165
factor plots, 319 aggregate() method, 166
fancy indexing, 78-85 apply() method, 167
basics, 79 column indexing, 163
binning data, 83 dispatch methods, 164
combined with other indexing schemes, 80 filter() method, 166

520 | Index

533 534 535 536 537 538 539 540 541 542 543