Page 541 - Python Data Science Handbook

P. 541

basics, 331-342 (see also Boolean masks)
categories of, 332 Boolean arrays, 75-78
classification task, 333-335 Boolean masks, 70-78
clustering, 338-339 MATLAB-style interface, 222
decision trees and random forests, 421 Matplotlib, 217, 329
defined, 332 axes limits for line plots, 228-230
dimensionality reduction, 340-342 changing defaults via rcParams, 284
educational resources, 514 colorbar customization, 255-262
face detection pipeline, 506-514 configurations and stylesheets, 282-290
feature engineering, 375-382 density and contour plots, 241-245
GMM (see Gaussian mixture models) error visualization, 237-240
hyperparameters and model validation, general tips, 218-222
359-375 geographic data with Basemap toolkit, 298
KDE (see kernel density estimation) gotchas, 232
linear regression (see linear regression) histograms, binnings, and density, 245-249
manifold learning (see manifold learning) importing, 218
naive Bayes classification, 382-390 interfaces, 222
PCA (see principal component analysis) labeling simple line plots, 230-232
qualitative examples, 333-342 line colors and styles, 226-228
regression task, 335-338 MATLAB-style interfaces, 222
Scikit-Learn basics, 343 multiple subplots, 262-268
supervised, 332 object hierarchy of plots, 275
SVMs (see support vector machines) object-oriented interfaces, 223
unsupervised, 332 plot customization, 282-284
magic commands plot display contexts, 218-220
code block pasting, 11 plot legend customization, 249-255
code execution timing, 12 plotting from a script, 219
help commands, 13 plotting from IPython notebook, 220
IPython input/output history, 16 plotting from IPython shell, 219
running external code, 12 resources and documentation for, 329
shell-related, 19 saving figures to file, 221
manifold learning, 445-462 Seaborn vs., 311-313
"HELLO" function, 446 setting styles, 218
advantages/disadvantages, 455 simple line plots, 224-232
applying Isomap on faces data, 456-460 stylesheets, 285-290
defined, 446 text and annotation, 268-275
k-means clustering (see k-means clustering) three-dimensional function visualization,
multidimensional scaling, 450-452 241-245
PCA vs., 455 three-dimensional plotting, 290-298
visualizing structure in digits, 460-462 tick customization, 275-282
many-to-one joins, 148 max() function, 59
map projections, 300-304 maximum margin estimator, 408
conic, 303 (see also support vector machines (SVMs))
cylindrical, 301 memory use, profiling, 29
perspective, 302 merge key
pseudo-cylindrical, 302 on keyword, 149
maps, geographic (see geographic data) specification of, 149-152
margins, maximizing, 407-416 merging, 146-158
masking, 114 (see also joins)

Index | 523

536 537 538 539 540 541 542 543 544 545 546