Page 542 - Python Data Science Handbook

P. 542

key specification, 149-152 N
relational algebra and, 146 naive Bayes classification, 382-390
US state population data example, 154-158 advantages/disadvantages, 389
min() function, 59 Bayesian classification and, 383
Miniconda, xiv Gaussian, 383-386
missing data, 120-124 multinomial, 386-389
feature engineering and, 381 text classification example, 386-389
handling, 119-120 NaN value, 104, 116, 122
NaN and None, 123 navigation shortcuts, 8
operating on null values in Pandas, 124-127 neural networks, 513
Möbius strip, 296-298 noise filter, PCA as, 440-442
model (defined), 334 None object, 121, 123
model parameters (defined), 334 nonlinear embeddings, MDS and, 452
model selection notnull() method, 124
bias–variance trade-off, 364-366 np.argsort() function, 86
validation curves in Scikit-Learn, 366-370 np.concatenate() function, 48, 143
model validation, 359-375 np.sort() function, 86
bias–variance trade-off, 364-366 null values, 124-127
cross-validation, 361-370 detecting, 124
grid search example, 373 dropping, 125
holdout sets, 360 filling, 126
learning curves, 370-373 NumPy, 33
naive approach to, 359 aggregations, 58-63
validation curves, 366-370 array attributes, 42
modules, IPython, 6-7 array basics, 42
Mollweide projection, 302 array indexing: accessing single elements, 43
multi-indexing (see hierarchical indexing) array slicing: accessing subarrays, 44
multidimensional scaling (MDS), 450-452 Boolean masks, 70-78
basics, 447-450 broadcasting, 63-69
locally linear embedding and, 453-455 comparison operators as ufuncs, 71-73
nonlinear embeddings, 452 computation on arrays, 50-58
MultiIndex type, 129-131 data types in Python, 34
creation methods, 131-134 datetime64 dtype, 189
data aggregations on, 140 documentation, 34
explicit constructors for, 132 fancy indexing, 78-85
extra dimension of data with, 130 keywords and/or vs. operators &/|, 77
for columns, 133 sorting arrays, 85-92
index setting/resetting, 139 standard data types, 41
indexing and slicing, 134-137 structured arrays, 92-96
keys option, 144 universal functions, 50-58
level names, 133
multiply indexed DataFrames, 136 O
multiply indexed Series, 134
rearranging, 137-140 object-oriented interface, 223
sorted/unsorted indices with, 137 offsets, time series, 196
stacking/unstacking indices, 138 on keyword, 149
multinomial naive Bayes classification, 386-389 one-hot encoding, 376
one-to-one joins, 147
optical character recognition
digit classification, 357-358

524 | Index

537 538 539 540 541 542 543 544 545 546 547